AI Machine Learning & Data Science Research

Amazon’s Sockeye 3: Neural Machine Translation With PyTorch That Is 126% Faster on GPUs

Amazon has introduced the latest version of their Sockeye toolkit for the efficient training of stronger and faster neural machine translation (NMT) models. Sockeye 3 achieves speeds up to 126 percent faster than other PyTorch implementations on GPUs and up to 292 percent faster on CPUs.

Anyone who regularly uses machine translation systems will have noticed huge performance improvements over the last few years, attributable to neural network-based models that have largely replaced the previous generation of phrase-based systems.

Introduced in 2018, Sockeye is an open-source framework that offers fast and reliable PyTorch implementation for neural machine translation (NMT) and has been powering Amazon Translate and other NMT applications. Sockeye 2 was released in 2020.

In the new paper Sockeye 3: Fast Neural Machine Translation with PyTorch, an Amazon team presents the latest version of the Sockeye toolkit for efficient training of stronger and faster models. Sockeye 3 achieves speeds up to 126 percent faster than other PyTorch implementations on GPUs and up to 292 percent faster on CPUs.

Sockeye 3 optimizes a distributed mixed precision training strategy to yield faster calculations and speedups by fitting larger batches into memory. Moreover, it can scale to any number of GPUs and any size of training data by launching separate training processes that use PyTorch’s distributed data parallelism to synchronize updates.

For inference design, Sockeye 3 uses static computation graphs to minimize the impacts of dynamic shapes and data-dependent control flow, enabling it to trace various model components via PyTorch’s JIT compiler.

The developers also maintain backward compatibility with Sockeye 2 MXNet models — all models that were trained with Sockeye 2 can be converted to models running on Sockeye 3 with PyTorch.

Sockeye 3 also introduces many new advanced features: It supports replacing the decoder’s self-attention layers with Simpler Simple Recurrent Units (SSRUs) and fine-tuning with parameter freezing, and enables users to specify arbitrary prefixes (sequences of tokens) on both the source and target sides for any input.

In their empirical studies, the team compared Sockeye with benchmark NMT toolkits that included Fairseq (Ott et al., 2019) and OpenNMT (Klein et al., 2017).

In the evaluations, Sockeye 3 achieved comparable or better performance on GPUs and CPUs: delivering a 15 percent improvement for batched GPU inference, +126 percent for non-batched GPU inference, and +292 percent for CPU inference.

Overall, Sockeye 3 provides much faster model implementations and more advanced features for NMT. As with previous versions, It has been open-sourced under an Apache 2.0 license, and the Amazon team welcomes pull requests from community members.

The code is available on the project’s GitHub. The paper Sockeye 3: Fast Neural Machine Translation with PyTorch is on arXiv.


Author: Hecate He | Editor: Michael Sarazen


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

2 comments on “Amazon’s Sockeye 3: Neural Machine Translation With PyTorch That Is 126% Faster on GPUs

  1. Gabriel Starks

    Hi! The best way for any business is-safe promotion! And I advice promote your product, service, your businesses… And this is a good strategy and excellent tool for attracting the attention of potential customers or for getting popular in social media networks. I will share professional site for safe social media marketing. If you want take high positions in the ranking, get organic traffic or be popular in social media networks… Just click and check this site, i think it will be very useful. https://likigram.com

  2. This is a very five-star post! load ringtone

Leave a Reply

Your email address will not be published.

%d bloggers like this: