Encoder-decoder models have become the preferred approach for a wide range of language-related tasks. Although some common logical functions are shared between different tasks, most contemporary encoder-decoder models are trained end-to-end for a specified task. This specialization increases the compute burden during training and results in less generally interpretable architectures.
Meta AI researchers address these issues in the new paper LegoNN: Building Modular Encoder-Decoder Models, proposing a procedure for building encoder-decoder architectures with decoder modules that can be shared across sequence generation tasks such as machine translation (MT) and automatic speech recognition (ASR) without requiring finetuning or suffering significant performance reductions.
Introducing modularity to encoder-decoder architectures enables reusability, which can save computational resources and benefit under-resourced tasks by utilizing shareable components from higher-resourced tasks.
The LegoNN encoders enable an interpretable interface by outputting a sequence of distributions over a discrete vocabulary derived from the final output labels. A novel Connectionist Temporal Classification (CTC) loss is employed on these outputs. The researchers also build a modality agnostic encoder for sequence prediction tasks, which leverages an output length controller (OLC) unit that uses cross-attention between two groups of transformer layers to enable working with fractional length ratios between inputs and outputs of the same module. LegoNN can thus train decoders and intermediate modules between different tasks and domains without jointly training for the tasks or requiring finetuning.
Given a typical German to English (De-En) MT system, the LegoNN framework can be used to build additional ASR and MT language systems without constructing a new dedicated decoder. Developers can instead build only a new encoder system and reuse the existing decoder module.
In their empirical study, the team evaluated the feasibility of reusing LegoNN modules on various ASR and MT tasks, where it achieved competitive performance, matching or beating baseline models. Their LegoNN decoder trained for De-En WMT (Workshop on Machine Translation) tasks was able to effectively replace an ASR decoder module without any performance drop, and provided better generation quality when applied to Romanian to English (Ro-En) WMT tasks.
The team notes that reusable libraries are common in software development and hopes their paper can help bring a similar paradigm to sequence-to-sequence neural models. Their future research will explore combining the flexibility of LegoNN models with the proven performance of encoder pretraining methods such as those used in Google’s BERT large language model, and exploring LegoNN’s zero-shot learning capabilities for speech translation scenarios that rely on a combination of ASR and MT modules.
The paper LegoNN: Building Modular Encoder-Decoder Models is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.
The best Cricut machines are yet to be discovered via cricut.com/setup to prepare your craft projects. With different types of cutting and heat press machines, you get to maintain the quality and consistency of crafts. The top Cricut machines, such as Cricut Explore Air 2 or Cricut Maker, allow you to cut all sorts of materials. For your Cricut machines, you get Cricut Design Space to create crafts and designs. Cricut offers a whole ecosystem of tech and accessories to utilize on your required materials. Visit http://www.cricut.com/setup to check out the material requirements for your crafts.
MS Office is a highly brilliant suite of productivity apps that make your office tasks relatively simple. To download MS Office, you can go to the site microsoft365.com/setup, implement the steps to install, and activate it. We have mentioned the detailed procedure to help you download Office from microsoft365.com/setup, and install and activate it. Read below.
To scale a business, it is important to use quality software and other business products. And in this regard, you definitely should not choose the first developer you come across. I recommend services of design-led product development agency to speed up the development process, to combine the efforts of different specialists..