Google Brain’s Switch Transformer language model packs a whopping 1.6 trillion parameters while effectively controlling computational cost. The model achieved a 4x pretraining speedup over a strongly tuned T5-XXL baseline.
Recent AI research on speech separation has explored ways to associate lip motions in videos with audio, but this approach suffers when speakers’ lips are occluded, which they often are in busy multi-speaker environments.
VOGUE, an AI-powered optimization method that deforms garments according to a given body shape while preserving pattern and material details to deliver state-of-the-art photorealistic, high-resolution try-on images.
This year, 22 Transformer-related research papers were accepted by NeurIPS, the world’s most prestigious machine learning conference. Synced has selected ten of these works to showcase the latest Transformer trends.
The approach dramatically reduces bandwidth requirements by sending only a keypoint representation [of faces] and reconstructing the source video on the receiver side with the help of generative adversarial networks (GANs) to synthesize the talking heads.
Google’s UK-based lab and research company DeepMind says its AlphaFold AI system has solved the protein folding problem, a grand challenge that has vexed the biology research community for half a century.
New Spoken Language Understanding (SLU) research from MIT CSAIL and Amazon AI introduces step-skipping semi-supervised frameworks that take speech as input and achieve performance competitive to systems leveraging oracle text.
“Our research provides enriched AR user experiences by enabling a more fine-grained visual recognition feature in AR, which is desirable in a wide range of application scenarios including technical support,” IBM researchers say.