An Apple research team explores multiple architectures and training procedures to develop a novel multi-speaker and multi-lingual neural TTS system. The study combines speech from 30 speakers from 15 locales in 8 languages, and demonstrates that for the vast majority of voices, such multi-lingual and multi-speaker models can yield better quality than single speaker models.
A research team from University of California San Diego and Microsoft proposes Micro-Factorized Convolution (MF-Conv), a novel approach that can deal with extremely low computational costs (4M–21M FLOPs) and achieves significant performance gains over state of the art models in the low FLOP regime.
In a 200+ page paper, Percy Liang, Fei-Fei Li, and over 100 other researchers from the Stanford University Center for Research on Foundation Models (CRFM) systematically describe the opportunities and risks of large-scale pretrained “foundation” models. The unique study aims to provide a clearer understanding of how these models work, when and how they fail, and the various capabilities provided by their emergent properties.
A research team from Università di Firenze, Università di Siena, University of Cambridge and Universitè Côte d’Azur proposes a general approach to explainable artificial intelligence (XAI) in neural architectures, designing interpretable deep learning models called Logic Explained Networks (LENs). The novel approach yields better performance than established white-box models while providing more compact and meaningful explanations.
A Google Research team explores the design space of Transformer models in an effort to enable deep learning architectures to solve compositional tasks. The proposed approach provides models with inductive biases via design decisions that significantly impact compositional generalization, and achieves state-of-the-art results on the COGS and PCFG composition benchmarks.
A research team from the University of Science and Technology of China, Microsoft Cloud AI, City University of Hong Kong and Wormpex AI Research propose a robust and invisible backdoor attack called “Poison Ink” and demonstrates its immunity to state-of-the-art defence techniques.
On August 5, WeChat AI and Beijing Jiaotong University system developers released the paper WeChat Neural Machine Translation Systems for WMT21, revealing the architecture of their novel neural machine translation (NMT) system and the strategies they adopted to achieve impressive performance in the WMT21 competition.
A research team from Zhejiang University, Wuhan University and Adobe Research proposes Feature Importance-Aware Attacks (FIA) that drastically improve the transferability of adversarial examples, achieving superior performance compared to state-of-the-art transferable attacks.
A DeepMind research team proposes Perceiver IO, a single network that can easily integrate and transform arbitrary information for arbitrary tasks while scaling linearly with both input and output sizes. The general architecture achieves outstanding results on tasks with highly structured output spaces, such as natural language and visual understanding.
A Google Research team draws inspiration from two numerical analysis methods — Hierarchical Matrix (H-Matrix) and Multigrid — to address the quadratic complexity problem of attention mechanisms in transformer architectures, proposing a hierarchical attention scheme that has linear complexity in run time and memory.
A research team from Google Research and Northwestern University presents polynomial time and sample-efficient algorithms for learning an unknown depth-2 feedforward neural network with general ReLU activations, aiming to provide insights into whether efficient algorithms exist for learning ReLU networks.
A research team from the University of Melbourne, Facebook AI, and Twitter Cortex proposes a black-box test method for assessing and debugging the numerical translation of neural machine translation systems in a systematic manner. The approach reveals novel types of errors that are general across multiple state-of-the-art translation systems.
On July 20, Quanergy Systems announced its 3D LiDAR solution has been selected to support the development of an Information, Communication, and Technology (ICT) system in Busan, South Korea. The ICT system is a key component of the South Korean government’s strategy to build data driven IoT smart cities. Busan is one of the pilot cities for the initiative.
On July 21, the chip giant Intel announced its second-quarter financial report. The report shows that Intel’s revenue in the second quarter was USD 19.631 billion, and its net profit was USD 5.061 billion, a decrease of 0.9 percent compared to the same period last year.
A research team from Microsoft, Zhejiang University, Johns Hopkins University, Georgia Institute of Technology and University of Denver proposes Only-Train-Once (OTO), a one-shot DNN training and pruning framework that produces a slim architecture from a full heavy model without fine-tuning while maintaining high performance.
A Google Research team proposes Wordcraft, a text editor with a built-in AI-powered creative writing assistant. Wordcraft uses few-shot learning and the natural affordances of conversation to support a variety of user interactions; and can help with story planning, writing and editing.
A research team from Taichi Graphics, MIT CSAIL, Zhejiang University, Tsinghua University and Kuaishou Technology introduces a programming language and compiler for quantized simulation that achieves both high performance and significantly reduced memory costs by enabling flexible and aggressive quantization.
On July 7, the University of Texas at Austin announced that it has teamed up with major wireless companies to set up a new 6G research center, the 6G@UT. Founding affiliates Samsung, AT&T, NVIDIA, Qualcomm and InterDigital will each fund at least two projects for three years at the center.