Site icon Synced

China’s GPT-3? BAAI Introduces Superscale Intelligence Model ‘Wu Dao 1.0’

Since the May 2020 release of OpenAI’s GPT-3, AI researchers have embraced super-large-scale pretraining models. Packing an epoch-making 175 billion parameters, GPT-3 has achieved excellent performance across multiple natural language processing (NLP) tasks. Despite their size and power however, such models still lack common sense or cognitive abilities, and so struggle with complex reasoning tasks like open dialogue, knowledge-based Q&A, visual reasoning, etc.

In a bid to promote the research and development of China’s own large-scale pretraining models and further explore universal intelligence from a more fundamental perspective, the Beijing Academy of Artificial Intelligence (BAAI) recently unveiled Wu Dao 1.0, China’s first homegrown super-scale intelligent model system.

The work was led by BAAI Research Academic Vice President and Tsinghua University Professor Tang Jie, with contributions from a team of more than 100 AI scientists from Peking University, Tsinghua University, Renmin University of China, Chinese Academy of Sciences and other institutes.

Wu Dao 1.0 has initiated large-scale research projects via four related models: Wu Dao – Wen Yuan, Wu Dao – Wen Lan, Wu Dao – Wen Hui, and Wu Dao – Wen Su.

Wu Dao – Wen Yuan is China’s largest-ever pretraining language model, boasting the best processing power in mainstream languages, including Chinese and English. It has surpassed average human performance benchmarks on text categorization, sentiment analysis, natural language inference, reading comprehension and more. The Wu Dao – Wen Yuan project is designed to explore universal natural language understanding (NLU) techniques and study brain-inspired language models. It has 2.6 billion parameters and is capable of performing cognitive activities such as memorization, comprehension, retrieval, numerical calculation, multi-language, etc. Wu Dao – Wen Yuan has achieved GPT-3 comparable performance on 20 Chinese NLP tasks such as open-domain answering, grammar correction, sentiment analysis, etc.

Wu Dao – Wen Lan, meanwhile, is the first publicly available Chinese universal graphic multimodal pretraining model. The ultra-large-scale multimodal pretraining model aims to break through the theoretical challenges of pretraining multimodal data based on a combination of graphics, text and video, and eventually generate industrial-grade Chinese graphics pretraining models and applications that exceed SOTA performance. Currently, the model has 1 billion parameters and is trained on 50 million graphic pairs collected from open sources. The Wu Dao – Wen Lan model has reached SOTA performance, scoring 5 percent higher than the champion team on the Image Caption task on the Chinese public multimodal test set AIC-ICC and 20 percent higher than the most popular UNITER model on the Visual Entailment task.

Wu Dao – Wen Hui is an ultra-large-scale cognitive-oriented pretraining model that focuses on a series of essential problems in general artificial intelligence from a cognitive perspective, aiming to develop and enhance the logic-, consciousness- and reasoning-based cognitive capabilities of pretraining models. Wu Dao – Wen Hui has reached 11.3 billion parameters, and through simple fine-tuning can generate poetry, make videos, draw pictures, retrieve text, perform complex reasoning, etc. BAAI says the model achieves near-human performance on poetry generation on the Turing test.

Poetry generation by Wu Dao – Wen Hui
Drawings by Wu Dao – Wen Hui

Wu Dao – Wen Su is a large-scale training model for biomolecular structure prediction. It can handle super long biomolecular structures, where it has achieved SOTA performance, interpretability and robustness. Based on Google’s BERT language model, Wu Dao – Wen Su has completed protein training on the 100 GB UNIPARC database and gene training on 5-100,000 human peripheral blood immune cells (25-30 cell types) and 10,000 drug-resistant bacteria.

The BAAI research team summarizes some of Wu Dao 1.0’s key contributions:

BAAI Research is currently in discussions with Sogou, 360, Alibaba, Zhipu.AI, Xinhua News Agency and others on model applications. The team also plans to build API interfaces to support high-concurrency and high-speed reasoning for enterprise and individual users.


Author: Hecate He | Editor: Michael Sarazen


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

Exit mobile version