AI China Machine Learning & Data Science Nature Language Tech Research

China’s GPT-3? BAAI Introduces Superscale Intelligence Model ‘Wu Dao 1.0’

The Beijing Academy of Artificial Intelligence (BAAI) releases Wu Dao 1.0, China’s first large-scale pretraining model.

Since the May 2020 release of OpenAI’s GPT-3, AI researchers have embraced super-large-scale pretraining models. Packing an epoch-making 175 billion parameters, GPT-3 has achieved excellent performance across multiple natural language processing (NLP) tasks. Despite their size and power however, such models still lack common sense or cognitive abilities, and so struggle with complex reasoning tasks like open dialogue, knowledge-based Q&A, visual reasoning, etc.

In a bid to promote the research and development of China’s own large-scale pretraining models and further explore universal intelligence from a more fundamental perspective, the Beijing Academy of Artificial Intelligence (BAAI) recently unveiled Wu Dao 1.0, China’s first homegrown super-scale intelligent model system.

The work was led by BAAI Research Academic Vice President and Tsinghua University Professor Tang Jie, with contributions from a team of more than 100 AI scientists from Peking University, Tsinghua University, Renmin University of China, Chinese Academy of Sciences and other institutes.

Wu Dao 1.0 has initiated large-scale research projects via four related models: Wu Dao – Wen Yuan, Wu Dao – Wen Lan, Wu Dao – Wen Hui, and Wu Dao – Wen Su.

Wu Dao – Wen Yuan is China’s largest-ever pretraining language model, boasting the best processing power in mainstream languages, including Chinese and English. It has surpassed average human performance benchmarks on text categorization, sentiment analysis, natural language inference, reading comprehension and more. The Wu Dao – Wen Yuan project is designed to explore universal natural language understanding (NLU) techniques and study brain-inspired language models. It has 2.6 billion parameters and is capable of performing cognitive activities such as memorization, comprehension, retrieval, numerical calculation, multi-language, etc. Wu Dao – Wen Yuan has achieved GPT-3 comparable performance on 20 Chinese NLP tasks such as open-domain answering, grammar correction, sentiment analysis, etc.

Wu Dao – Wen Lan, meanwhile, is the first publicly available Chinese universal graphic multimodal pretraining model. The ultra-large-scale multimodal pretraining model aims to break through the theoretical challenges of pretraining multimodal data based on a combination of graphics, text and video, and eventually generate industrial-grade Chinese graphics pretraining models and applications that exceed SOTA performance. Currently, the model has 1 billion parameters and is trained on 50 million graphic pairs collected from open sources. The Wu Dao – Wen Lan model has reached SOTA performance, scoring 5 percent higher than the champion team on the Image Caption task on the Chinese public multimodal test set AIC-ICC and 20 percent higher than the most popular UNITER model on the Visual Entailment task.

Wu Dao – Wen Hui is an ultra-large-scale cognitive-oriented pretraining model that focuses on a series of essential problems in general artificial intelligence from a cognitive perspective, aiming to develop and enhance the logic-, consciousness- and reasoning-based cognitive capabilities of pretraining models. Wu Dao – Wen Hui has reached 11.3 billion parameters, and through simple fine-tuning can generate poetry, make videos, draw pictures, retrieve text, perform complex reasoning, etc. BAAI says the model achieves near-human performance on poetry generation on the Turing test.

image.png
Poetry generation by Wu Dao – Wen Hui
image.png
Drawings by Wu Dao – Wen Hui

Wu Dao – Wen Su is a large-scale training model for biomolecular structure prediction. It can handle super long biomolecular structures, where it has achieved SOTA performance, interpretability and robustness. Based on Google’s BERT language model, Wu Dao – Wen Su has completed protein training on the 100 GB UNIPARC database and gene training on 5-100,000 human peripheral blood immune cells (25-30 cell types) and 10,000 drug-resistant bacteria.

The BAAI research team summarizes some of Wu Dao 1.0’s key contributions:

  • Wu Dao – Wen Yuan introduces the open-source Chinese pretraining model (CPM). Based on CPM, the CPM-Distill model reduces language confusion by 38 percent and achieves better results on downstream tasks.
  • Wu Dao – Wen Lan is the first Chinese generic multimodal pretraining model that can understand “connotative information” based on weak correlations of images and text. Wen Lan uses an advanced cross-modal contrast learning algorithm: Given an image-text pair, it can enlarge the number of negative samples for each modal, especially for those which are difficult to distinguish, further improving the expression ability of neural networks. It can easily replace image and text encoders with the most advanced single-mode pretraining model, achieving 20 times faster performance than the UNITER model.
  • Wu Dao · Wen Hui proposes a new pretraining paradigm, Generative Language Model (GLM), breaking the bottlenecks of BERT and GPT. For the first time in history, a single model has achieved the best results in language understanding and generating tasks, and surpassed common pretraining models such as BERT, RoBERTa and T5 that trained on the same volume of data. Wen Hui’s continuous vector based fine-tuning method, P-tuning, is the first autoregressive model that surpasses the AutoEncoder model in NLU tasks and has achieved SOTA results on more than 10 tasks such as Knowledge Extraction and Superglue Fewshot Learning, with over 20 percent performance improvement. Wen Hui’s inverse prompting algorithm achieves close to human performance on the task of Q&A and poetry generation, and is the first model that can generate classical Chinese poetry based on modern themes.
  • Wu Dao – Wen Su’s open-sourced FastMoE is the first high-performance MoE (Mixed Expert Model) system that supports the PyTorch framework and a variety of hardware. Only one line of code is required to complete the MoE transformation, and model training speed is increased by 47 times compared with the traditional PyTorch implementation.

BAAI Research is currently in discussions with Sogou, 360, Alibaba, Zhipu.AI, Xinhua News Agency and others on model applications. The team also plans to build API interfaces to support high-concurrency and high-speed reasoning for enterprise and individual users.


Author: Hecate He | Editor: Michael Sarazen


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

34 comments on “China’s GPT-3? BAAI Introduces Superscale Intelligence Model ‘Wu Dao 1.0’

  1. Pingback: [N] China’s GPT-3? BAAI Introduces Superscale Intelligence Model ‘Wu Dao 1.0’ – ONEO AI

  2. Pingback: China’s GPT-3? Beijing Academy of Artificial Intelligence (BAAI) Introduces Superscale Intelligence Model ‘Wu Dao 1.0’ – ONEO AI

  3. Pingback: OpenAI: Mehr als 300 GPT-3 Apps am Start

  4. Merci pour tous

  5. Pingback: Education: Age of the Hybrid Writer Beckons - Robot Writers AI

  6. Pingback: Education: Age of the Hybrid Writer Beckons – ONEO AI

  7. Pingback: ML News Monthly – Apr 2021 – Sushrut Tendulkar

  8. Pingback: China releases first superscale A.I model system – Technology Tube

  9. Pingback: В Китае представили нейросеть Wu Dao с 1,75 трлн параметров 🖳 Siterem

  10. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing | go U.S. News

  11. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing - MyCeylon

  12. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing - blacktechdaily.com

  13. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing | K2C Digital

  14. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing – Londonchiropracter.com

  15. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing – NewsForTime

  16. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing – Knocking Live

  17. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing | https://cryptoyaks.com

  18. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing - Jhazoo

  19. Pingback: L'IA chinoise "Wu Dao" est 10 fois plus grande que GPT-3, et elle peut chanter - nouveautés et actualités

  20. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing – TECHOSMO

  21. Pingback: China’s ‘Wu Dao’ AI is 10X bigger than GPT-3, and it can sing – DLSServe

  22. Pingback: GPT-3: Here’s what you should know.

  23. Pingback: Here’s what you should know. - MYSURS

  24. Pingback: Weekly Top 10 Automation Articles | June 10, 2021

  25. Pingback: This Chinese Super Scale Intelligence Model, 'Wu Dao 2.0', Claims To Be Trained Using 1.75 Trillion Parameters, Surpassing All Prior Models to Achieve a New Breakthrough in Deep Learning | MarkTechPost

  26. Pingback: r/artificial - This Chinese Super Scale Intelligence Model, ‘Wu Dao 2.0’, Claims To Be Trained Using 1.75 Trillion Parameters, Surpassing All Prior Models to Achieve a New Breakthrough in Deep Learning - Cyber Bharat

  27. Thanks for all the information…

  28. china doing a great work and helping the earth even when it comes to #bbnaija, mp3 download, Video Download, Album download, https://cookena.com/ they always deliver

  29. Pingback: The rise of artificial intelligence - My Blog

  30. Pingback: Wu Dao 2.0: The Rise of Artificial General Intelligence - Cyber Bharat

  31. Pingback: The Rise of Synthetic Normal Intelligence - avantvagas.com.br - Latest Jobs And Careers Blog

  32. Outsourcing Training has been going on for quite a few years.

  33. Download “Undumpable” by South African Musician, Moonchild Sanelly below

    https://basenaija.com/moonchild-sanelly-undumpable/

Leave a Reply to Cookena Cancel reply

Your email address will not be published.

%d bloggers like this: