On December 8, Peng Cheng Laboratory (PCL) and Baidu held a press conference to announce the release of PCL-BAIDU Wenxin (version Ernie 3.0 Titan), the world’s first knowledge-enhanced 100-billion-scale pretrained language model and largest Chinese-language monolithic model. PCL-BAIDU Wenxin achieves state-of-the-art results on more than 60 natural language processing (NLP) tasks and significantly advances more than 30 benchmarks in zero-shot and few-shot learning.
PCL-BAIDU Wenxin is based on Peng Cheng’s industry-leading Pengcheng Cloud Brain II computing power system and Baidu’s PaddlePaddle deep learning platform, and its 260 billion parameters exceed GPT-3’s total by 50 percent. PCL-BAIDU Wenxin aims to solve common model bottlenecks such as poor generalization ability, strong reliance on expensive manually labelled data and high application cost; and simplify the development of artificial intelligence (AI) applications.
At the presser, PCL Director and Chinese Academy of Engineering academician Wen Gao said the large model is crucial for continuing technological development and innovation and will enable more industries to benefit from the power of AI. To accelerate PCL-BAIDU Wenxin’s industrialization process, the development team has introduced the first large-scale model online distillation technology, which boosts the model’s parameter compression rate up to 99.98 percent.
Baidu Chief Technology Officer and director of the National Engineering Laboratory for Deep Learning Technology and Applications Wang Haifeng also spoke at the press conference, explaining PCL-BAIDU Wenxin contains both basic, general and large models as well as a rich suite of tools and platforms designed to promote technological innovation and industrial development.
The PCL-BAIDU Wenxin model’s development and key characteristics can be summarized as:
- PCL-BAIDU Wenxin is a large-scale pretrained model for natural language understanding and generation that achieves state-of-the-art results on more than 60 tasks including reading comprehension, text classification and semantic similarity; and advances over 30 few-shot and zero-shot benchmarks.
- The model was developed through a collaboration between Peng Cheng Laboratory’s self-developed computing system Peng Cheng Cloud Brain II and Baidu’s deep learning platform PaddlePaddle.
- Researchers applied strong model compression technologies to simplify Wenxin for real-world scenarios. The compressed model retains only 0.02 percent of the original size but can achieve comparable performance.
- “The knowledge-enhanced Wenxin model learns from an integration of large-scale knowledge and massive data, improving effectiveness and efficiency while achieving great interpretability,” says Baidu CTO Haifeng Wang.
- PCL-BAIDU Wenxin is the latest addition to the Baidu Wenxin model family following the creation of Wenxin ERNIE 1.0 in March 2019. The Wenxin family encompasses not only general-purpose models but also models tailored for specific areas and tasks, and is supported by a wide range of tools and platforms.
- Wenxin has been widely applied in Baidu’s search, newsfeed, smart speakers and other Internet products. Through the Baidu AI Cloud, Wenxin empowers fields such as manufacturing, energy, finance, telecommunications, media and education.
- In the finance industry, Wenxin and the Baidu ML full-feature AI development platform have been applied to develop an intelligent analysis model for financial contract terms which can complete the intelligent classification of different clauses in about one minute — dozens of times faster than manual processing — to greatly improve work efficiency. Baidu AI Cloud’s intelligent customer service has also applied Wenxin to improve accuracy in its services, and it is currently in use at numerous locations across China by enterprises that include China Unicom and Shanghai Pudong Development Bank.
PCL-BAIDU Wenxin’s creators believe it can solve common large language model problems and lower the real-world deployment threshold to accelerate scientific and technological innovation and advance the large-scale industrial application of AI technologies.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.