AI Asia China Machine Learning & Data Science Nature Language Tech Research

Meet ByteDance AI’s Xiaomingbot: World’s First Multilingual and Multimodal AI News Agent

Researchers from ByteDance AILab and Shanghai Jiao Tong University have introduced Xiaomingbot, a multilingual and multimodal news reporter.

Continuous improvements in modern natural language generation in recent years have enabled bots that can perform automatic news reporting. This has practical applications for example in minor league sports, where result data is available but it is not always cost-efficient to send human reporters to the contests. Most existing robot reporters however focus exclusively on text generation. In a bid to develop more versatile and user-friendly intelligent robot reporters, researchers from ByteDance AILab and Shanghai Jiao Tong University have introduced Xiaomingbot, a multilingual and multimodal news reporter that is able to:

  • Create news articles from input data such as scoring stats or box scores
  • Read these articles with the lifelike animation of a typical TV anchor
  • Deliver the news in multiple languages to serve global users
image.png

Xiaomingbot contains four components: a news generator, a news translator, a cross-lingual newsreader and an animated avatar. Its input is data table containing game and event records, and the output is an animated avatar reading a news article with a synthesized voice. The system targets news delivery in areas such as sports and financial reporting.

image.png
Xiaomingbot system architecture

To enable the system to generate reasonable text, a template based on table2text technology is used to write the news. The text summarization module filters and summarizes the produced news texts to provide a condensed abstract with the most important and pertinent information. Next, a state-of-the-art transformer large model is used as a machine translation module to translate the abstract into the user-specified language. A text to speech (TTS) module with multilingual capability enables Xiaomingbot to read the final script in different languages using the same voice. An animated anchor avatar with synchronized lip motion and facial expression and realistic body and clothing makes the news broadcast more natural and viewer-friendly.

image.png
User Interface of Xiaomingbot. Here, sports results are used as input and corresponding speech and visual effects are output.
image.png
Avatar animation synthesis

Xiaomingbot is a first fully functional robot reporter capable of writing, speaking, and dynamic expression to be deployed online and actively serve users. It has already written and delivered over 600,000 news articles and gained over 150,000 followers across various social media platforms.

The paper Xiaomingbot: A Multilingual Robot News Reporter is on arXiv.


Author: Hecate He | Editor: Michael Sarazen & Yuan Yuan


Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how the Chinese government and business owners have leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle.

Click here to find more reports from us.


We know you don’t want to miss any story. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.

%d bloggers like this: