China Industry Interview

Tencent Cloud’s Huang Ming on its New DI-X Deep Learning Platform

On March 28, Tencent Cloud launched its machine learning platform DI-X, with a goal of providing a one stop shop for its machine learning and deep learning customers.

On March 28, Tencent Cloud announced the launch of its machine learning platform DI-X (short for Data Intelligence X), with a goal of providing a one stop shop for its machine learning and deep learning customers by reducing their barrier to entry and streamlining their development of AI. Based on Tencent Cloud’s compute and storage resources, DI-X has mainstream deep learning frameworks such as Caffe, TensorFlow, and Torch built in. Along with open source and Tencent’s proprietary algorithm and model database, DI-X promises a drag and drop user experience. For Tencent, DI-X plays a crucial role in its long term AI strategy, and act as the bugle call for Tencent Cloud’s acceleration in AI.

Artificial Intelligence platforms have become the next battleground for Chinese tech giants. On March 29, at Alibaba Cloud’s Computing Conference, Alibaba Cloud also announced the launch of their updated machine learning platform PAI 2.0. So what can DI-X provide its customers? What is its strategic importance? How can Tencent Cloud come out on top in this fiery competition both on the business front and the technical front? Synced reached out to Tencent’s T4 expert, Tencent Cloud’s machine learning platform technical lead Huang Ming to find out.


 

Synced: Deep learning platform is no longer a new concept. There are several companies in the West that have attempted to develop deep learning platforms, with some already seeing results. Can you please explain why Tencent Cloud decided to launch the DI-X platform?

Huang: Like many other excellent products and capabilities of Tencent Cloud, the launch of the DI-X platform was an internal to external process. With the advancement of machine learning and AI, Tencent needed an internal platform to support the needs of our algorithm engineers and data scientists. As a result, DI-X was born. DI-X quickly became our primary machine learning platform, with tens of thousands of machine learning tasks running different algorithms and training different models on it daily. After the system matured over a year of running in production, along with external demands, Tencent Cloud decided to open its capabilities to the public.

Synced: Tencent Cloud has always worked towards lowering the barrier to entry of Cloud AI, with its Big Data Service platform, and last year’s Cloud Image service. What’s the strategic goal for DI-X?

Huang: The launch of DI-X completed Tencent Cloud’s product offerings in AI: From IaaS, to AI platform service, to AI infrastructure service, to AI application service, to vertical industry solutions, we now have a complete product and service coverage. Ma Huateng (Pony Ma) once said: “AI, IoT, and autonomous vehicles and robots of the future must have in the back a brain in the cloud.” The launch of DI-X will become the primary support for companies seeking to build their “brain in the cloud”. It also acts as the bugle call for Tencent Cloud’s acceleration in AI. For our small to medium-sized customers, they can directly leverage DI-X to hop onto the fast lane of AI.

1Tencent Cloud’s AI product and service offering matrix in March 2017

Synced: Can you please provide an overview of DI-X’s development team? What was the development process like? On a technical level, what is the difficulty in developing a deep learning cloud platform? How was it dealt with?

Huang: The DI-X team was a combination of Tencent Cloud’s data product team and Tencent’s data platform group’s machine learning team. It was by no means a large team, with just over a dozen people. We are firm believers in that small teams are more suitable for today’s fast pace development cycles. This product is still very young, and it needs to grow at a rapid pace. The project first tested out internally within Tencent for over a year, going through 3 major version and dozens of minor updates, before reaching a good enough customer base and satisfaction level where it was worthy of being a Cloud product. With it, we wish to provide more small to medium-sized companies with Tencent’s AI capabilities.

This platform started off as a service for internal customers, which meant it had lots of strange customer requirements. How to meet these requirements in a meaningful way for the platform required a deep level of understanding and control, and it was the biggest challenge for this product.

Another challenge we faced was machine learning it self. The incredible pace of development for AI and deep learning meant new functions and requirements kept appearing, and new platforms where we could borrow ideas kept coming up as well. We needed to quickly catch up to the forerunners, and keep up with the late comers. This placed a huge demand on the team’s ability to push out iteration after iteration of updates in quick succession.

Synced: One question on every user’s mind is: what deep learning frameworks does DI-X currently support? How is the platform’s compatibility like?

Huang: The first version of DI-X supported three deep learning frameworks: TensorFlow, Caffe, and Torch. In terms of compatibility, we will always be running the most up-to-date open source version of these frameworks, with the main modifications around seamless connectivity to Tencent Cloud’s object cloud storage (COS). This will allow the image, audio, and video data on Tencent Cloud to act as data inputs, allowing our customers to unlock the potential in their data by directly training algorithms and generating models on the Cloud. Also, the first version will not support multiple machine and GPUs running in parallel. We expect this feature to be released in the third version.

Synced: Can you please highlight some of the key features in Tencent Cloud’s DI-X platform? In this competitive market, what are DI-X’s advantages? What features can it realize? What problems can it solve?

Huang: The DI-X platform currently have the following product features:

  1. Deep learning support: currently supporting TensorFlow, Caffe, and Torch, with more framework support and optimizations coming in the future.
  2. Ease of use: Drag and drop style visual task design interface, with 5 types of building blocks: input, package, algorithms, models, and output. Through their flexible combinations, one can achieve complex machine learning tasks without writing a single piece of code.
  3. Flexibility: Users can either use the preset machine learning algorithms, or opt to use their own in various deep learning packages.
  4. Integration: Seamless connectivity to Tencent Cloud’s Cloud Object Storage (COS) and compute resources (GPU compute platform). Cloud customers today can be easily integrated into this closed-loop ecosystem.
  5. End to End process: With model training, forecasting, and application all in one, while providing public datasets and industry models, we can quickly help customers unlock the value in their data.

We hope our customers can experience deep learning model training, optimization, application, and forecasting all within this platform, and feel the power of a true one stop shop solution.

Synced: Who are DI-X’s target customers? How can it satisfy the needs of customers in different industries?

Huang: DI-X’s target customer are those with some basic knowledge in deep learning, as it does have a learning curve. The customer may have lots of unstructured data (images, audio, video) already stored in cloud object storage, and they want to do some research or work in AI on these valuable data. Now, they can purchase Tencent Cloud’s GPU compute platform, then through DI-X, quickly start running deep learning algorithms, completely bypassing the cumbersome configuration process in the middle.

Also, DI-X have common deep learning algorithms, public data sets and industry models already built in, allowing our customers to simply drag them out and run them with a simple configuration. This will help customers quickly verify an interesting idea. All of this is free across all domains, and I believe they will satisfy most of our customers’ needs.

Synced: Can you please explain what might a customer experience while using DI-X that is different from traditional cloud platforms?

Huang: DI-X is a relatively new platform, so it learned from the strengths of many existing platforms. Some of its features are from previous products, like the drag and drop visual design interface, which is mostly the same. But some of its unique features are either exclusive to DI-X, or is done much better on DI-X. For example:

1. Flexibility
All framework packages support user uploaded scripts and model network architectures. We don’t overly restrict users. We also promote users having more control during the task stream design process. Of course, we also support data streams.

2. Support multiple instance processing
Every task stream can be on multiple instances. Of course, there is a limit on the number of instances. Instance initialization supports multiple scheduling methods such as periodic scheduling and timed scheduling. Every instance also has a snapshot page, allowing the user to monitor each instance’s state and results.

3. Automatic parameter tuning
DI-X support multiple parameters (currently at most 5) to follow an initial, step growth, and final value in different combinations to dynamically replace parameters in the input and model network. This will allow multiple instances to run automatically and present the end result for multiple models, giving the user an easy way to compare different settings.

4. Training and use of models
In terms of algorithms and models, DI-X has a “little tail” design. Unlike other platforms out there, we provide better usability and scalability for all deep learning algorithms and models, thereby providing better forecasting for models.

The four features above have all been polished through months of internal testing, with a lot of attention paid to details. We wish to provide our external customers a brand new experience.

Synced: Security of cloud platforms have always been a focal point for customers, what did DI-X do to protect the security of its data and code?

Huang: DI-X is based on a modified implementation of Docker, so we have ample separation between user processes and accesses. Access to cloud object storage is independent as well, so there is no risk of code leak or data leak.

Synced: Deep learning, being the mainstream AI algorithm over the past few years, is a major focus for domestic and international tech giants. Tencent once launched a deep learning platform in 2014 – Mariana, supporting DNN GPU data parallelism, CNN GPU data parallelism and model parallelism, and DNN GPU cluster. What is the relationship between DI-X and Mariana? What is DI-X’s strategic importance to Tencent’s deep learning strategy? What is Tencent’s future plan in this field?

Huang: Mariana is a deep learning package in the internal DI-X platform. We will integrate it into the open source Angel framework announced earlier, and will launch them together. DI-X plays an important role in Tencent’s deep learning strategy, by giving small to medium sized customers the freedom to deploy their deep learning algorithms and models on Tencent Cloud. In the future, Tencent Cloud might open up more algorithms and models, providing an overall AI service.

Synced: What are some areas where DI-X need optimization and improvement? How do you plan on achieving this in the future?

Huang: DI-S is a new platform, and deep learning and AI are all in rapid development. There are lots of things to be improved, like the visual customizable model network architecture. In the future, we will rapidly push out updates and iterations to satisfy the needs of our customers.

Synced: On January 20, Tencent saw the potential of FPGA in realizing deep learning, and launched the first ever FPGA based cloud server in China, accelerating cloud computing in various applications. This was met with industry wide acclaim. Currently, what are the customers’s results and responses from using FPGA cloud servers?

Huang: The customer response has been fantastic in several areas: first, low set-up cost – customers can purchase FPGA cloud servers as needed, instead of spending lots of capital purchasing physical servers; second, short set-up time – the set-up time has been reduced from years or months to mere days; lastly, low operating cost – as Tencent Cloud is responsible for operating the servers, we’ve reduced the human and capital resource required to maintain FPGA servers for our customers.

Synced: Tencent Cloud’s enterprise customers can pay for FPGA as-needed, how much would a typical customer pay in a year? Compared to the past, is there a cost advantage?

Huang: Through FPGA cloud server, enterprises can perform hardware programming on the FPGA, increasing its performance to a level 30 times greater than typical CPU servers, while only paying for roughly 40% of the cost for a CPU server. Tencent Cloud is the first cloud service provider to provide FPGA compute service in China.

Synced: What is the trial application like for FPGA Servers?

Huang: Several hours after the announcement, Tencent Cloud already received hundreds of trial applications from developers, as well as numerous inquiries from large customers. Currently, we have a large number of customers, spanning universities, research institutes, computational genomics, financial analysis, and even more industries.

Synced: Tencent Cloud shortened the FPGA set-up time from months to minutes, can you share what technical challenges had to be overcome in achieving this breakthrough?

Huang: Tencent Cloud solved 3 main technical challenges in FPGA:

  1.      Reduced the FPGA device type to just a few, decreasing the development required when customers are porting
  2.      FPGA development can be seen as platform and application. Tencent Cloud provides the general robust platform portion, including hardware logic such as PCIE, DMA, and DDR access, drivers, and software APIs. Users just need to focus on realizing the application software, thereby reducing the work and time required to configure the platform.
  3.     Users can package environments already configured on Tencent Cloud, and leverage Tencent Cloud to achieve an one step set-up.

 


Original Article from Synced China: http://www.jiqizhixin.com/article/2580 |Author: Jingyi Gao| Localized by Synced Global Team:  Xiang Chen

0 comments on “Tencent Cloud’s Huang Ming on its New DI-X Deep Learning Platform

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: