Tencent AI Lab has announced that it will open source its multi-label image dataset ML-Images and deep residual network ResNet-101 by the end of September. ML-Images contains 18 million images and more than 11,000 common object categories; while ResNet-101 has reached the highest precision level in the industry.
Global tech giants are placing increasing emphasis on their AI architecture, and have built large internal image datasets such as Google’s JFT-300M and Facebook’s Instagram dataset. However, these trained datasets and models are proprietary, outside the reach of general scientific research institutions and SMEs.
ML-Images will be the largest open source multi-label image dataset in the industry — capable of meeting the needs of general scientific research institutions and SMEs alike — and may become researchers’ new standard in the field of computer vision. Currently, the largest available multi-label image dataset is Google’s Open Images, which includes 9 million training images and more than 6000 object categories.
Tencent AI Lab revealed additional details:
- Dataset architecture: The Tencent AI lab team combined various information sources including images, category semantic segmentation, and image annotations to build the ML-Images dataset.
- Training method: The team’s well-designed loss function and training methods can effectively suppress the negative impact of category imbalances in large-scale multi-label datasets for model training.
- ResNet-101: Based on ML-Images training, ResNet-101 has excellent visual representation and generalization performance. Through migration learning, the model has achieved 80.73% top-1 classification accuracy on the ImageNet validation set, exceeding the accuracy of Google’s model. ResNet will provide robust support for visual tasks in image and video processing, etc., and improve the technical level of image classification, object detection, object tracking, and semantic segmentation.
Breakthroughs in Deep Learning Networks have shown that the technology can be applied across many fields, especially computer vision, where it excels at essential tasks such as classifying, interpreting and generating images and videos. However, to give full play to the visual potential of deep learning requires high-quality training data, an excellent model structure and model training methods, and powerful computing resources and other capabilities.
Tencent’s announcement is a major step forward in making high-quality training data more accessible.
The ML-Images dataset will be published on Tencent’s GitHub page.
Journalist: Fangyu Cai | Editor: Michael Sarazen