In the intense race to develop autonomous driving vehicles, LiDAR, the radar-like system of lasers, has stood out as one of the most crucial hardware components. LiDAR systems generate accurate and computer-friendly point cloud data that work as 3D maps of the world to improve perception and safety for self-driving vehicles. However, the vital task of semantic segmentation of LiDAR point clouds remains challenging for AI researchers.
The scarcity of labelled 3D point clouds has hindered the further performance improvements of deep neural networks on semantic segmentation tasks. Although several autonomous driving companies have released a few datasets, the different configurations of LiDAR sensors and other domain discrepancies inevitably lead to the problem where deep networks trained on one dataset do not perform well on others. To bridge the domain gap caused by differences in 3D point cloud sampling in LiDAR sensors, a team of Google researchers recently proposed a novel “complete and label” domain adaptation approach.
In the paper Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds, the researchers identify a key observation that inspired the design of the novel domain adaptation approach: LiDAR samples have underlying geometric structures, and domain adaptation can be performed more effectively with a 3D model that leverages these structures. The team therefore assumed a physical world composed of 3D surfaces and approached the domain adaptation challenge as a 3D surface completion task.
“If we can recover the underlying complete 3D surfaces from sparse LiDAR point samples, and train networks that operate on the completed surfaces, then we can leverage labeled data from any LiDAR scanner to work on data from any other,” reads the paper. The team designed a Sparse Voxel Completion Network (SVCN) to complete the 3D surfaces of sparse point clouds.
The network architecture consists of two stages: surface completion and semantic labelling. Unlike semantic labels, obtaining training pairs for SVCN requires no manual labelling since surface completion can be learned from self-supervision — such as from multi-view observations or synthetic datasets.
The team trained the completion network with supervision from complete surfaces reconstructed from multiple LiDAR data frames, with 2400 complete scene point clouds available for training and 200 for testing. Once the 3D surface was recovered, researchers used a sparse convolutional U-Net to predict a semantic label for each voxel on the completed surface. In 3D computer graphics, a voxel is unit of graphic information that defines a point in 3D space.
The team evaluated the effectiveness of the new domain adaptation approach through experiments with the different autonomous vehicle driving datasets, where it showed 8.2-36.6 percent better performance than previous domain adaptation methods. For example, the network trained on the Waymo open dataset performing semantic segmentation tasks on the nuScenes dataset yielded a 10.4 percent improvement on mIoU (mean IoU) with the proposed approach. The Intersection-Over-Union (IoU) is one of the most commonly used metrics in semantic segmentation.
The proposed domain adaptation scheme targets the domain gap in 3D point clouds with LiDAR sensors. Its ability to improve semantic segmentation shows high potential for applications such as autonomous driving, semantic mapping, and construction site monitoring.
The paper Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds is on arXiv.
Journalist: Fangyu Cai | Editor: Michael Sarazen
This report offers a look at how the Chinese government and business owners have leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle.
Click here to find more reports from us.
We know you don’t want to miss any story. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.