Computer Vision & Graphics Machine Learning & Data Science Research

‘Neural Body’ Reconstructs Dynamic Human Bodies From Sparse Camera Views

The novel approach tackles dynamic 3D human-body synthesis from a sparse set of camera views, bettering existing methods on key metrics by significant margins.

In a new paper, a group of researchers from Zhejiang University, The Chinese University of Hong Kong and Cornell University propose an implicit neural representation method called Neural Body. The novel approach tackles dynamic 3D human-body synthesis from a sparse set of camera views, bettering existing methods on key metrics by significant margins.

Screen Shot 2021-01-04 at 5.11.35 PM.png

Typically, 3D reconstruction requires either a large number of cameras to cover all angles or the use of depth sensors, which makes the process complicated, costly, and constrained in certain environments. The researchers approach the novel view synthesis challenge with sparse multi-view video (at most four camera angles) that captures a moving human body. Since these camera angles remain constant, existing reconstruction-based methods tend to produce undesirable heavy rendering artifacts due to the occlusion of body parts at different temporal states. Meanwhile, view synthesis methods like Google’s NeRF (Neural Radiance Fields), which utilize implicit neural representations, also show degraded performance when input views are sparse.

Screen Shot 2021-01-04 at 2.55.44 PM.png

To address these shortcomings, Neural Body generates implicit 3D representations of a human body in different video frames from the same set of latent codes anchored to the vertices of a deformable mesh. For each frame, the model transforms the code locations based on the human pose, while a network regresses the density and colour for any 3D location based on the structured latent codes. This enables images at any viewpoint to be synthesized via volume rendering.

Screen Shot 2021-01-04 at 4.12.09 PM.png
Neural Body processing

To evaluate their approach, researchers built a multi-view dataset with 9 dynamic human videos captured using a system with 21 synchronized cameras. Four uniformly distributed cameras were selected for training, the rest reserved for testing. On both the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) metrics used for evaluating the novel view synthesis, the proposed model trained on all frames achieved the best performance, outperforming both NeRF and Neural Volume (NV) by a margin of at least 6.45 in terms of the PSNR metric and 0.119 in terms of the SSIM metric.

Multi-view dataset evaluations (higher is better), with PSNR (left) and SSIM (right)

The researchers also tested their model’s 3D reconstruction capability against a learning-based approach, PIFuHD (Pixel-aligned Implicit Function for high-resolution 3d human digitization). The results show that Neural Body generates accurate geometries for humans in complex motions, while PIFuHD fails to recover correct human shapes with complex poses.

Screen Shot 2021-01-05 at 9.31.36 AM.png

Researchers further compared the proposed method’s synthesis and reconstruction abilities from monocular videos to the People-Snapshot method, where Neural Body achieved more accurate appearance and geometric details, especially with subjects wearing loose clothing.

Screen Shot 2021-01-05 at 9.33.41 AM.png
Novel view synthesis on monocular videos
Screen Shot 2021-01-05 at 9.34.00 AM.png
3D reconstruction on monocular videos

The paper Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans is on arXiv. The code and dataset will soon be available on the project GitHub.


Analyst: Reina Qi Wan | Editor: Michael Sarazen; Yuan Yuan


B4.png

Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon KindleAlong with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.

Click here to find more reports from us.


AI Weekly.png

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

3 comments on “‘Neural Body’ Reconstructs Dynamic Human Bodies From Sparse Camera Views

  1. Pingback: [R] ‘Neural Body’ Reconstructs Dynamic Human Bodies From Sparse Camera Views – tensor.io

  2. Pingback: [R] ‘Neural Body’ Reconstructs Dynamic Human Bodies From Sparse Camera Views – ONEO AI

  3. Pingback: Applied Sports Science newsletter – January 11, 2021 | Sports.BradStenger.com

Leave a Reply

Your email address will not be published.

%d bloggers like this: