New Google & Oxford Model Time-Shifts People in Videos

Synced

4 years ago

“Timing,” it is often said, “is everything.” Our perception of an event can change dramatically depending on the timing of the human actions therein. In video, even the basic YouTube player can easily speed up or slow down a scene. But what if it were possible to temporally manipulate the individual characters in a scene, speeding them up or slowing them down independently of the rest of the action?

A group of researchers from Google Research and the University of Oxford have introduced a novel technique that does just that, by “retiming” people’s movements in videos.

The proposed method works with regular videos and supports various retiming effects, such as aligning the motions of different people in a scene for example to bring an off-beat dancer into synch with the rest of their troupe, speeding up or slowing down only certain actions, or “freezing” people and even erasing them from the video.

All the effects are achieved via a novel deep neural network-based model that learns a layered decomposition of the input video.

The researchers say their model draws inspiration from recent advances in neural rendering and combines classical elements from graphics rendering with deep learning. They were able to leverage human-specific models and represent each person in a video with a single deep-texture map that is used to render the person in each frame.

The core of their “layered neural rendering” technique is a learned layer decomposition in which each layer represents the full appearance of an individual in the video. This not only disentangles the direct motions of each person in the input video, but also correlates each person automatically with the scene changes they generate, such as shadows, reflections, motion of loose clothing, etc.

The layers can be individually retimed and recombined into a new video, enabling realistic, high-quality renderings of the retiming effects for real-world videos. This “people geometry” approach is effective in depicting dynamic and complex human actions such as dancing, trampoline jumping, groups of joggers, etc.

“We believe that our layered neural rendering approach holds great promise for additional types of synthesis techniques,” the researchers explain. The team plans to generalize the approach to other objects besides humans and to expand it to include additional non-trivial post-processing effects such as stylized rendering of different video components.

The paper Layered Neural Rendering for Retiming People in Video is on arXiv. The model’s code will be released at SIGGRAPH Asia 2020, which runs November 17-20.

Reporter: Yuan Yuan | Editor: Michael Sarazen

Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle. Along with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.

Click here to find more reports from us.

We know you don’t want to miss any latest news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

Share this: