A new study by NVIDIA, University of Toronto, McGill University and the Vector Institute introduces an efficient neural representation that enables real-time rendering of high-fidelity neural SDFs for the first time while delivering SOTA quality geometric reconstruction.
In the paper Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes, the researchers note that neural approximations of signed distance functions (neural SDFs) have become the go-to choice for many SOTA computer vision and graphics applications. Such systems typically encode the SDFs with a large, fixed-size multi-layer perceptron (MLP)/ neural network to approximate complex shapes with implicit surfaces.
Although neural SDF encoding methods can often achieve SOTA geometry reconstruction quality, they are computationally expensive for real-time graphics, as many forward passes through the large network are required for every pixel. The paper’s first author, Towaki Takikawa, elaborated on Twitter: “Neural SDFs are emerging as a 3D geometry representation for graphics. SDFs are f(x,y,z) = d, a function of position which returns nearest distance to the surface. They are differentiable and smooth, but typical neural SDFs are very slow to render. Why are they slow? SDFs are rendered with an algorithm called sphere tracing, which performs numerous distance queries along the ray. Typical neural SDFs are composed of large MLPs, which makes sphere tracing prohibitively expensive.”
A critical innovation is the researchers’ novel representation for neural SDFs. “We encode geometry using a sparse voxel octree which contains feature vectors at the corners, where the levels of the octree correspond to levels of detail,” Takikawa explains. These vectors can be decoded using a small MLP without compromising the reconstruction quality or generality.
The proposed architecture consists of a model that combines a small surface extraction neural network with a sparse-octree data structure that encodes the geometry and enables neural geometric level of detail (LOD). In computer graphics, LOD refers to 3D shapes filtered to limit feature variations to approximately twice the pixel size in image space to mitigate flickering and accelerate rendering by reducing model complexity. Combined with a tailored sphere tracing algorithm, the proposed method is both computationally performant and highly expressive.
In experiments, the approach achieved a rendering speedup of 2-3 orders of magnitude over baseline networks and state-of-the-art reconstruction quality for complex shapes on both 3D geometric and 3D image-space metrics.
The research team says the study is a big step forward in neural implicit function-based geometry, with potential applications in scene reconstruction, ultra-precise robotics path planning, interactive content creation and beyond.
The paper Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Surfaces is on arXiv.
Reporter: Fangyu Cai | Editor: Michael Sarazen