AI Computer Vision & Graphics Machine Learning & Data Science Research

UC Berkeley’s Instruct-NeRF2NeRF Edits 3D Scenes With Text Instructions

In the new paper Instruct-NeRF2NeRF: Editing 3D Scenes With Instructions, a UC Berkeley research team presents Instruct-NeRF2NeRF, an approach for editing 3D NeRF scenes through natural language text instructions. The proposed method can edit large-scale, real-world 3D scenes with improved ease of use and realism.

Recent progress in neural 3D reconstruction has greatly simplified capturing realistic digital representations of real-world 3D objects and scenes via neural radiance fields (NeRFs) built using information from multiple camera viewpoints. Current approaches for editing such 3D representations are however much less accessible, typically requiring specialized tools.

In the new paper Instruct-NeRF2NeRF: Editing 3D Scenes With Instructions, a UC Berkeley research team presents Instruct-NeRF2NeRF, an approach for editing 3D NeRF scenes through natural language text instructions alone. The proposed method is able to edit large-scale, real-world 3D scenes with improved ease of use and realism.

Instruct-NeRF2NeRF takes as its inputs a reconstructed NeRF scene, a set of captured images and their corresponding camera poses, and camera calibration information. The user’s natural-language editing instructions are then used to condition the model’s edited NeRF output.

Instruct-NeRF2NeRF uses InstructPix2Pix — a diffusion-based model specialized for image editing — to iteratively update image content at the captured viewpoints. These dataset edits are then consolidated into a globally consistent 3D representation via NeRF training. This novel Iterative Dataset Update (Iterative DU) approach enables Instruct-NeRF2NeRF to gradually percolate diffusion priors into a 3D scene reconstruction while maintaining the original scene’s structure and identity.

The team uses NeRFStudio’s Nerfacto model as their underlying NeRF implementation and fine-tunes parameters that affect noise/signal strength and the model’s classifier-free guidance weights to optimize edit strength and enable different degrees of scene edits before performing the NeRF optimization process.

In their empirical study, the team applied Instruct-NeRF2NeRF to the editing of 360 unique 3D scenes of varying complexity and compared its qualitative and quantitative performance against ablative baselines. The results show that Instruct-NeRF2NeRF can perform superior targeted edits on 3D representations of people, objects, and large-scale real-world scenes and impart its outputs with realism that surpasses the benchmarks.

Result videos can be found on the project’s website. The paper Instruct-NeRF2NeRF: Editing 3D Scenes With Instructions is on arXiv.


Author: Hecate He | Editor: Michael Sarazen


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

8 comments on “UC Berkeley’s Instruct-NeRF2NeRF Edits 3D Scenes With Text Instructions

  1. This type of 3D image editing has gradually become popular because of its very high practicality. It’s not as difficult to use as I initially thought.

  2. The way you define things is very helpful. This was a very good lesson and gave me a lot to think about

  3. This is a great source of information, I will keep track of your posts and share them with everyone. I admire the person who wrote this post, you are so talented, hope you will promote them and become more successful

  4. This is an exciting development in the field of 3D scene editing. The ability to edit 3D NeRF scenes using natural language text instructions is a game changer, as it simplifies what is typically a complex and specialized process.

  5. Traffic Road

    How does Instruct-NeRF2NeRF handle ambiguous or highly creative textual instructions, and are there limitations in the types of edits it can reliably perform?

  6. The ability to apply targeted edits to complex 3D scenes while maintaining realism is a major step forward.

  7. Your perspective on this topic is both unique and enlightening. I admire your ability to convey such detailed information in an accessible way.

  8. Traffic Rally encourages teamwork in multiplayer mode, where players collaborate to achieve victory together. The tutorial in Traffic Rally helps new players learn the basics quickly and effectively.

Leave a Reply

Your email address will not be published. Required fields are marked *