NVIDIA’s Minimal Video Instance Segmentation Framework Achieves SOTA Performance Without Video-Based Training

In the new paper MinVIS: A Minimal Video Instance Segmentation Framework Without Video-based Training, an NVIDIA research team presents MinVIS, a minimal video instance segmentation framework that outperforms state-of-the-art VIS approaches without requiring video-based training.

by Synced

2022-08-09

Comments 35

The task of simultaneously classifying, segmenting, and tracking multiple object instances in videos is referred to as video instance segmentation (VIS). Modern VIS transformers (VisTR) use a per-clip approach and have shown impressive end-to-end performance but suffer from long training times and high computation costs due to their frame-wise dense attention. Moreover, VisTRs must annotate object instance masks for each video frame, which is also prohibitively expensive at scale.

In the new paper MinVIS: A Minimal Video Instance Segmentation Framework Without Video-based Training, an NVIDIA research team presents MinVIS, a minimal video instance segmentation framework that outperforms state-of-the-art VIS methods without requiring video-based training or annotations.

The team summarizes their main contributions as follows:

We show that video-based architecture and training are not required for competitive VIS performances. MinVIS outperforms previous state-of-the-art on YouTube-VIS 2019 and 2021 datasets by 1% and 3% AP while only training an image instance segmentation model.
We show that image instance segmentation models capable of segmenting occluded instances are also well suited to track occluded instances in videos in our framework. MinVIS outperforms its per-clip counterpart by over 13% AP on the challenging Occluded VIS (OVIS) dataset, which is an over 10% improvement compared to the previous best performance on the dataset.
Our image-based approach allows us to significantly sub-sample the required segmentation annotations in training without any change to the model. With only 1% of labelled frames, MinVIS outperforms or is comparable to fully-supervised state-of-the-art approaches on all three datasets.

The proposed MinVIS method trains a query to only return high responses on features of its corresponding instance while other query embeddings have low responses on these features as the instance masks are non-overlapping. As such, the query embeddings for different instances in a frame are well-separated, which enables temporally consistent query embeddings for object-tracking without requiring video-based training.

For MinsVIS inference, the team first independently applies a query-based image instance segmentation model on video frames, then associates the segmented instances with their corresponding query embeddings. The query embeddings will thus contain the needed information for tracking a given instance. Since the video frames are treated as independent images for MinsVIS training, there is no need to annotate all the frames in a video, which enables significant sub-sampling and reduction of segmentation annotations without any change to the model.

In their empirical study, the team compared MinVIS against state-of-the-art approaches on the YouTube-VIS 2021 dataset, where it improved average precision (AP) by 3 percent. MinVIS also outperformed its per-clip counterparts by over 13 percent AP on the challenging Occluded VIS (OVIS) dataset, again without video-based training.

The researchers note MinVIS’s practical advantages — reducing both label and computation costs without sacrificing model performance — make it a promising new approach to VIS, and propose extending MinVIS with sub-sampled annotations to further improve performance.

The code is available on the project’s GitHub. The paper MinVIS: A Minimal Video Instance Segmentation Framework Without Video-based Training is on arXiv.

Author: Hecate He | Editor: Michael Sarazen

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

35 comments on “NVIDIA’s Minimal Video Instance Segmentation Framework Achieves SOTA Performance Without Video-Based Training”

Harry Kane

2022-08-12

The information is very special, I will have to follow you.
https://fivenightsatfreddys.onl

Loading...

Reply
quordle

2022-10-21

What an unexpected success that surpassed all expectations.

Loading...

Reply
JamesOneil

2023-03-29

The resulting multiplicative co-modulation architecture
achieves a favourable identity-editability trade off.

Loading...

Reply
Happy wheels

2023-09-27

Happy Wheels is a fun and very addicting physics-based game. You need skill, planning, and luck to get through the levels of this game.

Loading...

Reply
Michealjackson

2023-11-06

Stone Well Financial is the name I rely on for 해외여행자보험 비교
. Their services are unmatched, ensuring I have a worry-free travel experience.

Loading...

Reply
JamesOneil

2023-11-13

I chanced upon County Roofing Systems at https://www.countyroofingsystems.com
and found the perfect match for my roofing project.

Loading...

Reply
MichealJackson

2023-11-18

Elevate your holiday style with a stunning christmas case from Claspp! Their unique designs and durable materials make them the go-to source for festive phone accessories.

Loading...

Reply
MarkNeil

2023-11-20

Incorporating Grado Inspired’s affirmation card
into my routine has been a game-changer. The diverse range of affirmations caters to various aspects of personal growth. Grado Inspired’s dedication to promoting positivity shines through in every card, making them a trusted choice.

Loading...

Reply
JamesOneil

2023-11-25

I recently explored http://www.flangeandfittings.com
, the online home of Flanges Steel, and I was impressed with their range of flange solutions. If you’re looking for dependable flanges, they’ve got you covered.

Loading...

Reply
Michealjackson

2024-01-08

I recently had the pleasure of experiencing County Roofing System’s commitment to County’s Roofing Integrity
. Their team’s attention to detail and use of cutting-edge materials showcase why they are a trusted name in the roofing industry.

Loading...

Reply
MichealJackson

2024-02-14

Unplug from traditional contracts and embrace connectivity on your terms with Moving Internet’s prepaid home internet
solutions. This guide delves into the advantages of their services, offering reliable and efficient connectivity without the hassles of long-term commitments. Explore how Moving Internet transforms your home internet experience with prepaid options.

Loading...

Reply
Hannah Buvelot

2024-02-14

Wifi Hire’s wifi egg UK is a game-changer for business travelers. I rented one for my work trip, and the device provided fast and secure internet throughout my meetings. Elevate your business travel with Wifi Hire!

Loading...

Reply
- Leo
  
  2025-07-19
  
  Wifi Hire’s portable WiFi in the UK made my business trip incredibly smooth. I used it during my meetings, and the connection was consistently fast and secure.
  
  Loading...
  
  Reply
doramasmp4

2024-02-16

Impressive! NVIDIA’s Minimal Video Instance Segmentation Framework showcasing State-of-the-Art (SOTA) performance without relying on video-based training is a remarkable achievement. The innovative approach certainly raises the bar in the field.
https://doramasmp4.dev/

Loading...

Reply
Amber Champ

2024-03-12

Vesta Care is the premier destination for anyone seeking a clinic near me. Their state-of-the-art facilities and dedicated healthcare professionals ensure that you receive top-quality medical care conveniently located in your community. Experience the difference with Vesta Care.

Loading...

Reply
Claudia Whitworth

2024-03-13

In the market for a reliable pipe cutting machine? Look no further than GBC Spa! Their cutting-edge technology and commitment to quality make them my go-to for all equipment needs.

Loading...

Reply
Austin Zimin

2024-03-28

Unlock the potential of urban markets with urban fulfilment centre services. Partner with 3PL Fulfilment for strategic logistics solutions designed to meet the unique demands of urban areas.

Loading...

Reply
Michealjackson

2024-03-30

Looking for an effective intracervical insemination kit
Look no further than the selection offered by intracervical insemination. Their kits are designed with precision and effectiveness in mind, providing individuals with the tools they need for successful intracervical insemination. Trust intracervical insemination for reliable kits and expert guidance throughout the process.

Loading...

Reply
Dorama Vip

2024-05-21

DoramasVIP – Ver Doramas Online Gratis HD.

Loading...

Reply
Dorama Vip

2024-05-22

DoramasVIP – Ver Doramas Online Gratis HD.

Loading...

Reply
Terabox MOD APK

2024-05-31

Terabox MOD APK Premium Download is the latest and modified version of the official Terabox app. It gives upto 1TB free cloud storage.
teraboxmod.org

Loading...

Reply
videobuddy

2024-07-23

The Videobuddy app is a multimedia platform that allows users to download and stream videos from various websites. It also offers features like video sharing, social networking, and earning rewards through in-app activities.
getvideobuddy.in

Loading...

Reply
Jorja Barrenger

2024-08-05

If you’re looking to upgrade your ride, check out the AWC High-Performance Wheels from Goride MotorSports. They offer amazing quality and performance.

Loading...

Reply
Download Remini APK

2024-08-07

Enhance your photos with Remini’s advanced AI technology by downloading the APK from our trusted source.

Loading...

Reply
Rabia Noor

2024-09-06

Just discovered Bombitup VIP APK and it’s hilarious! The interface is user-friendly, and it works like a charm.

Loading...

Reply
9anime

2024-09-27

Watch Anime is the best platform offering you free anime streaming that allows you to watch anime online with English subtitles and English dubbing.

Loading...

Reply
hianime

2024-10-05

Official Hi Anime to watch all anime online in HD quality with DUB and SUB, No Ads,, Watch Now,, 100% FREE GUARANTEED!!!

Loading...

Reply
Lisa

2024-11-18

Wow, NVIDIA’s approach to video instance segmentation is truly groundbreaking—SOTA’s performance without video-based training is impressive! Speaking of handling data efficiently, I recently started using Terabox mod apk for managing large video files, and it’s been a game-changer. Highly recommend it for anyone dealing with massive datasets!

Loading...

Reply
maliha Waid

2024-11-30

NVIDIA’s breakthrough in video instance segmentation could revolutionize video analysis and editing tools, potentially making them more powerful and accessible to users of all levels. This could lead to more efficient video organization and management on platforms like TeraBox, making it easier to find and retrieve specific moments or objects within videos.

Loading...

Reply
Adward brain

2024-12-02

Tivimate Premium is an amazing movie platform from where you can watch your favorite movies free. Tap on tivimte with free account and download
.

Loading...

Reply
Brightrs

2024-12-22

Your post has really opened my eyes to a new perspective on this topic. I love how you dive deep into the details without losing the reader’s attention. It’s clear that you’ve put a lot of effort into researching and understanding the subject, and it definitely shows. I also appreciated how you made the content feel personal and relatable. It’s not often that I come across a post that feels so relevant to my own experiences. Thanks for sharing your knowledge – I’m looking forward to reading more of your work in the future!

Loading...

Reply
Micah Green

2025-02-15

What was once just a way to stay in touch has now become a global powerhouse shaping culture, business, and communication. Social media has evolved from simple chat forums to advanced platforms like TikTok, Instagram, and YouTube, where AI tailors content, trends spread instantly, and live streaming connects millions in real time. Virtual and augmented reality are pushing digital interactions even further, making the online world more immersive. But alongside innovation come challenges privacy risks, misinformation, and mental health concerns continue to grow. As technology advances, social media will keep changing, influencing the way we live, work, and connect.

Loading...

Reply
joan

2025-03-01

Really interesting read — it’s amazing how fast video processing tech is evolving, especially with approaches like this that cut down training data needs. As someone who streams a lot (mostly anime), it’s cool to see how these advancements could eventually improve streaming quality too. By the way, if anyone here is searching for the best 9anime alternatives this site is awesome. Great selection, smooth playback, and way fewer annoying ads. Perfect for unwinding after diving into heavy research like this.

Loading...

Reply
Anonymous

2025-09-18

Winter Mobile Massage offers customized outcall massage services that help you find comfort amidst your busy daily life.You can conveniently enjoy outcall massage anywhere in Seoul and the greater metropolitan area, and experience high-quality healing with professional therapists.

Loading...

Reply
ZoroTV

2025-11-19

ZoroTV is your ultimate online hub to watch latest anime streaming website you can watch English subbed and dubbed with no interruptions, anytime you want.

Loading...

Reply