Facebook & CMU’s Zero-Shot VideoCLIP Outperforms Fully-Supervised SOTA Methods for Video-Text Understanding
A research team from Facebook AI and Carnegie Mellon University presents VideoCLIP, a contrastive approach for pretraining a unified model for zero-shot video and text understanding without requiring annotated data for downstream tasks.