Facebook AI’s FLAVA Foundational Model Tackles Vision, Language, and Vision & Language Tasks All at Once
A Facebook AI Research team presents FLAVA, a foundational language and vision alignment model that explicitly targets language, vision, and their multimodal combination all at once, achieving impressive performance on 35 tasks across the vision, language, and multimodal domains.