Tag: visual language model

AI Machine Learning & Data Science Research

3D-LLM: Integrate 3D World Into Language Models

In a new paper 3D-LLM: Injecting the 3D World into Large Language Models, a research team inject the 3D world into large language models and presents 3D-LLMs, a whole new family of models that can capture 3D spatial information to perform 3D-related tasks.

AI Machine Learning & Data Science Research

No Images Are Needed! Allen AI’s CLOSE Learns to Complete Visual Tasks From Text Inputs Alone

In the new paper I Can’t Believe There’s No Images! Learning Visual Tasks Using only Language Data, an Allen Institute for Artificial Intelligence team proposes Cross Modal Transfer On Semantic Embeddings (CLOSE), an approach that learns high-level skills from textual data, then uses these skills to complete vision tasks without additional visual training data.