DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints
A DeepMind research team introduces JetFormer, a Transformer designed to directly model raw data. This model maximizes the likelihood of raw data without depending on any pre-trained components, and is capable of both understanding and generating text and images seamlessly.

























