Despite significant progress in Artificial Intelligent (AI) system, most of the existing state-of-the-art (SOTA) systems are unimodal single task systems, which poses a challenge in developing medical AI systems as medical tasks are inherently multimodal with rich modalities spanning text, imaging, genomics, and more.
To bridge this gap, in a new paper Towards Generalist Biomedical AI, a research team from Google Research and Google DeepMind presents Med-PaLM Multimodal (Med-PaLM M), a large multimodal generative model that can process multi-modal biomedical data including clinical language, imaging, and genomics using a single set of model weights without any task-specific modification.

The team summarizes their main contributions as follows:
- Curation of MultiMedBench We introduce MultiMedBench, a new multimodal biomedical benchmark spanning multiple modalities including medical imaging, clinical text and genomics with 14 diverse tasks for training and evaluating generalist biomedical AI systems.
- Med-PaLM M, the first demonstration of a generalist biomedical AI system We introduce Med-PaLM M, a single multitask, multimodal biomedical AI system that can perform medical image classification, medical question answering, visual question answering, radiology report generation and summarization, genomic variant calling, and more with the same set of model weights.
- Evidence of novel emergent capabilities in Med-PaLM M Beyond quantitative evaluations of task performance, we observe evidence of zero-shot medical reasoning, generalization to novel medical concepts and tasks, and positive transfer across tasks.
- Human evaluation of Med-PaLM M outputs Beyond automated metrics, we perform radiologist evaluation of chest X-ray reports generated by Med-PaLM M across different model scales.
The team starts by addressing the absence of comprehensive multimodal medical benchmarks issue by proposing MultiMedBench, a multimodal biomedical benchmark that covers a wide range of multimodal data sources for measuring the capability of a general-purpose biomedical AI to handle various medical tasks ranging from Visual Question Answering, Report Generation, Medical Image Classification etc.

Next, the researchers leverage MultiMedBench to develop Med-PaLM M by fine-tuning and aligning the PaLM-E model to the biomedical domain. The generalist biomedical AI model takes multimodal medical data as inputs while processing them using a single set of model parameters, with the capability to perform multiple tasks.

More specifically, the team trained the model with a mixture of distinct tasks simultaneously by using instruction tuning, added a text-only “one-shot exemplar” to enable the model to align with instructions, and fine-tuned the pretrained variants of PaLM-E on MultiMedBench tasks to obtain the resulting Med-PaLM M model.

In their empirical study evaluated Med-PaLM M on all tasks in MultiMedBench. Med-PaLM M performs near or exceeding SOTA of baselines on all tasks while also demonstrates strong zero-shot generalization capabilities.
To the team’s best knowledge, Med-PaLM M is the first attempts of a generalist biomedical AI system, the team believes their works represents a crucial step towards the development of generalist biomedical AI.
The paper Towards Generalist Biomedical AI on arXiv.
Author: Hecate He | Editor: Chain Zhang

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

Step into The Flower Shop NYC and be transported to a different era. This retro bar offers a distinctive vibe with its vintage decor and eclectic menu. Whether you’re in the mood for a signature cocktail or a hearty meal, this spot provides a memorable experience with a touch of classic charm.
With the capacity to interpret multimodal medical data using a single set of model parameters, the generalist biomedical AI model can handle a variety of jobs. slice master
Team trained the model with a mixture of distinct tasks t.a.b.s. simultaneously by using instruction tuning
I admire your ability to convey such detailed information in an accessible way. word vs word is an exciting single-player with bot word puzzle game where you battle in real-time matches.
Every aspect of your work is thoughtfully crafted, making it impossible to put down; snow road convey deep feelings beautifully.