Despite significant progress in Artificial Intelligent (AI) system, most of the existing state-of-the-art (SOTA) systems are unimodal single task systems, which poses a challenge in developing medical AI systems as medical tasks are inherently multimodal with rich modalities spanning text, imaging, genomics, and more.
To bridge this gap, in a new paper Towards Generalist Biomedical AI, a research team from Google Research and Google DeepMind presents Med-PaLM Multimodal (Med-PaLM M), a large multimodal generative model that can process multi-modal biomedical data including clinical language, imaging, and genomics using a single set of model weights without any task-specific modification.
The team summarizes their main contributions as follows:
- Curation of MultiMedBench We introduce MultiMedBench, a new multimodal biomedical benchmark spanning multiple modalities including medical imaging, clinical text and genomics with 14 diverse tasks for training and evaluating generalist biomedical AI systems.
- Med-PaLM M, the first demonstration of a generalist biomedical AI system We introduce Med-PaLM M, a single multitask, multimodal biomedical AI system that can perform medical image classification, medical question answering, visual question answering, radiology report generation and summarization, genomic variant calling, and more with the same set of model weights.
- Evidence of novel emergent capabilities in Med-PaLM M Beyond quantitative evaluations of task performance, we observe evidence of zero-shot medical reasoning, generalization to novel medical concepts and tasks, and positive transfer across tasks.
- Human evaluation of Med-PaLM M outputs Beyond automated metrics, we perform radiologist evaluation of chest X-ray reports generated by Med-PaLM M across different model scales.
The team starts by addressing the absence of comprehensive multimodal medical benchmarks issue by proposing MultiMedBench, a multimodal biomedical benchmark that covers a wide range of multimodal data sources for measuring the capability of a general-purpose biomedical AI to handle various medical tasks ranging from Visual Question Answering, Report Generation, Medical Image Classification etc.
Next, the researchers leverage MultiMedBench to develop Med-PaLM M by fine-tuning and aligning the PaLM-E model to the biomedical domain. The generalist biomedical AI model takes multimodal medical data as inputs while processing them using a single set of model parameters, with the capability to perform multiple tasks.
More specifically, the team trained the model with a mixture of distinct tasks simultaneously by using instruction tuning, added a text-only “one-shot exemplar” to enable the model to align with instructions, and fine-tuned the pretrained variants of PaLM-E on MultiMedBench tasks to obtain the resulting Med-PaLM M model.
In their empirical study evaluated Med-PaLM M on all tasks in MultiMedBench. Med-PaLM M performs near or exceeding SOTA of baselines on all tasks while also demonstrates strong zero-shot generalization capabilities.
To the team’s best knowledge, Med-PaLM M is the first attempts of a generalist biomedical AI system, the team believes their works represents a crucial step towards the development of generalist biomedical AI.
The paper Towards Generalist Biomedical AI on arXiv.
Author: Hecate He | Editor: Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.