Climate change and extreme weather events have made weather and climate modelling a challenging yet crucial real-world task. While current state-of-the-art approaches tend to employ numerical models conditioned on physical information collected from the atmosphere, the development of powerful deep learning models and the increasing availability of massive climate datasets have advanced the possibility of a data-driven, general-purpose foundation model for such modelling.
In the new paper ClimaX: A Foundation Model for Weather and Climate, a team from Microsoft Autonomous Systems and Robotics Research, Microsoft Research AI4Science and the University of California at Los Angeles presents ClimaX, a general-purpose deep learning foundation model for weather and climate that can be efficiently adapted for various tasks related to the Earth’s atmosphere.
The team set out to train a generalizable foundation model capable of handling heterogeneous datasets of different variables and providing spatiotemporal coverage based on physical groundings. They built ClimaX on a vision transformer (ViT) backbone and introduced two main architectural changes — variable tokenization and variable aggregation — to improve its flexibility and generality.
Variable tokenization is a novel tokenization scheme that tokenizes each variable in the input separately. Each input patch is then linearly embedded into a vector whose dimension represents the chosen embedding size, enabling ClimaX to learn from datasets with various numbers of input variables.
Variable tokenization however has two issues: it is computationally expensive, and the attention layers struggle with learning, as the input sequence contains tokens of different variables with very different physical groundings. The team addresses these issues with a variable aggregation approach that uses a cross-attention operation to output a single vector for each spatial position. This reduces the length of the sequence and equips it with unified tokens with universal semantics, making it easier for the attention layers to learn.
In their empirical study, the team compared ClimaX with existing data-driven baselines on downstream tasks such as forecasting, climate projection, and climate downscaling. In the evaluations, ClimaX achieved superior performance on all tasks, demonstrating its potential as a pioneering foundation model that enables broad scaling and generality in data-driven systems for weather and climate modelling.
The team believes it would also be interesting to explore the generalization abilities of a pretrained ClimaX backbone across other domains such as agriculture, demography, and actuarial sciences.
The paper ClimaX: A Foundation Model for Weather and Climate is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.