Introducing Prithvi WxC, a new general-purpose AI model for weather and climate (original) (raw)

The future of AI-powered weather forecasting is looking promising. Some deep-learning models trained on historical weather data can already match the performance of conventional weather models that simulate physical processes on massive supercomputers that few people traditionally have access to.

In collaboration with NASA, IBM wanted to build more than just another AI forecasting model. Instead, the goal was to push things forward with a general-purpose AI model that could be customized for a range of practical weather and climate applications, at varying spatial scales. Today, with contributions from Oak Ridge National Laboratory, they are open-sourcing the result of that effort on Hugging Face, less than a year after setting out to design a full-fledged foundation model for weather and climate.

It took several weeks and dozens of GPUs to train the model on 40 years of historical weather data from the MERRA-2 is NASA’s harmonized dataset of satellite and other historical Earth observation data.NASA MERRA-2 reanalysis. But the model can now be quickly tuned for different use cases and served from a desktop computer in seconds. Potential applications include creating targeted forecasts from local weather data, predicting extreme weather events, improving the spatial resolution of global climate simulations, and improving the representation of physical processes in conventional weather and climate models.

“We designed our foundation model so that all the hard work and GPU hours invested upfront would pay off by allowing people to quickly spin off and run new applications,” said Campbell Watson, an IBM climate researcher who helped develop the model.

In one experiment, the model took a tiny, localized sample of weather data and accurately reconstructed global surface temperatures by filling in 95% of the missing values. “The ability to generalize from a tiny sample of high-quality historical data to the entire planet is useful for a wide range of weather and climate projection tasks,” said Juan Bernabé-Moreno, the director of IBM Research Europe and IBM's lead for climate and sustainability.

Downscaling, hurricane forecasting, and capturing Earth’s elusive gravity waves

The new weather and climate foundation model is described in a new paper posted on arXiv. In the paper, researchers described how they built the model and fine-tuned it on specialized data to create three applications with immediate relevance for forecasters.

Downscaling for precipitation in action on the IBM Geospatial Studio.

The first application is designed to zoom in on low-resolution data for more detail, a method known as downscaling. By localizing weather and climate projections, downscaling can provide early warning that an extreme flooding event or hurricane force winds are on their way.

IBM has released a downscaling model as part of the IBM Granite family. It takes data of varying resolutions and types, like temperature and amount of rainfall, and magnifies them by up to 12 times. Through downscaling, intense rainfall leading to a flash flood would have been previously viewed from a 150-square kilometer perspective in a traditional climate model can now be seen in 12.5-square kilometer segments. The downscaling application is available through on Hugging Face.

The second focuses on hurricane forecasting. Researchers used the model to accurately reconstruct the track of Hurricane Ida, which struck Louisiana in 2021 and caused $75 billion in damages, making it the fourth costliest Atlantic hurricane on record. In the future, this model could be used to more accurately track where to shore up defenses against oncoming hurricanes.

IBM and NASA’s third application is designed to improve estimates of gravity waves. In Earth’s atmosphere, gravity waves influence cloud formation and global weather patterns, such as where aircraft turbulence appears. Traditional climate models fail to properly capture gravity waves at high resolution, adding uncertainty to weather and climate projections. This could be game-changing for the orchestration of global supply chains.

Separately, IBM is working with Canada’s weather agency, Environment and Climate Change Canada, to customize the base model for precipitation nowcasting, which involves using real-time radar data to make highly local rainfall predictions several hours out. The hope is that the data-driven foundation model approach could potentially use fewer computing resources and deliver more accurate results.

Learning to ‘think’ like a forecaster

This new weather and climate foundation model joins a growing family of open-source models designed to make NASA’s collection of satellite and other Earth observational datasets faster and easier to analyze. The model owes its flexibility to its hybrid architecture and unusual training regimen.

It’s built on a vision transformer and a masked autoencoder, allowing the model to encode spatial data unfolding through time. By extending the model’s attention mechanism to include time, it’s able to analyze MERRA-2 reanalysis data, which integrates multiple streams of observational data.

The model is also capable of running on both a sphere, as traditional gridded climate models do, and on a flat, rectangular surface. These dual representations allow the model to flip from global to regional views without sacrificing resolution.

The model was able to output the map on the left of global surface temperatures from the input on the right that had 95% of the map blacked out.

During training, researchers fed the model gridded, heavily blacked out climate reanalysis data and had it reconstruct each image pixel by pixel. They also had the model project the blacked-out image into the future. “The model effectively learns how the atmosphere evolves over time,” said Johannes Schmude, an IBM researcher who helped develop the model.

Asking the model to piece together incomplete weather data and envision its future state had two benefits. It cut in half the amount of data researchers needed train the model, reducing GPU and energy consumption. It also taught the model how to fill in missing information, both in the present moment and beyond. This is essentially what weather forecasters do.

“Weather data is inherently sparse,” said Schmude. “To learn how to forecast, you have to learn how to fill in gaps.”

What’s next

IBM and NASA plan to see if their existing open-source geospatial AI model for analyzing earth observation data can be combined with their new model for weather and climate. Released last year, the Prithvi Earth Observation model has been developed into a wide array of applications that have together been downloaded more than 10,000 times. Among other things, the applications have been used to estimate the extent of past floods and infer the intensity of past wildfires from burn scars.

Together, the Earth Observation and weather and climate models could be applied to equally challenging tasks, from forecasting expected crop yields to predicting extreme flooding events and their impact on communities.

IBM also recently previewed a new offering, called Environmental Intelligence, which is in public preview until early 2025. It combines rich APIs on weather, geospatial conditions, carbon fluctuations, and industry-specific information that can help developers gather data and insights to build climate-resilient solutions for enterprise.