AI weather forecasting

AI for Science Artificial Intelligence Deep Learning

25 min read

Updated Jun 25, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 25, 2026

Fact-checked

In review queue

Sources

13 citations

Revision

v5 · 5,052 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

AI weather forecasting is the use of machine learning and deep learning models, trained on decades of historical atmospheric data, to predict the weather faster and often more accurately than the physics-based numerical weather prediction (NWP) systems that have been the standard since the mid-20th century. The leading systems, including GraphCast and GenCast from Google DeepMind, Pangu-Weather from Huawei, FourCastNet from NVIDIA, and Aurora from Microsoft, learn atmospheric dynamics directly from the ERA5 reanalysis archive and produce a full 10-day global forecast in under a minute on a single chip, versus hours on a supercomputer for traditional NWP. GraphCast outperformed the European Centre for Medium-Range Weather Forecasts' (ECMWF) gold-standard deterministic model on more than 90% of its 1,380 verification targets ^[1]^[2], and in February 2025 ECMWF made its own AI forecasting system operational ^[3], marking a fundamental shift in meteorological science.

What is numerical weather prediction (NWP)?

Modern weather forecasting has traditionally relied on numerical weather prediction, a method that simulates the atmosphere by solving the fundamental equations of fluid dynamics, thermodynamics, and radiative transfer on high-resolution grids. The European Centre for Medium-Range Weather Forecasts (ECMWF) operates the Integrated Forecasting System (IFS), widely regarded as the most accurate global NWP system. The IFS high-resolution deterministic model (HRES) produces 10-day forecasts at approximately 9 km horizontal resolution with 137 vertical levels, while the ensemble forecast system (ENS) generates 51 ensemble members to capture forecast uncertainty.

Running these physics-based simulations requires enormous computational resources. The ECMWF operates a supercomputing facility consisting of four Atos BullSequana XH2000 clusters with a combined performance of approximately 30 petaflops. Producing a single 10-day global forecast demands billions of calculations across millions of grid points, with the full operational forecast suite consuming significant fractions of this capacity. National weather services around the world collectively spend hundreds of millions of dollars annually on supercomputing infrastructure for NWP alone.

Despite steady improvements over many decades, NWP systems face fundamental constraints. Increasing resolution requires exponentially more computation, and the chaotic nature of the atmosphere means that small errors in initial conditions grow rapidly over time, limiting the practical forecast horizon. These limitations motivated researchers to explore whether data-driven approaches could learn atmospheric dynamics directly from historical observations.

What data are AI weather models trained on?

Nearly all major AI weather forecasting models are trained on ERA5, the fifth-generation global atmospheric reanalysis dataset produced by the Copernicus Climate Change Service at ECMWF ^[10]. ERA5 provides hourly estimates of a large number of atmospheric, land, and oceanic climate variables on a 31 km (0.25 degree) grid, resolving the atmosphere using 137 levels from the surface up to 80 km in altitude. The dataset covers the period from January 1940 to the present ^[10], though most AI models use data from 1979 onward, when satellite observations became available and reanalysis quality improved substantially.

ERA5 was created through data assimilation, a process that combines physics-based model simulations with millions of weather observations from surface stations, radiosondes, aircraft, and satellites to produce a physically consistent estimate of the state of the atmosphere at every time step ^[10]. This means that AI weather models are not trained on raw observations but on a carefully curated, gridded product that already encodes significant physical knowledge.

The dataset includes variables at multiple pressure levels (such as temperature, wind components, geopotential height, and specific humidity) as well as surface variables (such as mean sea level pressure, 2-meter temperature, and 10-meter wind speed). A typical AI weather model uses between 5 and 7 atmospheric variables at 13 to 37 pressure levels, plus 4 to 7 surface-level variables, resulting in hundreds of input and output channels per forecast time step.

How do AI weather models work?

AI weather models approach forecasting as a supervised learning problem. Given the current state of the atmosphere (and often the state six hours earlier), the model learns to predict the atmospheric state at a future time, typically six or twelve hours ahead. Longer forecasts are produced autoregressively, meaning the model feeds its own prediction back as input to generate the next time step, repeating this process to extend forecasts out to 10 or 15 days.

Several different neural network architectures have proven effective for this task:

Vision Transformers and Fourier Neural Operators

Vision Transformers (ViTs) treat the atmospheric state as an image, dividing the global grid into patches that are processed as tokens through self-attention layers. FourCastNet pioneered this approach using Adaptive Fourier Neural Operators (AFNOs), which replace standard attention with operations in the Fourier domain, allowing the model to efficiently capture global spatial dependencies ^[6]. Pangu-Weather extended this idea with a 3D Earth-Specific Transformer that processes atmospheric data across both horizontal and vertical dimensions ^[1].

Graph Neural Networks

Graph neural networks (GNNs) represent the atmosphere as a graph, where nodes correspond to geographic locations and edges encode spatial relationships. GraphCast uses a multi-mesh approach based on a refined icosahedron, which provides roughly uniform coverage of the globe (unlike latitude-longitude grids, which oversample the poles). The encoder-processor-decoder architecture maps from the regular input grid to the icosahedral mesh, processes information through multiple message-passing layers, and maps back to the output grid ^[2].

Diffusion Models

Diffusion models generate probabilistic forecasts by learning to iteratively denoise random samples. GenCast adapts this approach to spherical geometry, generating ensemble members by conditioning the denoising process on the current atmospheric state ^[7]. Each ensemble member represents a plausible future weather trajectory, allowing the model to capture forecast uncertainty natively.

Hybrid Physics-AI Approaches

NeuralGCM takes a different path by combining a differentiable physics-based dynamical core with learned neural network parameterizations ^[4]. Rather than replacing physics entirely, it uses neural networks to handle the processes that traditional models parameterize crudely (such as convection and cloud formation), while retaining the equations of motion for large-scale atmospheric dynamics.

Which AI weather forecasting models are most important?

The following sections describe the most influential AI weather forecasting models developed between 2022 and 2025, listed in approximate chronological order of their initial publication.

FourCastNet (NVIDIA, 2022)

FourCastNet (Fourier Forecasting Neural Network) was developed by researchers at NVIDIA, Lawrence Berkeley National Laboratory, the California Institute of Technology, the University of Michigan, and Rice University ^[6]. The initial preprint appeared on arXiv in February 2022, making it one of the earliest AI models to demonstrate competitive performance with operational NWP at global scale.

The model employs a vision transformer backbone with Adaptive Fourier Neural Operators (AFNOs), which apply token mixing in the Fourier domain rather than through standard attention mechanisms ^[6]. Input variables on the 720 x 1440 latitude-longitude grid are divided into patches of size 8 x 8, with each patch embedded as a high-dimensional token. The AFNO architecture was specifically chosen for its ability to handle the global, periodic nature of atmospheric data.

FourCastNet takes 20 ERA5 variables as input, defined on single and pressure levels, and predicts these same variables at a 6-hour time step with 0.25-degree resolution. The model was trained on ERA5 data from 1979 to 2015 ^[6]. In terms of speed, FourCastNet generates a week-long forecast in less than 2 seconds, compared to hours for traditional NWP systems. A notable achievement was training on 3,072 GPUs at the JUWELS Booster supercomputer with a training time of approximately 67 minutes ^[6].

While FourCastNet did not fully match ECMWF HRES on all metrics at the time of release, it demonstrated that data-driven global weather prediction was feasible at operational resolution and paved the way for subsequent models.

Pangu-Weather (Huawei, Nature 2023)

Pangu-Weather, developed by researchers at Huawei Cloud, was published in Nature on July 5, 2023, under the title "Accurate medium-range global weather forecasting with 3D neural networks" ^[1]. It was the first AI model to comprehensively outperform traditional NWP on standard evaluation metrics. As the authors state, "for the first time, an AI-based method outperforms state-of-the-art numerical weather prediction (NWP) methods in terms of accuracy" across all tested variables and lead times from one hour to one week ^[1].

The architecture introduces a 3D Earth-Specific Transformer (3DEST) that processes atmospheric data across both horizontal and vertical dimensions simultaneously. Built on a hierarchical encoder-decoder framework derived from the Swin Transformer, Pangu-Weather includes an Earth-Specific positional Bias (ESB) term that encodes latitude and altitude-dependent patterns ^[1]. This positional bias accounts for the fact that weather dynamics are strongly influenced by absolute position on Earth (for example, the Coriolis effect varies with latitude).

The system trains four separate models for different lead times: 1 hour, 6 hours, 12 hours, and 24 hours. Forecasts at intermediate or longer lead times are composed by chaining the appropriate models. Each model has approximately 64 million trainable parameters, with the total across all four models reaching about 256 million parameters ^[1]. Training required approximately 73,000 GPU-hours on NVIDIA V100s per lead-time model, using 43 years of hourly ERA5 data ^[1].

Pangu-Weather operates at 0.25-degree resolution (approximately 25 km at the equator) and predicts 5 upper-air variables at 13 pressure levels plus 4 surface variables. In testing, it outperformed ECMWF HRES on all variables across all lead times from 1 hour to 7 days ^[1]. It can produce a 10-day forecast in approximately 10 seconds on a single GPU, compared to hours on a supercomputer cluster of thousands of processors for traditional NWP.

Pangu-Weather was made publicly available on the ECMWF website and demonstrated practical value in tracking tropical cyclones, including successfully predicting the track of Typhoon Mawar in May 2023.

How does GraphCast work? (DeepMind, Science 2023)

GraphCast was developed by Google DeepMind and published in Science on November 14, 2023, under the title "Learning skillful medium-range global weather forecasting" ^[2]. It quickly became one of the most widely cited AI weather models. DeepMind describes it as a model that "predicts weather conditions up to 10 days in advance more accurately and much faster than the industry gold-standard" ^[11].

GraphCast uses an encoder-processor-decoder graph neural network architecture with 36.7 million parameters ^[2]. The encoder maps input data from the 0.25-degree latitude-longitude grid (more than one million grid points, roughly 28 km by 28 km at the equator) onto a multi-mesh icosahedral graph, which is constructed by iteratively refining a regular icosahedron six times, dividing each triangle into four smaller ones at each refinement ^[11]. This mesh provides roughly uniform spatial coverage of the globe. The processor consists of a deep GNN with 16 message-passing layers that captures long-range dependencies across the mesh. The decoder maps the processed mesh representation back to the output grid.

The model takes as input the two most recent atmospheric states (current and six hours prior) and predicts the state six hours ahead. It operates on 5 surface variables and 6 atmospheric variables at 37 pressure levels, totaling 227 input and output variables per grid point. GraphCast was trained on 39 years of ERA5 data (1979 to 2017) over approximately four weeks using 32 Google Cloud TPU v4 devices ^[2].

In a comprehensive evaluation, GraphCast outperformed ECMWF HRES on more than 90% (precisely 90.0%) of 1,380 verification targets, combinations of variables, levels, and lead times ^[2]. When restricted to the troposphere (the lowest 6 to 20 km of the atmosphere, where most weather occurs), this figure rose to 99.7% ^[2]^[11]. As DeepMind put it, "making 10-day forecasts with GraphCast takes less than a minute on a single Google TPU v4 machine" ^[11], representing a speed advantage of several orders of magnitude over NWP.

GraphCast also demonstrated skill in predicting severe weather events. In September 2023, a live deployment of GraphCast on the ECMWF website accurately predicted approximately nine days in advance that Hurricane Lee would make landfall in Nova Scotia, while conventional forecasts only converged on Nova Scotia about six days before landfall ^[11].

NeuralGCM (Google Research, Nature 2024)

NeuralGCM was developed by Google Research in partnership with ECMWF and published in Nature on July 22, 2024, under the title "Neural general circulation models for weather and climate" ^[4]. Unlike the purely data-driven models described above, NeuralGCM is a hybrid system that integrates machine learning with traditional atmospheric physics.

The model combines two core components: a differentiable dynamical core and a learned physics module. The dynamical core solves the hydrostatic primitive equations (the fundamental equations governing large-scale atmospheric flow) using a pseudo-spectral discretization, implemented entirely in JAX to enable automatic differentiation ^[4]. The neural network component replaces traditional parameterizations of sub-grid-scale processes such as convection, radiation, and turbulent mixing.

Both components produce tendencies (rates of change) that are integrated forward in time by an implicit-explicit ordinary differential equation solver. Because the entire system is differentiable, it can be trained end-to-end through backpropagation across multiple simulation time steps, with the rollout length gradually increasing from 6 hours to 5 days during training ^[4].

NeuralGCM operates at multiple resolutions, with its deterministic model achieving competitive performance at 0.7 degrees and its stochastic (ensemble) version running at 1.4 degrees. At 0.7-degree resolution, NeuralGCM matches the forecast accuracy of state-of-the-art models for lead times up to 5 days and performs comparably to ECMWF forecasts up to 15 days ^[4]. At 1.4-degree resolution (approximately 140 km), the model generates realistic tropical cyclone statistics and captures emergent large-scale phenomena.

A key advantage of NeuralGCM is its ability to produce physically consistent simulations over extended time periods, making it suitable not only for weather forecasting but also for climate projections spanning decades. The model runs entirely on GPUs and TPUs, delivering results roughly 100,000 times faster than comparable traditional climate models ^[4].

GenCast (DeepMind, Nature 2024)

GenCast was developed by Google DeepMind and published in Nature on December 4, 2024, under the title "Probabilistic weather forecasting with machine learning" ^[7]. It represents a major advance in probabilistic (ensemble) AI weather prediction. According to DeepMind, GenCast "provides better forecasts of both day-to-day weather and extreme events than the top operational system, the European Centre for Medium-Range Weather Forecasts' (ECMWF) ENS, up to 15 days in advance" ^[12].

GenCast is built on a conditional diffusion model adapted to the spherical geometry of Earth. The architecture comprises an encoder that converts atmospheric data onto a mesh grid, a processor based on a sparse transformer with sliding-window attention on the icosahedral mesh, and a decoder that maps back to grid-based variables ^[7]. The processor consists of 16 transformer blocks with a feature dimension of 512 and 4-head self-attention, where each mesh node attends to all other nodes within its 16-hop neighborhood.

Starting from an initial noise sample, GenCast iteratively refines its predictions through the diffusion denoising process, conditioned on the two most recent atmospheric states. Each run of this process produces one ensemble member representing a plausible future weather trajectory. A full GenCast forecast comprises 50 or more ensemble members, each covering 15 days at 12-hour time steps ^[7]^[12].

GenCast was trained on 40 years of ERA5 data (1979 to 2018) at 0.25-degree resolution with 13 pressure levels using a 6-times-refined icosahedral mesh. Training took five days on 32 TPUs ^[7]. At inference time, a single Google Cloud TPU v5 produces one 15-day forecast trajectory in approximately 8 minutes, and ensemble members can be generated in parallel ^[7]^[12].

In evaluation, GenCast outperformed ECMWF's ENS (the gold-standard operational ensemble system) on 97.2% of 1,320 verification targets ^[7]^[12]. At lead times beyond 36 hours, this figure rose to 99.8% ^[7]^[12]. GenCast also showed superior performance in predicting extreme weather events, including tropical cyclone tracks and extreme temperature and wind events, up to 15 days ahead.

A known limitation is that GenCast underpredicts the intensity of tropical cyclones, likely because the 0.25-degree training resolution does not fully resolve the inner core structure of these storms.

Aurora (Microsoft Research, Nature 2025)

Aurora was developed by Microsoft Research and published in Nature in May 2025 under the title "A foundation model for the Earth system" ^[5]. It distinguishes itself as the first large-scale foundation model for atmospheric prediction, applying the pre-train-then-fine-tune paradigm common in large language models to weather and climate.

The architecture is a flexible 3D Swin Transformer with Perceiver-based encoders and decoders. The encoder converts heterogeneous inputs (which may come from different data sources, resolutions, and variable sets) into a standard 3D representation of the atmosphere. The processor, a Vision Transformer, evolves this representation forward in time. The decoder translates the processed representation back into specific predictions. The model has 1.3 billion parameters, making it by far the largest AI weather model to date ^[5].

Aurora's foundation model approach involves two phases. In the pre-training phase, the model trains on more than one million hours of diverse geophysical data, including reanalysis products, forecast simulations, and climate model outputs ^[5]. This gives Aurora a general understanding of atmospheric dynamics. In the fine-tuning phase, the model is adapted to specific tasks such as 10-day global weather forecasting at 0.25-degree resolution, 5-day air pollution prediction, or ocean wave forecasting.

When fine-tuned for medium-range weather forecasting, Aurora outperforms existing numerical and AI models across 91% of forecasting targets at 0.25-degree resolution, and it matches or outperforms GraphCast on 94% of targets ^[5]. The model runs approximately 5,000 times faster than the ECMWF IFS, producing 10-day high-resolution weather forecasts in under a minute ^[5].

Microsoft has made Aurora's source code and model weights publicly available to the research community.

How do AI weather forecasting models compare?

The following table summarizes the key characteristics of the major AI weather forecasting models.

Model	Developer	Year	Publication	Architecture	Parameters	Resolution	Lead Time	Training Data	Inference Speed
FourCastNet	NVIDIA	2022	arXiv preprint	Vision Transformer with AFNO	Not disclosed	0.25 degree	7 days (6h steps)	ERA5 (1979-2015)	< 2 seconds for 7-day forecast (single GPU)
Pangu-Weather	Huawei	2023	Nature	3D Earth-Specific Transformer	~256M total (4 models)	0.25 degree	7 days	ERA5 (43 years)	~10 seconds for 10-day forecast (single GPU)
GraphCast	DeepMind	2023	Science	Graph Neural Network (encoder-processor-decoder)	36.7M	0.25 degree	10 days (6h steps)	ERA5 (1979-2017)	< 1 minute for 10-day forecast (single TPU v4)
NeuralGCM	Google Research	2024	Nature	Hybrid differentiable physics + neural network	Not disclosed	0.7 / 1.4 degree	10-15 days	ERA5	~100,000x faster than traditional climate models
GenCast	DeepMind	2024	Nature	Conditional diffusion model with sparse transformer	Not disclosed	0.25 degree	15 days (12h steps)	ERA5 (1979-2018)	~8 minutes per 15-day trajectory (single TPU v5)
Aurora	Microsoft	2025	Nature	3D Swin Transformer with Perceiver encoders/decoders	1.3B	0.25 degree	10 days	ERA5 + diverse geophysical data (1M+ hours)	~5,000x faster than IFS

Is AI better than traditional weather models?

The performance gains of AI weather models over traditional NWP have been documented across multiple independent evaluations ^[9]^[11]. Several patterns emerge from these comparisons.

For standard verification metrics such as root mean square error (RMSE) and anomaly correlation coefficient (ACC), the leading AI models consistently outperform ECMWF HRES on the majority of variables and lead times. GraphCast exceeded HRES on 90.0% of its 1,380 targets ^[2], while GenCast surpassed the full ECMWF ensemble (ENS) on 97.2% of its 1,320 targets ^[7]. These results have been validated by ECMWF itself, which began hosting AI model forecasts on its website for real-time comparison ^[11].

Tropical cyclone track prediction has been a particular strength of AI models. GraphCast demonstrated earlier and more accurate landfall predictions for Hurricane Lee in 2023 ^[11]. ECMWF's own AIFS system improved tropical cyclone track forecasts by up to 20% compared to the physics-based IFS ^[3]. Several AI models have shown skill in predicting other severe weather phenomena, including atmospheric rivers, extreme temperature events, and high wind episodes.

The speed advantage is perhaps the most striking difference. While a traditional NWP system requires hours of computation on a supercomputer with thousands of processors to produce a 10-day forecast, AI models can generate equivalent forecasts in seconds to minutes on a single GPU or TPU ^[2]^[11]. This enables applications that were previously impractical, such as generating very large ensembles (hundreds or thousands of members) to better characterize forecast uncertainty, or running rapid forecast updates as new observations become available. The accuracy is not yet uniform, however: as discussed below, numerical models still hold an edge on record-breaking extremes ^[13].

How did ECMWF adopt AI? (The AIFS)

The most significant institutional validation of AI weather forecasting came from ECMWF itself. Recognizing the capabilities of data-driven models, ECMWF developed the Artificial Intelligence/Integrated Forecasting System (AIFS), a homage to its traditional IFS.

AIFS uses a graph neural network encoder and decoder combined with a sliding-window transformer processor. It is trained on a combination of ERA5 reanalysis and ECMWF's operational NWP analyses. The deterministic AIFS became operational on February 25, 2025, running alongside the physics-based IFS ^[3]. The ensemble version of AIFS followed, becoming operational on July 1, 2025 ^[8].

AIFS outperforms the physics-based IFS on many verification metrics, with notable improvements in tropical cyclone track forecasting (gains of up to 20%) ^[3]. ECMWF reports that the system "can generate forecasts over 10 times faster than the physics-based forecasting system, while reducing energy consumption by approximately 1,000 times" ^[8]. An updated version, AIFS 1.1.0, was released on August 27, 2025, to address a precipitation forecast issue identified in the initial operational version ^[8].

The decision by ECMWF to make AI forecasts operational represents a watershed moment. For the first time, the world's leading weather prediction center is producing official forecasts using AI alongside its traditional physics-based system. This signals a fundamental shift in how the meteorological community views data-driven approaches.

What are the limitations of AI weather forecasting?

Despite their impressive performance on standard benchmarks, AI weather models face several important limitations that the research community continues to address.

Extreme Events and Record-Breaking Weather

Multiple studies have found that AI models struggle with extreme weather events, particularly record-breaking conditions that are absent or underrepresented in the training data. A 2025 study demonstrated that numerical models consistently outperform AI models like GraphCast and Pangu-Weather for record-breaking heat, cold, and wind events ^[13]. AI models tend to underestimate both the frequency and intensity of record-breaking events, with larger forecast errors for greater record exceedances ^[13]. This is a fundamental challenge for data-driven approaches: by training on historical data, the models learn the statistical distribution of past weather, but they may not extrapolate well to unprecedented conditions.

Physical Consistency

Purely data-driven models do not explicitly enforce the physical laws that govern the atmosphere, such as conservation of mass, energy, and momentum. This can lead to forecasts that are statistically accurate on average but physically inconsistent in individual cases. For tropical cyclones, for example, the wind fields produced by AI models may not have the organized structure needed for accurate downstream predictions of storm surge, rainfall distribution, and wind damage. Hybrid approaches like NeuralGCM partially address this by retaining physics-based dynamics ^[4], but fully data-driven models remain vulnerable to this limitation.

Resolution and Small-Scale Phenomena

Most AI weather models operate at 0.25-degree resolution (approximately 25 km), which is coarser than the 9 km resolution of ECMWF HRES. This limits their ability to resolve small-scale phenomena such as thunderstorms, sea breezes, and mountain-valley circulations. While the models may capture the large-scale environment conducive to these events, they cannot predict their precise location and timing in the way higher-resolution NWP models can.

Climate Change and Non-Stationarity

AI weather models are trained on historical data from a climate that is continuously changing. As global temperatures rise and weather patterns shift, the statistical relationships learned from the past may become less reliable. This concern is particularly acute for "gray swan" events: weather extremes that are rare enough to be absent from the training dataset but may become more common under future climate conditions. Traditional NWP models, because they are grounded in physical principles, are in theory better equipped to handle novel atmospheric states.

Precipitation and Convective Processes

Forecasting precipitation remains challenging for AI models. Rainfall is inherently more chaotic and localized than temperature or pressure fields, and its distribution depends on complex interactions between dynamics, thermodynamics, and microphysics. Several AI models show a tendency to produce overly smooth precipitation fields, underpredicting extreme rainfall while overestimating light precipitation. ECMWF's AIFS 1.1.0 update was specifically released to address a precipitation forecast issue in the initial version ^[8].

Interpretability and Trust

Operational meteorologists rely on their understanding of atmospheric physics to interpret and communicate forecasts. AI models operate as black boxes, making it difficult for forecasters to understand why a particular prediction was made or to identify when the model might be unreliable. Building trust in AI forecasts within the operational weather community requires not just demonstrated skill but also tools for interpretability, uncertainty communication, and failure mode analysis.

When did AI weather forecasting break through? (Timeline)

The development of AI weather forecasting has progressed at a remarkable pace:

Year	Milestone
2022	NVIDIA publishes FourCastNet, demonstrating global AI weather prediction at 0.25 degree resolution using Adaptive Fourier Neural Operators ^[6]
July 2023	Huawei publishes Pangu-Weather in Nature, the first AI model to comprehensively outperform ECMWF HRES ^[1]
November 2023	DeepMind publishes GraphCast in Science, outperforming HRES on 90.0% of verification targets ^[2]
2023	ECMWF begins hosting AI weather forecasts (including GraphCast and Pangu-Weather) on its website for public access
July 2024	Google Research publishes NeuralGCM in Nature, demonstrating hybrid physics-AI models for both weather and climate ^[4]
December 2024	DeepMind publishes GenCast in Nature, the first AI model to outperform ECMWF ENS for probabilistic forecasting ^[7]
February 2025	ECMWF makes deterministic AIFS operational, running alongside the traditional IFS ^[3]
May 2025	Microsoft publishes Aurora in Nature, introducing the foundation model paradigm to weather forecasting with 1.3 billion parameters ^[5]
July 2025	ECMWF makes ensemble AIFS operational ^[8]

What is next for AI weather forecasting?

Several active research directions are shaping the next generation of AI weather forecasting:

Higher resolution and regional models. Researchers are working on AI models that operate at resolutions of 5 km or finer, enabling prediction of convective storms and other mesoscale phenomena. Regional AI models tailored to specific geographic areas are also under development.

Extended-range and seasonal prediction. While current AI models focus primarily on the 1-to-15-day range, there is growing interest in applying similar techniques to sub-seasonal (2 to 6 weeks) and seasonal forecasting, where traditional models have limited skill.

Foundation models. Following Aurora's example, the trend toward large, pre-trained foundation models that can be fine-tuned for diverse tasks is expected to continue ^[5]. These models could unify weather forecasting, air quality prediction, ocean modeling, and climate projection within a single framework.

Coupled Earth system modeling. Future AI models may jointly predict the atmosphere, ocean, land surface, and cryosphere, capturing the interactions between these components that are critical for extended-range prediction and climate applications.

Operational integration. As weather agencies worldwide adopt AI forecasts, significant work is needed on calibration, post-processing, uncertainty quantification, and forecaster tools to integrate AI predictions into existing operational workflows.

Physics-informed architectures. Approaches like NeuralGCM suggest that combining the strengths of physics-based modeling with data-driven learning may ultimately produce the most reliable forecasts, particularly for extreme events and climate projections ^[4].

ELI5: AI weather forecasting explained simply

Imagine you wanted to guess tomorrow's weather. The old way is to build a giant computer model of the air, the oceans, and the sun, then do trillions of math problems to push today's weather forward in time. That is accurate but slow, and it needs one of the most powerful supercomputers on Earth. AI weather forecasting does something different: it studies more than 40 years of past weather maps until it learns the patterns of how weather changes from one day to the next. Then, when you show it today's weather, it can sketch out the next 10 days in less than a minute on a single computer chip, and it often gets the answer right more often than the old supercomputer does. The trade-off is that AI is great at normal weather but can still be fooled by record-breaking, never-seen-before extremes, so scientists use AI and the traditional models together.

References

Bi, K., Xie, L., Zhang, H., Chen, X., Gu, X., & Tian, Q. (2023). "Accurate medium-range global weather forecasting with 3D neural networks." *Nature*, 619, 533-538. https://doi.org/10.1038/s41586-023-06185-3 ↩
Lam, R., Sanchez-Gonzalez, A., Willson, M., et al. (2023). "Learning skillful medium-range global weather forecasting." *Science*, 382(6677), 1416-1421. https://doi.org/10.1126/science.adi2336 ↩
ECMWF. (2025). "ECMWF's AI forecasts become operational." https://www.ecmwf.int/en/about/media-centre/news/2025/ecmwfs-ai-forecasts-become-operational ↩
Kochkov, D., Yuval, J., Langmore, I., et al. (2024). "Neural general circulation models for weather and climate." *Nature*, 632, 1060-1066. https://doi.org/10.1038/s41586-024-07744-y ↩
Bodnar, C., Bruinsma, W.P., Lucic, A., et al. (2025). "A foundation model for the Earth system." *Nature*, 641, 1180-1187. https://doi.org/10.1038/s41586-025-09005-y ↩
Pathak, J., Subramanian, S., Harrington, P., et al. (2022). "FourCastNet: A Global Data-driven High-resolution Weather Forecasting Model using Adaptive Fourier Neural Operators." *arXiv preprint arXiv:2202.11214*. ↩
Price, I., Sanchez-Gonzalez, A., Ber, F., et al. (2024). "Probabilistic weather forecasting with machine learning." *Nature*, 636, 84-90. https://doi.org/10.1038/s41586-024-08252-9 ↩
ECMWF. (2025). "ECMWF's ensemble AI forecasts become operational." https://www.ecmwf.int/en/about/media-centre/news/2025/ecmwfs-ensemble-ai-forecasts-become-operational ↩
Charlton-Perez, A.J., et al. (2024). "The Rise of Data-Driven Weather Forecasting." *Bulletin of the American Meteorological Society*, 105(6), E1150-E1172. ↩
Hersbach, H., Bell, B., Berrisford, P., et al. (2020). "The ERA5 global reanalysis." *Quarterly Journal of the Royal Meteorological Society*, 146(730), 1999-2049. https://doi.org/10.1002/qj.3803 ↩
Google DeepMind. (2023). "GraphCast: AI model for faster and more accurate global weather forecasting." https://deepmind.google/blog/graphcast-ai-model-for-faster-and-more-accurate-global-weather-forecasting/ ↩
Google DeepMind. (2024). "GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy." https://deepmind.google/blog/gencast-predicts-weather-and-the-risks-of-extreme-conditions-with-sota-accuracy/ ↩
Watt-Meyer, O., et al. (2025). "Numerical models outperform AI weather forecasts of record-breaking extremes." *arXiv preprint arXiv:2508.15724*. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

4 revisions by 1 contributors · full history

Suggest edit

What links here

AI in climate GenCast Huawei AI Jua NVIDIA Earth-2 WeatherNext 2 WindBorne Systems

What is numerical weather prediction (NWP)?

What data are AI weather models trained on?

How do AI weather models work?

Vision Transformers and Fourier Neural Operators

Graph Neural Networks

Diffusion Models

Hybrid Physics-AI Approaches

Which AI weather forecasting models are most important?

FourCastNet (NVIDIA, 2022)

Pangu-Weather (Huawei, Nature 2023)

How does GraphCast work? (DeepMind, Science 2023)

NeuralGCM (Google Research, Nature 2024)

GenCast (DeepMind, Nature 2024)

Aurora (Microsoft Research, Nature 2025)

How do AI weather forecasting models compare?

Is AI better than traditional weather models?

How did ECMWF adopt AI? (The AIFS)

What are the limitations of AI weather forecasting?

Extreme Events and Record-Breaking Weather

Physical Consistency

Resolution and Small-Scale Phenomena

Climate Change and Non-Stationarity

Precipitation and Convective Processes

Interpretability and Trust

When did AI weather forecasting break through? (Timeline)

What is next for AI weather forecasting?

ELI5: AI weather forecasting explained simply

See Also

References

Improve this article

Related Articles

GeoAI

Space

AI in agriculture

AI in drug discovery

Claude Science

GNoME

What links here

Related Articles

GeoAI

Space

AI in agriculture

AI in drug discovery

Claude Science

GNoME

What links here