Download Now
Zeus AI awarded DOE Phase II grant to build a kilometer-scale severe weather model
February 10, 2026

We’re excited to share that Zeus AI has been awarded a Department of Energy Small Business Innovation Research (SBIR) Phase II grant to develop a high-resolution AI foundation model for severe weather forecasting over the Continental United States.

The project, A Meteorological Foundation Model for Gap-filled High-Resolution Data in Urban Environments, extends our global foundation model, EarthNet, into a multi-resolution system that combines kilometer-scale regional observations, including convection-resolving MRMS radar composites, with satellite observations (Figure 1) and point measurements from weather stations and radiosondes.

The grant supports development across the full pipeline, from data infrastructure to validation:

  • ML-ready data pipelines for historical and real-time severe weather observations, including MRMS radar, microwave sounders, and surface stations at 1-2 km resolutions
  • Kilometer-scale data assimilation and nowcasting over CONUS at 15-minute temporal resolution, with no reliance on numerical weather prediction systems
  • Continuous forecast validation through an extended WeatherBench framework adapted for high-temporal-resolution regional evaluation
  • Independent atmospheric profile verification in collaboration with MIT Lincoln Laboratory
Figure 1: TROPICS (Time-Resolved Observations of Precipitation structure and storm Intensity with a Constellation of Smallsats) provides 90-minute revisit microwave measurements that can be used to observe precipitation structure throughout a storm’s life cycle. (Credit: MIT Lincoln Labs)

Early Results

We’ve already begun producing ML-ready datasets at a resolution matched to severe weather. Our MRMS radar and GOES-R satellite pipelines now cover the full CONUS domain at 1.6 km and 15-minute resolution, representing a 10-20x spatial resolution increase over our previous global datasets.

Before multi-modal training, most data sources are compressed into a compact latent representation, a process we call tokenization, that reduces GPU memory requirements and allows the foundation model to ingest many sensors simultaneously. Tokenizing radar-derived variables like precipitation and hail fields, that are zero-inflated and heavily right-skewed, is tricky for plain autoencoders. Our solution is a discrete-continuous tokenizer that jointly models occurrence probability and conditional intensity, achieving R = 0.990 against observed MRMS precipitation rates.

We’re also exploring hyperspherical VAE representations, where latent vectors are constrained to a manifold rather than unconstrained Euclidean space (Figure 2). These produce latents with structured separation between precipitation regimes, a property we expect to be useful within the multi-modal foundation model.

Figure 2: Hyperspherical latent space representation of MRMS precipitation data. Each point represents a 4×4 pixel patch encoded by the tokenization model, with positions constrained to a 3D sphere. Colors indicate log-scaled precipitation rates, revealing structured organization of texture and intensity, where precipitation intensity varies smoothly across the manifold, with distinct clustering of zero-precipitation (red) and high-intensity (blue) cases.

What’s Next?

With our data pipelines and tokenization models in place, we’re moving into the core modeling phase: training the multi-modal foundation model that brings together radar, satellite, and surface observations into a unified forecasting system.

We’re grateful for DOE’s continued support and we look forward to sharing results as the work progresses.