NASA Logo in the header
Data Science
Operational

SatVision-TOA Geospatial Foundation Model (SatVision-TOA)

SatVision-TOA demonstrates the untapped potential of leveraging moderate- to coarse-resolution data for deep learning in Earth observation. By training a 3-billion-parameter vision transformer on a 100-million-image MODIS TOA dataset, it establishes a scalable, open-source foundation for advancing atmospheric science, cloud analysis, and Earth system modeling. Its released weights and workflows aim to broaden participation and foster collaboration in remote sensing applications. SatVision-TOA captures diverse atmospheric and surface conditions. Additionally, the model improves performance in 3D cloud retrieval and environmental monitoring, surpassing baseline methods.

Launch Date

--

Class

Data/Image

Websites


Key Staffs

Foundation models have the potential to transform the landscape of remote sensing (RS) data analysis by enabling large computer vision models to be pre-trained on vast amounts of remote sensing data. These models can then be fine-tuned with small amounts of labeled training and applied to a variety of applications. Most existing foundation models are designed for high spatial resolution, cloud-free satellite imagery or photos, limiting their applicability in scenarios that require frequent temporal monitoring or broad spectral profiles. As a result, foundation models trained solely on cloud-free images have limited utility for applications that involve atmospheric variables or require atmospheric corrections.

We introduce SatVision-TOA, a novel foundation model pre-trained on 14-band MODIS L1B Top-Of-Atmosphere (TOA) radiance imagery, addressing the need for models pre-trained to handle moderate- and coarse-resolution all-sky remote sensing data. The SatVision-TOA model is pre-trained using a Masked-Image-Modeling (MIM) framework and the SwinV2 architecture, and learns detailed contextual representations through self-supervised learning without the need for labels. It is a 3 billion parameter model that is trained on 100 million images. To our knowledge this is the largest foundation model trained solely on satellite RS imagery. Results show that SatVision-TOA achieves superior performance over baseline methods on downstream tasks such as 3D cloud retrieval. Notably, the model achieves a mean intersection over union (mIOU) of 0.46, a substantial improvement over the baseline mIOU of 0.22. Additionally, the rate of false negative results in the fine-tuning task were reduced by over 50% compared to the baseline. Our work advances pre-trained vision modeling for multispectral RS by learning from a variety of atmospheric and aerosol conditions to improve cloud and land surface monitoring.