trying to make Geospatial Foundation Models work
Research Fellow at @ESA PhiLab
Previously at @KULeuven, @Cnam
PhD in Data Science at @Sapienza
website: https://sites.google.com/uniroma1.it/valeriomarsocci
#AI4EO #GeoAI #SSL4EO
This paper introduces: a) a new pre-training dataset; b) a new benchmark dataset; c) a GFM, all based on a diverse set of Copernicus data.
⬆️: really appreciate the grid embeddings part
⬇️: some doubts about claims about generalizability
arxiv.org/pdf/2503.11849
This paper introduces: a) a new pre-training dataset; b) a new benchmark dataset; c) a GFM, all based on a diverse set of Copernicus data.
⬆️: really appreciate the grid embeddings part
⬇️: some doubts about claims about generalizability
arxiv.org/pdf/2503.11849
New preprint around :)
Incorporating inductive biases specific to MSI can enhance the fine-tuning of large Earth observation models, pre-trained on RGB
arxiv.org/pdf/2503.09493
New preprint around :)
Incorporating inductive biases specific to MSI can enhance the fine-tuning of large Earth observation models, pre-trained on RGB
arxiv.org/pdf/2503.09493
The authors introduce NC and discuss the characteristics of EO and climate data, w.r.t natural images
⬆️: great entry point
⬇️: no baseline exps
arxiv.org/pdf/2503.01505
The authors introduce NC and discuss the characteristics of EO and climate data, w.r.t natural images
⬆️: great entry point
⬇️: no baseline exps
arxiv.org/pdf/2503.01505
This study pretrains two SSL methods on ImageNet and GeoNet. The improvement with GeoNet is minimal.
⬆️ useful to reduce computation?
⬇️ more considerations about the resolutions?
arxiv.org/pdf/2502.10669
This study pretrains two SSL methods on ImageNet and GeoNet. The improvement with GeoNet is minimal.
⬆️ useful to reduce computation?
⬇️ more considerations about the resolutions?
arxiv.org/pdf/2502.10669
Galileo is a family of pretrained RS models designed to flexibly process multimodal RS data. It has two loss: one in the pixel space, one in the latent space.
⬆️: multi-modal/temporal/sensor
⬇️: why just using Sentinel data?
arxiv.org/pdf/2502.09356
Galileo is a family of pretrained RS models designed to flexibly process multimodal RS data. It has two loss: one in the pixel space, one in the latent space.
⬆️: multi-modal/temporal/sensor
⬇️: why just using Sentinel data?
arxiv.org/pdf/2502.09356
It looks like they can :)
⬆️: validating it on a real-world task
⬇️: is it super-resolution or mapping S2 to NAIP?
arxiv.org/pdf/2501.15847
It looks like they can :)
⬆️: validating it on a real-world task
⬇️: is it super-resolution or mapping S2 to NAIP?
arxiv.org/pdf/2501.15847
This paper provides a comprehensive review of the applications of diffusion models in remote sensing
⬆️ excellent entry point
⬇️ not sure about the statement about the "inherent denoising ability" of diffusion models
arxiv.org/abs/2404.08926
This paper provides a comprehensive review of the applications of diffusion models in remote sensing
⬆️ excellent entry point
⬇️ not sure about the statement about the "inherent denoising ability" of diffusion models
arxiv.org/abs/2404.08926
Last year I decided to do a #50paperschallenge
I ended up with 43. Still:
🥵 I read more than 50 papers. I just didn't post all
😇 the strategy worked independently of the posted ones
For this reason, this year I will do a #40paperschallenge!
Last year I decided to do a #50paperschallenge
I ended up with 43. Still:
🥵 I read more than 50 papers. I just didn't post all
😇 the strategy worked independently of the posted ones
For this reason, this year I will do a #40paperschallenge!
GNNs open new possibilities for EO, handling irregular, multi-source datasets (e.g. point clouds) for smarter weather forecasts, disaster relief, etc..
⬆️: excels at non-Euclidean spatial data
⬇️: limited scalability across diverse data (?)
arxiv.org/abs/2411.03223
GNNs open new possibilities for EO, handling irregular, multi-source datasets (e.g. point clouds) for smarter weather forecasts, disaster relief, etc..
⬆️: excels at non-Euclidean spatial data
⬇️: limited scalability across diverse data (?)
arxiv.org/abs/2411.03223
1. generally speaking GFMs don't really excel when compared to supervised baselines
2. for some specific scenarios (e.g. HR data), it makes sense to use them
3. multi-temporal data are still under-estimated
other insights in the paper!
🧵
1. generally speaking GFMs don't really excel when compared to supervised baselines
2. for some specific scenarios (e.g. HR data), it makes sense to use them
3. multi-temporal data are still under-estimated
other insights in the paper!
🧵
* provide a robust evaluation protocol to benchmark GFMs
* investigate GFMs capabilities, with a focus on a) domain generalization, b) comparison to supervised baselines, c) performance with limited labels
🧵
* provide a robust evaluation protocol to benchmark GFMs
* investigate GFMs capabilities, with a focus on a) domain generalization, b) comparison to supervised baselines, c) performance with limited labels
🧵
* application domain
* geographical distribution
* type of task
* modality
* temporality
Spoiler: no patch-level classification tasks are included!
🧵
* application domain
* geographical distribution
* type of task
* modality
* temporality
Spoiler: no patch-level classification tasks are included!
🧵
Are geospatial foundation models really impactful?
Check it in our new pre-print!
Welcome to **PANGAEA: a global and inclusive benchmark for GFMs**
arxiv.org/abs/2412.04204
Check also the public GitHub repo (other news/updates soon):
github.com/VMarsocci/pa...
a short thread 🧵
Are geospatial foundation models really impactful?
Check it in our new pre-print!
Welcome to **PANGAEA: a global and inclusive benchmark for GFMs**
arxiv.org/abs/2412.04204
Check also the public GitHub repo (other news/updates soon):
github.com/VMarsocci/pa...
a short thread 🧵
Can global SatML models solve local challenges?
This study finds local models outperform global & fine-tuned models for TCH mapping in Africa
⬆️: interesting set of research questions
⬇️: what about "generalist" geospatial foundation models?
arxiv.org/pdf/2411.14354
Can global SatML models solve local challenges?
This study finds local models outperform global & fine-tuned models for TCH mapping in Africa
⬆️: interesting set of research questions
⬇️: what about "generalist" geospatial foundation models?
arxiv.org/pdf/2411.14354
#32 GeoFMs for crop type mapping
it investigates the ability of geoFMs to transfer to new geographic regions in agriculture
⬆️the pivotal topic for real-world applications
⬇️the limited number of geoFMs
arxiv.org/pdf/2409.09451
#32 GeoFMs for crop type mapping
it investigates the ability of geoFMs to transfer to new geographic regions in agriculture
⬆️the pivotal topic for real-world applications
⬇️the limited number of geoFMs
arxiv.org/pdf/2409.09451
TO BEAT SUPERVISED BASELINES
Specialized FMs in genomics, satellite imaging, and time series, struggle w.r.t. supervised learning pipelines
⬆️: very relevant work
⬇️: just classification, limiting the real-world capabilities*
arxiv.org/abs/2411.02796
TO BEAT SUPERVISED BASELINES
Specialized FMs in genomics, satellite imaging, and time series, struggle w.r.t. supervised learning pipelines
⬆️: very relevant work
⬇️: just classification, limiting the real-world capabilities*
arxiv.org/abs/2411.02796
Satlas (#ICCV ‘23) proposes both a dataset (SatlasPretrain) and a model (SatlasNet).
SatlasNet is a supervised Swin-based model with multi-head for different tasks
⬆️multi-task model
⬇️supervised setting
arxiv.org/pdf/2211.15660
Satlas (#ICCV ‘23) proposes both a dataset (SatlasPretrain) and a model (SatlasNet).
SatlasNet is a supervised Swin-based model with multi-head for different tasks
⬆️multi-task model
⬇️supervised setting
arxiv.org/pdf/2211.15660
SpectralGPT (#TPAMI) leverages 3D token generation for spatial-spectral coupling to process images of different sizes, resolutions, etc.
⬆️great flexibility in the input
⬇️missing some modalities (e.g. SAR)
ieeexplore.ieee.org/document/104...
SpectralGPT (#TPAMI) leverages 3D token generation for spatial-spectral coupling to process images of different sizes, resolutions, etc.
⬆️great flexibility in the input
⬇️missing some modalities (e.g. SAR)
ieeexplore.ieee.org/document/104...
Prithvi is the geospatial foundation model developed by NASA and IBM. Trained on HLS, it employs an MAE with 3D positional encoding to consider multi-temporality
⬆️multi-temporality and the tasks
⬇️limited in the geographical extent
arxiv.org/pdf/2310.18660
Prithvi is the geospatial foundation model developed by NASA and IBM. Trained on HLS, it employs an MAE with 3D positional encoding to consider multi-temporality
⬆️multi-temporality and the tasks
⬇️limited in the geographical extent
arxiv.org/pdf/2310.18660
ALISE (ALigned SITS Encoder) is a model for processing irregular and unaligned Satellite Image Time Series (SITS)
⬆️: great sparse data and label handling
⬇️: would be great to extend the domains (geographical and sensor-related)
arxiv.org/abs/2407.08448
ALISE (ALigned SITS Encoder) is a model for processing irregular and unaligned Satellite Image Time Series (SITS)
⬆️: great sparse data and label handling
⬇️: would be great to extend the domains (geographical and sensor-related)
arxiv.org/abs/2407.08448
TaxaBind introduces a unified embedding space for 6 data types, solving ecological tasks like species mapping & zero-shot classification
⬆️: robust zero-shot classification
⬇️: what if we want to add/test new tasks?
arxiv.org/abs/2411.00683
TaxaBind introduces a unified embedding space for 6 data types, solving ecological tasks like species mapping & zero-shot classification
⬆️: robust zero-shot classification
⬇️: what if we want to add/test new tasks?
arxiv.org/abs/2411.00683
#SatCLIP learns the implicit representations of the image features that characterize a specific location
what I liked most: geo adaptation and per-continent performance
paper: arxiv.org/pdf/2311.171...
#SatCLIP learns the implicit representations of the image features that characterize a specific location
what I liked most: geo adaptation and per-continent performance
paper: arxiv.org/pdf/2311.171...