Guillaume Astruc
@gastruc.bsky.social
2nd Year PhD Student from Imagine-ENPC/IGN/CNES
Working on Self-supervised Cross-modal Geospatial Learning.
Personal WebPage: https://gastruc.github.io/
Working on Self-supervised Cross-modal Geospatial Learning.
Personal WebPage: https://gastruc.github.io/
Pinned
Guillaume Astruc
@gastruc.bsky.social
· Dec 19
🤔 What if embedding multimodal EO data was as easy as using a ResNet on images?
Introducing AnySat: one model for any resolution (0.2m–250m), scale (0.3–2600 hectares), and modalities (choose from 11 sensors & time series)!
Try it with just a few lines of code:
Introducing AnySat: one model for any resolution (0.2m–250m), scale (0.3–2600 hectares), and modalities (choose from 11 sensors & time series)!
Try it with just a few lines of code:
Reposted by Guillaume Astruc
We introduce MIRO: a new paradigm for T2I model alignment integrating reward conditioning into pretraining, eliminating the need for separate fine-tuning/RL stages. This single-stage approach offers unprecedented efficiency and control.
- 19x faster convergence ⚡
- 370x less FLOPS than FLUX-dev 📉
- 19x faster convergence ⚡
- 370x less FLOPS than FLUX-dev 📉
October 31, 2025 at 11:24 AM
We introduce MIRO: a new paradigm for T2I model alignment integrating reward conditioning into pretraining, eliminating the need for separate fine-tuning/RL stages. This single-stage approach offers unprecedented efficiency and control.
- 19x faster convergence ⚡
- 370x less FLOPS than FLUX-dev 📉
- 19x faster convergence ⚡
- 370x less FLOPS than FLUX-dev 📉
Super interesting to see pure SSL outperforms text alignement on a super competitive but text-aligned suited task 🤯
🚀 DinoV3 just became the new go-to backbone for geoloc!
It outperforms CLIP-like models (SigLip2, finetuned StreetCLIP)… and that’s shocking 🤯
Why? CLIP models have an innate advantage — they literally learn place names + images. DinoV3 doesn’t.
It outperforms CLIP-like models (SigLip2, finetuned StreetCLIP)… and that’s shocking 🤯
Why? CLIP models have an innate advantage — they literally learn place names + images. DinoV3 doesn’t.
August 18, 2025 at 3:44 PM
Super interesting to see pure SSL outperforms text alignement on a super competitive but text-aligned suited task 🤯
🛰️ At #CVPR2025 presenting "AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities" - Saturday afternoon, Poster 355!
If you're here and want to discuss geolocation or geospatial foundation models, let's connect!
If you're here and want to discuss geolocation or geospatial foundation models, let's connect!
June 11, 2025 at 9:08 PM
🛰️ At #CVPR2025 presenting "AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities" - Saturday afternoon, Poster 355!
If you're here and want to discuss geolocation or geospatial foundation models, let's connect!
If you're here and want to discuss geolocation or geospatial foundation models, let's connect!
Reposted by Guillaume Astruc
📢 FLAIR-HUB dataset
A new large-scale, multimodal dataset for land cover and crop type mapping
🤗 Dataset: huggingface.co/datasets/IGN...
📄 Preprint: arxiv.org/abs/2506.07080
🤗 Pretrained models: huggingface.co/collections/...
💻 Code: github.com/IGNF/FLAIR-HUB
🌐 Project : arxiv.org/abs/2506.07080
A new large-scale, multimodal dataset for land cover and crop type mapping
🤗 Dataset: huggingface.co/datasets/IGN...
📄 Preprint: arxiv.org/abs/2506.07080
🤗 Pretrained models: huggingface.co/collections/...
💻 Code: github.com/IGNF/FLAIR-HUB
🌐 Project : arxiv.org/abs/2506.07080
FLAIR-HUB: Large-scale Multimodal Dataset for Land Cover and Crop Mapping
The growing availability of high-quality Earth Observation (EO) data enables accurate global land cover and crop type monitoring. However, the volume and heterogeneity of these datasets pose major pro...
arxiv.org
June 11, 2025 at 2:00 PM
📢 FLAIR-HUB dataset
A new large-scale, multimodal dataset for land cover and crop type mapping
🤗 Dataset: huggingface.co/datasets/IGN...
📄 Preprint: arxiv.org/abs/2506.07080
🤗 Pretrained models: huggingface.co/collections/...
💻 Code: github.com/IGNF/FLAIR-HUB
🌐 Project : arxiv.org/abs/2506.07080
A new large-scale, multimodal dataset for land cover and crop type mapping
🤗 Dataset: huggingface.co/datasets/IGN...
📄 Preprint: arxiv.org/abs/2506.07080
🤗 Pretrained models: huggingface.co/collections/...
💻 Code: github.com/IGNF/FLAIR-HUB
🌐 Project : arxiv.org/abs/2506.07080
Reposted by Guillaume Astruc
I will be presenting our work on the detection of archaeological looting with satellite image time series at CVPR 2025 EarthVision workshop tomorrow!
Honored and grateful that this paper received the best student paper award!
Honored and grateful that this paper received the best student paper award!
June 11, 2025 at 4:04 AM
I will be presenting our work on the detection of archaeological looting with satellite image time series at CVPR 2025 EarthVision workshop tomorrow!
Honored and grateful that this paper received the best student paper award!
Honored and grateful that this paper received the best student paper award!
Reposted by Guillaume Astruc
📢 New preprint!
“When majority rules, minority loses: bias amplification of gradient descent”
We often blame biased data but training also amplifies biases. Our paper explores how ML algorithms favor stereotypes at the expense of minority groups.
➡️ arxiv.org/abs/2505.13122
(1/3)
“When majority rules, minority loses: bias amplification of gradient descent”
We often blame biased data but training also amplifies biases. Our paper explores how ML algorithms favor stereotypes at the expense of minority groups.
➡️ arxiv.org/abs/2505.13122
(1/3)
When majority rules, minority loses: bias amplification of gradient descent
Despite growing empirical evidence of bias amplification in machine learning, its theoretical foundations remain poorly understood. We develop a formal framework for majority-minority learning tasks, ...
arxiv.org
May 23, 2025 at 4:48 PM
📢 New preprint!
“When majority rules, minority loses: bias amplification of gradient descent”
We often blame biased data but training also amplifies biases. Our paper explores how ML algorithms favor stereotypes at the expense of minority groups.
➡️ arxiv.org/abs/2505.13122
(1/3)
“When majority rules, minority loses: bias amplification of gradient descent”
We often blame biased data but training also amplifies biases. Our paper explores how ML algorithms favor stereotypes at the expense of minority groups.
➡️ arxiv.org/abs/2505.13122
(1/3)
We've added new experiments demonstrating robust generalization capabilities! Notably, AnySat shows strong performance on HLS Burn Scars - a sensor never seen during pretraining! 🔥🛰️
Check it out:
📄 Paper: arxiv.org/abs/2412.14123
🌐 Project: gastruc.github.io/anysat
Check it out:
📄 Paper: arxiv.org/abs/2412.14123
🌐 Project: gastruc.github.io/anysat
April 30, 2025 at 2:00 PM
We've added new experiments demonstrating robust generalization capabilities! Notably, AnySat shows strong performance on HLS Burn Scars - a sensor never seen during pretraining! 🔥🛰️
Check it out:
📄 Paper: arxiv.org/abs/2412.14123
🌐 Project: gastruc.github.io/anysat
Check it out:
📄 Paper: arxiv.org/abs/2412.14123
🌐 Project: gastruc.github.io/anysat
Reposted by Guillaume Astruc
Introducing HySCDG #CVPR2025, a generative pipeline for creating a large hybrid semantic change detection for Earth Observation using Stable Diffusion and ControlNet ! 🗺️🛩️
📄 Paper: arxiv.org/abs/2503.15683
📄 Paper: arxiv.org/abs/2503.15683
The Change You Want To Detect: Semantic Change Detection In Earth Observation With Hybrid Data Generation
Bi-temporal change detection at scale based on Very High Resolution (VHR) images is crucial for Earth monitoring. This remains poorly addressed so far: methods either require large volumes of annotate...
arxiv.org
April 28, 2025 at 4:48 PM
Introducing HySCDG #CVPR2025, a generative pipeline for creating a large hybrid semantic change detection for Earth Observation using Stable Diffusion and ControlNet ! 🗺️🛩️
📄 Paper: arxiv.org/abs/2503.15683
📄 Paper: arxiv.org/abs/2503.15683
Reposted by Guillaume Astruc
💻We've released the code for our #CVPR2025 paper MAtCha!
🍵MAtCha reconstructs sharp, accurate and scalable meshes of both foreground AND background from just a few unposed images (eg 3 to 10 images)...
...While also working with dense-view datasets (hundreds of images)!
🍵MAtCha reconstructs sharp, accurate and scalable meshes of both foreground AND background from just a few unposed images (eg 3 to 10 images)...
...While also working with dense-view datasets (hundreds of images)!
April 3, 2025 at 10:33 AM
💻We've released the code for our #CVPR2025 paper MAtCha!
🍵MAtCha reconstructs sharp, accurate and scalable meshes of both foreground AND background from just a few unposed images (eg 3 to 10 images)...
...While also working with dense-view datasets (hundreds of images)!
🍵MAtCha reconstructs sharp, accurate and scalable meshes of both foreground AND background from just a few unposed images (eg 3 to 10 images)...
...While also working with dense-view datasets (hundreds of images)!
Reposted by Guillaume Astruc
🔥🔥🔥 CV Folks, I have some news! We're organizing a 1-day meeting in center Paris on June 6th before CVPR called CVPR@Paris (similar as NeurIPS@Paris) 🥐🍾🥖🍷
Registration is open (it's free) with priority given to authors of accepted papers: cvprinparis.github.io/CVPR2025InPa...
Big 🧵👇 with details!
Registration is open (it's free) with priority given to authors of accepted papers: cvprinparis.github.io/CVPR2025InPa...
Big 🧵👇 with details!
March 21, 2025 at 6:43 AM
🔥🔥🔥 CV Folks, I have some news! We're organizing a 1-day meeting in center Paris on June 6th before CVPR called CVPR@Paris (similar as NeurIPS@Paris) 🥐🍾🥖🍷
Registration is open (it's free) with priority given to authors of accepted papers: cvprinparis.github.io/CVPR2025InPa...
Big 🧵👇 with details!
Registration is open (it's free) with priority given to authors of accepted papers: cvprinparis.github.io/CVPR2025InPa...
Big 🧵👇 with details!
Reposted by Guillaume Astruc
Starter pack including some of the lab members: go.bsky.app/QK8j87w
March 14, 2025 at 10:34 AM
Starter pack including some of the lab members: go.bsky.app/QK8j87w
Reposted by Guillaume Astruc
🧩 Excited to share our paper "RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges" (arxiv.org/abs/2502.19955) accepted to #CVPR2025! We created a benchmark that systematically evaluates image matching methods across well-defined geometric difficulty levels. 🔍
February 28, 2025 at 3:23 PM
🧩 Excited to share our paper "RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges" (arxiv.org/abs/2502.19955) accepted to #CVPR2025! We created a benchmark that systematically evaluates image matching methods across well-defined geometric difficulty levels. 🔍
Reposted by Guillaume Astruc
Weights for CAD are finally available. It's one of the smallest diffusion models on the market, achieving performance close to SD and Pixart, featuring a Perceiver-like architecture.
We leverage our coherence aware training to improve the textual understanding
We leverage our coherence aware training to improve the textual understanding
🚨 Just a quick note that following requests, we trained a 512px version of our Coherence-Aware Diffusion model (CVPR'24) and updated the paper on arxiv: arxiv.org/abs/2405.20324
It has a package and pretrained models!
🖥️ nicolas-dufour.github.io/cad.html
🤖 github.com/nicolas-dufo...
It has a package and pretrained models!
🖥️ nicolas-dufour.github.io/cad.html
🤖 github.com/nicolas-dufo...
February 20, 2025 at 12:14 PM
Weights for CAD are finally available. It's one of the smallest diffusion models on the market, achieving performance close to SD and Pixart, featuring a Perceiver-like architecture.
We leverage our coherence aware training to improve the textual understanding
We leverage our coherence aware training to improve the textual understanding
🤔 What if embedding multimodal EO data was as easy as using a ResNet on images?
Introducing AnySat: one model for any resolution (0.2m–250m), scale (0.3–2600 hectares), and modalities (choose from 11 sensors & time series)!
Try it with just a few lines of code:
Introducing AnySat: one model for any resolution (0.2m–250m), scale (0.3–2600 hectares), and modalities (choose from 11 sensors & time series)!
Try it with just a few lines of code:
December 19, 2024 at 10:46 AM
🤔 What if embedding multimodal EO data was as easy as using a ResNet on images?
Introducing AnySat: one model for any resolution (0.2m–250m), scale (0.3–2600 hectares), and modalities (choose from 11 sensors & time series)!
Try it with just a few lines of code:
Introducing AnySat: one model for any resolution (0.2m–250m), scale (0.3–2600 hectares), and modalities (choose from 11 sensors & time series)!
Try it with just a few lines of code:
Reposted by Guillaume Astruc
Guillaume Astruc, Nicolas Gonthier, Clement Mallet, Loic Landrieu
AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities
https://arxiv.org/abs/2412.14123
AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities
https://arxiv.org/abs/2412.14123
December 19, 2024 at 6:45 AM
Guillaume Astruc, Nicolas Gonthier, Clement Mallet, Loic Landrieu
AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities
https://arxiv.org/abs/2412.14123
AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities
https://arxiv.org/abs/2412.14123
Reposted by Guillaume Astruc
⚠️Reconstructing sharp 3D meshes from a few unposed images is a hard and ambiguous problem.
☑️With MAtCha, we leverage a pretrained depth model to recover sharp meshes from sparse views including both foreground and background, within mins!🧵
🌐Webpage: anttwo.github.io/matcha/
☑️With MAtCha, we leverage a pretrained depth model to recover sharp meshes from sparse views including both foreground and background, within mins!🧵
🌐Webpage: anttwo.github.io/matcha/
December 11, 2024 at 2:59 PM
⚠️Reconstructing sharp 3D meshes from a few unposed images is a hard and ambiguous problem.
☑️With MAtCha, we leverage a pretrained depth model to recover sharp meshes from sparse views including both foreground and background, within mins!🧵
🌐Webpage: anttwo.github.io/matcha/
☑️With MAtCha, we leverage a pretrained depth model to recover sharp meshes from sparse views including both foreground and background, within mins!🧵
🌐Webpage: anttwo.github.io/matcha/
Reposted by Guillaume Astruc
🌍 Guessing where an image was taken is a hard, and often ambiguous problem. Introducing diffusion-based geolocation—we predict global locations by refining random guesses into trajectories across the Earth's surface!
🗺️ Paper, code, and demo: nicolas-dufour.github.io/plonk
🗺️ Paper, code, and demo: nicolas-dufour.github.io/plonk
December 10, 2024 at 3:56 PM
🌍 Guessing where an image was taken is a hard, and often ambiguous problem. Introducing diffusion-based geolocation—we predict global locations by refining random guesses into trajectories across the Earth's surface!
🗺️ Paper, code, and demo: nicolas-dufour.github.io/plonk
🗺️ Paper, code, and demo: nicolas-dufour.github.io/plonk