tommiekerssies.bsky.social
@tommiekerssies.bsky.social
How fast can segmentation get while still maintaining accuracy?
✅ EoMT achieves an optimal trade-off between accuracy (PQ) 📊 and speed (FPS) ⚡ on COCO, thanks to its simple encoder-only design.
❌ No complex additional components.
❌ No bottlenecks.
🚀 Just performance.
(3/6)
March 31, 2025 at 8:35 PM
How do modern segmentation models work?
🚫 They chain together complex components:
ViT → Adapter → Pixel Decoder → Transformer Decoder…
✅ EoMT removes them all.
It keeps only the ViT and adds a few query tokens that guide it to predict masks, no decoder needed.
(2/6)
March 31, 2025 at 8:35 PM
Image segmentation doesn’t have to be rocket science. 🚀
Why build a rocket engine full of bolted-on subsystems when one elegant unit does the job? 💡
That’s what we did for segmentation.
✅ Meet the Encoder-only Mask Transformer (EoMT): tue-mps.github.io/eomt (CVPR 2025)
(1/6)
March 31, 2025 at 8:35 PM