Sasha Abramowitz
sash-4.bsky.social
Sasha Abramowitz
@sash-4.bsky.social
Research engineer at InstaDeep working on multi-agent RL
🧑‍🔬 Oumayma Mahjoub and Wiem Khilfi will be presenting Sable at #ICML2025.

🗓️Wednesday, 16 July, 4:30 PM PDT
📍West Exhibition Hall B2-B3, Poster Number W-820

#MARL #AI #ICML2025
July 14, 2025 at 3:46 PM
If you are interested, have a look at the full paper and code:
📜Paper: arxiv.org/abs/2410.01706
🧑‍💻Code: bit.ly/4eMUXhn
🌐Website/Data: sites.google.com/view/sable-m...

(7/N)
Sable: a Performant, Efficient and Scalable Sequence Model for MARL
As multi-agent reinforcement learning (MARL) progresses towards solving larger and more complex problems, it becomes increasingly important that algorithms exhibit the key properties of (1) strong per...
arxiv.org
July 14, 2025 at 3:46 PM
🎉 A massive thank you to my incredible co-authors Oumayma Mahjoub, Ruan De Kock, Wiem Khlifi, Simon Du Toit, Jemma Daniel, Louay Ben Nessir, Louise Beyers, Claude Formanek & Arnu Pretorius

(6/N)
July 14, 2025 at 3:44 PM
⚡Despite its power, Sable is remarkably efficient. It scales to over 1000 agents with linear memory increase and boasts 7x better GPU memory efficiency and up to a 6.5x improvement in throughput compared to MAT (previous SOTA).

(5/N)
July 14, 2025 at 3:42 PM
🔬In a benchmark across 45 diverse tasks (the largest in the literature), Sable substantially outperformed existing methods, ranking best 11 times more often than previous SOTA methods.

(4/N)
July 14, 2025 at 3:42 PM
💪 Our solution? Sable adapts the retention mechanism from Retentive Networks (RetNets) and achieves centralised learning advantages without the associated drawbacks. This allows for efficient, long-term memory and impressive scalability.

(3/N)
July 14, 2025 at 3:41 PM
🤔 The challenge? Centralised training in MARL performs well but cannot scale, limiting its use to scenarios with only a few agents. This creates a trade-off between performance and agent scalability.

(2/N)
July 14, 2025 at 3:41 PM
Please add me 🙏
November 26, 2024 at 11:51 AM
Totally agree with you on the filtering point, but we're all are pretty bad at predicting what papers will be useful in future e.g PPO was rejected.

So maybe only reviewing for soundness is a good thing?
November 20, 2024 at 7:24 PM
Can you add me 🙏
November 19, 2024 at 6:11 AM
End-to-end compiling RL algorithms and envs and running everything across multiple TPU cores/GPUs, so that you never have to communicate anything with the CPU. This gives ridiculous speed ups, on the order 100x depending on environment. I don't think torch is there yet.
November 19, 2024 at 6:08 AM