Lightnews — Scholar-powered news

Kale-ab Tessera

@kale-ab.bsky.social

2.3K followers 240 following 51 posts

ML PhD Student @ Uni. of Edinburgh, working on Multi-Agent Problems. | Organiser @deeplearningindaba.bsky.social‬ @rl-agents-rg.bsky.social‬ | 🇪🇹🇿🇦
kaleabtessera.com

Posts Replies Media Videos

Kale-ab Tessera

@kale-ab.bsky.social

I think most people judge reputation from high-level things e.g. num of accepted papers, and very few people actually read these papers. This means you can game the system with LLM generated papers with little consequences, and this makes things frustrating for everyone.

November 16, 2025 at 10:20 AM

Kale-ab Tessera

@kale-ab.bsky.social

Refreshing to see posts like this compared to "we have 15 papers accepted at X" 🙌

August 19, 2025 at 11:44 AM

Kale-ab Tessera

@kale-ab.bsky.social

🙌🎉

August 3, 2025 at 8:14 PM

Kale-ab Tessera

@kale-ab.bsky.social

Always nice to see when simpler methods + good evaluations > more complicated ones. 👌

July 23, 2025 at 9:47 AM

Kale-ab Tessera

@kale-ab.bsky.social

This has happened to me too many times 🤦‍♂️ Also doesn't help that Jax and PyTorch use different default initialisations for dense layers.

June 24, 2025 at 7:19 AM

Kale-ab Tessera

@kale-ab.bsky.social

Well done & well deserved!! 🎉🎉 It has been awesome to see this project evolve from the early days.

June 23, 2025 at 6:45 AM

Kale-ab Tessera

@kale-ab.bsky.social

The Edinburgh one will be back and running soon. We are just updating the website and other things. There is this form for people interested - forms.gle/DAbkpN9b4cUt...

Edinburgh RL Reading Group

Please add your details so that you can remain on the mailing list for the RL Reading Group.

forms.gle

June 5, 2025 at 3:40 PM

Kale-ab Tessera

@kale-ab.bsky.social

Forgot to also add ⚡ quickstart link for people who like to experiment on notebooks: github.com/KaleabTesser...

github.com

May 28, 2025 at 9:37 AM

Kale-ab Tessera

@kale-ab.bsky.social

Thanks for checking it out! 👍 Good point, there might be an interesting link between MoEs and hypernets. We used hypernets since they're simpler (no need to pick or combine experts), and maximally expressive (gen weights directly).

Lol yes, will had a .gitignore, missed it when copying things over.

May 28, 2025 at 7:40 AM

Kale-ab Tessera

@kale-ab.bsky.social

🎯 TL;DR: HyperMARL is a versatile approach for adaptive MARL -- no changes to the RL objective, preset diversity, or seq. updates needed. See paper & code below!

Work with Arrasy Rahman, Amos Storkey & Stefano Albrecht.

📜: arxiv.org/abs/2412.04233
👩‍💻: github.com/KaleabTessera/HyperMARL

HyperMARL: Adaptive Hypernetworks for Multi-Agent RL

Adaptability to specialised or homogeneous behaviours is critical in cooperative multi-agent reinforcement learning (MARL). Parameter sharing (PS) techniques, common for efficient adaptation, often li...

arxiv.org

May 27, 2025 at 11:07 AM

Kale-ab Tessera

@kale-ab.bsky.social

⚠️ Limitations (+opportunity): HyperMARL uses vanilla hypernets, which can inc. param. count esp. MLP hypernets. In RL/MARL this matters less (actor-critic nets are small), and params grow ~const with #agents, so scaling remains strong. Future work could explore chunked hypernets.

May 27, 2025 at 11:07 AM

Kale-ab Tessera

@kale-ab.bsky.social

🔎 We also do ablations and see the importance of the decoupling and the simple initialisation scheme we follow.

May 27, 2025 at 11:07 AM

Kale-ab Tessera

@kale-ab.bsky.social

📊 We validate HyperMARL across various diverse envs (18 settings; up to 20 agents) and find that it achieves competitive mean episode returns compared to NoPS, FuPS, and modern diversity-focused methods -- without using diversity losses, preset diversity levels or seq. updates.

May 27, 2025 at 11:07 AM

Kale-ab Tessera

@kale-ab.bsky.social

💡To address the coupling problem, we propose 𝐇𝐲𝐩𝐞𝐫𝐌𝐀𝐑𝐋: a method that explicitly 𝐝𝐞𝐜𝐨𝐮𝐩𝐥𝐞𝐬 obs- and agent-conditioned gradients with hypernetworks. This means obs grad noise is avg. per agent (Zᵢ) before applying agent-cond. grads (Jᵢ) -- unlike FuPS, which entangles both.

May 27, 2025 at 11:07 AM

Kale-ab Tessera

@kale-ab.bsky.social

🔬 We isolate FuPS’s failure in matrix games: shared policies struggle when agents need to act differently. Inter-agent gradient interference is at play -- especially when obs and agent IDs are 𝐜𝐨𝐮𝐩𝐥𝐞𝐝. Surprisingly, using only IDs (no obs) performed better and reduced interference.

Performance and gradient interference plots.

May 27, 2025 at 11:07 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news