Marcel Hussing
marcelhussing.bsky.social
Marcel Hussing
@marcelhussing.bsky.social
PhD student at the University of Pennsylvania. Prev, intern at MSR, and Meta FAIR. Interested in reliable and replicable reinforcement learning, robotics and knowledge discovery: https://marcelhussing.github.io/
All posts are my own.
Pinned
Bringing this back up.
I made a starter pack for learning theory people to gather some people around the topic. There are too many names on here that I don't know so I only added a few I do. If you believe you should be on this list, let me know. I will add people with accurate profile descriptions.

go.bsky.app/21nFz12
Reposted by Marcel Hussing
Scaling Laws in Particle Physics Data! This is a result I've been itching to share and it's finally out. One of the big open questions is how much better AI-based methods at particle colliders can still become. 1/4
February 1, 2026 at 11:55 AM
Not a particle physics person but extremely curious, can you elaborate what we might hope to learn from these models in the future? What physics might we discover using them?
February 1, 2026 at 8:36 PM
"Scientific reviewers should have experience publishing scientific work in related areas" is really not that hot of a take.
January 27, 2026 at 3:25 PM
Clicking like on any relevant ICLR paper. Encourage people to post their work here more!
January 26, 2026 at 4:32 PM
How do I see this?
January 26, 2026 at 4:22 PM
The other paper accepted to @iclr-conf.bsky.social 2026 🇧🇷. Our work on replicable RL sheds some light on how to consistently make decisions in RL.

@ericeaton.bsky.social @mkearnsphilly.bsky.social @aaroth.bsky.social @sikatasengupta.bsky.social @optimistsinc.bsky.social
I think I posted about it before but never with a thread. We recently put a new preprint on arxiv.

📖 Replicable Reinforcement Learning with Linear Function Approximation

🔗 arxiv.org/abs/2509.08660

In this paper, we study formal replicability in RL with linear function approximation. The... (1/6)
Replicable Reinforcement Learning with Linear Function Approximation
Replication of experimental results has been a challenge faced by many scientific disciplines, including the field of machine learning. Recent work on the theory of machine learning has formalized rep...
arxiv.org
January 26, 2026 at 4:08 PM
Two papers accepted to @iclr-conf.bsky.social 2026! One of the is REPPO, see below! I think it deserves a lot more recognition. Let's chat about it in Rio! 🇧🇷
🤔 Want to use REPPO (cvoelcker.de/projects/rep...) but hate jax? 🤔
😮 Want to have stable on-policy RL without filling your GPU with an enormous replay buffer? 😮
🤖 Are you a roboticist and just want your RL code to run? 🤖

🎉 Fear not, we started adding new REPPO versions! 🎉
github.com/cvoelcker/rs...
Relative Entropy Pathwise Policy Optimization | Claas A. Voelcker
A simple, whitespace theme for academics. Based on [*folio](https://github.com/bogoli/-folio) design.
cvoelcker.de
January 26, 2026 at 4:05 PM
Quite disheartening that there isn't a single workshop at ICLR to present my RL work but there several topics that are listed 5 or 6 times just named differently.
January 26, 2026 at 2:17 PM
That's correct, we did make it bold
January 24, 2026 at 3:46 AM
Our number went down by 0.01 but it's very expensive to run so we can't have error bars. Our algorithm is so much better than the rest, new SOTA!
January 23, 2026 at 2:46 PM
I can't believe that this paper is not yet used by literally everyone. Claas doing all he can to make your life easier. Check it out.
🤔 Want to use REPPO (cvoelcker.de/projects/rep...) but hate jax? 🤔
😮 Want to have stable on-policy RL without filling your GPU with an enormous replay buffer? 😮
🤖 Are you a roboticist and just want your RL code to run? 🤖

🎉 Fear not, we started adding new REPPO versions! 🎉
github.com/cvoelcker/rs...
Relative Entropy Pathwise Policy Optimization | Claas A. Voelcker
A simple, whitespace theme for academics. Based on [*folio](https://github.com/bogoli/-folio) design.
cvoelcker.de
January 17, 2026 at 10:06 PM
Bringing this back up.
I made a starter pack for learning theory people to gather some people around the topic. There are too many names on here that I don't know so I only added a few I do. If you believe you should be on this list, let me know. I will add people with accurate profile descriptions.

go.bsky.app/21nFz12
January 13, 2026 at 11:36 AM
Reposted by Marcel Hussing
Excited about a new paper! Multicalibration turns out to be strictly harder than marginal calibration. We prove tight Omega(T^{2/3}) lower bounds for online multicalibration, separating it from online marginal calibration for which better rates were recently discovered.
January 9, 2026 at 1:21 PM
For me, it's mostly verbalizing code I already know I want. I don't write whole apps. I take my RL code and add entropy regularization to TD3. Then I verify. Be explicit about what needs to change and know ahead what changes you expect. I still do the thinking, I just worry less about code context.
December 26, 2025 at 12:24 AM
I like this site a lot but too few people are posting interesting ML content imo and if they do too infrequently. I realize this for myself a lot.
December 22, 2025 at 7:41 PM
This project is a huge team effort across @grasplab.bsky.social and Penn trauma led by PIs @ericeaton.bsky.social and CJ Taylor. Shoutout to @jasonahughes.bsky.social, Raj Kannapiran, and Edward Zhang who did a lot of the heavy lifting. Check out the technical report arxiv.org/abs/2512.08754 (5/5)
A Multi-Robot Platform for Robotic Triage Combining Onboard Sensing and Foundation Models
This report presents a heterogeneous robotic system designed for remote primary triage in mass-casualty incidents (MCIs). The system employs a coordinated air-ground team of unmanned aerial vehicles (...
arxiv.org
December 22, 2025 at 6:51 PM
Then the ML takes over. Onboard models estimate breathing and heart rate from radar and thermal, and read injuries from multi view images and audio. Fine tuned VLMs plus Grounding DINO and SAM2 convert data into a triage report for responders. (4/5)
December 22, 2025 at 6:51 PM
Our robots do the dangerous first pass. Falcon drones sweep the scene from above using RGB and thermal cameras to detect and geolocate casualties in day or night. Jackal ground robots then drive in for close up sensing and send a live victim map to responders. (3/5)
December 22, 2025 at 6:51 PM
In this project, we are building a multi-robot system to facilitate the process of triage. The system consists of a fleet of Jackal ground robots and several in-house built Falcon drones which rapidly scan an area, localize people in need and help determine their injuries. (2/5)
December 22, 2025 at 6:51 PM
In emergencies, minutes can decide who lives and who dies. Our team is participating in the Triage Challenge, building AI to empower clinicians and prioritize care when resources are thin:
prontotriage.com

Also recently featured by Meta:
ai.meta.com/blog/upenn-d...

(1/5)
December 22, 2025 at 6:51 PM
Posted about this last week; I feel like it didn't get as much attention as it deserves. We have a new preprint on using diffusion models to generate compositional data. This work was conducted by Quan, a student I advised over the fall. He is currently looking for PhD positions. Check it out!
December 22, 2025 at 3:57 PM
Oh nice, this is kinda cool. Have an immediate idea on what to do with this. :D Hadn't heard of this before. Which ones are the most interesting multi-player games available and for what reason do you think they are interesting? I'm assuming those are multi-agent games?
December 19, 2025 at 6:41 PM
We have a new preprint out on iterative generation of compositional robot data. This work was conducted by Quan, one of the students I advised over the semester. Check out the thread!

P.S. Quan is currently looking for PhD positions, keep an eye out for him!
December 19, 2025 at 6:30 PM
Agreed and that is basically what section 4.1 in the paper I linked says. There is nuance to it for course but that part really needs to be step 1. However it is unclear how to start this discussion and our proposal is to turn the discussion into research.
December 10, 2025 at 5:38 AM
What is a better benchmark? @cvoelcker.bsky.social and I wrote finding the frame paper about this in the hopes of starting a discussion around Benchmark selection. Benchmarks are often chosen via "we did what others did" where the starting point was picked arbitrarily. arxiv.org/abs/2410.08870
Can we hop in general? A discussion of benchmark selection and design using the Hopper environment
Empirical, benchmark-driven testing is a fundamental paradigm in the current RL community. While using off-the-shelf benchmarks in reinforcement learning (RL) research is a common practice, this choic...
arxiv.org
December 10, 2025 at 1:32 AM