Lightnews — Scholar-powered news

Marcel Hussing

@marcelhussing.bsky.social

2.9K followers 340 following 140 posts

PhD student at the University of Pennsylvania. Prev, intern at MSR, currently at Meta FAIR. Interested in reliable and replicable reinforcement learning, robotics and knowledge discovery: https://marcelhussing.github.io/
All posts are my own.

Posts Replies Media Videos

Marcel Hussing

@marcelhussing.bsky.social

You know what would be funny? If it comes back and the reviews aren't out yet.

November 11, 2025 at 9:09 PM

Marcel Hussing

@marcelhussing.bsky.social

Not sure there is a single good source. Maybe we should write one @cvoelcker.bsky.social

October 27, 2025 at 3:10 PM

Marcel Hussing

@marcelhussing.bsky.social

I don't necessarily think it's dull but one would need a conference where work like that can be published. Only TMLR comes to mind to some extent.

October 27, 2025 at 2:57 PM

Marcel Hussing

@marcelhussing.bsky.social

The cynic in me wants to say "because the paper needs to confuse the reviewer to get accepted" but I would of course never say that.

October 27, 2025 at 2:51 PM

Marcel Hussing

@marcelhussing.bsky.social

I also think it's not that they don't work but there were a lot of entangled problems that over the years have been addressed. I'm convinced that many of these things need to be restudied with our new algorithmic/architectural insights that simply make learning stable.

October 27, 2025 at 2:45 PM

Marcel Hussing

@marcelhussing.bsky.social

Yea 😂 we spent a lot of time on getting the exponents on the ridge regression small to avoid an explosion down the line but that worked out only semi well. 😅 I do think it's probably possible to get much smaller exponents but I suspect that will require a fundamentally different approach.

October 26, 2025 at 3:26 PM

Marcel Hussing

@marcelhussing.bsky.social

This should of course say quantizing Q-values 🤦

October 26, 2025 at 2:34 PM

Marcel Hussing

@marcelhussing.bsky.social

This was a fun collaboration between theory and practice with the theory group at Penn.

👩‍🎓👨‍🎓
@ericeaton.bsky.social
@mkearnsphilly.bsky.social
@aaroth.bsky.social
@sikatasengupta.bsky.social
@optimistsinc.bsky.social

(6/6)

October 26, 2025 at 2:16 PM

Marcel Hussing

@marcelhussing.bsky.social

We also empirically evaluate the algorithms. We first demonstrate that the sample complexity bounds are not representative of average case performance. Then, we derive insights for deep RL with discrete action spaces.

💡 Quantizing actions leads to agreement across policies! (5/6)

October 26, 2025 at 2:16 PM

Marcel Hussing

@marcelhussing.bsky.social

We build two objects near a reference point and apply randomized rounding. In ridge, the reference is the convex minimizer. With a Rademacher argument and uniform gradient convergence, this yields replicability. Also, the algorithm is replicable even if not fully accurate. (4/6)

October 26, 2025 at 2:16 PM

Marcel Hussing

@marcelhussing.bsky.social

In this work, we show that we can get replicability guarantees even in function approximation settings with RL. The idea is to ensure replicability of ridge regression and uncentered covariance estimation first. Then, use these tools in common approaches that solve linear MDPs. (3/6)

October 26, 2025 at 2:16 PM

Marcel Hussing

@marcelhussing.bsky.social

is motivated by the fact that in deep RL, variation from randomness can lead to drastically different solutions when executing the same algorithm twice. An algorithm is formally replicable, if (whp) it produces identical outcomes. I.e., run your algorithm twice and get the same policy twice. (2/6)

October 26, 2025 at 2:16 PM

Marcel Hussing

@marcelhussing.bsky.social

A large chunk of CS theory is ordering alphabetically, some even order randomly. Without any common standard, ideas like these are just gonna disadvantage people.

October 26, 2025 at 1:58 PM

Marcel Hussing

@marcelhussing.bsky.social

This one is so accurate it hurts my soul

October 23, 2025 at 2:31 AM

Marcel Hussing

@marcelhussing.bsky.social

arxiv.org/abs/2207.04136 we always wondered how to discover the factored structure if not given. It's an intriguing question for which I have a few ideas but so far too little time.

August 23, 2025 at 2:20 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news