Willem Röpke
willemropke.bsky.social
Willem Röpke
@willemropke.bsky.social
PhD student | Interested in all things decision-making and learning
Pinned
Exciting news! My paper on multi-objective reinforcement learning was accepted at AAMAS 2025!

We introduce IPRO (Iterated Pareto Referent Optimisation)—a principled approach to solving multi-objective problems.

🔗 Paper: arxiv.org/abs/2402.07182
💻 Code: github.com/wilrop/ipro
May 13, 2025 at 2:20 PM
I think the Qwen team is missing up on a huge opportunity to basically be the default model in all neurips submissions by not releasing Qwen3
April 22, 2025 at 5:59 PM
Using LLMs to come up with prompts for LLMs to then ask the LLMs to then train the LLMs to then ....
April 10, 2025 at 9:49 AM
Manifesting Qwen 3
April 9, 2025 at 10:20 AM
RIP to my investments from the past few years, it was nice seeing the green while it lasted
April 4, 2025 at 10:57 AM
The people demand Qwen3!
April 2, 2025 at 4:08 PM
I've been bashing my head against a wall trying to make TRL and their new vllm-serve work and holy moly it's just an infinite pain

why must i suffer
March 24, 2025 at 9:49 PM
Why does reading a book feel so much more satisfying than watching a TV show? Both are ways of consuming content so I don't get the difference
March 22, 2025 at 7:11 PM
Bought a cherry coke on accident today.

Horrible things happening everywhere apparently
March 12, 2025 at 1:57 PM
This is actually insanely clever, I would've never thought about this. Seems very interesting and important to fix!
What happens if we tokenize cat as [ca, t] rather than [cat]?

LLMs are trained on just one tokenization per word, but they still understand alternative tokenizations. We show that this can be exploited to bypass safety filters without changing the text itself.

#AI #LLMs #tokenization #alignment
March 12, 2025 at 10:09 AM
I don't recall seeing a video in the recent past that depressed me as much as what I just watched unfolding in the Oval Office
February 28, 2025 at 7:58 PM
Exciting news! My paper on multi-objective reinforcement learning was accepted at AAMAS 2025!

We introduce IPRO (Iterated Pareto Referent Optimisation)—a principled approach to solving multi-objective problems.

🔗 Paper: arxiv.org/abs/2402.07182
💻 Code: github.com/wilrop/ipro
February 17, 2025 at 1:22 PM
This is unholy
February 12, 2025 at 12:52 PM
How can I stop ChatGPT from talking to me with emojis, this is just the worst update I've ever experienced.

I've put it in its memory, in my details, and I even repeat it in the chat but it's just replying like 👉🥺👈
February 12, 2025 at 9:52 AM
Macron is the goat

French people don't appreciate true genius
February 11, 2025 at 11:58 AM
Why did OpenAI update chatGPT to use emojis in its responses? I hate it and even when I explicitly say this it just keeps doing it.
February 11, 2025 at 11:01 AM
To whomever put my email in some spam list: I fart in your general direction
February 5, 2025 at 10:57 AM
The fact that in the year 2025 we are still dealing with the stupid "make the paper fit in an arbitrary format for the camera ready submission" minigame is killing me.

Either let me group authors or let me put acknowledgements after the main text. This isn't hard.
February 4, 2025 at 8:14 AM
Does anyone have any good hacks for making the AAMAS template not suck for people with multiple affiliations? I lose a gazillion lines for basically no reason...
January 31, 2025 at 8:29 AM
I found a very promising open problem in AI

Computing a MEDIAN over a list of rows where one of the elements is just an empty array
January 29, 2025 at 2:45 PM
I think this is the best paper I’ve ever read: arxiv.org/abs/2404.03715

A strong emphasis on theoretically principled algorithms for RLHF followed by motivated practical implementations. Well-written and a clear overview of the relevant background and related work.

10/10 no comments
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
This paper studies post-training large language models (LLMs) using preference feedback from a powerful oracle to help a model iteratively improve over itself. The typical approach for post-training L...
arxiv.org
January 27, 2025 at 6:50 PM
Deepseek making my day just a little better
January 20, 2025 at 3:07 PM
I realise I'm woefully unqualified on this topic, but can someone please explain why we still don't have personal carrier drones? This seems like an obvious next step in transportation and given the state of our tech tree shouldn't be that hard?
January 20, 2025 at 9:19 AM
I think we should do congestion pricing in a lot more places
January 15, 2025 at 3:35 PM
Claude just declined my attempt at bribing it to do a better job.

Not sure whether to be happy or sad
January 13, 2025 at 3:23 PM