Nico Bohlinger
nicobohlinger.bsky.social
Nico Bohlinger
@nicobohlinger.bsky.social
26 | Morphology-aware Robotics, RL Research | PhD student at @ias-tudarmstadt.bsky.social
I'm presenting four different works at IROS 2025 this week in Hangzhou 🤖
October 20, 2025 at 4:46 AM
⚡️ Can one unified policy control 10 million different robots and zero-shot transfer to completely unseen robots, even humanoids?

🔗 Yes! Checkout our paper: arxiv.org/abs/2509.02815
October 2, 2025 at 4:51 AM
September 30, 2025 at 8:25 AM
🇰🇷 Conferences are about finally meeting your collaborators from all around the world!

Check out our work on Embodiment Scaling Laws @CoRL2025
We investigate cross-embodiment learning as the next axis of scaling for truly generalist policies 📈

🔗 All details: embodiment-scaling-laws.github.io
September 30, 2025 at 8:10 AM
Robot Randomization is fun!
July 2, 2025 at 6:19 PM
⚙️ Architecture matters
We also explored architectures like Universal Neural Functionals (UNF) and action-based representations ("Probing").
And yes, our scaled EPVFs are competitive with PPO and SAC in their final performance.
June 13, 2025 at 11:50 AM
⚡ But it's not just about size
Key ingredients for stability and performance are weight clipping and using uniform noise scaled to the parameter magnitudes.
Our ablation studies show just how critical these components are. Without them, performance collapses.
June 13, 2025 at 11:50 AM
📈 Massive Scaling Pays Off
We see strong scaling effects when using MJX to rollout up to 4000 differently perturbed policies in parallel.
This explores the policy space effectively and large batches drastically reduce the variance of the resulting gradients.
June 13, 2025 at 11:50 AM
🧠 Simple & Powerful RL
This unlocks fully off-policy learning and policy parameter space exploration using any policy data and leads to the probably most simple DRL algorithm one can imagine:
June 13, 2025 at 11:50 AM
🔍 What are EPVFs?
Imagine a value function that understands the policy's parameters directly: V(θ).
This allows for direct, gradient-based policy updates:
June 13, 2025 at 11:50 AM
🚀 Checkout our new work at @rldmdublin2025.bsky.social today at poster#16!
We're showing how to make Explicit Policy-conditioned Value Functions V(θ) (originating from Faccio & Schmidhuber) work for more complex control tasks. The secret? Massive scaling!
June 13, 2025 at 11:50 AM
We build on the efficient CrossQ DRL algorithm and combine it with two control architectures — Joint Target Prediction for agile maneuvers and Central Pattern Generators for stable, natural gaits — to train locomotion policies directly on the HoneyBadger quadruped robot from MAB Robotics.
March 18, 2025 at 10:24 PM
⚡️ Do you think training robot locomotion needs large scale simulation? Think again!

We train an omnidirectional locomotion policy directly on a real quadruped in just a few minutes 🚀
Top speeds of 0.85 m/s, two different control approaches, indoor and outdoor experiments, and more! 🤖🏃‍♂️
March 18, 2025 at 10:24 PM