Lerrel Pinto
banner
lerrelpinto.com
Lerrel Pinto
@lerrelpinto.com
Assistant Professor of CS @nyuniversity.

I like robots!
We just released RUKA, a $1300 humanoid hand that is 3D-printable, strong, precise, and fully open sourced!

The key technical breakthrough here is that we can control joints and fingertips of the robot **without joint encoders**. All we need here is self-supervised data collection and learning.
April 18, 2025 at 6:53 PM
When life gives you lemons, you pick them up.

(trained with robotutilitymodels.com)
March 28, 2025 at 4:02 AM
Point Policy uses sparse key points to represent both human demonstrators and robots, bridging the morphology gap. The scene is hence encoded through semantically meaningful key points from minimal human annotations.
February 28, 2025 at 7:09 PM
The robot behaviors shown below are trained without any teleop, sim2real, genai, or motion planning. Simply show the robot a few examples of doing the task yourself, and our new method, called Point Policy, spits out a robot-compatible policy!
February 28, 2025 at 7:09 PM
We just released AnySense, an iPhone app for effortless data acquisition and streaming for robotics. We leverage Apple’s development frameworks to record and stream:

1. RGBD + Pose data
2. Audio from the mic or custom contact microphones
3. Seamless Bluetooth integration for external sensors
February 26, 2025 at 3:14 PM
Just found a new winner for the most hype-baiting, unscientific plot I have seen. (From the recent Figure AI release)
February 20, 2025 at 10:01 PM
At NYU Abu Dhabi today and in love how cat friendly the campus is!
December 18, 2024 at 4:39 AM
P3-PO uses a one time “point prescription” by a human to identify key points. After this it uses semantic correspondence to find the same points on different instances of the same object.
December 10, 2024 at 8:33 PM
New paper! We show that by using keypoint-based image representation, robot policies become robust to different object types and background changes.

We call this method Prescriptive Point Priors for robot Policies or P3-PO in short. Full project is here: point-priors.github.io
December 10, 2024 at 8:32 PM
BAKU consists of three modules:
1. Sensor encoders for vision, language, and state
2. Observation trunk to fuse multimodal inputs
3. Action head for predicting actions.

This allows BAKU to combine different action models like VQ-BeT and Diffusion Policy under one framework.
December 9, 2024 at 11:34 PM
Modern policy architectures are unnecessarily complex. In our #NeurIPS2024 project called BAKU, we focus on what really matters for good policy learning.

BAKU is modular, language-conditioned, compatible with multiple sensor streams & action multi-modality, and importantly fully open-source!
December 9, 2024 at 11:33 PM
RUMs is the brainchild of @notmahi.bsky.social with several insightful experiments. The most important one being that data diversity >> data quantity.

Another insight is that regardless of the algorithm there is a similar-ish scaling law across tasks.

Check out the paper: arxiv.org/abs/2409.05865
December 8, 2024 at 2:45 AM
Our awesome undergrad lead on this project @haritheja.bsky.social took RUMs to Munich for CoRL 2024 and showed it work zero-shot in opening doors and drawers bought from German IKEA.
December 8, 2024 at 2:37 AM
There are three main components to build RUMs: diverse expert data + multi-modal behavior cloning + mLLM feedback.

hardware, code & pretrained policies are fully opensourced: robotutilitymodels.com
December 8, 2024 at 2:34 AM
Since we are nearing the end of the year, I'll revisit some of our work I'm most excited about from the last year and maybe a sneak peek of what we are up to next.

To start of, Robot Utility Models, which enables zero-shot deployment. In the video below, the robot hasnt seen these doors before.
December 8, 2024 at 2:32 AM
We got stranded for a day in rural NY without electricity, running water, heat, internet, or cell service. It is crazy how difficult it is to live without these relatively modern inventions.

I hope one day robots will join this list.
November 24, 2024 at 2:38 PM
In our latest project, we train robots to mimic a human video of the task by matching the object features using RL. We only need one video and under an hour of robot training.

Project was led by Irmak Guzey w/ Yinlong Dai, Georgy Savva and Raunaq Bhirangi.

More details: object-rewards.github.io
November 1, 2024 at 3:29 PM
It is really hard to get robot policies that are both precise (small margins for error) and general (robust to variations).

We just released ViSk, where skin sensing is used to train fine-grained policies with ~1 hour of data. I have attached a single-take video on this post.

visuoskin.github.io
October 25, 2024 at 5:57 PM