Lightnews — Scholar-powered news

Lerrel Pinto

@lerrelpinto.com

We just released RUKA, a $1300 humanoid hand that is 3D-printable, strong, precise, and fully open sourced!

The key technical breakthrough here is that we can control joints and fingertips of the robot **without joint encoders**. All we need here is self-supervised data collection and learning.

April 18, 2025 at 6:53 PM

Lerrel Pinto

@lerrelpinto.com

When life gives you lemons, you pick them up.

(trained with robotutilitymodels.com)

March 28, 2025 at 4:02 AM

Lerrel Pinto

@lerrelpinto.com

Point Policy uses sparse key points to represent both human demonstrators and robots, bridging the morphology gap. The scene is hence encoded through semantically meaningful key points from minimal human annotations.

February 28, 2025 at 7:09 PM

Lerrel Pinto

@lerrelpinto.com

The robot behaviors shown below are trained without any teleop, sim2real, genai, or motion planning. Simply show the robot a few examples of doing the task yourself, and our new method, called Point Policy, spits out a robot-compatible policy!

February 28, 2025 at 7:09 PM

Lerrel Pinto

@lerrelpinto.com

We just released AnySense, an iPhone app for effortless data acquisition and streaming for robotics. We leverage Apple’s development frameworks to record and stream:

1. RGBD + Pose data
2. Audio from the mic or custom contact microphones
3. Seamless Bluetooth integration for external sensors

February 26, 2025 at 3:14 PM

Lerrel Pinto

@lerrelpinto.com

Just found a new winner for the most hype-baiting, unscientific plot I have seen. (From the recent Figure AI release)

February 20, 2025 at 10:01 PM

Lerrel Pinto

@lerrelpinto.com

At NYU Abu Dhabi today and in love how cat friendly the campus is!

December 18, 2024 at 4:39 AM

Lerrel Pinto

@lerrelpinto.com

P3-PO uses a one time “point prescription” by a human to identify key points. After this it uses semantic correspondence to find the same points on different instances of the same object.

December 10, 2024 at 8:33 PM

Lerrel Pinto

@lerrelpinto.com

New paper! We show that by using keypoint-based image representation, robot policies become robust to different object types and background changes.

We call this method Prescriptive Point Priors for robot Policies or P3-PO in short. Full project is here: point-priors.github.io

December 10, 2024 at 8:32 PM

Lerrel Pinto

@lerrelpinto.com

BAKU consists of three modules:
1. Sensor encoders for vision, language, and state
2. Observation trunk to fuse multimodal inputs
3. Action head for predicting actions.

This allows BAKU to combine different action models like VQ-BeT and Diffusion Policy under one framework.

December 9, 2024 at 11:34 PM

Lerrel Pinto

@lerrelpinto.com

Modern policy architectures are unnecessarily complex. In our #NeurIPS2024 project called BAKU, we focus on what really matters for good policy learning.

BAKU is modular, language-conditioned, compatible with multiple sensor streams & action multi-modality, and importantly fully open-source!

December 9, 2024 at 11:33 PM

Lerrel Pinto

@lerrelpinto.com

RUMs is the brainchild of @notmahi.bsky.social with several insightful experiments. The most important one being that data diversity >> data quantity.

Another insight is that regardless of the algorithm there is a similar-ish scaling law across tasks.

Check out the paper: arxiv.org/abs/2409.05865

December 8, 2024 at 2:45 AM

Lerrel Pinto

@lerrelpinto.com

Our awesome undergrad lead on this project @haritheja.bsky.social took RUMs to Munich for CoRL 2024 and showed it work zero-shot in opening doors and drawers bought from German IKEA.

December 8, 2024 at 2:37 AM

Lerrel Pinto

@lerrelpinto.com

There are three main components to build RUMs: diverse expert data + multi-modal behavior cloning + mLLM feedback.

hardware, code & pretrained policies are fully opensourced: robotutilitymodels.com

December 8, 2024 at 2:34 AM

Lerrel Pinto

@lerrelpinto.com

Since we are nearing the end of the year, I'll revisit some of our work I'm most excited about from the last year and maybe a sneak peek of what we are up to next.

To start of, Robot Utility Models, which enables zero-shot deployment. In the video below, the robot hasnt seen these doors before.

December 8, 2024 at 2:32 AM

Lerrel Pinto

@lerrelpinto.com

We got stranded for a day in rural NY without electricity, running water, heat, internet, or cell service. It is crazy how difficult it is to live without these relatively modern inventions.

I hope one day robots will join this list.

November 24, 2024 at 2:38 PM

Lerrel Pinto

@lerrelpinto.com

In our latest project, we train robots to mimic a human video of the task by matching the object features using RL. We only need one video and under an hour of robot training.

Project was led by Irmak Guzey w/ Yinlong Dai, Georgy Savva and Raunaq Bhirangi.

More details: object-rewards.github.io

November 1, 2024 at 3:29 PM

Lerrel Pinto

@lerrelpinto.com

It is really hard to get robot policies that are both precise (small margins for error) and general (robust to variations).

We just released ViSk, where skin sensing is used to train fine-grained policies with ~1 hour of data. I have attached a single-take video on this post.

visuoskin.github.io

October 25, 2024 at 5:57 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news