pablovelagomez.bsky.social
@pablovelagomez.bsky.social
From 8 -> 5 -> 4 exocentric cameras, all visualized with @rerundotio. I'm dropping the number of cameras used and collecting my own data to make sure I'm not overfitting to open-source datasets.
October 10, 2025 at 8:25 PM
It's finally done, I've finished ripping out my full-body pipeline and replaced it with a hands-only version. Critical to make it work in a lot more scenarios! I've visualized the final predictions with @rerundotio!
September 29, 2025 at 1:00 PM
If you're not labeling your own data, you're NGMI. I take this seriously, so I finished building the first version of my hand-tracking annotation app using rerun.io and gradio.app
September 19, 2025 at 5:00 PM
Unfortunately, I've identified some serious issues with the hand tracker that necessitate manual intervention. So I decided the next best course of action is to build a labeling app with @rerun.io and
@gradio-hf.bsky.social, as I transition from my full-body solution to a hands-only solution.
September 15, 2025 at 5:01 PM
One of the failure modes for my pipeline is the full-body tracker. This is a huge issue for ego views, where it falls apart. I've been looking for a solution that generalizes to ego/exo, and I think I found it!
August 29, 2025 at 4:43 PM
Go check out ethz-vlg. github. io/mvtracker/it's a really clever idea for multiview tracking, and they made an awesome @rerun.io‬ integration on their project page!
August 29, 2025 at 1:44 PM
This is a big one! I'm almost done with my MVP, built with @rerun.io‬ and @gradio-hf.bsky.social‬!

TLDR:
Inputs
- 2-N Synchronized Videos in a zip file

Outputs
- Multiview Consistent Depthmaps + Pointclouds (Thanks VGGT + MogeV2!)
- Depth Confidence Values
August 22, 2025 at 4:38 PM
✨ Massive Pipeline Refactor → One Framework for Ego + Exo Datasets, Visualized with @rerun.io 🚀

After a refactoring, my entire egocentric/exocentric pipeline is now modular. One codebase handles different sensor layouts and outputs a unified, multimodal timeseries file that you can open in Rerun.
June 26, 2025 at 1:40 PM
Streaming iPhone data in real-time directly to @rerun.io 🚀
May 22, 2025 at 4:54 PM
MVP of Multiview Video → Camera parameters + 3D keypoints. Visualized with @rerun.io
May 9, 2025 at 3:09 PM
First signs of life on my self-collected data =]

I still need to add synchronization between the exo/ego videos, but all of the work the past few months is coming together

This shows a synchronization app I'm building with @rerun.io and @gradio-hf.bsky.social . More to come
May 5, 2025 at 8:47 PM
Trying to wrap my head around fwd/bwd kinematics for imitation learning, so I built a fully‑differentiable kinematic hand skeleton in JAX and visualized it with @rerun.io new callback system in a Jupyter Notebook. This shows each joint angle and how it impacts the kinematic skeleton.
May 2, 2025 at 8:59 PM
I added export settings to get the images/masks/camera parameters in NerfStudio format. I can chuck these into brush from @arthurperpixel.bsky.social and get a splat on my M4.

Right now, such sparse input doesn't produce great results, but the depths are not being used, which should be able to help
April 25, 2025 at 4:36 PM
@rerun.io v0.23 is finally out! 🎉 I’ve extended my @gradio-hf.bsky.social annotation pipeline to support multiview videos using the callback system introduced in 0.23.
April 24, 2025 at 2:20 PM
Visualized with @rerun.io, I’ve integrated video‑based depth estimation into my robot‑training pipeline to make data collection as accessible as possible—without requiring specialized hardware.
April 17, 2025 at 8:08 PM
I extended my previous @rerun.io and @gradio-hf.bsky.social annotation pipeline for multiple views. You can see how powerful this is when using Meta's Segment Anything and multi-view geometry. Only annotating 2 views, I can triangulate the other 6 views and get masks extremely quickly!
April 10, 2025 at 5:00 PM
Here’s a sneak peek using @rerun.io and @gradio-hf.bsky.social for data annotation. It uses Video Depth Anything and Segment Anything 2 under the hood to generate segmentation masks and depth maps/point clouds. More to share next week.
April 1, 2025 at 7:13 PM
Continuing with my robot training collection pipeline, one of the challenges has been obtaining calibrated cameras from sparse multi-view inputs. I previously worked with Dust3r but found its accuracy insufficient for generating reliable camera parameters.
March 25, 2025 at 5:57 PM
More progress towards building a straightforward method to collect first-person (ego) and third-person (exo) data for robotic training in @rerun.io. I’ve been using the HO-cap dataset to establish a baseline, and here are some updates I’ve made (code at the end)
March 18, 2025 at 3:32 PM
Finally finished porting mast3r-slam to @rerun.io and adding a @gradio-hf.bsky.social interface. Really cool to see it running on any video I throw at it, I've included the code at the end
March 7, 2025 at 9:52 PM
I'm working towards an easy method to collect a combined third-person and first-person pose dataset starting Assembly101 from Meta, with near real-time performance via @rerun.io visualization. The end goal is robot imitation learning with Hugging Face LeRobot
February 24, 2025 at 4:02 PM
Following up on my prompt depth anything post, I'm starting a bit of a miniseries where I'm going through the tutorials of
Lerobot to understand better how I can get a real robot to work on my custom dataset. Using @rerun.io to visualize
code: github.com/rerun-io/pi0...
February 11, 2025 at 5:09 PM
Recently, I've been playing with my iPhone ToF sensor, but the problem has always been the abysmal resolution (256x192). The team behind DepthAnything released PromptDepthAnything that fixes this. Using @rerun.io to visualize. Links at the end of the thread
February 3, 2025 at 1:18 PM