Lightnews — Scholar-powered news

Zubair Irshad

@zubair-irshad.bsky.social

1.9K followers 700 following 81 posts

Research Scientist @Toyota Research Institute | Prev. PhD in AI, ML and CV @GeorgiaTech | Researching 3D Perception, Generative AI for Robotics and Multimodal AI

W: https://zubairirshad.com

Posts Replies Media Videos

Zubair Irshad

@zubair-irshad.bsky.social

Shoutout to the authors of the wonderful papers i.e. CtRNet-X, DUSt3R, Segment Anything, CLIP and Pytorch3D and for open-sourcing their codebase to advance science and make this effort happen!

Please check these works out if you haven’t already!

April 24, 2025 at 12:33 AM

Zubair Irshad

@zubair-irshad.bsky.social

We have released our improved extrinsics. Try it out now at droid-dataset.github.io and read more details about it in the updated DROID paper at arxiv.org/abs/2403.12945

This was a fun collaboration with
@vitorguizilini, @SashaKhazatsky and @KarlPertsch!

April 23, 2025 at 11:50 PM

Zubair Irshad

@zubair-irshad.bsky.social

There’s room to improve. Future work could explore:

• Extending to in-the-wild scenes via foundation models for robot segmentation & keypoints.
• Ensembling predictions over time for better temporal consistency.
• Fine-tuning pointmap models on real robot data to handle cluttered tabletops.

8/n

April 23, 2025 at 11:50 PM

Zubair Irshad

@zubair-irshad.bsky.social

Large-scale auto calibration in robotics is challenging, and our pipeline has some limits:

• CtRNet-X is trained on Panda; generalization to other robots is untested.
• DUSt3R struggles with clutter or minimal view overlap.
• Steps 2️⃣ & 3️⃣ may yield false positives in tough lighting or geometry.

7/n

April 23, 2025 at 11:50 PM

Zubair Irshad

@zubair-irshad.bsky.social

Similarly, we plot the distribution of number of matched points and cumulative curve after 3️⃣, helping to identify the top quantile of well-calibrated camera pairs within each lab.

6/n

April 23, 2025 at 11:50 PM

Zubair Irshad

@zubair-irshad.bsky.social

Automatically calibrating a large-scale dataset is challenging. We provide quality assessment metrics across all three stages, with flexibility to narrow bounds for downstream tasks as needed.

1️⃣ and 2️⃣ quality metrics show IOU and Reprojection-error distributions post-calibration.

5/n

April 23, 2025 at 11:50 PM

Zubair Irshad

@zubair-irshad.bsky.social

Below we show the Camera-to-Camera transformations, post-calibration improves the alignment of obtained pointclouds!

4/n

April 23, 2025 at 11:50 PM

Zubair Irshad

@zubair-irshad.bsky.social

We provide:
🤖 ~36k calibrated episodes with good-quality extrinsic calibration
🦾 ~24k calibrated multi-view episodes with good-quality multi-view camera calibration
✅ Quality assessment metrics for all provided camera poses

3/n

April 23, 2025 at 11:50 PM

Zubair Irshad

@zubair-irshad.bsky.social

To achieve this, we utilize:
1️⃣ Auto Segment Anything (SAM) based filtering (Camera-to-Base Calibration)
2️⃣ Tuned CtRNet-X for bringing in additional cams (Camera-to-Base Calibration)
3️⃣ Pretrained DUST3R with depth-based pose optimization (Camera-to-Camera Calibration)

2/n

April 23, 2025 at 11:50 PM

Zubair Irshad

@zubair-irshad.bsky.social

🔗 Learn more & submit your work: robo-3dvlms.github.io

Join us in shaping the future of robotics, 3D vision, and language models! 🤖📚 #CVPR2025

3D Vision Language Models (VLMs) for Robotic Manipulation: Opportunities and Challenges

robo-3dvlms.github.io

February 10, 2025 at 5:01 PM

Zubair Irshad

@zubair-irshad.bsky.social

🎤 We’re honored to host top experts in the field:
⭐ Angel Chang (Simon Fraser University)
⭐ Chelsea Finn (Stanford University)
⭐ Hao Su (UC San Diego)
⭐ Katerina Fragkiadaki (CMU)
⭐ Yunzhu Li (Columbia University)
⭐ Ranjay Krishna (University of Washington)

5/N

February 10, 2025 at 5:01 PM

Zubair Irshad

@zubair-irshad.bsky.social

🎯 Key Topics:
✅ 3D Vision-Language Policy Learning
✅ Pretraining for 3D VLMs
✅ 3D Representations for Policy Learning
✅ 3D Benchmarks & Simulation Frameworks
✅ 3D Vision-Language Action Models
✅ 3D Instruction-Tuning & Pretraining Datasets for Robotics

4/N

February 10, 2025 at 5:01 PM

Zubair Irshad

@zubair-irshad.bsky.social

📢 Call for Papers: Submission opens today!
📅 Deadline: April 15, 2024 (11:59 PM PST)
📜 Format: Up to 4 pages (excluding references/appendices), CVPR template, anonymized submissions
🏆 Accepted papers: Poster presentations, with selected papers receiving spotlight talks!

3/N

February 10, 2025 at 5:01 PM

Zubair Irshad

@zubair-irshad.bsky.social

🔍 Explore how 3D perception and language models can enhance robotic manipulation in the era of foundation models. Engage with leading experts and be part of this new frontier in 3D-based VLMs/VLAs for robotics.

2/N

February 10, 2025 at 5:01 PM

Zubair Irshad

@zubair-irshad.bsky.social

Welcome onboard!

January 18, 2025 at 2:08 AM

Zubair Irshad

@zubair-irshad.bsky.social

Done, welcome aboard!

January 17, 2025 at 6:55 PM

Zubair Irshad

@zubair-irshad.bsky.social

Welcome on board!

December 20, 2024 at 9:16 AM

Zubair Irshad

@zubair-irshad.bsky.social

Hello 👋

November 28, 2024 at 2:53 PM

Zubair Irshad

@zubair-irshad.bsky.social

Just included :) Welcome @ajdavison.bsky.social!

go.bsky.app/HcQYMj

November 28, 2024 at 2:51 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news