Yash Bhalgat
banner
ysbhalgat.bsky.social
Yash Bhalgat
@ysbhalgat.bsky.social
PhD at VGG, Oxford w/ Andrew Zisserman, Andrea Vedaldi, Joao Henriques, Iro Laina. Past: Senior RS Qualcomm #AI #Research, UMich, IIT Bombay.

I occasionally post AI memes.

yashbhalgat.github.io
Pinned
from kindness import *
Excited to announce the 1st Workshop on 3D-LLM/VLA at #CVPR2025! 🚀 @cvprconference.bsky.social

Topics: 3D-VLA models, LLM agents for 3D scene understanding, Robotic control with language.

📢 Call for papers: Deadline – April 20, 2025

🌐 Details: 3d-llm-vla.github.io

#llm #3d #Robotics #ai
March 23, 2025 at 9:35 PM
Reposted by Yash Bhalgat
Our beginner's oriented accessible introduction to modern deep RL is now published in Foundations and Trends in Optimization. It is a great entry to the field if you want to jumpstart into RL!
@bernhard-jaeger.bsky.social
www.nowpublishers.com/article/Deta...
arxiv.org/abs/2312.08365
February 22, 2025 at 7:32 PM
"LLaDA: Large Language Diffusion Models" Nie et al.

Just read this fascinating paper.

Scaled up Masked Diffusion Language Models to 8B params, and show that it can match #LLMs (including Llama 3) while solving some key limitations!

Let's dive in... 🧵

(1/8)

#genai
February 18, 2025 at 3:05 PM
New work introduces a training-free method to relight entire videos, while maintaining temporal consistency! 📽️🌅

"Light-A-Video: Training-free Video Relighting via Progressive Light Fusion" Zhou et al.

(1/n) 🧵

#genai #ai #research #video
February 16, 2025 at 4:26 PM
Need to rig 3D models? 🦖

New work from UCSD and Adobe:
"RigAnything: Template-Free Autoregressive Rigging
for Diverse 3D Assets" Liu et al.

tl;dr: reduces rigging time from 2 mins to 2 secs, works on any shape category & doesn't need predefined templates! 🚀
February 15, 2025 at 1:05 PM
"Latent Radiance Fields with 3D-aware 2D Representations" Zhou et al., #ICLR2025

tl;dr: Novel framework that integrates 3D awareness into VAE latent space using correspondence-aware encoding, enabling high-quality rendered images with ~50% memory savings.

(1/n) 🧵
February 14, 2025 at 10:28 AM
"EdgeRunner" (#ICLR2025) from #Nvidia & PKU introduces an auto-regressive auto-encoder for mesh generation, supporting up to 4000 faces at 512³ resolution. 🤩

Their mesh tokenization algorithm (adapted from EdgeBreaker) achieves ~50% compression (4-5 tokens per face vs 9), making training efficient.
February 13, 2025 at 10:34 PM
Just came across this fascinating paper "CraftsMan3D" - a practical approach to text/image-to-3D generation that mimics how artists actually work!

Code available (pretrained models too) 🤩: github.com/wyysf-98/Cra...

(1/n) 🧵
February 12, 2025 at 10:10 PM
Got me excited for a second here 🫠
February 10, 2025 at 12:12 PM
So, what happened this week in #AI?
January 29, 2025 at 12:57 PM
📢 Paper accepted to #ICLR2025 🎉

"GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting"

TL;DR: a novel test-time camera pose refinement framework leveraging 3DGS as the scene representation and MASt3R for 2D matching.

🔗: arxiv.org/abs/2408.11085
January 23, 2025 at 11:42 AM
Switzerland is rolling out solar panels... on railway tracks! 🇨🇭🚄

Swiss startup Sun-ways will run a pilot project turning train lines into clean energy highways.

#renewable #energy for the win 🤓

www.pv-magazine.com/2024/10/04/s...
Switzerland authorizes removable PV plant on railway track
Swiss startup Sun-ways is planning to build a 18 kW pilot PV system between the racks of a 100-m linear section of a railway line in the Swiss canton of Neuchâtel.
www.pv-magazine.com
January 21, 2025 at 9:06 PM
Reposted by Yash Bhalgat
Solar panels are becoming so cheap the Swiss are looking at installing them *between train tracks* !!!

www.pv-magazine.com/2024/10/04/s...
January 13, 2025 at 8:09 AM
Reposted by Yash Bhalgat
🚨🚨🚨 Reminder: closing in 3 weeks time 🚨🚨🚨

Please re-post!

Note: Oxford recruits faculty at Associate Professor level - we have no Assistant Professor level.
Multiple faculty positions at University of Oxford in @oxengsci.bsky.social - Join Us!

Recruiting for 3 Information Engineering faculty - including Robotics, Computer Vision, Machine Learning. Please repost!

Faculty positions in Oxford are typically linked to a college.
⬇️ details in thread ⬇️
January 11, 2025 at 7:28 PM
Came across this LLM visualisation tool today: bbycroft.net/llm

Cool stuff! Let's you visualize each operation or layer in different Transformer architectures, and also explains them on the side. 😍

#llm #visualisation #gpt #ai #transformers
January 9, 2025 at 7:30 AM
Reposted by Yash Bhalgat
For other 3D vision newcomers to blue sky: I highly recommend joning @chrisoffner3d.bsky.social 's list to follow the right people ;): go.bsky.app/Cfm9XFe
January 8, 2025 at 2:28 PM
"NeuralSVG: An Implicit Representation for Text-to-Vector Generation"

(1/2) Encodes SVGs as implicit neural representations using a small MLP trained with Score Distillation Sampling (SDS). Maps 2D coordinates to shape/color outputs. Dropout-like technique ensures ordered, layered structures.
January 8, 2025 at 2:26 PM
"AR4D: Autoregressive 4D Generation from Monocular Videos" *without* SDS.

Autoregressively generate "3D frames" (aka 3DGS) starting from a canonical space, and using a local deformation field for each frame -- high-quality prompt-aligned generations.

#ai #nerf #GenAI #video
January 6, 2025 at 7:14 AM
Another gem from Bill Freeman, Katie Bouman & team 🌌

A differentiable rendering framework for direct #exoplanet imaging, leveraging wavefront sensing to refine starlight subtraction. Tested on JWST, it approaches noise limits and reveals faint planets like never before! 🚀

#ai #astronomy
January 6, 2025 at 7:13 AM
from kindness import *
November 28, 2024 at 2:55 PM
Excited to share that our work with VAL, IISc on "Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections" has been accepted to #3DV2025

Project page: val.cds.iisc.ac.in/reflecting-r...
Reflecting Reality
Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections
val.cds.iisc.ac.in
November 17, 2024 at 1:44 PM
Reached 600 today. Onwards and upwards. 📈

#Research #googlescholar
November 17, 2024 at 1:41 PM
We are hosting the 2nd Workshop on Learning #3D with Multi-View Supervision (3DMV) AT #CVPR2024 in Seattle on June 17th!

We are accepting paper submissions on a range of topics. More details on the website: abdullahamdi.com/3dmv2024/

#computervision #ai
February 13, 2024 at 12:08 PM