Lightnews — Scholar-powered news

Yash Bhalgat

@ysbhalgat.bsky.social

I think a few things will happen soon:
🚀 Scale beyond 8B
🎯 Multi-modal capabilities
⚡️Faster inference
🔄 Reinforcement learning integration

Exciting to see alternatives to autoregressive models succeeding at scale!

Paper: ml-gsai.github.io/LLaDA-demo/

(8/8)

SOCIAL MEDIA TITLE TAG

SOCIAL MEDIA DESCRIPTION TAG TAG

ml-gsai.github.io

February 18, 2025 at 3:08 PM

Yash Bhalgat

@ysbhalgat.bsky.social

Results vs LLaMA3 8B:

- Matches/exceeds on most tasks
- Better at math & Chinese tasks
- Strong in-context learning
- Improved dialogue capabilities

(7/8) 🧵

February 18, 2025 at 3:07 PM

Yash Bhalgat

@ysbhalgat.bsky.social

A major result: LLaDA breaks the "reversal curse" that plagues autoregressive models. 🔄

On tasks requiring bidirectional reasoning, it outperforms GPT-4 and maintains consistent performance in both forward/reverse directions.

(6/8) 🧵

February 18, 2025 at 3:07 PM

Yash Bhalgat

@ysbhalgat.bsky.social

For generation, they introduce clever remasking strategies:

- Low-confidence remasking: Remask tokens the model is least sure about

- Semi-autoregressive: Generate in blocks left-to-right while maintaining bidirectional context

(5/8) 🧵

February 18, 2025 at 3:07 PM

Yash Bhalgat

@ysbhalgat.bsky.social

Training uses random masking ratio t ∈ [0,1] for each sequence.

The model learns to predict original tokens given partially masked sequences. No causal masking used.

Also enables instruction-conditioned generation with the same technique. No modifications.

(4/8) 🧵

February 18, 2025 at 3:06 PM

Yash Bhalgat

@ysbhalgat.bsky.social

💡Core insight: Generative modeling principles, not autoregression, give LLMs their power.

LLaDA's forward process gradually masks tokens while reverse process predicts them simultaneously. This enables bidirectional modeling.

(3/8) 🧵

February 18, 2025 at 3:06 PM

Yash Bhalgat

@ysbhalgat.bsky.social

Key highlights:
- Successful scaling of masked diffusion to LLM scale (8B params)
- Masking with variable ratios for forward/reverse process
- Smart remasking strategies for generation, incl. semi-autoregressive
- SOTA on reversal tasks, matching Llama 3 on others

(2/8) 🧵

February 18, 2025 at 3:05 PM

Yash Bhalgat

@ysbhalgat.bsky.social

Project page: bujiazi.github.io/light-a-vide...
Code: github.com/bcmi/Light-A...

Could be a game-changer for quick video mood/lighting adjustments without complicated VFX pipelines! 🎬

Light-A-VideoClick to Play and Loop VideoClick to Play and Loop VideoClick to Play and Loop VideoClick to Play and Loop VideoClick to Play and Loop VideoClick to Play and Loop Video

bujiazi.github.io

February 16, 2025 at 4:28 PM

Yash Bhalgat

@ysbhalgat.bsky.social

The results are pretty good ✨
They can transform regular videos into moody noir scenes, add sunlight streaming through windows, or create cyberpunk neon vibes -- works on everything from portrait videos to car commercials! 🚗

February 16, 2025 at 4:28 PM

Yash Bhalgat

@ysbhalgat.bsky.social

Technical highlights 🔍:
- Consistent Light Attention (CLA) module for stable lighting across frames
- Progressive Light Fusion for smooth temporal transitions
- Works with ANY video diffusion model (AnimateDiff, CogVideoX)
- Zero-shot - no fine-tuning needed!

February 16, 2025 at 4:27 PM

Yash Bhalgat

@ysbhalgat.bsky.social

Project page: liuisabella.com/RigAnything/
Code: not available yet

Really excited to try this out once the code is available!

RigAnything: Template-Free Autoregressive Rigging for Diverse 3D Assets

liuisabella.com

February 15, 2025 at 1:06 PM

Yash Bhalgat

@ysbhalgat.bsky.social

Authors claim that the model generalizes well across diverse shapes - from humanoids to marine creatures! And works with real-world images & arbitrary poses. 🤩

February 15, 2025 at 1:06 PM

Yash Bhalgat

@ysbhalgat.bsky.social

Technical highlights:
- BFS-ordered skeleton sequence representation
- Autoregressive joint prediction with diffusion sampling
- Hybrid attention masking: full self-attention for shape tokens, causal attention for skeleton
- e2e trainable pipeline without clustering/MST ops

February 15, 2025 at 1:05 PM

Yash Bhalgat

@ysbhalgat.bsky.social

@sebastianraschka.com this is such an interesting discussion! I haven't tried this myself, but I think this can be analyzed theoretically by looking at the rank of the attention matrix in both cases.

I have posted my thoughts on the discussion here: github.com/rasbt/LLMs-f...

Self attention: Merge Query matrix and Key matrix into a single covariance matrix? · rasbt LLMs-from-scratch · Discussion #517

When compute the context vector in the attention algorithm, three weight matrices were introduced. It has discussed in #454 that the value matrix W_V is not necessary. For the rest two, query matri...

github.com

February 14, 2025 at 2:42 PM

Yash Bhalgat

@ysbhalgat.bsky.social

Interesting how they handle the domain gap between 2D latent space and 3D representations through their three-stage pipeline. The correspondence-aware encoding significantly reduces high-frequency noise while preserving geometry.

Project: latent-radiance-field.github.io/LRF/

Latent Radiance Fields with 3D-aware 2D Representations

latent-radiance-field.github.io

February 14, 2025 at 10:29 AM

Yash Bhalgat

@ysbhalgat.bsky.social

Technical approach:
- Correspondence-aware autoencoding to enhance 3D consistency in VAE latent space
- Builds 3D representations from 3D-aware 2D features
- VAE-Radiance Field alignment to bridge domain gap between latent and image space

#nerf #ai #research

February 14, 2025 at 10:28 AM

Yash Bhalgat

@ysbhalgat.bsky.social

Project: research.nvidia.com/labs/dir/edg...
Training and inference code available here: github.com/NVlabs/EdgeR...

February 13, 2025 at 10:36 PM

Yash Bhalgat

@ysbhalgat.bsky.social

The architecture uses a lightweight encoder and auto-regressive decoder to compress variable-length meshes into fixed-length codes, enabling point cloud and single-image conditioning.

Their ArAE model controls face count for varying detail while preserving mesh topology.

February 13, 2025 at 10:36 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news