Semisance
semisance.bsky.social
Semisance
@semisance.bsky.social
I post daily research highlights from arXiv.com, and I gather curated AI and tech news on https://semisance.substack.com/

Reach at semisance@gmail.com
Dynamic Point Maps: A Versatile Representation for Dynamic 3D Reconstruction
arxiv.org/abs/2503.16318
March 21, 2025 at 6:54 AM
Structured-Noise Masked Modeling for Video, Audio and Beyond
arxiv.org/abs/2503.16311
March 21, 2025 at 6:42 AM
Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation
arxiv.org/abs/2503.15905
March 21, 2025 at 6:41 AM
SynCity: Training-Free Generation of 3D Worlds
arxiv.org/abs/2503.16420

Project page: research.paulengstler.com/syncity/
March 21, 2025 at 6:40 AM
SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation
arxiv.org/abs/2503.16396
March 21, 2025 at 6:35 AM
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
arxiv.org/abs/2503.16430

Project page: yuqingwang1029.github.io/TokenBridge/
March 21, 2025 at 6:28 AM
M3: 3D-Spatial MultiModal Memory
arxiv.org/abs/2503.16413

Project page: m3-spatial-memory.github.io

#ICLR2025 🎉
March 21, 2025 at 6:21 AM
Decompositional Neural Scene Reconstruction with Generative Diffusion Prior
arxiv.org/abs/2503.14830

Project page: dp-recon.github.io

#CVPR2025 🎉
March 20, 2025 at 8:11 AM
Cube: A Roblox View of 3D Intelligence
arxiv.org/abs/2503.15475

Project page: github.com/Roblox/cube
March 20, 2025 at 8:03 AM
1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
arxiv.org/abs/2503.14858

Project page: wang-kevin3290.github.io/scaling-crl/
March 20, 2025 at 7:59 AM
Temporal Regularization Makes Your Video Generator Stronger
arxiv.org/abs/2503.15417

Project page: haroldchen19.github.io/FluxFlow/
March 20, 2025 at 7:57 AM
Visual Persona: Foundation Model for Full-Body Human Customization
arxiv.org/abs/2503.15406

Project page: cvlab-kaist.github.io/Visual-Perso...

#CVPR2025 🎉
March 20, 2025 at 7:55 AM
Object-Centric Pretraining via Target Encoder Bootstrapping
arxiv.org/abs/2503.15141

#ICLR2025 🎉

Code repository: github.com/djukicn/ocebo (coming soon)
March 20, 2025 at 7:54 AM
TULIP: Towards Unified Language-Image Pretraining
arxiv.org/abs/2503.15485

Project page: tulip-berkeley.github.io
March 20, 2025 at 7:53 AM
When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning
arxiv.org/abs/2503.15096

Project page: github.com/yafeng19/T-C...

#CVPR2025 🎉
March 20, 2025 at 7:51 AM
Humanoid Policy ~ Human Policy
arxiv.org/abs/2503.13441

Project page: human-as-robot.github.io
March 19, 2025 at 7:42 AM
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
arxiv.org/abs/2503.14489

Project page stable-virtual-camera.github.io
March 19, 2025 at 7:38 AM
March 19, 2025 at 7:37 AM
Bolt3D: Generating 3D Scenes in Seconds
arxiv.org/abs/2503.14445

Project page: szymanowiczs.github.io/bolt3d
March 19, 2025 at 7:35 AM
MusicInfuser: Making Video Diffusion Listen and Dance
arxiv.org/abs/2503.14505

Project page: susunghong.github.io/MusicInfuser/
March 19, 2025 at 7:34 AM
Deeply Supervised Flow-Based Generative Models
arxiv.org/abs/2503.14494

Project page: deepflow-project.github.io
March 19, 2025 at 7:33 AM
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers
arxiv.org/abs/2503.14487

Project page: shiml20.github.io/DiffMoE/
March 19, 2025 at 7:32 AM
Seeing the Future, Perceiving the Future: A Unified Driving World Model for Future Generation and Perception
arxiv.org/abs/2503.13587

Project page: github.com/dk-liang/Uni...
March 19, 2025 at 7:30 AM
DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers
arxiv.org/abs/2503.14405

Project page: europe.naverlabs.com/research/pub...

#CVPR2025 🎉
March 19, 2025 at 7:29 AM
LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models
arxiv.org/abs/2503.14325

Project page: github.com/westlake-rep...
March 19, 2025 at 7:28 AM