Yuyang Wang
yuyangw.bsky.social
Yuyang Wang
@yuyangw.bsky.social
Research Scientist at @Apple ML Research (MLR) | PhD from CMU | Simulate physical world with generative models 🌎🧬
Reposted by Yuyang Wang
Check out our Apple research work on scaling laws for native multimodal models! Combined with mixtures of experts, native models develop both specialized and multimodal representations! Lots of rich findings and opportunists for follow up research!
Shukor, Fini, da Costa, Cord, Susskind, El-Nouby: Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models https://arxiv.org/abs/2504.07951 https://arxiv.org/pdf/2504.07951 https://arxiv.org/html/2504.07951
April 11, 2025 at 10:37 PM
Reposted by Yuyang Wang
🚨 One question that has always intrigued me is the role of different ways to increase a model's capacity: parameters, parallelizable compute, or sequential compute?

We explored this through the lens of MoEs:
January 28, 2025 at 6:26 AM
1/n 🚨New preprint! Our work “Coordinate In and Value Out: Training Flow Transformers in Ambient Space” arxiv.org/abs/2412.03791 presents a domain-agnostic and end2end flow-matching generative model that effectively handles various modalities like images and point clouds.
December 6, 2024 at 7:15 PM
Reposted by Yuyang Wang
🤔Image-to-3D, monocular depth estimation, camera pose estimation, …, can we achieve all of this with just ONE model easily?

🚀Our answer is Yes -- Excited to introduce our latest work: World-consistent Video Diffusion (WVD) with Explicit 3D Modeling!

arxiv.org/abs/2412.01821
December 4, 2024 at 1:41 PM
Reposted by Yuyang Wang
I am seeking multiple PhD students passionate about Generative Intelligence and its applications in empowering AI agents to interact with the physical world to join us at UPenn CIS for the 2024-2025 academic cycle. You can find more information at www.cis.upenn.edu/graduate/pro...
Doctoral Program
Doctoral Program
www.cis.upenn.edu
November 27, 2024 at 1:18 AM