Shaokai Ye
shaokaiye.bsky.social
Shaokai Ye
@shaokaiye.bsky.social
I am a 5th year PhD student from Mackenzie Mathis's lab at EPFL. I am on the job market and looking for positions that build multi-modal agentic systems that help understand the real / augmented world and that analyze self / others' behavior.
Reposted by Shaokai Ye
✨ Introducing a new #SOTA action recognition large multimodal language model: #LLaVAction!

By @shaokaiye.bsky.social Haozhe Qi, @trackingskills.bsky.social and me!

📝 arxiv.org/abs/2503.18712

🤖 mmathislab.github.io/llavaction/

1/n
LLaVAction: Video Action Recognition
LLaVAction: evaluating and training multi-modal large language models for action recognition
mmathislab.github.io
March 25, 2025 at 8:46 AM