https://hokindeng.github.io/
👉 Slack: join.slack.com/t/growingail...
👉 Early Results: grow-ai-like-a-child.com/video-reason/
📄 Paper: github.com/hokindeng/VM...
👉 GitHub: github.com/hokindeng/VM...
The age of video reasoning is here 🎬🧠
👉 Slack: join.slack.com/t/growingail...
👉 Early Results: grow-ai-like-a-child.com/video-reason/
📄 Paper: github.com/hokindeng/VM...
👉 GitHub: github.com/hokindeng/VM...
The age of video reasoning is here 🎬🧠
👉 Slack: join.slack.com/t/growingail...
👉 GitHub: github.com/hokindeng/VM...
👉 Early Results: grow-ai-like-a-child.com/video-reason/
📄 Paper: github.com/hokindeng/VM...
The age of video reasoning is here 🎬🧠
👉 Slack: join.slack.com/t/growingail...
👉 GitHub: github.com/hokindeng/VM...
👉 Early Results: grow-ai-like-a-child.com/video-reason/
📄 Paper: github.com/hokindeng/VM...
The age of video reasoning is here 🎬🧠
For example, Sora-2 somehow figures out how to solve Chess problems. But all other models do not have such ability.
Veo 3 and 3.1 actually are able to do mental rotation quite well, but really fail on the maze problems.
For example, Sora-2 somehow figures out how to solve Chess problems. But all other models do not have such ability.
Veo 3 and 3.1 actually are able to do mental rotation quite well, but really fail on the maze problems.
1️⃣ Initial image: unsolved puzzle
2️⃣ Text instruction: “Solve this ...”
3️⃣ Final image: correct solution (hidden during generation)
Models see (1)+(2), we compare their output to (3). Simple and straight-forward ✅
1️⃣ Initial image: unsolved puzzle
2️⃣ Text instruction: “Solve this ...”
3️⃣ Final image: correct solution (hidden during generation)
Models see (1)+(2), we compare their output to (3). Simple and straight-forward ✅
#EmbodiedAI #SpatialReasoning #NeuroAI #CognitiveScience #SpatialReasoning
#EmbodiedAI #SpatialReasoning #NeuroAI #CognitiveScience #SpatialReasoning