Zizhao Chen
ch272h.bsky.social
Zizhao Chen
@ch272h.bsky.social
chenzizhao.github.io unlearning natural stupidity
I'm presenting the poster today. Details below:

Fri, Dec 5, 2025
11:00 AM – 2:00 PM PST
Exhibit Hall C,D,E #4505

Pic: (fancy) knots at USS midway museum near SD convention center
December 5, 2025 at 5:18 PM
🔗 Why knots?

Knots are simple to see but deep to reason about.

✔ Verifiable outcomes
✔ Structured complexity (crossing number # X)
✔ A ladder of difficulty for generalization

Perfect for studying long-horizon visual reasoning and test-time scaling in visual space.
December 5, 2025 at 5:13 PM
🧩Natural language isn’t all you need.

We’re great at evaluating text-based reasoning (MATH, AIME…) but what about long-horizon visual reasoning?

Enter 𝗞𝗻𝗼𝘁𝗚𝘆𝗺: a minimalistic testbed for evaluating agents on spatial reasoning along a difficulty ladder
December 5, 2025 at 5:13 PM
So I was volunteering today. I prompted folks randomly this question after they collected their neurips thermos:

Do you think AIs today are intelligent? Answer with yes or no.

Here is the break down:

Yes: 57
No: 62
Total: 119

Pretty close!
December 12, 2024 at 5:00 AM
Extra: search for our wall of shame and fame @cornelltech.bsky.social (trigger alert) (whoa CT has a bsky account?!)

7/7
November 22, 2024 at 7:21 PM
We experiment in an abstract multi-turn generalization of reference games. After 6 rounds of grounded continual learning, the human-bot games success rate improves 31→82%📈 - an absolute improvement of 51%, all without any external human annotations! 🚀

4/7
November 22, 2024 at 7:21 PM
How do we decode the reward? Implicit feedback occupies a general and easy to reason about subspace of language
→ Prompt the same LLM that does the task (really bad early on) with a task-independent prompt
→ LLM bootstraps itself

3/7
November 22, 2024 at 7:21 PM
me: let’s start with a meme
@yoavartzi.com: how about the paper’s fig1? 🙅
me: lesson learned. no memes 😭

A paper on continually learning from naturally occurring interaction signals, such as in the hypothetical conversation above
arxiv.org/abs/2410.13852

1/7
November 22, 2024 at 7:21 PM