Lightnews — Scholar-powered news

Zizhao Chen

@ch272h.bsky.social

35 followers 46 following 35 posts

chenzizhao.github.io unlearning natural stupidity

Posts Replies Media Videos

Zizhao Chen

@ch272h.bsky.social

I'm presenting the poster today. Details below:

Fri, Dec 5, 2025
11:00 AM – 2:00 PM PST
Exhibit Hall C,D,E #4505

Pic: (fancy) knots at USS midway museum near SD convention center

December 5, 2025 at 5:18 PM

Zizhao Chen

@ch272h.bsky.social

🔗 Why knots?

Knots are simple to see but deep to reason about.

✔ Verifiable outcomes
✔ Structured complexity (crossing number # X)
✔ A ladder of difficulty for generalization

Perfect for studying long-horizon visual reasoning and test-time scaling in visual space.

December 5, 2025 at 5:13 PM

Zizhao Chen

@ch272h.bsky.social

🧩Natural language isn’t all you need.

We’re great at evaluating text-based reasoning (MATH, AIME…) but what about long-horizon visual reasoning?

Enter 𝗞𝗻𝗼𝘁𝗚𝘆𝗺: a minimalistic testbed for evaluating agents on spatial reasoning along a difficulty ladder

December 5, 2025 at 5:13 PM

Zizhao Chen

@ch272h.bsky.social

So I was volunteering today. I prompted folks randomly this question after they collected their neurips thermos:

Do you think AIs today are intelligent? Answer with yes or no.

Here is the break down:

Yes: 57
No: 62
Total: 119

Pretty close!

December 12, 2024 at 5:00 AM

Zizhao Chen

@ch272h.bsky.social

Extra: search for our wall of shame and fame @cornelltech.bsky.social (trigger alert) (whoa CT has a bsky account?!)

7/7

November 22, 2024 at 7:21 PM

Zizhao Chen

@ch272h.bsky.social

We experiment in an abstract multi-turn generalization of reference games. After 6 rounds of grounded continual learning, the human-bot games success rate improves 31→82%📈 - an absolute improvement of 51%, all without any external human annotations! 🚀

4/7

November 22, 2024 at 7:21 PM

Zizhao Chen

@ch272h.bsky.social

How do we decode the reward? Implicit feedback occupies a general and easy to reason about subspace of language
→ Prompt the same LLM that does the task (really bad early on) with a task-independent prompt
→ LLM bootstraps itself

3/7

November 22, 2024 at 7:21 PM

Zizhao Chen

@ch272h.bsky.social

me: let’s start with a meme
@yoavartzi.com: how about the paper’s fig1? 🙅
me: lesson learned. no memes 😭

A paper on continually learning from naturally occurring interaction signals, such as in the hypothetical conversation above
arxiv.org/abs/2410.13852

1/7

November 22, 2024 at 7:21 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news