Zizhao Chen
ch272h.bsky.social
Zizhao Chen
@ch272h.bsky.social
chenzizhao.github.io unlearning natural stupidity
I'm presenting the poster today. Details below:

Fri, Dec 5, 2025
11:00 AM – 2:00 PM PST
Exhibit Hall C,D,E #4505

Pic: (fancy) knots at USS midway museum near SD convention center
December 5, 2025 at 5:18 PM
✨ Why it matters

KnotGym gives us a lightweight yet expressive testbed for multi-modal long-horizon reasoning and planning.

📄 Paper: arxiv.org/abs/2505.18028
🔗 Website: lil-lab.github.io/knotgym

Joint work with @yoavartzi.com
Knot So Simple: A Minimalistic Environment for Spatial Reasoning
We propose KnotGym, an interactive environment for complex, spatial reasoning and manipulation. KnotGym includes goal-oriented rope manipulation tasks with varying levels of complexity, all requiring ...
arxiv.org
December 5, 2025 at 5:14 PM
🧠 What can agents do in KnotGym?

➡️ Untangle a knot
➡️ Tie a goal knot
➡️ Convert one knot into another

All within Gym + MuJoCo, easy to run, hard to solve.

Even strong RL baselines and VLMs cannot beat random at cross number # X=3 (though they fail for different reasons).
December 5, 2025 at 5:14 PM
🔗 Why knots?

Knots are simple to see but deep to reason about.

✔ Verifiable outcomes
✔ Structured complexity (crossing number # X)
✔ A ladder of difficulty for generalization

Perfect for studying long-horizon visual reasoning and test-time scaling in visual space.
December 5, 2025 at 5:13 PM
also imo this is a habit that is cultivated by constant practice (say, from local collaboration/mentorship or OSS). Instead of a whopping 12-week course, a workshop talk or informal tricks-sharing is perhaps more suitable
December 28, 2024 at 11:08 PM
The Internet has almost too many resources on general SE best practices (super useful for code release). What's lacking are good programming practices in the context of day-to-day research, e.g., versioning datasets, tracking experiments, reporting prelim findings, reacting to constant pivots
December 28, 2024 at 11:00 PM
Why bother coming up with an "artificial" project when there are natural ones and the goal (I assume) is to train better researchers anyway?
December 28, 2024 at 9:47 PM
I actually relate to much of the presentation on state management.

Jupyter shines in plotting and interactive demoing. E.g., a use case not fulfilled by console or scripts: prompt engineering. Jupyter (1) does not reload model weights and (2) can fold/clear historical long outputs like logits
December 28, 2024 at 7:33 PM
A PhD *student* paranoid with code. I guess that’s what makes me a student 🥲
December 28, 2024 at 7:15 PM
You were blessed with a codebase that's easy to work with, or the ability to build one. IMO factoring is tricky for different, ever-shifting research goals. See a discussion on "single-file implementation" and "Does modularity help RL libraries?" at iclr-blog-track.github.io/2022/03/25/p...
December 28, 2024 at 12:37 AM
What’s wrong with Jupyter notebooks 😂
December 27, 2024 at 11:15 PM
That’s quite a lot of investment in a course for phds lol. How about allowing collaborated projects in your graduate seminar?
December 27, 2024 at 11:12 PM
Also collaborating with others in the same repo motivated both of us to write better code than we would otherwise.
December 27, 2024 at 7:07 PM
Speaking as a phd paranoid with code:

goodresearch.dev is good.

A guilty pleasure of mine is reading not only good research repo, but also their full git history if released. Factored code is not always easy to change and a big refactor commit says something.
December 27, 2024 at 7:03 PM
Some misread it as geopolitics instead of racism.

And caring for others, that’s not exactly part of a researcher’s job description or perf review.

I made up the second one to save myself from greater disappointment.
December 14, 2024 at 9:47 AM
All I am saying is I don't assume a prior definition, nor do I observe your latent thought process
December 13, 2024 at 5:10 AM
I’m not sure what conclusion I can draw from this poll.

And disclaimer - this is absolutely not affiliated with neurips.

Credit goes to everyone who participated in this mini poll. Thank you - you made my day!
December 12, 2024 at 5:06 AM