Lightnews — Scholar-powered news

Zizhao Chen

@ch272h.bsky.social

I'm presenting the poster today. Details below:

Fri, Dec 5, 2025
11:00 AM – 2:00 PM PST
Exhibit Hall C,D,E #4505

Pic: (fancy) knots at USS midway museum near SD convention center

December 5, 2025 at 5:18 PM

Zizhao Chen

@ch272h.bsky.social

✨ Why it matters

KnotGym gives us a lightweight yet expressive testbed for multi-modal long-horizon reasoning and planning.

📄 Paper: arxiv.org/abs/2505.18028
🔗 Website: lil-lab.github.io/knotgym

Joint work with @yoavartzi.com

Knot So Simple: A Minimalistic Environment for Spatial Reasoning

We propose KnotGym, an interactive environment for complex, spatial reasoning and manipulation. KnotGym includes goal-oriented rope manipulation tasks with varying levels of complexity, all requiring ...

arxiv.org

December 5, 2025 at 5:14 PM

Zizhao Chen

@ch272h.bsky.social

🧠 What can agents do in KnotGym?

➡️ Untangle a knot
➡️ Tie a goal knot
➡️ Convert one knot into another

All within Gym + MuJoCo, easy to run, hard to solve.

Even strong RL baselines and VLMs cannot beat random at cross number # X=3 (though they fail for different reasons).

December 5, 2025 at 5:14 PM

Zizhao Chen

@ch272h.bsky.social

🔗 Why knots?

Knots are simple to see but deep to reason about.

✔ Verifiable outcomes
✔ Structured complexity (crossing number # X)
✔ A ladder of difficulty for generalization

Perfect for studying long-horizon visual reasoning and test-time scaling in visual space.

December 5, 2025 at 5:13 PM

Zizhao Chen

@ch272h.bsky.social

also imo this is a habit that is cultivated by constant practice (say, from local collaboration/mentorship or OSS). Instead of a whopping 12-week course, a workshop talk or informal tricks-sharing is perhaps more suitable

December 28, 2024 at 11:08 PM

Zizhao Chen

@ch272h.bsky.social

The Internet has almost too many resources on general SE best practices (super useful for code release). What's lacking are good programming practices in the context of day-to-day research, e.g., versioning datasets, tracking experiments, reporting prelim findings, reacting to constant pivots

December 28, 2024 at 11:00 PM

Zizhao Chen

@ch272h.bsky.social

Why bother coming up with an "artificial" project when there are natural ones and the goal (I assume) is to train better researchers anyway?

December 28, 2024 at 9:47 PM

Zizhao Chen

@ch272h.bsky.social

I actually relate to much of the presentation on state management.

Jupyter shines in plotting and interactive demoing. E.g., a use case not fulfilled by console or scripts: prompt engineering. Jupyter (1) does not reload model weights and (2) can fold/clear historical long outputs like logits

December 28, 2024 at 7:33 PM

Zizhao Chen

@ch272h.bsky.social

A PhD *student* paranoid with code. I guess that’s what makes me a student 🥲

December 28, 2024 at 7:15 PM

Zizhao Chen

@ch272h.bsky.social

You were blessed with a codebase that's easy to work with, or the ability to build one. IMO factoring is tricky for different, ever-shifting research goals. See a discussion on "single-file implementation" and "Does modularity help RL libraries?" at iclr-blog-track.github.io/2022/03/25/p...

December 28, 2024 at 12:37 AM

Zizhao Chen

@ch272h.bsky.social

What’s wrong with Jupyter notebooks 😂

December 27, 2024 at 11:15 PM

Zizhao Chen

@ch272h.bsky.social

That’s quite a lot of investment in a course for phds lol. How about allowing collaborated projects in your graduate seminar?

December 27, 2024 at 11:12 PM

Zizhao Chen

@ch272h.bsky.social

Also collaborating with others in the same repo motivated both of us to write better code than we would otherwise.

December 27, 2024 at 7:07 PM

Zizhao Chen

@ch272h.bsky.social

Speaking as a phd paranoid with code:

goodresearch.dev is good.

A guilty pleasure of mine is reading not only good research repo, but also their full git history if released. Factored code is not always easy to change and a big refactor commit says something.

December 27, 2024 at 7:03 PM

Zizhao Chen

@ch272h.bsky.social

Some misread it as geopolitics instead of racism.

And caring for others, that’s not exactly part of a researcher’s job description or perf review.

I made up the second one to save myself from greater disappointment.

December 14, 2024 at 9:47 AM

Zizhao Chen

@ch272h.bsky.social

All I am saying is I don't assume a prior definition, nor do I observe your latent thought process

December 13, 2024 at 5:10 AM

Zizhao Chen

@ch272h.bsky.social

I’m not sure what conclusion I can draw from this poll.

And disclaimer - this is absolutely not affiliated with neurips.

Credit goes to everyone who participated in this mini poll. Thank you - you made my day!

December 12, 2024 at 5:06 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news