Lightnews — Scholar-powered news

Tianwei Ni

@twni2016.bsky.social

31 followers 12 following 5 posts

https://twni2016.github.io/ Reinforcement Learning PhD student @Mila

Posts Replies Media Videos

Tianwei Ni

@twni2016.bsky.social

Work completed during my internship at Amazon Science. Thank you to my co-authors @allenanie.bsky.social, Sapana Chaudhary, Yao Liu, Huzefa Rangwala, @rasoolfa.bsky.social!

April 23, 2025 at 10:05 PM

Tianwei Ni

@twni2016.bsky.social

Results on challenging math games, Countdown & Game-of-24:
⚡180× faster inference than search-based baseline
📈Beats CoT and inference-time search (ToT, RAP)

📄 Paper: arxiv.org/abs/2504.11364
💻 Code & data: github.com/twni2016/llm...

Teaching Large Language Models to Reason through Learning and Forgetting

Leveraging inference-time search in large language models has proven effective in further enhancing a trained model's capability to solve complex mathematical and reasoning problems. However, this app...

arxiv.org

April 23, 2025 at 10:05 PM

Tianwei Ni

@twni2016.bsky.social

2️⃣ Learn successful reasoning paths ✅ while forgetting failed reasoning paths ❌ at the same time, which we call Unlikelihood Fine-Tuning (UFT)
3️⃣ Small learning rate is crucial to preserve inference-time search capabilities

April 23, 2025 at 10:05 PM

Tianwei Ni

@twni2016.bsky.social

1️⃣ Aggregate reasoning paths from diverse sources: Chain-of-Thought, inference-time search (Tree-of-Thought, Reasoning-via-Planning), classic algorithms (BFS, DFS)

April 23, 2025 at 10:05 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news