https://yining610.github.io/
🔗 Paper link: arxiv.org/abs/2511.07577
🔗 Paper link: arxiv.org/abs/2511.07577
- Rebalance multiobjectives during training through dynamic reward weighting
- Build Pareto-dominant front over static baselines across online RL algorithms, datasets, and model families
- Faster convergence rate
1/8
- Rebalance multiobjectives during training through dynamic reward weighting
- Build Pareto-dominant front over static baselines across online RL algorithms, datasets, and model families
- Faster convergence rate
1/8
TL;DR: We found a mismatch between the decomposition policy and LLM verifier, and propose a dynamic training paradigm to bridge the gap.
TL;DR: We found a mismatch between the decomposition policy and LLM verifier, and propose a dynamic training paradigm to bridge the gap.
1. Dynamic Decomposition: arxiv.org/abs/2503.15354
2. RATIONALYST: arxiv.org/abs/2410.01044
Both works study the mulit-model collobration!
1. Dynamic Decomposition: arxiv.org/abs/2503.15354
2. RATIONALYST: arxiv.org/abs/2410.01044
Both works study the mulit-model collobration!
📅 11AM-12:30PM, Fri, May 2
📍 Hall 3
📝 arxiv.org/abs/2407.09007
🎥 www.youtube.com/watch?v=v1c...
📅 11AM-12:30PM, Fri, May 2
📍 Hall 3
📝 arxiv.org/abs/2407.09007
🎥 www.youtube.com/watch?v=v1c...
I'd love to chat about RL and its interpretability, data influence for post-training, CogSci for LLM. Feel free to reach out and let's have some coffee together ☕ !
TLDR— Proposed a framework for benchmarking LLMs' 𝒄𝒓𝒆𝒂𝒕𝒊𝒗𝒊𝒕𝒚.
x.com/Yining__Lu/...
I'd love to chat about RL and its interpretability, data influence for post-training, CogSci for LLM. Feel free to reach out and let's have some coffee together ☕ !
www.youtube.com/watch?v=v1c...
www.youtube.com/watch?v=v1c...
@NotreDame! Abstract submissions are due Mar 20, and registration deadline is Mar 27. Financial assistance for students (lodging, poster printing) is available. nlp.nd.edu/msld25
@NotreDame! Abstract submissions are due Mar 20, and registration deadline is Mar 27. Financial assistance for students (lodging, poster printing) is available. nlp.nd.edu/msld25