Keshav Ramji
keshavramji.bsky.social
Keshav Ramji
@keshavramji.bsky.social
Post-training Alignment at IBM Research AI | Prev: Penn CS + Wharton
(6/n) This mechanism largely retains or even improves performance while facilitating convergence to a fixed constitution over the iterations. The resulting model also enhances its self-correction abilities, successfully revising a higher rate of samples with each iteration.
May 23, 2025 at 9:36 PM
(3/n) We introduce a Monte Carlo EM algorithm that alternates between the *principle discovery* and *principle learning* phases, enabling the LM to self-improve over multiple iterations, bootstrapping its learned distribution of principles to discover new ones.
May 23, 2025 at 9:34 PM
Excited to share our new paper on language model self-improvement!

Paper: arxiv.org/abs/2505.16927

We introduce Self-Taught Principle Learning (STaPLe), a new approach for LMs to generate their own constitutions, by learning the principles that are most effective to self-correct their responses.
May 23, 2025 at 9:33 PM
I'm at #ICLR2025 🇸🇬 and will be presenting Conformal Language Model Reasoning with Coherent Factuality (arXiv to come soon) this afternoon (4/24, poster session 2)! This work is with my amazing collaborators Max Rubin-Toles, Maya Gambhir, @aaroth.bsky.social, and @surbhigoel.bsky.social!
April 23, 2025 at 11:52 PM