Reward models treat preference as a black-box😶🌫️but human brains🧠decompose decisions into hidden attributes
We built the first system to mirror how people really make decisions in our recent COLM paper🎨PrefPalette✨
Why it matters👉🏻🧵
We welcome any feedback and questions -- don't hesitate to reach out!
16/16
We welcome any feedback and questions -- don't hesitate to reach out!
16/16
Excited to see where this takes the next generation of effective LLM assistants and agents!
Our new framework ALFA—ALignment with Fine-grained Attributes—teaches LLMs to PROACTIVE seek information through better questions through **structured rewards**🏥❓
(co-led with @jiminmun.bsky.social)
👉🏻🧵
Excited to see where this takes the next generation of effective LLM assistants and agents!