I want to highlight progress we made in understanding the role of tokenization, developing the core incidents and mitigating its problems. 🧵👇
w/ Jingtong Su, Jianyu Zhang, @karen-ullrich.bsky.social , and Léon Bottou.
🧵
w/ Jingtong Su, Jianyu Zhang, @karen-ullrich.bsky.social , and Léon Bottou.
🧵
We introduce SAMD & SAMI — a novel, concept-agnostic approach to identify and manipulate attention modules in transformers.
We introduce SAMD & SAMI — a novel, concept-agnostic approach to identify and manipulate attention modules in transformers.
We explore optimization scenarios where objectives align rather than conflict, introducing new scalable algorithms with theoretical guarantees. #MachineLearning #AI #Optimization
We explore optimization scenarios where objectives align rather than conflict, introducing new scalable algorithms with theoretical guarantees. #MachineLearning #AI #Optimization
Byte-level LLMs without training and guaranteed performance? Curious how? Dive into our work! 📚✨
Paper: arxiv.org/abs/2410.09303
Github: github.com/facebookrese...
Byte-level LLMs without training and guaranteed performance? Curious how? Dive into our work! 📚✨
Paper: arxiv.org/abs/2410.09303
Github: github.com/facebookrese...
9-11am I will be at the Meta AI Booth
12.30-2pm
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs (neurips.cc/virtual/2024...)
OR
End-To-End Causal Effect Estimation from Unstructured Natural Language Data (neurips.cc/virtual/2024...)
9-11am I will be at the Meta AI Booth
12.30-2pm
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs (neurips.cc/virtual/2024...)
OR
End-To-End Causal Effect Estimation from Unstructured Natural Language Data (neurips.cc/virtual/2024...)
11-12.30 WiML round tables
1.30-4 Beyond Decoding, Tutorial
11-12.30 WiML round tables
1.30-4 Beyond Decoding, Tutorial
Two papers have been accepted at #NeurIPS2024 ! 🙌🏼 These papers are the first outcomes of my growing focus on LLMs. 🍾 Cheers to Nikita Dhawan and Jingtong Su + all involved collaborators: @cmaddis.bsky.social Leo Cotta, Rahul Krishnan, Julia Kempe
book on generative modeling (long overdue)
book on generative modeling (long overdue)
I want to highlight progress we made in understanding the role of tokenization, developing the core incidents and mitigating its problems. 🧵👇
I want to highlight progress we made in understanding the role of tokenization, developing the core incidents and mitigating its problems. 🧵👇
I got one PhD internship position available for 2025!
Interested in exploring the intersection of information theory, probabilistic reasoning, and LLMs?
📩 Send me a DM with your CV, website, and GScholar profile by October 14th.
I got one PhD internship position available for 2025!
Interested in exploring the intersection of information theory, probabilistic reasoning, and LLMs?
📩 Send me a DM with your CV, website, and GScholar profile by October 14th.
Two papers have been accepted at #NeurIPS2024 ! 🙌🏼 These papers are the first outcomes of my growing focus on LLMs. 🍾 Cheers to Nikita Dhawan and Jingtong Su + all involved collaborators: @cmaddis.bsky.social Leo Cotta, Rahul Krishnan, Julia Kempe
Two papers have been accepted at #NeurIPS2024 ! 🙌🏼 These papers are the first outcomes of my growing focus on LLMs. 🍾 Cheers to Nikita Dhawan and Jingtong Su + all involved collaborators: @cmaddis.bsky.social Leo Cotta, Rahul Krishnan, Julia Kempe
Submit by September 30th. More info
neuralcompression.github.io/workshop24
Submit by September 30th. More info
neuralcompression.github.io/workshop24