PhDing on Interpretable NLP + CSS @gesis.org Prev: Masters Student + Researcher at @ubuffalo.bsky.social and Sr. Data Scientist at Coursera
Language models, unlike humans, require large amounts of data, which suggests the need for an inductive bias.
But what kind of inductive biases do we need?
Language models, unlike humans, require large amounts of data, which suggests the need for an inductive bias.
But what kind of inductive biases do we need?
www.linkedin.com/posts/subbar...
www.linkedin.com/posts/subbar...
Five sections:
- Foundational Components
- Foundational Problems
- Training Resources
- Applications
- Future Directions
Five sections:
- Foundational Components
- Foundational Problems
- Training Resources
- Applications
- Future Directions
Now on arXiv: arxiv.org/abs/2508.16599
Now on arXiv: arxiv.org/abs/2508.16599
Eric's: docs.google.com/document/d/1...
Mor's: s.tech.cornell.edu/phd-syllabus/
Eric's: docs.google.com/document/d/1...
Mor's: s.tech.cornell.edu/phd-syllabus/
bsky.app/profile/pape...
bsky.app/profile/pape...
context: (some?all?) panelists & him agree the field needs more deep, careful research on smaller models to do better science. everyone is frustrated with impossibility of large-scale pretraining experiments
context: (some?all?) panelists & him agree the field needs more deep, careful research on smaller models to do better science. everyone is frustrated with impossibility of large-scale pretraining experiments
Looking for practical methods for settings where human annotations are costly.
A few examples in thread ↴
Looking for practical methods for settings where human annotations are costly.
A few examples in thread ↴
Looking for practical methods for settings where human annotations are costly.
A few examples in thread ↴
More info: www.ubjobs.buffalo.edu/postings/57734
More info: www.ubjobs.buffalo.edu/postings/57734
I really like this paper as a survey on the current literature on what CoT is, but more importantly on what it's not.
It also serves as a cautionary tale to the (apparently quite common) misuse of CoT as an interpretable method.
I really like this paper as a survey on the current literature on what CoT is, but more importantly on what it's not.
It also serves as a cautionary tale to the (apparently quite common) misuse of CoT as an interpretable method.
We will explore the future of computing for health, sustainability, human-centered AI, and policy.
Please consider submitting a 1-page abstract
We will explore the future of computing for health, sustainability, human-centered AI, and policy.
Please consider submitting a 1-page abstract
ojs.aaai.org/index.php/IC...,
ojs.aaai.org/index.php/IC...,
Phase transitions! We love to see them during LM training. Syntactic attention structure, induction heads, grokking; they seem to suggest the model has learned a discrete, interpretable concept. Unfortunately, they’re pretty rare—or are they?
Phase transitions! We love to see them during LM training. Syntactic attention structure, induction heads, grokking; they seem to suggest the model has learned a discrete, interpretable concept. Unfortunately, they’re pretty rare—or are they?
(1/6)
(1/6)
⚖️ sin discriminar,
🚫 sin reforzar estereotipos,
🔁 y sin aprender a odiar?
Esa es la gran pregunta de mi tesis.
👇 Te lo cuento en este #HiloTesis @crueuniversidades.bsky.social @filarramendi.bsky.social
⚖️ sin discriminar,
🚫 sin reforzar estereotipos,
🔁 y sin aprender a odiar?
Esa es la gran pregunta de mi tesis.
👇 Te lo cuento en este #HiloTesis @crueuniversidades.bsky.social @filarramendi.bsky.social
👇
👇
The sort of small wins that accumulate into a real career in AI.
When I started grad school AI prof's didn't have space for me in their group and when I ended I had no papers at NeurIPS/ICLR/ICML, yet the process can still work.
www.interconnects.ai/p/my-path-in...
The sort of small wins that accumulate into a real career in AI.
When I started grad school AI prof's didn't have space for me in their group and when I ended I had no papers at NeurIPS/ICLR/ICML, yet the process can still work.
www.interconnects.ai/p/my-path-in...
(not referring to "cultural alignment" techniques)
(bonus if used and *to good effect*)
(not referring to "cultural alignment" techniques)
(bonus if used and *to good effect*)