@aaditya6284.bsky.social should really take the bulk of the credit :)
And thank you so much to all our wonderful collaborators, who made fundamental contributions as well!
Ted Moskovitz, Sara Dragutinovic, Felix Hill, @saxelab.bsky.social
@aaditya6284.bsky.social should really take the bulk of the credit :)
And thank you so much to all our wonderful collaborators, who made fundamental contributions as well!
Ted Moskovitz, Sara Dragutinovic, Felix Hill, @saxelab.bsky.social
Felix Hill, who passed away recently. This is our last ever paper with him.
It was bittersweet to finish this research, which contains so much of the scientific spark that he shared with us. Rest in peace Felix, and thank you so much for everything.
Felix Hill, who passed away recently. This is our last ever paper with him.
It was bittersweet to finish this research, which contains so much of the scientific spark that he shared with us. Rest in peace Felix, and thank you so much for everything.
But we find that cIWL and ICL actually compete AND cooperate, via shared subcircuits. In fact, ICL cannot emerge if cIWL is blocked from emerging, even though ICL emerges first!
But we find that cIWL and ICL actually compete AND cooperate, via shared subcircuits. In fact, ICL cannot emerge if cIWL is blocked from emerging, even though ICL emerges first!
We call this combo "cIWL" (context-constrained in-weights learning).
We call this combo "cIWL" (context-constrained in-weights learning).
For more details please see Aaditya's thread,
or the paper itself.
bsky.app/profile/aadi...
arxiv.org/abs/2503.05631
For more details please see Aaditya's thread,
or the paper itself.
bsky.app/profile/aadi...
arxiv.org/abs/2503.05631