And you might get the best of both worlds by spending extra compute and train time to augment finetuning.
And you might get the best of both worlds by spending extra compute and train time to augment finetuning.
@aaditya6284.bsky.social
"Strategy coopetition explains the emergence and transience of in-context learning in transformers."
We find some surprising things!! E.g. that circuits can simultaneously compete AND cooperate ("coopetition") 😯 🧵👇
@aaditya6284.bsky.social
"Strategy coopetition explains the emergence and transience of in-context learning in transformers."
We find some surprising things!! E.g. that circuits can simultaneously compete AND cooperate ("coopetition") 😯 🧵👇