🧵5/6
🧵5/6
Early layers remain largely unchanged.
🧵4/6
Early layers remain largely unchanged.
🧵4/6
🧵3/6
🧵3/6
These phases are separated by a peak of the data intrinsic dimension and a sharp decrease in the separation of the probability modes.
Paper: arxiv.org/abs/2409.03662
🧵2/6
These phases are separated by a peak of the data intrinsic dimension and a sharp decrease in the separation of the probability modes.
Paper: arxiv.org/abs/2409.03662
🧵2/6
Few-shot learning and fine-tuning change the layers inside LLMs in a dramatically different way, even when they perform equally well on multiple-choice question-answering tasks.
🧵1/6
Few-shot learning and fine-tuning change the layers inside LLMs in a dramatically different way, even when they perform equally well on multiple-choice question-answering tasks.
🧵1/6