banner
diegodoimo.bsky.social
@diegodoimo.bsky.social
⚒️ We applied an advanced density-based clustering algorithm, showing its potential as an interpretability tool and in guiding novel strategies for the effective finetuning of LLMs.
🧵5/6
December 10, 2024 at 7:54 PM
In fine-tuning, answer-focused modes rapidly emerge midway through the network, just after the intrinsic dimension peak.
Early layers remain largely unchanged.
🧵4/6
December 10, 2024 at 7:52 PM
In few-shot learning, the prompt topic defines the modes of data distribution early in the network, and density modes are hierarchically organized based on the similarity of the subjects.
🧵3/6
December 10, 2024 at 7:49 PM
🎯 Key results: few-shot learning and fine-tuning show two distinct processing phases inside LLMs.

These phases are separated by a peak of the data intrinsic dimension and a sharp decrease in the separation of the probability modes.

Paper: arxiv.org/abs/2409.03662
🧵2/6
December 10, 2024 at 7:48 PM
Just landed in Vancouver to present @neuripsconf.bsky.social the results of our new work!

Few-shot learning and fine-tuning change the layers inside LLMs in a dramatically different way, even when they perform equally well on multiple-choice question-answering tasks.
🧵1/6
December 10, 2024 at 7:47 PM