Eyal Ben David, Hadas Orgad, Eran Ofek, Yonatan Belinkov, Idan Szpektor, Jonathan Herzig, and Roi Reichart.
Paper: arxiv.org/abs/2503.15299
17/🧵 (end)
Eyal Ben David, Hadas Orgad, Eran Ofek, Yonatan Belinkov, Idan Szpektor, Jonathan Herzig, and Roi Reichart.
Paper: arxiv.org/abs/2503.15299
17/🧵 (end)
16/🧵
16/🧵
15/🧵
15/🧵
14/🧵
14/🧵
13/🧵
13/🧵
This highlights limitations in the generation process and opens interesting directions for future research on decoding mechanisms.
12/🧵
This highlights limitations in the generation process and opens interesting directions for future research on decoding mechanisms.
12/🧵
11/🧵
11/🧵
This highlights the need to understand these differences and build models that better use their knowledge, for which our framework serves as a foundation.
10/🧵
This highlights the need to understand these differences and build models that better use their knowledge, for which our framework serves as a foundation.
10/🧵
Internal knowledge is measured using a linear probing classifier to score candidate answers, while external knowledge is measured using standard methods that rely on the model’s observable token-level probabilities.
9/🧵
Internal knowledge is measured using a linear probing classifier to score candidate answers, while external knowledge is measured using standard methods that rely on the model’s observable token-level probabilities.
9/🧵
8/🧵
8/🧵
7/🧵
7/🧵
6/🧵
6/🧵
5/🧵
5/🧵
4/🧵
4/🧵
We propose such a definition, laying foundations for studying this concept, and use it in a study to demonstrate hidden knowledge.
3/🧵
We propose such a definition, laying foundations for studying this concept, and use it in a study to demonstrate hidden knowledge.
3/🧵
Diverse evidence from prior work suggests the existence of hidden knowledge.
2/🧵
Diverse evidence from prior work suggests the existence of hidden knowledge.
2/🧵