studying the foundations of machine intelligence
awni.xyz
We empirically validate our theory’s predictions in simple settings where the CoT information can be computed exactly.
We find that the theory closely predicts the sample-efficiency gains.
[8/n]
We empirically validate our theory’s predictions in simple settings where the CoT information can be computed exactly.
We find that the theory closely predicts the sample-efficiency gains.
[8/n]
To distinguish between hypotheses with error ε, classical theory tells us we need roughly O(1/ε) samples.
We prove that under CoT supervision, the sample complexity improves to O(1/CoTInfo(ε)).
[5/n]
To distinguish between hypotheses with error ε, classical theory tells us we need roughly O(1/ε) samples.
We prove that under CoT supervision, the sample complexity improves to O(1/CoTInfo(ε)).
[5/n]
We formalize this by introducing the “CoT Information”: a measure of the extra discriminative power gained by observing the reasoning trace, not just the label.
[4/n]
We formalize this by introducing the “CoT Information”: a measure of the extra discriminative power gained by observing the reasoning trace, not just the label.
[4/n]
Excited to share our work developing a learning-theoretic account of the statistical advantage of chain-of-thought supervision in reasoning systems!
Blog: awni.xyz/cot-info
👇🧵
[1/n]
Excited to share our work developing a learning-theoretic account of the statistical advantage of chain-of-thought supervision in reasoning systems!
Blog: awni.xyz/cot-info
👇🧵
[1/n]