Julius Adebayo
juliusad.bsky.social
Julius Adebayo
@juliusad.bsky.social
ML researcher, building interpretable models at Guide Labs (guidelabs.bsky.social).
The LCMs are cool though; however, it is early days. They give us a knob (concept representations) to understand and change the model's outputs. There is no reason why an LCM should also not have a COT (or be able to reason via search/planning)...we just have to ask it :)
January 3, 2025 at 11:32 PM
The reasoning models are cool though; they explicitly enforce dependence on the model's cot, so here it should be a reliable explanation (? not sure tho). Played with 'thinking' gemini: it generate pages of COT sometimes, and now we have to figure what (and which part) is relevant.
January 3, 2025 at 11:32 PM
This reminds me of all the issues with heatmaps and probes. The model really has no incentive to rely on its cot unless it is explicitly asked to do so via fine-tuning or some kind of penalty.
January 3, 2025 at 11:32 PM
You always ask the right questions :) I don't think chain-of-thought, of current models, (except the reasoning ones) gives reliable insight about models. The issue is that cot is an output (and input) of the model, and you can change it in all sort of ways without affecting the model's output.
January 3, 2025 at 11:32 PM
It is too early to tell :) I like the papers on your list but I think only a few of them were instant ‘classics’.

Having said that, I like: large concept models paper from meta.
January 2, 2025 at 3:29 PM
Is the final output actually “causally” dependent on the long COT generated? How key are these traces to the search/planning clearly happening here? Some many questions but so little answers.
December 21, 2024 at 7:12 PM