Laura
lauraruis.bsky.social
Laura
@lauraruis.bsky.social
PhD supervised by Tim Rocktäschel and Ed Grefenstette, part time at Cohere. Language and LLMs. Spent time at FAIR, Google, and NYU (with Brenden Lake). She/her.
"Rather than being animals that *think*, we are *animals* that think"; the last sentence of Tom Griffiths's characterisation of human intelligence through limited time, compute, and communication hits different today than it did 4 years ago.
December 22, 2024 at 11:04 AM
Sometimes o1's thinking time almost feels like a slight. o1 is like "oh I thought about this uninvolved question of yours for 7 seconds and here is my 20 page essay on it"
December 15, 2024 at 5:38 PM
Reviewing requires constant questioning of the motive behind your responses, every step of the way. Which btw, according to chatty, will help you become a better scientist yourself.
November 27, 2024 at 5:25 PM
We squeezed all findings in Figure 1 of the preprint, but check out the paper or blogpost for details:

Paper: arxiv.org/abs/2411.12580
Blogpost: lauraruis.github.io/2024/11/10/i...
November 20, 2024 at 4:35 PM
For the factual questions, the answer often shows up as highly influential, whereas for reasoning questions it does not (see bottom row of documents below).

Also, we find evidence for code being both positively and negatively influential for reasoning!
November 20, 2024 at 4:35 PM
We find that a document's influence on the reasoning traces of a query is strongly predictive of that document's influence on another query with the same mathematical task, indicating that influence picks up on procedural knowledge in documents for reasoning tasks.
November 20, 2024 at 4:35 PM
We find that the models rely less on individual documents when generating reasoning traces than when answering factual questions (indicated below by arrow thickness), and the set of documents they rely on is more general.
November 20, 2024 at 4:35 PM
We use influence functions to estimate the effect pretraining data have on the likelihood of completions of two LLMs (7B and 35B) for factual question answering (left), and reasoning traces for simple mathematical tasks (3 tasks, one shown right).
November 20, 2024 at 4:35 PM