Lightnews — Scholar-powered news

Laura

@lauraruis.bsky.social

1.5K followers 110 following 56 posts

PhD supervised by Tim Rocktäschel and Ed Grefenstette, part time at Cohere. Language and LLMs. Spent time at FAIR, Google, and NYU (with Brenden Lake). She/her.

Posts Replies Media Videos

Laura

@lauraruis.bsky.social

"Rather than being animals that *think*, we are *animals* that think"; the last sentence of Tom Griffiths's characterisation of human intelligence through limited time, compute, and communication hits different today than it did 4 years ago.

December 22, 2024 at 11:04 AM

Laura

@lauraruis.bsky.social

Sometimes o1's thinking time almost feels like a slight. o1 is like "oh I thought about this uninvolved question of yours for 7 seconds and here is my 20 page essay on it"

December 15, 2024 at 5:38 PM

Laura

@lauraruis.bsky.social

Reviewing requires constant questioning of the motive behind your responses, every step of the way. Which btw, according to chatty, will help you become a better scientist yourself.

November 27, 2024 at 5:25 PM

Laura

@lauraruis.bsky.social

We squeezed all findings in Figure 1 of the preprint, but check out the paper or blogpost for details:

Paper: arxiv.org/abs/2411.12580
Blogpost: lauraruis.github.io/2024/11/10/i...

November 20, 2024 at 4:35 PM

Laura

@lauraruis.bsky.social

For the factual questions, the answer often shows up as highly influential, whereas for reasoning questions it does not (see bottom row of documents below).

Also, we find evidence for code being both positively and negatively influential for reasoning!

November 20, 2024 at 4:35 PM

Laura

@lauraruis.bsky.social

We find that a document's influence on the reasoning traces of a query is strongly predictive of that document's influence on another query with the same mathematical task, indicating that influence picks up on procedural knowledge in documents for reasoning tasks.

November 20, 2024 at 4:35 PM

Laura

@lauraruis.bsky.social

We find that the models rely less on individual documents when generating reasoning traces than when answering factual questions (indicated below by arrow thickness), and the set of documents they rely on is more general.

November 20, 2024 at 4:35 PM

Laura

@lauraruis.bsky.social

We use influence functions to estimate the effect pretraining data have on the likelihood of completions of two LLMs (7B and 35B) for factual question answering (left), and reasoning traces for simple mathematical tasks (3 tasks, one shown right).

November 20, 2024 at 4:35 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news