Joe Stacey
joestacey.bsky.social
Joe Stacey
@joestacey.bsky.social
NLP PhD student at Imperial College London and Apple AI/ML Scholar.
5) The best way to improve performance on the hardest OOD data was to choose more challenging training examples

Our best method (Uncertainty Sampling) picked examples with the most uncertain predictions. This identified challenging examples, but without too much label noise
May 27, 2025 at 3:50 PM
3) Replacing some training examples with LLM-generated data proved very effective on less challenging OOD data

See Standard-OOD scores below (avg), where the simplest LLM-generated data (Short & Simple Generation) performed best, with substantial improvements
May 27, 2025 at 3:50 PM
2) We experiment with 6+ ways for improving robustness:

This involved sampling methods to choose more complex examples in our training data, and generating new synthetic examples

Some methods were pretty fun, e.g. asking an LLM to assess the difficulty of training examples
May 27, 2025 at 3:50 PM
1) It's time to stop using fine-tuned encoder models:

We find that fine-tuned LLMs are substantially more robust than commonly used encoder models, despite being fine-tuned on x50 less data.

This is especially the case on challenging OOD datasets (see Challenge-OOD avg below)
May 27, 2025 at 3:50 PM
We have a fun new #NLProc paper on arXiv about improving the robustness of fine-tuned NLI models!

Have a look :)
arxiv.org/abs/2505.20209
May 27, 2025 at 3:50 PM
This paper is really cool. They decompose NLI (and defeasible NLI) hypotheses into atoms, and then use these atoms to measure the logical consistency of LLMs.

E.g. for an entailment NLI example, each hypothesis atom should also be entailed by the premise.

Very nice idea 👏👏
February 18, 2025 at 4:14 PM
You can see a table like this. Would be brilliant to know which reviews are 'great' or not though!
January 7, 2025 at 3:26 PM
#10 Severn Beach Line in Bristol (England)

My local train line in Bristol is so heart warming. My highlight was commuting to the obscure, industrial St Andrews Road station (pictured) when I was a receptionist at the nearby firefighters hotel / training centre
January 2, 2025 at 11:32 AM
#9 Urumqi to Beijing in China

I did this journey a long time ago so I'm sure the train will have totally transformed now. I chose the cheapest 'hard seat' class, and remember curling up on the floor to sleep at night.

But wow the journey was fun, and I was sitting with the funnest maths students.
January 2, 2025 at 11:32 AM
#6 Stockholm to Narvik (northern Norway)

I stoped a few times along the way, but going up into the artic circle was pretty exciting, and taking the train through snowy blizzards was also new for me.

Picture from google.
January 2, 2025 at 11:32 AM
#4 The Canadian (4 days from Toronto to Vancouver)

I did this trip in winter, and got to see Canada in the snow. Apart from the scenery, I love when the train stops for a few hours somewhere and you can run out and explore.
January 2, 2025 at 11:32 AM
#2 The Empire Builder (Seattle to Chicago)

I love Amtrak long distance trains, with their amazing little sleeper rooms and dining cars. The views were great, but the real highlight on Amtrak is always the people I meet there
January 2, 2025 at 11:32 AM
#1 The Trans-Siberian (or more specifically the trans-mongolian)

From Beijing to Moscow took me ~7 days, where I lived off instant noodles and vodka from the dining car. You get to see China gradually change into Mongolia and then into Russia.
January 2, 2025 at 11:32 AM
#3

London is a great city to live in. I love all the green spaces personally, and the public transport (with the fantastic new Elizabeth line), but this city has just about everything.
December 28, 2024 at 4:12 PM
#2

The location is brilliant! Imperial is right by Hyde Park / Kensington Gardens, which are enormous green parks. My commute involves walking through them on my way to uni and I love it.

Perfect for running too if that's the sort of thing you're into.
December 28, 2024 at 4:12 PM
#1

Imperial has a fantastic reputation, and we came 2nd in the world in the last QS world university rankings.

Help us get #1 from MIT 😊
December 28, 2024 at 4:12 PM
Made it to northern Sweden (Kiruna) by train from London. Freezing cold with northern lights 😍

Just over a week ago and I was in the crazy Miami heat for #EMNLP2024
November 26, 2024 at 10:26 PM
This papers' findings about testing LLMs on NLI aligns with many of personal thoughts:

1) NLI remains a difficult task for LLMs
2) Having more few-shot examples is helpful (in my view, helping LLMs better understand class boundaries)
3) Incorrect predictions are often a result of ambiguous labels
November 24, 2024 at 4:38 PM
I’ve seen some pretty amazing metros before (like Moscow), but wow Stockholm is wild. Never seen anything like it!
November 23, 2024 at 3:37 PM
You know it’s cold when little Hamish starts hugging the radiator ❤️
November 18, 2024 at 10:57 PM
I love the Amtrak dining cars!! How pretty is this. Really good sit down breakfast, lunch and dinners. And the best bit is all the amazing people you meet and speak to at the meals.
November 17, 2024 at 9:28 PM
Just boarded my train from Miami to New York post #EMNLP2024
and super excited!! Amtrak trains are the fantastic, and I’ve got my own little room with two seats, a bed above, and toilet next to the bed.

The toilet thing is a bit weird though if you have two to a room
November 17, 2024 at 1:11 PM
Such a fantastic reaction to our paper today. so happy 🙂

Chocolates went down well too!

Massive thanks to everyone for all your ideas and feedback
November 15, 2024 at 1:32 AM
Excited to present our #EMNLP2024 paper as a poster this morning at 10:30 (in the downstairs poster room)!

It's cool work about creating inherently interpretable models, and (as always) I will have chocolate to give out 😀

Paper is here: aclanthology.org/2024.emnlp-m...
November 14, 2024 at 1:37 PM