Stephanie Hyland
banner
hylandsl.bsky.social
Stephanie Hyland
@hylandsl.bsky.social
machine learning for health at microsoft research, based in cambridge UK 🌻 she/her
Emojis and massive try: except blocks. GitHub Copilot (at least Claude Sonnet 4) is very concerned about error handling.
August 3, 2025 at 6:46 AM
if openreview were a lot fancier you could dynamically reallocate/cancel remaining reviews once a paper meets that expected minimum.

ideally you would mark these remaining reviews as optional rather than fully cancelled, in case that reviewer has already done work
July 30, 2025 at 4:26 PM
How many AI researchers fold their own laundry?
July 29, 2025 at 6:29 AM
I am in the UK so feel free to discard, but I recently noticed Discord asking for age verification for some channels:
July 25, 2025 at 7:02 AM
ALSO we have released the SAEs we trained, and the automated interp for all(!!)* features:
huggingface.co/microsoft/ma...

*all features for a subset of SAEs, we didn't run the full auto-interp pipeline on the widest SAE
microsoft/maira-2-sae · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
July 18, 2025 at 9:43 AM
We also found that the majority of the SAE features remained "uninterpretable", indicating room for improvement both in automated interpretability (we focused primarily on textual features!), but perhaps also questioning the SAE training and modelling assumptions. More work to be done here ✌️
July 18, 2025 at 9:40 AM
... and in some cases we were able to steer MAIRA-2's generations, selectively introducing or removing concepts from its generated report.

But steering worked inconsistently! Sometimes it did nothing, or introduced off-target effects. We still don't fully understand when it will work.
July 18, 2025 at 9:35 AM
We found interpretable and radiology-relevant concepts in MAIRA-2, like:
- "Aortic tortuosity or calcification"
- "Placement and position of PICC lines"
- "Presence of 'shortness of breath' in indication"
- "Describing findings without comparison to prior images"
- "Use of 'possible' or 'possibly'"
July 18, 2025 at 9:34 AM
We performed the full pipeline of SAE training, automated interpretation with LLMs, steering, and automated steering evaluation.
July 18, 2025 at 9:32 AM
Mexico is an *official* NeurIPS event, it’s an additional location for the conference and is different to the endorsement of EurIPS.
July 17, 2025 at 7:32 PM
It’s an endorsed event but is not actually officially NeurIPS! Maybe if this experiment works well there will be more distributed (official) NeurIPS locations in future.
July 17, 2025 at 2:26 PM