Stuart Gray
banner
sgray.bsky.social
Stuart Gray
@sgray.bsky.social
He/Him. AI Wrangler. Web Geek. F1 Fan. All views my own.

🤖 AI, LLMs, GenAI, NLP
🐍 Python Dev
🚀 Indie Hacker
🎮 Game Dev, ProcGen, Unity, C#
🏎️ F1 Fan
🇬🇧 UK Based

🦣 mastodonapp.uk/@StuartGray
✖️ x.com/StuartGray (inactive)
Pinned
I welcome any genuine civil discussion, challenge, or critique.

However, if you strongly disagree with a post to the point you're unable to refrain from insults, rude or unthinking replies then please, save us both a lot of time and block me now - because I will block you.
Reposted by Stuart Gray
one to be aware of, browsers allow any website to go full screen, so now ClickFix are faking Windows update reboot prompts to get remote access for ransomware groups.

cyberplace.social/@GossiTheDog...
Kevin Beaumont (@GossiTheDog@cyberplace.social)
Attached: 1 image Interesting one spotted by Daniel B in the NHS - ClickFix (fake browser adverts to encourage people to run commands which provide remote access) have a new technique - they use brow...
cyberplace.social
November 13, 2025 at 12:15 PM
Reposted by Stuart Gray
Just tried to annotate somebody else's request thread on @whatdotheyknow.bsky.social and apparently that's not a thing anymore?

From late May – seems to be, in part at least, another tedious consequence of the Online Safety Act github.com/mysociety/wh... + github.com/mysociety/al...

#FOI #openweb
November 13, 2025 at 10:11 AM
Reposted by Stuart Gray
Already 18 hallucinatory citations in UK cases. And we thought we were smarter / smugger than this
www.damiencharlotin.com/hallucinatio...
AI Hallucination Cases Database – Damien Charlotin
Database tracking legal cases where generative AI produced hallucinated citations submitted in court filings.
www.damiencharlotin.com
November 13, 2025 at 8:56 AM
Reposted by Stuart Gray
The British government admits it is now monitoring VPNs use by UK residents. Regulator Ofcom has contracted with an AI-powered surveillance service to detect the number of citizens using VPNs to evade the Online Safety Act.

The UK tech minister has said a VPN ban is on the table.
Exclusive: Ofcom is monitoring VPNs following Online Safety Act. Here's how
Ignoring VPNs risks creating ineffective laws, but tracking them threatens people's privacy
www.techradar.com
November 11, 2025 at 11:39 PM
Reposted by Stuart Gray
This is awful, UK government raising an expectation that cannot be met technologically. "to ensure AI models cannot be misused to create synthetic child sexual abuse images" - would be nice but is not possible.
New amendment to the Crime and Policing Bill announced by the Government, enabling safety testing of AI models to prevent them being used to create CSAM; the press release also notes that such testing will cover extreme pornography and non-consensual intimate imagery. ⬇️ www.gov.uk/government/n...?
New law to tackle AI child abuse images at source as reports more than double
New legislation sees government work with AI industry and child protection organisations to ensure AI models cannot be misused to create synthetic child sexual abuse images.
www.gov.uk
November 12, 2025 at 1:58 PM
Anti AI sentiment has finally jumped the shark - some people are now lumping procedural generation in with genAI & “AI Slop”.

They are literally anti-algorithm now.

And yet non-ironically posting about it on an app only possible because of algorithms, on a software powered device!

🤦
November 12, 2025 at 12:54 PM
Reposted by Stuart Gray
Banning VPNs not only compromises privacy, but it compromises safety. Being forced to use public wifi when travelling without the option of a VPN to make that safe will just lead to compromised computers. All in pursuit of the so-called "Online Safety Act", which achieves the opposite of safety.
The British government admits it is now monitoring VPNs use by UK residents. Regulator Ofcom has contracted with an AI-powered surveillance service to detect the number of citizens using VPNs to evade the Online Safety Act.

The UK tech minister has said a VPN ban is on the table.
Exclusive: Ofcom is monitoring VPNs following Online Safety Act. Here's how
Ignoring VPNs risks creating ineffective laws, but tracking them threatens people's privacy
www.techradar.com
November 12, 2025 at 8:18 AM
Reposted by Stuart Gray
I'll give you a US example:.

The main age-verification lobbyist — a man who largely believes porn should be outlawed — admitted the state-level bills he pushed for won't work and were really a predicate for federal action.

He wants the DOJ to seize domains.
November 12, 2025 at 12:19 AM
Reposted by Stuart Gray
The fact that the BBC has made serious culpable errors does not negate the point that there is a real and concerted right-wing media campaign to destroy it. Both points can be true at the same time and the campaign would not end even if the errors did.
November 10, 2025 at 1:08 PM
Reposted by Stuart Gray
So, just the normal stuff your carmaker knows about you.
🤯

(From Byron Tau’s Means of Control, which you should read)

#talkaboutSurveillanceCapitalism
November 10, 2025 at 8:43 AM
Reposted by Stuart Gray
BREAKING

President Trump has granted a pardon to a slew of Trump world figures, including Rudy Giuliani, Mark Meadows, and Sidney Powell, for their efforts to overturn the 2020 election.
November 10, 2025 at 4:59 AM
Reposted by Stuart Gray
Can LLMs accurately aggregate information over long, information-dense texts? Not yet…

We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!
November 7, 2025 at 5:07 PM
Reposted by Stuart Gray
Lily is one of my coworkers. imo she's the archetype of what success looks like for AI in the workforce

she's in marketing, and leverages AI in ridiculously powerful ways

i like her phrasing — "Ops" is the superpower, not AI. AI is merely the tool

www.appliedaiformops.com/p/why-ops-sk...
Why Ops Skills Are Your AI Superpower
How your ops skills supercharge AI effectiveness - plus a practical roadmap for building AI solutions for your business
www.appliedaiformops.com
November 5, 2025 at 2:51 PM
Reposted by Stuart Gray
I'm obliged for the nod.

The "Penrose Effect" seems to be a real thing - hypothesised in the 1930s and re-tested in the last decade or so:

Where you reduce your inpatient psychiatric provision, you'll see a correlated rise within 10yrs in prisons of seriously mentally ill prisoners.
November 9, 2025 at 11:37 AM
Reposted by Stuart Gray
This is a cool paper showing that first-gen college students don't realize a lot of unwritten rules that lead to success (the value of internships, student clubs, letters from professors).

But giving them access to an LLM for guidance significantly closes the gap. mgcuna.github.io/website/JMP_...
November 9, 2025 at 2:55 PM
Reposted by Stuart Gray
I'm teaching "Intro to NLP" for our grad students next semester, and I'm curious how others are teaching such courses, in our current "era of AI." I've seen ideas (no tech in class, commonplace books) for smaller seminars, but how to do this in large, structured CS classes? Any success stories?
November 9, 2025 at 3:05 PM
Reposted by Stuart Gray
... which results in many (closed and open) models showing a similar performance bias towards likely leaked samples
We split MMLU in two parts (leaked/clean) and show that almost all models tend to perform better on leaked samples
November 7, 2025 at 9:11 PM
Reposted by Stuart Gray
This contamination is not intentional: we identified websites that reframed splits of MMLU as user-friendly quizzes
These websites can then be found in CommonCrawl dumps that are generally used for pretraining data curation...
November 7, 2025 at 9:11 PM
Reposted by Stuart Gray
We used the great Infinigram from Jiacheng Liu and found numerous hints of test set leakage in DCLM, which is used in OLMo-2

For instance, the fraction of MMLU questions that are leaked in pretraining had gone from ~1% to 24% between OLMo-1 and 2 😬
November 7, 2025 at 9:11 PM
Reposted by Stuart Gray
Thrilled to release Gaperon, an open LLM suite for French, English and Coding 🧀

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social
November 7, 2025 at 9:11 PM
Reposted by Stuart Gray
I'm a high demand resource and was recently re-org'd under a newly hired mgr. they have an H1B already but it needs to be transferred and can't because of the shutdown. unexpected delayed start date causing some small chaos as high priority projects vie for headcount and don't know who to talk to.
November 7, 2025 at 7:35 PM
Reposted by Stuart Gray
15 years ago, I said every social media founder should be forced to use and live with the default settings of their platforms. Now I would say that the AI founders should have their tools pointed at their families for a year before they can deploy them elsewhere.
I seem to remember whole congressional hearings about the terrible destructive force that is violence in video games and yet, there is a very strange quiet about an unregulated technology that coaches people to suicide.
There are no words for how evil this is
November 7, 2025 at 11:01 AM
Reposted by Stuart Gray
These all sound like edge cases. But as we are seeing in the US right now, edge cases have a horrible habit of becoming reality. It's not about "can you trust today's government" - it's about can you trust a completely unknown government in the future?
November 6, 2025 at 12:49 PM
Reposted by Stuart Gray
Driving without insurance sounds innocuous. How about driving to a demonstration? How about driving near the home of an MP? How about driving to an illegal meeting of gay people, if a government made being gay illegal?
November 6, 2025 at 12:49 PM
Reposted by Stuart Gray
The second problem is people think that the data will only be used for crimes that are on the statute books today. They forget that any government in the future can add to the crimes that data can be used for.
November 6, 2025 at 12:49 PM