FOIMonkey
banner
foimonkey.bsky.social
FOIMonkey
@foimonkey.bsky.social
Recovering FOI enthusiast and polyglot with a developing machine learning habit.
TIL that over the years, I have added or updated 34,724 unique public authorities on WhatDoTheyKnow. That's 74% of the total. It's all my fault 😅
November 28, 2025 at 12:10 PM
The decision by a number of local councils to run adverts on their websites didn't quite sit right with me. I couldn't quite work out why until I saw an advert for a credit card on the crisis loan page.
November 14, 2025 at 12:11 PM
When I find failed redactions or accidental releases of PII in FOI responses I will usually notify the authority. The response is mixed, but far too often there is just silence. Often the only way I know they've got my message is by seeing if the file has disappeared. You'd think they'd want to know
September 22, 2025 at 11:15 AM
Microsoft seems to have pulled the larger vibevoice TTS model from huggingface, and the github repo 404s github.com/microsoft/Vi.... It's not been out for long, but I can't be alone in having both downloaded and it's MIT licensed, so there is nothing to stop mirrors. I wonder what the issue is? 🤔
https://github.com/microsoft/VibeVoice)
September 4, 2025 at 9:03 AM
The SSD drive of shame has hit 2,752,077 files. Of course I haven't plugged in a new one rather than face clearing it 😅
August 27, 2025 at 12:43 PM
As a bit of fun, I used some of the WDTK keywords that I created last week to create synthetic FOI requests using mistral-small. I then finetuned SmolLM2-360M-Instruct on those outputs to generate requests from 3 keywords and the authority name: huggingface.co/HMC83/reques...
HMC83/request_writer_smol · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
August 18, 2025 at 12:02 PM
I've uploaded descriptive keywords for over 1 million public FOI requests. I have left it at with request_id, keywords and the name of the public authority for now: github.com/FOIMonkey/fo...
August 15, 2025 at 12:21 PM
As a side effect of this, I have generated keywords for over 1 million FOI requests. Figuring out how to do that well in the most lightweight way was a journey in and of itself. I've not looked yet, but combined with authority and outcome data, it should be possible to spot some interesting trends.
August 14, 2025 at 1:50 PM
Turns out you don't need to read an FOI response to start to be able to guess the outcome. I trained a TF-IDF classifier with a 73% macro F1-score in predicting success using just 3 keywords about the request and metadata. Adding the full request text hits 76% & a snippet of the response email 84%.
August 14, 2025 at 1:43 PM
I wrote up some quick notes on yesterday's journey to nowhere:
foimonkey.github.io/posts/12-hou...
12 hour Public Transport challenge
Objective Travel on as many different types of public transport as possible within a 12 hour window, starting and finishing in the same location.
foimonkey.github.io
August 7, 2025 at 2:50 PM
Made it back to Cowes in 11 hours 39 minutes. Taking the floating bridge x 2, a double decker bus, a Hovercraft, a single decker bus, a tram, the overground, DLR X 2 (got on the wrong train), the cable car, a catamaran, the underground, an automatic people mover, 3 x trains, and the vehicle ferry.
August 6, 2025 at 5:26 PM
Going to see how many different forms of public transport it is possible to take in one day today. First up Floating bridge.
August 6, 2025 at 5:36 AM
I took 40,000 images from emails in the WhatDoTheyKnow archive and extracted the five most dominant colours from each (excluding monochrome/shades of grey). Behold the palette of the UK public sector.
August 3, 2025 at 3:06 PM
Then I'm going to have another go at teaching it Welsh. Did the 1B version on the weekend, which showed promising results in terms of picking up the vocab/grammar vs the base model. A larger, more capable model and a more curated dataset seem like the way to go there if it is to be useful.
July 29, 2025 at 11:43 AM
I'm starting a couple of new projects today. First up, trying to make OLMo-2 better at reasoning. I want to have a go at teaching it to generate <think> tokens showing its chain-of-thought as it answers. It will be a couple of days before I know if it has worked.
July 29, 2025 at 11:40 AM