illtud
illtud.mastodon.social.ap.brid.gy
illtud
@illtud.mastodon.social.ap.brid.gy
Technoleg a Threftadaeth, cyfri personol.
Technology & Heritage, personal account.
#Cymraeg #Aberyswyth

[bridged from https://mastodon.social/@illtud on the fediverse by https://fed.brid.gy/ ]
Reposted by illtud
So, I’ve reading up on lossless audio compression yesterday and found that based on http://www.audiograaf.nl/losslesstest/revision%206/Average%20of%20CDDA%20sources.pdf these are the most efficient free-to-use lossless audio codecs (in order):

1. OptimFROG _and_ LA (Lossless Audio) […]
Original post on tech.lgbt
tech.lgbt
February 3, 2026 at 1:22 PM
Thanks BBC for "Madness at the BBC" last night, available for 28 days after broadcast.

But there's some whiff of editorial interference. See the screenshots + I'll explain.
December 23, 2025 at 1:43 AM
Reposted by illtud
The virus of fake science is spreading: LLMs are hallucinating references that scholars are citing, and editors of real scholarly journals are accepting in published articles […]

[Original post on mastodon.social]
December 20, 2025 at 12:19 PM
Reposted by illtud
Stupidly I forgot to hit the 'record' button at the start of my talk today, but here are the slides with my notes: "my place: exploring the Library's place-based collections". https://slides.com/wragge/slv-my-place #glam #libraries #maps #digitalhumanities
my place
Presentation summarising my work as Creative Technologist-in-Residency at the State Library of Victoria LAB, December 2025
slides.com
December 3, 2025 at 5:41 AM
Reposted by illtud
Calling #digipres colleagues worldwide 📣 We’re looking for volunteers to share their workflows for the March 2026 #dpc Workflow Webinars!

Share your #digitalpreservation workflow, spark discussion, and help strengthen our community in the 2026 Workflow […]

[Original post on digipres.club]
December 2, 2025 at 10:30 AM
Reposted by illtud
My comment on plans to create a specific legal base in #gdpr which would declare the use of personal data for training #AI as justified by legitimate interest of corporations or governments […]
Original post on eupolicy.social
eupolicy.social
November 4, 2025 at 9:20 PM
Reposted by illtud
Hey UK GLAM friends, I'm super keen to get to the GLAM Labs conference in Edinburgh next June, but I need to find some funding. https://www.glamlabs.io/events/glam-labs-futures-26 Is there anything I could come and do for you around June next year that could help pay my way?
International GLAM Labs Community - GLAM Labs Futures 26
GLAM Labs Futures ● 25-26 June 2026 ● Scotland
www.glamlabs.io
November 4, 2025 at 7:36 AM
I'r gwely yn fwy tawel nawr. Falch fod @plaidcymru wedi claddu Reform er gwaetha'r arian mawr ac anwiredd tywyll a lifodd i'r ymgyrch.
#cymru #cymraeg #caerffili #senedd
October 24, 2025 at 1:26 AM
Caerphilly, Senedd constituency by-election result:

PC: 47.4% (+19.0)
REF: 36.0% (+34.2)
LAB: 11.0% (-34.9)
CON: 2.0% (-15.3)
GRN: 1.5% (+1.5)
LDEM: 1.5% (-1.2)
GWL: 0.3% (+0.3)
UKIP: 0.2% (+0.2)

Plaid Cymru GAIN from Labour.

#senedd #cymru #caerffili
October 24, 2025 at 1:19 AM
Oes na unrhyw fywyd heno yn y Fediverse Cymraeg dwi'n gweld/cyrraedd? #cymraeg #caerffili #plaidcymru #senedd #Etholiad
Oes tags dwi'n methu?
October 24, 2025 at 12:44 AM
Reposted by illtud
BBC Breakfast this morning had a long segment about the 30th anniversary of Windows 95. So what vintage computer did they put in the studio as a handy prop?

#windows95
August 29, 2025 at 7:17 AM
A reminder that the recently late Tom Lehrer relinquished his copyrights in his works and performances. Further than that, he had a website made where you can download lyrics, sheet music and mp3s, to make it easy to perform, reuse and remix his work.

https://tomlehrersongs.com/

#tomlehrer […]
Original post on mastodon.social
mastodon.social
July 27, 2025 at 10:03 PM
Reposted by illtud
For 30 years I've been experimenting with ways to see & use the online collections of libraries, archives & museums. I've made useful things, playful things, & weird things. Here's a big list of them for you to explore! https://wraggelabs.com #glam #histodons #digitalhumanities
Wragge Labs
A compilation of things I've built over the last 30 years to help people see and use the online collections of libraries, archives, and museums.
wraggelabs.com
July 4, 2025 at 6:12 AM
Reposted by illtud
More on the crisis in open-source maintenance as exemplified by libxml2: https://lwn.net/SubscriberLink/1025971/73f269ad3695186d/

Some money *has* to start flowing into this community or the foundations we all rely on will start rotting away. Given the many-billions-per-quarter in Big Tech […]
Original post on cosocial.ca
cosocial.ca
June 26, 2025 at 6:12 PM
Reposted by illtud
Mastodon (and other Fedi software) should stop stripping the C2PA manifest from the EXIF of uploaded media.

In the long run there are lots of good C2PA things to be done, but for now, just get out of the way and let the emergent ecosystem of tools work with it.

Probably best as an opt-in […]
Original post on cosocial.ca
cosocial.ca
June 20, 2025 at 6:03 PM
Reposted by illtud
This stuff about the Cambridge-Oxford ‘England and Wales’ railway scandal from @willhaycardiff.bsky.social is damning beyond words.

Evidence it has been recently reclassified from being an ‘England’ only project, robbing Wales of millions.

UK Govts (blue or red) treat #Wales abysmally.
June 8, 2025 at 7:59 AM
Reposted by illtud
Here's video, slides and a detailed annotated transcript from my talk at this week's AI Engineer World's Fair conference in San Francisco - "The last year six months in LLMs, illustrated by pelicans on bicycles" https://simonwillison.net/2025/Jun/6/six-months-in-llms/
The last year six months in LLMs, illustrated by pelicans on bicycles
I presented an invited keynote at the AI Engineer World’s Fair in San Francisco this week. This is my third time speaking at the event—here’s my talks from October 2023 …
simonwillison.net
June 6, 2025 at 8:44 PM
Reposted by illtud
Comma v0.1 1T and 2T-7B are two brand new LLMs trained exclusively on public domain and openly licensed text!

I got the 2T one running locally on my Mac after porting it to MLX - notes on that here: https://simonwillison.net/2025/Jun/7/comma/
Comma v0.1 1T and 2T—7B LLMs trained on openly licensed text
It’s been a long time coming, but we finally have some promising LLMs to try out which are trained entirely on openly licensed text! EleutherAI released the Pile four and …
simonwillison.net
June 8, 2025 at 12:10 AM
Reposted by illtud
Scoop: Texas police used Flock's license plate cameras to search *nationwide* for a woman who had a self-managed abortion. Claim it was "about her safety"

https://www.404media.co/a-texas-cop-searched-license-plate-cameras-nationwide-for-a-woman-who-got-an-abortion/
May 29, 2025 at 5:40 PM
Reposted by illtud
I put together an annotated version of the new Claude 4 system prompt, covering both the prompt Anthropic published and the missing, leaked sections that describe its various tools

It's basically the secret missing manual for Claude 4, it's […]

[Original post on fedi.simonwillison.net]
May 25, 2025 at 1:53 PM
Reposted by illtud
When lawyers first started getting yelled at by judges for citing hallucinated case law two years ago I naively assumed word would get around and they would all quickly learn not to

This new database has 116 cases from 12 countries where this happened, and 20 of them were from just this month! […]
Original post on fedi.simonwillison.net
fedi.simonwillison.net
May 25, 2025 at 5:07 PM
Reposted by illtud
The GitHub MCP server suffers from the lethal trifecta for prompt injection: access to private data, exposure to malicious instructions + the ability to exfiltrate information

Be really careful with this stuff: attackers can trick your "agent" into stealing your private data […]
Original post on fedi.simonwillison.net
fedi.simonwillison.net
May 27, 2025 at 12:28 AM
Reposted by illtud
How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation – Sean Heelan's Blog https://sean.heelan.io/2025/05/22/how-i-used-o3-to-find-cve-2025-37899-a-remote-zeroday-vulnerability-in-the-linux-kernels-smb-implementation/
How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation
In this post I’ll show you how I found a zeroday vulnerability in the Linux kernel using OpenAI’s o3 model. I found the vulnerability with nothing more complicated than the o3 API – no scaffolding, no agentic frameworks, no tool use. Recently I’ve been auditing ksmbd for vulnerabilities. ksmbd is “ _a linux kernel server which implements SMB3 protocol in kernel space for sharing files over network._ “. I started this project specifically to take a break from LLM-related tool development but after the release of o3 I couldn’t resist using the bugs I had found in ksmbd as a quick benchmark of o3’s capabilities. In a future post I’ll discuss o3’s performance across all of those bugs, but here we’ll focus on how o3 found a zeroday vulnerability during my benchmarking. The vulnerability it found is CVE-2025-37899 (fix here), a use-after-free in the handler for the SMB ‘logoff’ command. Understanding the vulnerability requires reasoning about concurrent connections to the server, and how they may share various objects in specific circumstances. o3 was able to comprehend this and spot a location where a particular object that is not referenced counted is freed while still being accessible by another thread. As far as I’m aware, this is the first public discussion of a vulnerability of that nature being found by a LLM. Before I get into the technical details, the main takeaway from this post is this: with o3 LLMs have made a leap forward in their ability to reason about code, and if you work in vulnerability research you should start paying close attention. If you’re an expert-level vulnerability researcher or exploit developer the machines aren’t about to replace you. In fact, it is quite the opposite: they are now at a stage where they can make you _significantly_ more efficient and effective. If you have a problem that can be represented in fewer than 10k lines of code there is a reasonable chance o3 can either solve it, or help you solve it. _Aside: If you work at a frontier lab and want to discuss evaluating your model’s capabilities on these sorts of tasks then drop me an email via firstname.lastname @ gmail.com_. ## o3 re-finds CVE-2025-37778 Lets first discuss CVE-2025-37778, a vulnerability I found manually but which o3 was also able to find. CVE-2025-37778 is a use-after-free vulnerability. The issue occurs during the Kerberos authentication path when handling a “ _session setup_ ” request from a remote client. To save us referring to CVE numbers, I will refer to this vulnerability as the “ _kerberos authentication vulnerability_ “. The root cause looks as follows: static int krb5_authenticate(struct ksmbd_work *work, struct smb2_sess_setup_req *req, struct smb2_sess_setup_rsp *rsp) { ... if (sess->state == SMB2_SESSION_VALID) ksmbd_free_user(sess->user); retval = ksmbd_krb5_authenticate(sess, in_blob, in_len, out_blob, &out_len); if (retval) { ksmbd_debug(SMB, "krb5 authentication failed\n"); return -EINVAL; } ... If `krb5_authenticate` detects that the session state is `SMB2_SESSION_VALID` then it frees `sess->user`. The assumption here appears to be that afterwards either `ksmbd_krb5_authenticate` will reinitialise it to a new valid value, or that after returning from `krb5_authenticate` with a return value of `-EINVAL` that `sess->user` will not be used elsewhere. As it turns out, this assumption is false. We can force `ksmbd_krb5_authenticate` to not reinitialise `sess->user`, and we can access `sess->user` even if `krb5_authenticate` returns `-EINVAL`. This vulnerability is a nice benchmark for LLM capabilities as: 1. It is interesting by virtue of being part of the remote attack surface of the Linux kernel. 2. It is not trivial as it requires: * (a) Figuring out how to get `sess->state == SMB2_SESSION_VALID` in order to trigger the free. * (b) Realising that there are paths in `ksmbd_krb5_authenticate` that do not reinitialise sess->user and reasoning about how to trigger those paths. * (c) Realising that there are other parts of the codebase that could potentially access `sess->user` after it has been freed. 3. While it is not trivial, it is also not insanely complicated. I could walk a colleague through the entire code-path in 10 minutes, and you don’t really need to understand a lot of auxiliary information about the Linux kernel, the SMB protocol, or the remainder of ksmbd, outside of connection handling and session setup code. I calculated how much code you would need to read at a minimum if you read every ksmbd function called along the path from a packet arriving to the ksmbd module to the vulnerability being triggered, and it works out at about 3.3k LoC. OK, so we have the vulnerability we want to use for evaluation, now what code do we show the LLM to see if it can find it? My goal here is to evaluate how o3 would perform were it the backend for a hypothetical vulnerability detection system, so we need to ensure we have clarity on how such a system would generate queries to the LLM. In other words, it is no good arbitrary selecting functions to give to the LLM to look at if we can’t clearly describe how an automated system would select those functions. The _ideal_ use of an LLM is we give it all the code from a repository, it ingests it and spits out results. However, due to context window limitations and regressions in performance that occur as the amount of context increases, this isn’t practically possible right now. Instead, I thought one possible way that an automated tool could generate context for the LLM was through expansion of each SMB command handler individually. So, I gave the LLM the code for the ‘session setup’ command handler, including the code for all functions it calls, and so on, up to a call depth of 3 (this being the depth required to include all of the code necessary to reason about the vulnerability). I also include all of the code for the functions that read data off the wire, parses an incoming request, selects the command handler to run, and then tears down the connection after the handler has completed. Without this the LLM would have to guess at how various data structures were set up and that would lead to more false positives. In the end, this comes out at about 3.3k LoC (~27k tokens). The final decision is what prompt to use. You can find the system prompt and the other information I provided to the LLM in the .prompt files in this Github repository. The main points to note are: 1. I told the LLM to look for use-after-free vulnerabilities. 2. I gave it a brief, high level overview of what ksmbd is, its architecture, and what its threat model is. 3. I tried to strongly guide it to not report false positives, and to favour not reporting any bugs over reporting false positives. I have no idea if this helps, but I’d like it to help, so here we are. In fact my entire system prompt is speculative in that I haven’t ran a sufficient number of evaluations to determine if it helps or hinders, so consider it equivalent to me saying a prayer, rather than anything resembling science or engineering. Once I have ran those evaluations I’ll let you know. To run the query I then use the `llm` tool (github) like: $ llm --sf system_prompt_uafs.prompt \ -f session_setup_code.prompt \ -f ksmbd_explainer.prompt \ -f session_setup_context_explainer.prompt \ -f audit_request.prompt My experiment harness executes this N times (N=100 for this particular experiement) and saves the results. It’s worth noting, if you rerun this you may not get _identical_ results to me as between running the original experiment and writing this blog post I had removed the code context in session_setup_code.prompt and had to regenerate it. I believe it is effectively identical, but have not re-run the experiment. o3 finds the kerberos authentication vulnerability in 8 of the 100 runs. In another 66 of the runs o3 concludes there is no bug present in the code (false negatives), and the remaining 28 reports are false positives. In other words, with a ratio of 1:4.5 of true positives to false positives we would have had to go through, at most, 5 false positive reports to get to one of the true positives. For comparison, Claude Sonnet 3.7 finds it 3 out of 100 runs and Claude Sonnet 3.5 does not find it in 100 runs. For the curious, I have uploaded a sample report from o3 (here) and Sonnet 3.7 (here). One aspect I found interesting is their presentation of results. With o3 you get something that feels like a human-written bug report, condensed to just present the findings, whereas with Sonnet 3.7 you get something like a stream of thought, or a work log. There are pros and cons to both. o3’s output is typically easier to follow due to its structure and focus. On the other hand, sometimes it is too brief, and clarity suffers. ## o3 finds a 0-day (CVE-2025-37899) Having confirmed that o3 can find the kerberos authentication vulnerability (CVE-2025-37778) when given the code for the session setup command handler, I wanted to see if it could find it if I give it the code for **all** of the command handlers. This is a harder problem as the command handlers are all found in smb2pdu.c, which has ~9k LoC. However, if o3 can still find vulnerabilities when given all of the handlers in one go then it suggests we can build a more straightforward wrapper for o3 that simply hands it entire files, covering a variety of functionality, rather than going handler by handler. _It’s worth noting that while the top level command handlers are found in smb2pdu.c not all of the functions they call are found in that file. There is functionality for virtual file system access, IPC, crypto etc. that these top level handlers make use of but that are found in other files, and I did not provide these to the model in this experiment_. Combining the code for all of the handlers with the connection setup and teardown code, as well as the command handler dispatch routines, ends up at about 12k LoC (~100k input tokens), and as before I ran the experiment 100 times. o3 finds the kerberos authentication vulnerability in 1 out of 100 runs with this larger number of input tokens, so a clear drop in performance, but it does still find it. More interestingly however, in the output from the other runs I found a report for a similar, but novel, vulnerability. This vulnerability is also due to a free of `sess->user`, but this time in the session logoff handler. int smb2_session_logoff(struct ksmbd_work *work) { struct ksmbd_conn *conn = work->conn; struct ksmbd_session *sess = work->sess; ... ksmbd_close_session_fds(work); ksmbd_conn_wait_idle(conn); ... down_write(&conn->session_lock); sess->state = SMB2_SESSION_EXPIRED; up_write(&conn->session_lock); if (sess->user) { ksmbd_free_user(sess->user); sess->user = NULL; } ... I’ll let o3 explain the issue: Short description ================= While one ksmbd worker thread is still executing requests that use sess‑>user, another thread that processes an SMB2 LOGOFF for the same session frees that structure. No synchronisation protects the pointer, so the first thread dereferences freed memory – a classic use‑after‑free that leads to kernel memory corruption and arbitrary code execution in kernel context. Affected code path ------------------ 1. A second transport is bound to an existing session (SMB 3.0 or later, `conn->binding == true`). 2. Worker‑A (running on connection C2) receives any normal request (e.g. WRITE). • `smb2_check_user_session()` stores a pointer to the already existing `struct ksmbd_session` in `work->sess` and increments the session ref‑count, **but it does not take any reference on `sess->user`.** • The pointer to `sess` is kept for the whole request processing; every helper that needs authentication data accesses `sess->user` directly. 3. Before Worker‑A finishes, Worker‑B (running on the first connection C1) processes an SMB2 ​LOGOFF for the same session and executes `smb2_session_logoff()` (smb2pdu.c). Relevant part of smb2_session_logoff() -------------------------------------- ```c ... if (sess->user) { ksmbd_free_user(sess->user); /* (1) frees memory */ sess->user = NULL; /* (2) clear the field */ } ... ``` `ksmbd_free_user()` ultimately ends in simple `kfree(user)`. 4. Worker‑B **does not wait for the other connections that are still using the session**; it only waits for running requests on *its own* connection (`ksmbd_conn_wait_idle(conn)`). 5. Worker‑A continues to execute (e.g. inside `smb2_open()`, `smb2_write()`, …) and dereferences `sess->user`: ```c /* examples (many of them) */ if (user_guest(sess->user)) ← dereference after free ksmbd_compare_user(sess->user, …) sess->user->uid ``` Because the memory was already `kfree()`‑ed in step (1) the access is to freed memory. Depending on exact timing it is either: • a use‑after‑free (pointer still points into now‑reused slab object), enabling controlled kernel‑memory overwrite, or • a NULL–deref (if Worker‑A reads after step (2)), still a DoS. Reading this report I felt my expectations shift on how helpful AI tools are going to be in vulnerability research. If we were to never progress beyond what o3 can do right now, it would still make sense for everyone working in VR to figure out what parts of their work-flow will benefit from it, and to build the tooling to wire it in. Of course, part of that wiring will be figuring out how to deal with the the signal to noise ratio of ~1:50 in this case, but that’s something we are already making progress at. One other interesting point of note is that when I found the kerberos authentication vulnerability the fix I proposed was as follows: diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index d24d95d15d87..57839f9708bb 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -1602,8 +1602,10 @@ static int krb5_authenticate(struct ksmbd_work *work, if (prev_sess_id && prev_sess_id != sess->id) destroy_previous_session(conn, sess->user, prev_sess_id); - if (sess->state == SMB2_SESSION_VALID) + if (sess->state == SMB2_SESSION_VALID) { ksmbd_free_user(sess->user); + sess->user = NULL; + } retval = ksmbd_krb5_authenticate(sess, in_blob, in_len, out_blob, &out_len); -- 2.43.0 When I read o3’s bug report above I realised this was insufficient. The logoff handler _already_ sets `sess->user = NULL`, but is still vulnerable as the SMB protocol allows two different connections to “bind” to the same session and there is nothing on the kerberos authentication path to prevent another thread making use of `sess->user` in the short window after it has been freed and before it has been set to NULL. I had already made use of this property to hit a prior vulnerability in ksmbd but I didn’t think of it when considering the kerberos authentication vulnerability. Having realised this, I went again through o3’s results from searching for the kerberos authentication vulnerability and noticed that in some of its reports it had made the same error as me, in others it had not, and it had realised that setting `sess->user = NULL` was insufficient to fix the issue due to the possibilities offered by session binding. That is quite cool as it means that had I used o3 to find and fix the original vulnerability I would have, in theory, done a better job than without it. I say ‘in theory’ because right now the false positive to true positive ratio is probably too high to definitely say I would have gone through each report from o3 with the diligence required to spot its solution. Still, that ratio is only going to get better. ## Conclusion LLMs exist at a point in the capability space of program analysis techniques that is far closer to humans than anything else we have seen. Considering the attributes of creativity, flexibility, and generality, LLMs are far more similar to a human code auditor than they are to symbolic execution, abstract interpretation or fuzzing. Since GPT-4 there has been hints of the potential for LLMs in vulnerability research, but the results on real problems have never quite lived up to the hope or the hype. That has changed with o3, and we have a model that can do well enough at code reasoning, Q&A, programming and problem solving that it can genuinely enhance human performance at vulnerability research. o3 is not infallible. Far from it. There’s still a substantial chance it will generate nonsensical results and frustrate you. What is different, is that for the first time the chance of getting correct results is sufficiently high that it is worth your time and and your effort to try to use it on real problems. Whether it works or not, your time will not be wasted. Now is the perfect opportunity to construct a few small, targeted benchmarks of problems that matter to you, and that you can use to track progress as new and improved models land. ### Share this: * Click to share on Facebook (Opens in new window) Facebook * Click to share on X (Opens in new window) X * Like Loading... ### _Related_
sean.heelan.io
May 26, 2025 at 3:13 AM