Mathew Attlee
banner
codeinabox.hachyderm.io.ap.brid.gy
Mathew Attlee
@codeinabox.hachyderm.io.ap.brid.gy
Code In A Box is a London based software development consultancy run by Mathew Attlee specialising in service and web development.

[bridged from https://hachyderm.io/@codeinabox on the fediverse by https://fed.brid.gy/ ]
Reposted by Mathew Attlee
GenAI feels like faster horses, not trains. Powerful and useful, but not yet the kind of change you can point and say “this could not exist before”

1/4
Faster horses, not trains. Yet | Rob Bowley
blog.robbowley.net
December 16, 2025 at 2:52 PM
@graeme howdy! Does your blog have an RSS feed?
December 17, 2025 at 1:16 PM
Reposted by Mathew Attlee
Ran into a problem in prod?
Just generate a fake cloudflare error page and blame it on them - gives you time to fix.

#foss #devops #cloudflare #infosec
December 16, 2025 at 6:41 AM
Nearly every time I dictate “quick huddle”, Open AI’s Whisper model transcribes it as "quick cuddle" 🤖 😅 🤣
December 17, 2025 at 1:05 PM
Reposted by Mathew Attlee
it's truly amazing what LLMs can achieve. we now know it's possible to produce an html5 parsing library with nothing but the full source code of an existing html5 parsing library, all the source code of all other open source libraries ever, a meticulously maintained and extremely comprehensive […]
Original post on mastodon.social
mastodon.social
December 17, 2025 at 3:06 AM
Some interesting takes in this newsletter about the future of remote working. One thing I did find very interesting about Instagram’s return to the office policy is a call for less meetings. I think it’s very true that remote working has created a culture of meetings because it isn’t constrained […]
Original post on hachyderm.io
hachyderm.io
December 13, 2025 at 4:16 PM
A fascinating article by Kent Beck, arguing that now is the perfect time to invest in junior developers, because AI allows them to learn and skill up faster https://tidyfirst.substack.com/p/the-bet-on-juniors-just-got-better
The Bet On Juniors Just Got Better
Why genies can make hiring juniors more profitable, & what you need to change to get there
tidyfirst.substack.com
December 12, 2025 at 4:12 PM
On the same day that I started using Bun for the Advent of Code 2025 exercises, they later announced that they've been acquired by Anthropic. I'll still keep using Bun for now because it's super handy having TypeScript and testing tools out of the box, but I am going to keep my eye on this […]
Original post on hachyderm.io
hachyderm.io
December 2, 2025 at 8:14 PM
Reposted by Mathew Attlee
We’re still an absolute skeleton crew of 14 people, competing with teams sometimes 100x as large as ours. To get to our humble team size was only possible through the less than 1% community members who donate to Mastodon, a handful larger donations, & EU grants, all of which we are forever […]
Original post on mastodon.social
mastodon.social
November 18, 2025 at 8:06 AM
Reposted by Mathew Attlee
Slow Horses screen grab, gotta love it.

#xml #slowhorses
November 21, 2025 at 10:15 PM
Reposted by Mathew Attlee
Sales of AI-enabled teddy bear suspended after it gave advice on BDSM sex and where to find knives https://edition.cnn.com/2025/11/19/tech/folotoy-kumma-ai-bear-scli-intl
November 22, 2025 at 11:20 AM
Reposted by Mathew Attlee
Important #indieweb lesson in #modular website setup this morning:

Keep your DNS provider separate from your CDN separate from your webhost, so you can swap out any one of them as necessary, whether for economic or as it were today, reliability reasons. And make sure those services themselves […]
Original post on tantek.com
tantek.com
November 18, 2025 at 3:48 PM
Reposted by Mathew Attlee
I’ve been testing a theory: many people who are high on #ai and #LLMs are just new to automation and don’t realize you can automate processes with simple programming, if/then conditions, and API calls with zero AI involved.

So far it’s been working!

Whenever I’ve been asked to make an AI flow […]
Original post on hachyderm.io
hachyderm.io
November 14, 2025 at 3:29 PM
Reposted by Mathew Attlee
Since this question shows up so often that it qualifies as an FAQ, here's my definite answer to "What happens if AI labs train for pelicans riding bicycles?" https://simonwillison.net/2025/Nov/13/training-for-pelicans-riding-bicycles/
What happens if AI labs train for pelicans riding bicycles?
Almost every time I share a new example of an SVG of a pelican riding a bicycle a variant of this question pops up: how do you know the labs …
simonwillison.net
November 13, 2025 at 4:06 PM
Reposted by Mathew Attlee
A twist on @simon getting LLMs to draw pelicans on bicycles: Getting it to do it using a raytracer:

https://blog.nawaz.org/posts/2025/Oct/pelican-on-a-bike-raytracer-edition/

#llm #raytracing #povray
Pelican on a Bike - Raytracer Edition
blog.nawaz.org
October 25, 2025 at 7:36 PM
Reposted by Mathew Attlee
Go fuck yourselves, NYT.
November 6, 2025 at 12:06 PM
Reposted by Mathew Attlee
2001 was obviously a tough year, but there were bright spots in internet innovation: Wikipedia launched in January 2001, Wayback Machine in October, and iTunes and iPod both came out that year. Not to mention bloggers redesigning in Movable Type and other increasingly cool blog tools. I hope you […]
Original post on mastodon.social
mastodon.social
November 5, 2025 at 2:24 PM
Today I wondered if I could use generative AI to generate some test fixtures, and it turns out there is a paper written on this very topic. https://arxiv.org/abs/2401.17626 #genai #softwaretesting
Generative AI to Generate Test Data Generators
Generating fake data is an essential dimension of modern software testing, as demonstrated by the number and significance of data faking libraries. Yet, developers of faking libraries cannot keep up with the wide range of data to be generated for different natural languages and domains. In this paper, we assess the ability of generative AI for generating test data in different domains. We design three types of prompts for Large Language Models (LLMs), which perform test data generation tasks at different levels of integrability: 1) raw test data generation, 2) synthesizing programs in a specific language that generate useful test data, and 3) producing programs that use state-of-the-art faker libraries. We evaluate our approach by prompting LLMs to generate test data for 11 domains. The results show that LLMs can successfully generate realistic test data generators in a wide range of domains at all three levels of integrability.
arxiv.org
November 5, 2025 at 4:44 PM
A great analysis of the DX AI-assisted engineering 2025 impact report. A key point is that “existing bottlenecks dwarf AI time savings” https://blog.robbowley.net/2025/11/05/findings-from-dxs-2025-report-ai-wont-save-you-from-your-engineering-culture/
Findings from DX’s 2025 report: AI won’t save you from your engineering culture
The DX AI-assisted engineering: Q4 (2025) impact report offers one of the most substantial empirical views yet of how AI coding assistants are affecting software development, and largely corroborates the key findings from the 2025 DORA State of AI-assisted Software Development Report: quality outcomes vary dramatically based on existing engineering practices, and both the biggest limitation and the biggest benefit come from adopting modern software engineering best practices – which remain rare even in 2025. AI accelerates whatever culture you already have. ## Who are DX and why the report matters DX is probably the leading and most respected developer intelligence platform. They sell productivity measurement tools to engineering organisations including Dropbox, Block, Pinterest, and BNY Mellon. They combine telemetry from development tools with periodic developer surveys to help engineering leaders track and improve productivity. This creates potential bias – DX’s business depends on organisations believing productivity can be measured. But it also means they have access to data most researchers don’t. ### **Data collection** The report examines data collected between July and October 2025. Drawing on data from 135,000 developers across 435 companies, the data set is substantially larger than most productivity research, and the methodology is transparent. It combines: * **System telemetry** from AI coding assistants (GitHub Copilot, Cursor, Claude Code), Git (PR throughput, time to 10th PR), and quality metrics (Change Failure Rate). * **Self-reported surveys** asking about time savings, AI-authored code percentage, maintainability perception, and enablement quality. ## Key Findings ### Quality impact varies dramatically The report tracks Change Failure Rate (CFR) – the percentage of changes causing production issues. Results split sharply: some organisations see CFR improvements, others see degradation. The report calls this “varied,” but I’d argue it’s the most important signal in the entire dataset. What differentiates organisations seeing improvement from those seeing degradation? The report doesn’t fully unpack this. ### Existing bottlenecks dwarf AI time savings This should be the headline: **meetings, interruptions, review delays, and CI wait times cost developers more time than AI saves.** Meeting-heavy days are reported as the single biggest obstacle to productivity, followed by interruption frequency (context switching). Individual task-level gains from AI are being swamped by organisational dysfunction. This corroborates 2025 DORA State of AI-assisted Software Development Report findings that systemic constraints limit AI impact. You can save 4 hours writing code faster, but if you lose 6 hours to slow builds, context switching, poorly-run meetings, the net effect is negative. ### Modest time savings claimed, but seem to have hit a wall Developers _report_ saving 3.6 hours per week on average, with daily users reporting 4.1 hours. But this is self-reported, not measured (see limitations). More interesting: **times savings have plateaued around 4 hours even as adoption climbed from ~50% to 91%**. The report initially presents this as a puzzle, but the data actually explains it. The biggest finding, buried on page 20, is – as above – that **non-AI bottlenecks dwarf AI gains**. ### Throughput gains measured, but problematic Daily AI users merge 60% more PRs per week than non-users (2.3 vs 1.4). That’s a measurable difference in activity. Whether it represents productivity is another matter entirely. (More on this in the limitations section.) ### Traditional enterprises show higher adoption Non-tech companies in regulated industries show higher adoption rates than big tech. The report attributes this to deliberate, structured rollouts with strong governance. There’s likely a more pragmatic explanation: traditional enterprises are aggressively rolling out AI tools in hopes of compensating for weak underlying engineering practices. The question is whether this works. If the goal is to shortcut or leapfrog organisational dysfunction without fixing the root causes, the quality degradation data suggests it won’t. AI can’t substitute for modern engineering practices; it can only accelerate whatever practices already exist. ### Other findings * **Adoption is near-universal:** 91% of developers now use AI coding assistants, matching DORA’s 2025 findings. The report also reveals significant “shadow AI” usage: developers using tools they pay for themselves, even when their organisation provides approved alternatives. * **Onboarding acceleration:** Time to 10th PR dropped from 91 days to 49 days for daily AI users. The report cites Microsoft research showing early output patterns predict long-term performance. * **Junior devs use AI most, senior devs save most time** : Junior developers have highest adoption, but Staff+ engineers report biggest time savings (4.4 hours/week). Staff+ engineers also have the _lowest_ adoption rates. Why aren’t senior engineers adopting as readily? Scepticism about quality? Lack of compelling use cases for complex architectural work? ## Limitations and Flaws ### Pull requests as a productivity metric The report treats “60% more PRs merged” as evidence of productivity gains. This is where I need to call out a significant problem – and interestingly, DX themselves have previously written about why this is flawed. PRs are a poor productivity metric because: * **They measure motion, not progress.** Counting PRs shows how many code changes occurred, not whether they improved product quality, reliability, or customer value. * **They’re highly workflow-dependent.** Some teams merge once per feature, others many times daily. Comparing PR counts between teams or over time is meaningless unless workflows are identical. * **They’re easily gamed and inflated.** Developers (or AI) can create more, smaller, or trivial PRs without increasing real output. “More PRs” often just means more noise. * **They’re actively misleading in mature Continuous Delivery environments.** Teams practising trunk-based development integrate continuously with few or no PRs. Low PR counts in that model actually indicate _higher_ productivity. ### Self-reported time savings can’t be trusted The “3.6 hours saved per week” is self-reported, not measured. People overestimate time savings. As an example. the METR Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity study found developers predicted 24% speedup from AI but were actually 19% slower. ### Quality findings under-explored The varied CFR results are the most important finding, but they’re presented briefly and then the report moves on. What differentiates organisations seeing improvement from those seeing degradation? Code review practices? Testing infrastructure? Team maturity? The enablement data hints at answers but doesn’t fully investigate. This is a missed opportunity to identify the practices that make AI a quality accelerator rather than a debt accelerator. ### Missing DORA Metrics The report covers Lead Time (poorly, approximated via PR throughput) and Change Failure Rate. But it doesn’t measure deployment frequency or Mean Time to Recovery. That means we’re missing the end-to-end delivery picture. We know code is written and merged faster, but we don’t know if it’s deployed faster or if failures are resolved more quickly. Without deployment frequency and MTTR, we can’t assess full delivery-cycle productivity. ## Conclusion This is one of the better empirical datasets on AI’s impact, corroborating DORA 2025’s key findings. But the real story isn’t in the headline numbers about time saved or PRs merged. It’s in two findings: ### **Non-AI bottlenecks still dominate.** Meetings, interruptions, review delays, and slow CI pipelines cost more than AI saves. Individual productivity tools can’t fix organisational dysfunction. As with DORA’s findings, the biggest limitation and the biggest opportunity both come from adopting modern engineering practices. Small batch sizes, trunk-based development, automated testing, fast feedback loops. AI makes their presence more valuable and their absence more costly. ### **AI is an accelerant, not a fix** It reveals and amplifies existing engineering culture. Strong quality practices get faster. Weak practices accumulate debt faster. The variation in CFR outcomes isn’t noise – it’s the signal. The organisations seeing genuine gains are those already practising modern software engineering. Those practices remain rare. My advice for engineering leaders: 1. **Tackle system-level friction first.** Four hours saved writing code doesn’t matter if you lose six to meetings, context switching and poor CI infrastructure and tooling. 2. **Adopt modern engineering practices.** The gains from adopting a continuous delivery approach dwarf what AI alone can deliver. 3. **Don’t expect AI to fix broken processes.** If review is shallow, testing is weak, or deployment is slow, AI amplifies those problems. 4. **Invest in structured enablement.** The correlation between training quality and outcomes is strong. 5. **Track throughput properly alongside quality.** More PRs merged isn’t a win if it isn’t actually resulting in shipping faster and your CFR goes up. Measure end to end cycle times, CFR, MTTR, and maintainability.
blog.robbowley.net
November 5, 2025 at 11:42 AM
This is a very handy technique for when you’re making local changes for testing, but you don’t want to commit them https://www.brandonpugh.com/til/git/skip-worktree-ignore-modified-files/
Use skip-worktree to ignore modified files
Today I learned about the `--skip-worktree` command in git which will treat a file like it hasn’t been modified. This is useful if you have to modify a file locally but don’t ever want to commit it (config files are a common scenario). Like me, you may have seen `--assume-unchanged` used in this way but that’s not what it’s meant for since it’s “designed for cases where it is expensive to check whether a group of files have been modified”. As a result you’re likely to lose the changes you have made to those files. This post shows a good summary of the outcomes of common operations with each command. The advantage of `--skip-worktree` is that git really tries to preserve the changes you’ve made to those files. This works pretty well if the files aren’t changed very often but it can be pretty tedious if they change frequently even when those changes wouldn’t have caused merge conflicts as git will refuse to modify the files. Say for example you need to make a change to a config file for your environment. You would run: git update-index --skip-worktree config/local.conf If changes are rarely committed to this than you may not have to think about it again. However, if you need to switch to a branch with changes to this file then you’ll get an error like: > error: Your local changes to the following files would be overwritten by checkout: path/to/file If you run `git stash` now, that file won’t be affected. So now you need to run: git update-index --no-skip-worktree config/local.conf # now you can run stash git stash git switch other-branch git stash pop # you'll need to resolve conflicts if any otherwise skip the file again git update-index --skip-worktree config/local.conf # you can run this to see which files have skip-worktree set git ls-files -v | grep '^S' Depending on how frequently you have to deal with this, you’ll quickly end up making an alias or script for it.
www.brandonpugh.com
November 3, 2025 at 4:15 PM
Reposted by Mathew Attlee
Hi all, it's been a while 😅

Just posting here to let you know that BeaconDB has been down for nearly 20 hours due to what appears to be a fraudulent abuse claim.

Someone has contacted netcup with a few MB of logs that show the IP of BeaconDB's server being used to scan ports. netcup has […]
Original post on mapstodon.space
mapstodon.space
October 29, 2025 at 8:30 AM
Any recommendations for Vim or Neovim plugins when working with a OpenAPI schema? #vim #neovim #openapi
October 28, 2025 at 10:10 AM
Reposted by Mathew Attlee
«tl;dr: #futo is not being honest about their “grant program”, they don’t have permission to pass off these logos or project names as endorsements, and they collaborate with and promote mask-off, self-proclaimed fascists.»

oh no, another group centered around 1-2 rich people suck! […]
Original post on scholar.social
scholar.social
October 22, 2025 at 1:41 PM