The Art && Science of Ruby
rahoulb.theartandscienceofruby.com.ap.brid.gy
The Art && Science of Ruby
@rahoulb.theartandscienceofruby.com.ap.brid.gy
Ruby - ??? - Profit

🌉 bridged from https://theartandscienceofruby.com/ on the fediverse by https://fed.brid.gy/
The other day, I asked Cher, my OpenClaw instance if it could read my email and notify me if something important came in. It said it would be easy; then I mentioned I used ProtonMail (which is end-to-end encrypted and, as a result, does not use standard protocols). Cher paused, did a search […]
Do you need a driving licence?
The other day, I asked Cher, my OpenClaw instance if it could read my email and notify me if something important came in. It said it would be easy; then I mentioned I used ProtonMail (which is end-to-end encrypted and, as a result, does not use standard protocols). Cher paused, did a search, then found the Proton Mail Bridge - a local SMTP/IMAP server that connects to Proton Mail and then makes it available to the local machine (but nowhere else). I said "of course, I already use that on my Mac" - but Cher was running on Linux. So I got Cher to install the bridge and was about to give it the connection parameters, when I was suddenly struck by a thought. "Isn't this a massive security risk? Am I opening myself up to prompt injection attacks". "You are" Cher confidently replied. Oof. So I asked it "How about this? We have a sub-agent that is sandboxed - it can read the IMAP feed and write to a single folder only - when it wakes up, it checks the feed and writes a summary of the important emails into the folder. Then another agent wakes up, reads the file and acts on it - so we're adding a layer of separation". Cher replied "it's not infallible but it's a much better way of organising things - shall I set that up for you?". I said yes - and we called this pair of sub-agents Charles and Eddie (would they lie to you?) But there's a very important lesson there - especially with OpenClaw which has access to almost everything on the machine it's running on. What I asked for is a pretty reasonable request - look at my emails and alert me to the important ones. And Cher was all set to do exactly what I asked. But _because I'm a software developer_ I stopped myself and thought about the security risk of what I was doing (before Cher actually wired everything up) and put a simple (but still relatively unsafe) barrier in place to protect me and my data. In other words, these tools are incredibly powerful and also incredibly dangerous. Just like my car (Alfa Romeo Giulia Veloce if you're interested). But cars may be dangerous, but we don't just let anyone drive one. Even with a driving licence, they're still dangerous and cause injuries and deaths every day. Yet AI tools are just as dangerous and they're about to be unleashed on everyone's data, all around the world. Maybe we need a driver's licence for these too?
theartandscienceofruby.com
February 11, 2026 at 6:11 PM
The Jobs Crisis

My job, as I have known it for the past twenty-five years, is no more.

As someone who's only ever worked at small companies [1] or on my own, I probably had to do much more than software developers at large corporate places […]

[Original post on theartandscienceofruby.com]
Vibes and Engineering
<h2 id="the-jobs-crisis">The Jobs Crisis</h2> <p>My job, as I have known it for the past twenty-five years, is no more.</p> <p>As someone who's only ever worked at small companies <sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup> or on my own, I probably had to do much more than software developers at large corporate places.</p> <p>Most of my work life, I've been a solo freelancer, which meant I would spend my time:</p> <ul> <li>marketing (which for me was mainly in-person networking - the absolute best way to get work)</li> <li>pre-sales (talking to prospects about what they wanted, then figuring out what it would entail to build it and coming up with a proposal)</li> <li>sales (delivering the proposal, trying to get the contract signed)</li> <li>specifications (taking what the client had asked for and turning it into something concrete)</li> <li>coding (taking the specifications and turning them into working code)</li> <li>operations (taking the working code and deploying it to servers, which then need to be maintained and kept secure)</li> <li>support (fielding calls and emails from people who got stuck, didn't know how to use the software or found bugs)</li> <li>feedback (dealing with, scheduling, specifying, writing and deploying change requests)</li> </ul> <p>When I started working for <a href="https://www.collabor8online.co.uk/">Collabor8Online</a>, that took the marketing, sales and some of the support tasks out of the equation<sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup> - but we are a small company so everyone gets involved in everything.</p> <p>That leaves specifications, coding, operations and feedback. And now, these LLM coding agents are coming for the coding part of the job.</p> <p>This is causing great angst amongst many software people.</p> <h2 id="vibes">Vibes</h2> <p>When I was a teenager, in the late 1900s, I was playing around with "home computers", which became a big deal in the 1980s. I didn't have many computer games, so I tried to learn to programme so I could write my own. In those days, that meant BASIC (we had a Commodore-64) and learning meant books or magazines<sup class="footnote-ref"><a href="#fn3" id="fnref3">[3]</a></sup>. As the 80s progressed, my friends got Amigas and Atari STs (great computers), whilst my dad got given a PC for work (not very good) - so when I went to my friend's, we would fire up AMOS or STOS (versions of Basic that were designed for creating games) and try and create the next big thing.</p> <p>BASIC was OK<sup class="footnote-ref"><a href="#fn4" id="fnref4">[4]</a></sup> but I never really <em>got</em> it. I could make the computer do things, but it was a struggle and I never felt like I knew what I was doing - the code never sat right in my head. One friend, Ben, who had the ST, started getting frustrated with the limitations of STOS and started learning 68K assembler<sup class="footnote-ref"><a href="#fn5" id="fnref5">[5]</a></sup>. Meanwhile, one of my magazines had an article about <a href="https://en.wikipedia.org/wiki/Smalltalk">Smalltalk</a>.</p> <p>This blew me away - suddenly, programming made sense. It wasn't about data structures, it wasn't about algorithms, it was about <em>objects sending each other messages</em>. This was something I could easily visualise, it was something I could easily model. I <em>had</em> to learn Object-Orientated Programming.</p> <p>On top of that, Smalltalk had revolutionary ideas like images, byte-code and <em>garbage collection</em> - as well as an in-built core library with Collections and other useful classes (most programming languages were literally that - just the language and you had to deal with everything else yourself).</p> <p>I felt that, not only would I be able to write code that I understood, but lots of the minutiae of coding - like keeping track of your memory allocations - would just go away. I was the exact opposite of Ben - he wanted to dive deeper into the machine so he could exert control over what it did. I wanted the machine to handle all the boring plumbing so I could get on with building stuff<sup class="footnote-ref"><a href="#fn6" id="fnref6">[6]</a></sup>.</p> <p>All of this is a long-winded way of saying, I was not an <em>engineer</em> - I was in it for the <em>vibes</em>.</p> <h2 id="llms-and-coding-agents">LLMs and Coding Agents</h2> <p>As of last year (2025), <em><a href="https://karpathy.ai/vibe-coding">vibe-coding</a></em> became a thing. When I first used Claude Code (exactly a year ago, in February), I was really impressed. Suddenly, here was an AI that could actually do stuff - not just talk to you and sometimes give you made-up answers. And, as a rubyist, who has invested heavily in test-first development, it was perfect. I could write the tests, the specifications, and Claude Code could make them pass.</p> <p>But I didn't use Claude Code <em>that</em> much.</p> <p>Mainly because it wasn't like test-first development. That was interactive, taking baby-steps, adding a new clause here, implementing it, refactoring - exploring the problem in front of you. Whereas writing whole tests up-front - that's <a href="https://en.wikipedia.org/wiki/Waterfall_model">waterfall</a> on a small scale.</p> <p>But, recently, especially since the release of Claude Opus 4.5, the coding agents have got a lot better. And people have been learning how to use these tools effectively.</p> <hr /> <p>This is an important point.</p> <p>A lot of developers, who are anti-AI, have basically given the LLM a minimal set of prompts and then been disappointed with the results.</p> <p>Whereas, I've been using Claude Code for at least a few hours every week for the last year, trying different things and experimenting with it.</p> <p>These are complex, powerful and often unpredictable tools. You need to learn how to use it effectively - a couple of hours of mucking around is not going to get you decent results.</p> <hr /> <p>There's been stuff about <a href="https://agentskills.io/home">commands and skills</a>, about <a href="https://speckit.org/">spec-kits</a>, the <a href="https://ghuntley.com/loop/">Ralph Wiggum Loop</a>, about orchestrating swarms of agents in <a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04">Gas Town</a>. Lots of people have been trying lots of things to make these things more effective. I've come up with a method that is working for me extremely well (and I'll type it up soon).</p> <p>But the key thing is, the physical act of sitting at a keyboard and typing code. Then running that code and seeing if it works (either through compilation or tests) and assembling all your pieces of code into something that meets the specification. That is no longer part of the job of being a software developer. The agents need guidance from us, but, for the most part, are better at it than us humans.</p> <p>So the "vibes" part of my job is still alive - talking to humans, dealing with feedback, making it work the way people expect. It's the engineering part that has changed.</p> <p>But that <em>doesn't</em> mean software engineering is dead.</p> <p>The difference between "craft" and "engineering" is craft is about making the thing, while engineering is about making sure it meets the constraints around it. There are economic constraints, acceptable tolerances to errors and breakages, safety requirements, social contracts. Those have always been the most important parts of engineering. And now the actual "making" is out of our hands, it's the part we need to concentrate on.</p> <hr /> <hr class="footnotes-sep" /> <section class="footnotes"> <ol class="footnotes-list"> <li id="fn1" class="footnote-item"><p>The biggest had about 100 employees, but the majority were based in India working on a different project, so I never interacted with them. I worked with about eight others. <a href="#fnref1" class="footnote-backref">↩︎</a></p> </li> <li id="fn2" class="footnote-item"><p>I still get involved in pre-sales depending upon the potential customer's requirements. <a href="#fnref2" class="footnote-backref">↩︎</a></p> </li> <li id="fn3" class="footnote-item"><p>Amazingly, some magazines had pages and pages of code listed in them - you would type it all in by hand and, if you had made no errors, you would have a working game (that you could then save to tape to play it again later) <a href="#fnref3" class="footnote-backref">↩︎</a></p> </li> <li id="fn4" class="footnote-item"><p>And Commodore 64 BASIC was much more limited than many of its contemporaries, so I had to learn about memory registers and how the display adapter worked if I wanted to get the machine to do anything fancy. <a href="#fnref4" class="footnote-backref">↩︎</a></p> </li> <li id="fn5" class="footnote-item"><p>Both the Amiga and ST used a Motorola 68000 chip. The Amiga was better than the ST because it used co-processors for graphics and sound (a dedicated GPU and SPU - is that a thing?). The ST had a built in MIDI interface, so it became the thing for musicians, like we wanted to be. <a href="#fnref5" class="footnote-backref">↩︎</a></p> </li> <li id="fn6" class="footnote-item"><p>Ironically, I never used Smalltalk professionally (Ruby is inspired by and very very similar to it though). Ben actually ended up working at a bank where their entire system was written in Smalltalk - he wanted low-level but got higher-level than I ever did. <a href="#fnref6" class="footnote-backref">↩︎</a></p> </li> </ol> </section>
theartandscienceofruby.com
February 5, 2026 at 4:25 PM
Now, when it comes to technology, I'm actually pretty conservative.

There have only really been two key moments where I've got excited about technology.

* Seeing a Mac for the first time
* Writing applications in Ruby on Rails

All the rest […]

[Original post on theartandscienceofruby.com]
The future of software
<p>Now, when it comes to technology, I'm actually pretty conservative.</p> <p>There have only really been two key moments where I've got excited about technology.</p> <ul> <li><a href="https://theartandscienceofruby.com/why-i-love-apple/">Seeing a Mac for the first time</a></li> <li>Writing applications in Ruby on Rails</li> </ul> <p>All the rest were broken promises.</p> <p>But I think I've now got a third moment. I have seen the future of software.</p> <p>At the risk of sounding like a Youtuber "<strong>this changes everything</strong>".</p> <h2 id="meet-cher">Meet Cher</h2> <p><img src="https://theartandscienceofruby.com/content/images/2026/01/cher.png" alt="cher.png" loading="lazy" /></p> <p>This is Cher Horowitz. She/it is my installation of <s>Clawdbot</s> <s>Moltbot</s> <a href="https://openclaw.ai/blog/introducing-openclaw">OpenClaw</a> on my old 2015 iMac running ElementaryOS. That machine was sat there as an emergency spare if I needed to SSH in from somewhere on my iPad - now it's actually doing something useful. Not just useful - really, <em>really</em> useful.</p> <p>For those that haven't heard the hype, OpenClaw is an AI Assistant. Yes, another one. But there are a couple of differences about this one that lead to, what I think, is going to be the defining factor of software in the future.</p> <p>Firstly, I can communicate how I want with Cher. It has access to a few channels on our work Slack (I have to manually approve each person or channel it talks to), but I've also set up a WhatsApp channel for my phone. We have Anthropic, OpenAI and ElvenLabs API tokens and accounts already set up so I gave it access to those. Which means that if I send Cher a voice note it responds with a <a href="https://elevenlabs.io/app/voice-library?voiceId=gE0owC0H9C8SzfDyIUtB">voice note</a> too.</p> <p>Secondly, Cher is installed on my own machine. This means that it can do things that ChatGPT or Claude cannot - it automatically has access to any files and folders on that box. Obviously this has severe security risks (there are some steps you can take to reduce the "blast radius" but if this blows up, it <em>really</em> blows up). And because it's a persistent service with its own CPU and storage, it can also do things in the background - unlike Claude Code - it has a "heartbeat" file where it wakes up and checks on stuff, plus it can set up its own cron jobs.</p> <p>And it's this second capability that allows Cher to be revolutionary.</p> <h2 id="creating-the-claw">Creating the claw</h2> <p>There's an <a href="https://podcasts.apple.com/gb/podcast/the-pragmatic-engineer/id1769051199?i=1000747059706">excellent interview with Peter Steinberger</a>, the creator of OpenClaw. It establishes that he does, in fact, know what he's doing when it comes to software development (he wrote PSPdfKit). And then he explains how he burnt out, didn't switch on a computer for years and when he did, it was just after the beta of Claude Code was released. And that's how he wrote Clawdbot (although he says OpenAI's Codex is more capable now).</p> <p>The final 45 minutes of the podcast are about his process. And how he doesn't really care about the code that gets written, as long as it's got tests (written by the LLM) that prove it does what he wants. All he cares about is how it <em>feels</em> to use it (and I love that he used the word "feel" - I've got a draft post that's been sat awaiting completion for ages about emotions and vibes).</p> <p>So he starts by "chatting" to the AI - "give me a few ideas on how we could incorporate this feature into the codebase". In fact, he says the LLMs like to use the word "weave", so he's started using it too - "how can we weave this into the codebase". They have a "discussion" and he defines the feature's "end state". Many apps (such as native iOS apps) are difficult to test - so he gets the LLM to define a CLI. And then he can specify what the CLI should output given a particular input.</p> <p>In other words, it's test-driven development but <em>he's not writing the tests</em>.</p> <p>The LLM writes the tests (red), writes the code (green), refactors. Then he tries it out and feeds back on the user experience.</p> <h2 id="changing-the-game">Changing the game</h2> <p>None of this screams "the future of software" though.</p> <p>The thing that's amazing about OpenClaw, and therefore Cher, is that it is <em>self-modifying</em>.</p> <p>The software is anything you want it to be.</p> <p>OpenClaw has a number of "channels". I installed the WhatsApp channel myself, by running the CLI tool and looking at the changes in the configuration JSON file. But when it came to adding the Slack channel, I asked Cher to do it for me. Cher checked the Clawdbot documentation, figured out the changes it needed to make and updated its own configuration file, restarting the gateway so it reloaded. Then it gave me instructions on what to do next to ensure it was set up securely.</p> <p>I asked it to look over some of my code and help me out with a few tasks. It did well - as it was running Opus 4.5 which is the same model I use in Claude Code. But I had Cher set to use Opus 4.5 all the time and I soon discovered, after about three days, I had used my whole $20/month allowance. I extended it, switched the default model to Sonnet and asked Cher if it was possible to run any local models on this ageing iMac. It suggested installing Ollama with Mistral7B, saying "it won't run in the GPU so it will be slow but we can test it and see if it's any good". Ollama reported 5-7 tokens per second, so Cher said "that's too slow for conversations - but I do a lot of background tasks - periodic heartbeat checking and so on where speed isn't an issue, so let's use Ollama for that - it's free!".</p> <p>I designed a "team" of sub-agents, from "Bishop" who runs Opus 4.5 and is used for advanced coding tasks and detailed planning, down to "Hicks" who runs GPT5.2-mini and is used for monitoring log files and managing simple commands. Cher is given a task and decides which level of expertise it needs and assigns it accordingly.</p> <p>And then I tried installing Ollama with Qwen3-Coder-30B on my M4 Pro MBP. Ollama reported 70 tokens per second. I told Cher and it immediately wrote a shell script for testing if my MacBook Pro is switched on with Ollama accessible over my Tailscale network. If it is, then Cher passes a lot of coding tasks to Qwen3 (it's 70 tok/s and it's free!), otherwise it passes the task to Bishop or Ripley (faster and more capable but have to pay Anthropic or OpenAI).</p> <p><img src="https://theartandscienceofruby.com/content/images/2026/01/qwen3.png" alt="qwen3.png" loading="lazy" /></p> <p>Notice that Cher <em>wrote a script to do this</em> and chooses <em>when</em> it needs to use that script.</p> <p>Likewise, we use Linear for issue tracking. I asked if Cher can connect to MCP servers and it replied no. I told it about Linear and it immediately suggested writing a script that calls the Linear API to fetch data from it.</p> <p>In fact, almost any time that I ask Cher something that it cannot do, it does a quick web search, figures out how it might be possible, then asks if it should write some code to enhance its own capabilities.</p> <p>I've never seen a piece of software that can grow and shape itself to match its users needs and wants in this way. Peter Steinberger gave the example of how it was running on his computer in the office while he was on holiday. He had told it that he needed to wake up early and when he didn't message it at 6am, it connected to his Macbook Pro (which was in his hotel room) and started playing music, gradually increasing the volume until he woke up and asked it to stop.</p> <p>This is software that listens to what you're telling it, figures out a way of doing it and then updates and modifies itself so that it can comply. I'm sure there will be horrendous security failures and terrible stuff will happen as a result of it. We're in entirely new territory.</p> <p>Because this is something <strong>the likes of which we've never seen before</strong>. Both amazing and utterly terrifying.</p>
theartandscienceofruby.com
January 30, 2026 at 10:42 AM
Things that happened - January 2026, Week Two
(Published a few days after week two). I've been reading the Murderbot books and I've been pretty hooked. My wife started reading them, then the series came on Apple TV and I watched it - but I've turned to the books as they are (unsurprisingly) much, much better. They were written a decade ago and what is described is a pretty accurate description of a system integrating multiple data streams feeding into multiple LLM agents, each with the ability to start multiple sub-agents and write and deploy code. Which sounds pretty familiar to me, in the present. Anyway I started designing a "HubSystem". This is basically a directory of users (both human and bot), a dynamic collection of communication channels and a whole number of bots - LLM agents that run autonomously, receiving input and posting output to any of the channels they are subscribed to. Then I read about Gastown and thought it sounds conceptually similar too (although aimed purely at coding and HubSystem is a bit more general). During the week I also heard about Charm for Ruby. Charm is a collection of libraries, written in Go, based upon the Elm framework. But Marco Roth has written ruby bindings for it - meaning that amazing looking, text-based, interactive terminal applications are now easy to build. I'm very excited about this - thanks Marco. Then I discovered Checkend - a self-hosted error reporting/tracking application. Which is a great thing to have full control over (instead of sending your most vulnerable data over to some third party). Couple that with RailsPulse and that's two important parts of your runtime monitoring that you can bring under your own control. Finally, I heard about the Ralph Wiggum method for coding agents. And Anthropic released a Ralph Loop plugin, so I thought I'd give it a go. I got Claude to build me a Sveltekit application for tracking my progress using Casey Johnston's training plan. I've never done any Sveltekit before (beyond a couple of toys), which meant I would not be great at evaluating the quality of the code that Claude produced. And I really like "Outside In" development, with Gherkin stories, nowadays. So I wrote the stories then got Claude to produce the Javascript steps files and make them pass (using a TDD approach for the rest of the code). This means that, as long as the feature specs pass (driving a real browser via Playwright), I can refactor the rest of the code (or get Claude to do it) without fear. I just made sure that I evaluated the steps files it produced to ensure it was actually doing what the feature required. This means I get the benefits of vibe coding (quickly building an application in an environment I don't know), but I can apply some software engineering rigour to it. And after reading the steps I _did_ get Claude to change a few things about its implementation. The application isn't quite finished yet - I want to make it a fully offline PWA - but I'll let you know when it's available. As for Anthropic's Ralph Loop plugin? I'm less than convinced. It just seemed to burn through tokens whilst blindly bashing its head against a wall (which I guess is pretty Ralph Wiggum). And Geoffrey Huntley (who came up with the original idea and name) reckons Anthropic has missed the point - the plugin does not clear the context window following each iteration - meaning it gains no benefits from the Ralph Loop. Certainly I won't be using Anthropic's plugin again in the near future. But I _have_ known for months that managing the context window is the most important thing you can do with LLM agents, so I'll be building something similar into HubSystem. I'm thinking of saying that the bot has gone to sleep - because humans need sleep to refresh their brains too.
theartandscienceofruby.com
January 19, 2026 at 10:10 PM
Things that happened - January 2026, Week One
## Claude builds a UI I did a sketch on my iPad, showing a dashboard in both desktop and mobile layouts. I gave it to Claude, with a short description (for example "the hero section has a fixed height, so the news articles there have a vertical scroll bar in desktop layout; but the service sections have variable height so all announcements can be seen without scrolling") and asked it to generate an HTML mockup, using Tailwind classes. Claude took my crappy sketch and produced a Tailwind HTML page that matched it perfectly. I gave the HTML to the developer and said "follow this template so the scrolling and responsiveness works correctly, but rebuild it using our standard components". Which in turn means she doesn't waste her time getting the CSS right, which is something she's not so good at. ## Specification driven development I love "outside in" development - starting with a feature specification, then working from the user-interface (the outside of the application) to the database (the inside of the application). It helps me because my starting point is describing the functionality of the application in English, so I'm not even thinking about code. Then I write the individual steps as code and start working inwards. Simon Willison has been talking about conformance suites and how coding agents can write better code than humans, _if_ they have a specification to work to. I've already found that if I give Claude RSpec tests it sometimes writes better implementations that I would have. So the next step is to try this with a whole feature. My first go at this will be me writing the steps and saying "make this pass". Then, if it's good at this, I'll just give it the feature and say "one at a time, write a step, then make it pass". ## I went to my daughter's graduation Not work-related but I'm very proud of her.
theartandscienceofruby.com
January 11, 2026 at 9:49 PM
Oh 25
I've never written an annual review before. Mainly because I have zero memory and everything passes by in a blur. But this year I'm going to give it a go. Mainly because I have zero memory and everything passes by in a blur. Let's start with the basics. This year has been a lot. I turned fifty-one (the first time I've thought "oh, people will think I'm _old_ "), I became a granddad ("oh, people will think I'm _old_ ") and I lost my dad (sort of, it would be much easier if he had died). ## Music This one is easy. K-Pop Demon Hunters. I don't think I've listened to an album on repeat like this since I was a teenager. One evening, I had the house to myself, and I noticed a number of people on Mastodon mentioning it. So I started watching the film and wasn't convinced for the first few minutes. Then came the Saja Boys and the little shoulder dance and I was totally hooked. Even better, later that evening, my daughter came round (she has similar taste in music to me) and I played her "Golden" and "Soda Pop" - and she started the shoulder dance herself. Apart from the animated, I've been listening to a lot of Poppy, Electric Callboy, Babymetal and Bloodywood on the metal side of things. And Jade Thirlwell, Sabrina Carpenter, Lisa, Sophie Powers and Lady Gaga on the pop side. ## Gigs Best Gig: Babymetal (with Bambi Thug and Poppy) - this wasn't a gig, it was a show. And the O2 is a great venue. Also good: * Alt Blk Era - so good I actually moshed for the first time in over 30 years * Scene Queen - very funny, especially as she tried to explain sororities to a load of emo, non-binary, Brits * Slipknot - doing the 25th anniversary tour, like Babymetal, an amazing show Biggest Disappointment: Electric Callboy - I think Alexandra Palace is a crappy venue. ## Family As I mentioned, I became a granddad in May. I've heard a few people say it's not like having kids, and for me, it really isn't. It's pure joy with (almost) none of the terror. He's also the happiest little baby i've ever met and his mum is doing an amazing job. I also lost my dad - mentally if not physically. I was having lunch with Jeremy from Brightbox when I got a call from my mum. Dad had collapsed and wasn't moving. He had had a massive stroke and has lost movement in one side, is unable to speak and unable to keep his attention on anything for more than a few seconds. My mum is now living alone for the first time in her life and my dad is in a home with round the clock nursing care. My dad never got to meet his great-grandson until yesterday (he enjoyed it but I"m not sure he knew who the baby or its mother were). ## Films One word, Sinners. It's like "From Dusk till Dawn" (which I love) but also about the blues and racism. I also love the fact that the vampires (represented by an Irishman) have music too, but the KKK-types do not. Finally, it's great to see Buddy Guy appear in the film too - my dad took me to see him and Eric Clapton when I was younger and they were both amazing. I also liked Better Man (but then I love Robbie and almost everything he touches), Predator: Killer of Killers and Wake up Dead Man, as well as the aforementioned K-Pop Demon Hunters. I didn't see One Battle after Another, which a lot of people seemed to like. ## TV Andor. There was a lot I've liked this year - it's been pretty good all round - but Andor was perfect for me. Especially episode 10 where Kleya has to deal with Luthen. Pluribus was good (it's nice to have a post-apocalyptic show where the message isn't "who are the real monsters?" - although Carol is a bit of a monster). Slow Horses is still great, Dept Q was a good copy, The Diplomat is still ridiculous (and fun) and Alien Earth was good because it was an Alien thing that wasn't awful (Aliens is one of my all-time favourite films and everything else disappoints). Ignoring the subject matter, I love a single shot tracking scene so was mesmerised by Adolescence (although the hand-wringing reaction was a bit much - where have people been for the last ten years?). Finally a mention for Big Boys (because no-one else seems to include it) - a sitcom that, on the surface, is about a young man coming to terms with his homosexuality - but actually it's much more about the struggles of his heterosexual "one of the lads" best friends and his difficulties fitting in to the modern world (so I guess a similar theme to Adolescence, just put together in a gentler way). ## Work I rediscovered "Outside In Development" - writing specifications in English, then implementing them by starting at the user-interface, drilling in to the database and then returning the results to the UI. And I've missed it so much - the code I write is simpler, the UI is simpler, everything is just simpler. I've been using a lot of Claude Code. It blew me away at the start of the year - "_now I understand how AI could actually be useful_ " - and I'm using it more and more. Code Review (which I hate), refactoring, bug fixing, adding in tests which I've forgotten - it saves me a lot of time. I've also started to move away from Ruby on Rails - for the first time in 20 years of professional development. I'm playing around with some toy projects using Javascript, CapacitorJS, Lit.dev and PouchDB (so an offline-first, sync-capable, mobile application or PWA). Lit is fantastic - it's something I've really missed over the last 20 years of web development - writing a component that can actually interact with the user (and having all the code for that interaction in one place). And PouchDB/CouchDB having automatic syncing is fantastic - the quote is "_CouchDB is a database that's shit at everything, except syncing. But syncing is so important you'll love it anyway_ ". This is true. ## Football Nuno Nuno Nuno, we're on the piss with Nuno. What a fucking season. Yes, we tailed off towards the end, because our squad wasn't big enough and the injuries started to hit. But seventh in the Premier League. Europe again, Ole, Ole. And this season - what an absolute shitshow. I knew we would struggle in the league - Chris Wood had totally overachieved and we were no longer the surprise package. But fuck me. I don't know what happened with Nuno but i wish it hadn't. The Australian was just the wrong choice (never mind his merits as a manager, Forest are a defensive-minded counter-attacking club and have been for 50 years, the A's style just wasn't a good fit). And I'm not 100% convinced by Dyche, although he was exactly what we needed to steady the ship. I still hold out for Glasner; when Nuno left there was a chance of that, but the owner doesn't like to hire managers who are already in a job - and now Glasner will be in high demand. Still, my initial prediction - stay up and win the Europa League - that would be an incredible result for this season. ## Cars I started the year with my Subaru BRZ. I don't think there's much better - certainly not for sensible amounts of money. It feels analogue, the controls are super-responsive and while you're driving, the car constantly talks to you and tells you what it's feeling. The only things that could make it better were if it delivered power earlier (it's very flat till you get to 3000 revs) and if you could take the roof off (I love a convertible). However, because of the baby, a 2+2 wasn't really big enough (we couldn't even fit the car seat in the back). So I got an Alfa Giulia Veloce. I've always complained about modern cars, saying they have no personality. In the olden days of carburettors, you had to learn what your car liked and treat it correctly. As I've said, the BRZ felt _analogue_ - it didn't have a personality as such, but it was expressive and chatty. Whereas the Alfa is a _diva_. It definitely has a personality - that personality is total spoilt brat. * Reversing into a parking space? _Stop making me do these menial tasks_. * Driving at 20mph in a residential zone? _You're so fucking boring_ * Taking corners at 50mph? _Just let me run free_ * Holding at 70mph on the motorway? _You little bitch, you know you want to go faster_ ## And on to the next one So that was 2025. Let's hope 2026 goes a bit better.
theartandscienceofruby.com
December 31, 2025 at 11:45 AM
Outside In: the return
Recently I've returned to "outside in" style development. This used to be really popular ten to fifteen years ago, but kind of vanished. I suspect the reason for this is because it's a style of development that does not fit when your application is split across multiple code-bases - not for micro-services and, more importantly, not for mobile applications with an API back-end. But that's not what I'm writing. And the trend, in general, seems to be heading back towards server-side rendering and monolithic applications (with a small number of external services). So it's a style of development that makes sense. The basic idea is that you start from the point of view of a user of the application - someone who is sat "outside" the system. You write a specification that states what they want to do, why they want to do it and the steps they take to make it happen. Again, all of this is written _from the end-user's point of view_. There's no mention of routes or end points or models or databases. End-users don't see those things - they see screens and buttons and menus and fields - so we describe the steps in their terms. And what I've found is that working this way fits perfectly with _YAGNI_ - "you aren't going to need it". We end up writing less, much simpler, code because, if it's not required to meet the specification, we don't need to write it. * * * Aside: one of the things I've noticed over the years is that, when listing their "requirements" for an application, people do not fully understand what they are asking for (and we, as developers, don't fully understand what they have said). Something they thought was "high priority, it _must_ do this" quickly becomes "oh, we don't need that" when they actually use the system. And something that was a "nice to have" (or was never even imagined at the start) quickly becomes "we must have this" when they see the application in action. So YAGNI - writing the minimum amount of code for the _one_ thing we are working on right now - means we don't end up wasting time. Of course, we still need to design and think through exactly what it is that we _are_ building. And we must be careful to make sure that the database is structured correctly, because database structure changes in tables full of live data can be tricky. But, as you'll see, writing tests at each step, means we have confidence, in the future, when we need to make amendments to previous work. * * * So, once we have a specification, we start to implement each step in turn. For a web application, it's a browser that the user will be interacting with. So it makes sense to make our steps remote control a browser, making sure it shows the information the user wants to see and behaves in the way that the user is expecting. For ruby apps, this means using Capybara and headless Chrome or Firefox. This approach does have issues, especially timing issues in Javascript causing flaky tests, but I've found that it's not much of a problem when using Turbo and Stimulus. Plus the flakiness can be minimised by running your application and headless browser in docker containers. This means that the browser is not affected by stuff going on happening to the browser instance on your desktop. I've also heard that using Playwright instead of Selenium improves reliability too. The specification is split into "setup", "action" and "expectation" sections. When using the Gherkin specification language, these map to "Given", "When" and "Then" steps. Feature: Logging in Scenario: Successful login Given I am an administrator at an account When I log in Then I should see my dashboard "Given" steps set up the environment into a known state. While we are writing these, we will have to start thinking about models. For example that first line in the "successful login" scenario implies that we have "accounts", "administrators" and a "user" of some kind. In the spirit of YAGNI, we just write the bare minimum code - if we were using my Fabrik gem, I might write: step "I am an administrator at an account" do @me = Fabrik.db.users.create @account = Fabrik.db.accounts.create @role = Fabrik.db.roles.create account: @Account, user: @user, role_type: "administrator" end I have not designed the database, or created these models yet - but at this point of time, it seems likely that this will be enough to make the specification work. "When" steps are the actions that the user (or the system) take. Most of the time, these are our Capybara commands to remote control the browser - commands like `visit "/some/page"`, `click_on "The Menu"`, `fill_in "Email address", with: "someone@example.com"`. Again, note that we're not really designing anything - there's nothing about routes or controllers or views, beyond taking note of things that the user will see and can interact with. step "I log in" do visit root_path fill_in "Email address", with: @me.email_address fill_in "Password", with: "password123" click_on "Log in" end "Then" steps are then test that things have happened as we expected. For the user, we test that the output displayed on screen matches what they are looking for. For the system, we can test the database or check if API calls were made. The user expectations are more important than the system expectations though. step "I should see my dashboard" do expect(page).to have_text "#{@me.name}'s Dashboard" end Once we have an outline set of steps, written in Ruby, we can run the spec. Of course, it will fail - the first step references models that don't exist. It's only now that our design work begins and we create some skeleton models. I do a quick sketch of some possible database tables and models and decide that a `User` and `Account` make sense, with a `Role` joining the two. I'll also violate YAGNI at this point and make all three of these tables soft-deletable. This is because I know, from experience, that, when using foreign keys with cascade deletes, deleting a `user` or an `account` can cause lots of live client data to be deleted that we really want to keep. In a Rails application I'd use the Authentication Generator to build the `User` model (adding in additional first and last name fields). I'll add in an `Account` model that has a `name` string field and a `Role` model that `belongs_to :account` and `belongs_to :user`, with an `enum :role_type, user: 0, administrator: 1`. For soft-deletes, I'll add an `enum :status, active: 0, deleted: -1` column to each of these models. Then I'll make sure the correct indexes are added to the migrations and add the validations, associations and normalisations to the ActiveRecord models. I'll probably write model specs for those validations and normalisations too - because they are part of the business rules of the system. Email addresses _must_ be in a valid format, otherwise you just get lots of errors appearing in your logs when sending notifications - someone will always type the address in incorrectly. This rule is so important that I want to make sure that it is documented in the user spec. Likewise, an account must _always_ have a name - it's an _invariant_ of the system, so we document it and make sure the specs enforce it. Finally, I'll add a `Fabrik` configuration so that we can create users, roles and accounts without having to specify all the required fields every time. Faker.db.configure do with User do unique :email_address first_name { Faker::Name.first_name } last_name { Faker::Name.last_name } email_address { |u| Faker::Internet.unique.email(name: u.first_name) } password "password123" password_confirmation "password123" status "active" end with Account do unique :name name { Faker::Company.unique.name } status "active" end with Role do unique :account, :user account { accounts.create } user { users.create } role_type "user" status "active" end end Now that first step passes - and the second step fails because it tries to get the browser to visit the root path, which we haven't created yet. So we add the root path to the `config/routes.rb` file, pointing it at `DashboardController#show`. The Rails authentication generator has already added a login page and it automatically forces any new controllers require authentication. So when the browser visits `root_path` it redirects to `new_session_path` and shows the login form. Capybara fills in and submits the form (we may need to tidy it up to meet the specification and make it fit our application's style) and then the spec fails on the final step - it expects to find some text saying we are on the user's dashboard - but our dashboard does not even have a `show` action at this point. Here I add in a controller spec. This is important because controllers represent the public access points to our system - we need to be sure that users can only read and write data that they have permission for; _data security is our most important responsibility_. So I add in a controller spec that looks something like this: RSpec.describe DashboardController do include Login describe "showing the dashboard - GET /" do it "shows the login page if not logged in" do get root_path expect(response).to redirect_to new_session_path end it "shows the dashboard if logged in" do @user = Fabrik.db.users.create login_as @user get root_path expect(response).to have_http_status 200 end end end This is really simple - if we try to view the dashboard without a login, it redirects. But if we are logged in then it renders a page. I've added in a `Login` module that simulates a login for a user (the implementation depends on how your controllers and test framework work - you could just set the session cookie, or you may need to go to the login page and perform an actual login). However, the controller spec still fails - because our controller does not return any content - and hence no `200` status. We create an empty view for `DashboardController#show`, update the action to render that view and now the controller spec passes. But our actual feature specification is still failing - it's looking for the text "Alice Aardvark's Dashboard". Again, we update the view, with `<h1><%= Current.user.to_s %>'s Dashboard</h1>` and now the feature spec passes. In real life I'd actually use `I18n` for the view, because it's easier to start with `I18n` than retrofit it later. Even if we are only ever working in English, there are differences between American English and English English that users will complain about, so we might as well use I18n from the start. And there we have it - a working feature for logging in a user. It's completely bare-bones - there's no styling on the UI, the models barely store any data at all. But we can ship it today and say "we can _guarantee_ that this feature works as expected". Now, on to the next feature.
theartandscienceofruby.com
October 5, 2025 at 12:10 PM
LLMs for Software Developers (notes from my talk at NWRUG)
I recently gave a talk at the North West Ruby User Group about how I use LLMs for software development. This was an update on a previous demo I had given on Claude Code. We didn't record the talk, but here are my (adapted) notes. # LLMs for Software Developers This is an update on how I use LLMs - mainly Claude and Claude Code - in my day to day software development life. It's a follow-on to the demo I did a few months ago, but what I do now is very different to how I used them then. ## A quick history lesson I think I look at LLMs slightly differently to many other people. This is because I never did learnt formal computer science or software engineering; my degree was in Cognitive Science, as I was (and still am) interested in cognitive neuroscience, linguistics and philosophy of mind. However, I've been a professional software developer for thirty years now; that's because there have been two occasions when computers have had a profound influence on my life. Firstly, when I was a kid, in the 1980s, my dad got a PC from work. It ran DOS, so nothing was graphical. I got extremely frustrated trying to teach him how to use the word processor; he could not grasp that "blue text on-screen" meant "prints out in bold" and "green text on-screen" meant "prints out double-width". My friend Ben's dad was an academic and they got a Mac. I remember walking in to their front room and seeing it - and as my (no doubt embellished) memory recalls it, Ben's dad _was teaching Ben_ how to use the computer. This was **sorcery** - and since that moment, I've always tried (but often failed) to make the interfaces to my software as friendly and accommodating of human sensibilities as possible. Secondly, when I discovered Ruby and Ruby on Rails, I loved it because the APIs were designed to look like english. Not bad considering neither the language nor the framework author were native english speakers. The underlying issue is that _code is easy to write but hard to read_ - so if you can make your code read like english, it reduces that little bit of friction in your head. Which, in theory, should make the code more understandable and more maintainable. Then, I was watching the latest series of Black Mirror and there's an episode, called Eulogy, starring Paul Giamatti. He has to prepare some memories and recollections for a funeral and is guided through the process by an AI device. It's a great episode with a strong emotional pay-off. But I also realised that the AI device was basically doing a project configuration and data gathering exercise. In a few years time, the idea that you would have to manually search through stuff, learn a load of settings and options and organise your information by hand, will seem antiquated. A computer can simply have a conversation with you, ask the relevant questions, sift out the unimportant stuff and then put the relevant data into the right places. Two very important ideas there - making computers more accessible to people who do not understand how they work and code being harder to read than it is to write. ## Ethics It's not possible to discuss LLMs without mentioning the ethics of these things. Most importantly, be careful who you listen to. I was reading one _very_ angry blog-post the other day, slating LLMs, saying anyone who uses them was a misguided fool. Then at the end, the author mentioned that their experience with them was 2 hours pasting some Python code into ChatGPT (they didn't say, but I assume it was the free version, using older, less capable models) and reading Google's famously terrible AI search result summaries. On the other hand, you have loud tech-bros who think "AI" is the second coming of crypto when actually they're just massive arseholes. With regards to energy usage, it's hard to say - because the "AI" companies are generally private so do not need to break down their spending. What we can do is think about the energy usage in two forms: inference and training. With regards to inference, even if they are subsidised, the costs for using OpenAI or Anthropic's APIs are probably a good guide - and they have been trending downwards with each model that is released. However, for training costs, the attitude is "MOAR MOAR MOAR". But I feel that's probably because these companies are VC funded, meaning they are using the billions of dollars that were released to already rich people after 2008 which never reached the normal economy. They want to justify their existence and so engage in this huge dick waving contest over who can spend the most. The Chinese Deepseek models caused such a stir because, even if you can't trust the their costings, they must be a fraction of the cost of the American models. Simply because the Chinese do not have access to the same power hungry hardware. There's a narrative doing the rounds that AI is already causing massive job losses. I'm not sure this is true; I think it's just being used as cover for job-cutting. But there's a good chance that it will cause job losses in the future. That's because technology always creates change. In the 1980s with desktop publishing, in the 1990s with the web, in the 200s and 2010s with mobile - they all destroyed entire industries, but also created new ones. Back in the 1800s Luddites smashed up machines - but not because they were anti-technology. It was because the technology was dehumanising and they had no other way to get their lofty overlords, the factory bosses, to listen. With regards to copyright, I have a different view to most people. I absolutely believe you should get paid for what you create. But I also grew up in a time of musical "remix culture" - dub reggae, hip hop, British rave and house music - they were all sampled liberally from other people's ideas but they created brand new forms, the likes of which had never been heard before. The real issue I have with copyright violations is who those violations benefit. LLMs are owned by VCs, tech bros and Silicon Valley giants. These people live in a different world to the rest of us, they don't look at ordinary people as something they need to worry about (with the likes of the transhumanists and effective altruists openly encouraging their followers to ignore the suffering of billions today because we may be able to alleviate the suffering of hundreds of billions in 20,000 years time). If it weren't for Big Tech, I would have no problem with the copyrighted training material. If LLMs were publicly owned, I wouldn't mind about the energy usage. 35% of the US stock exchange is in tech stocks and the value is rising because of LLMs - investors are pouring money into a technology that is demanding ever-growing capital expenditure and is yet to show any signs of making a profit. It's a bubble that's about to burst - and when it does, it will be the ordinary folk, who don't have billions, who will suffer. The real problem here is how power and control is distributed across society. _Which is as it's always been_. ## How I use LLMs (summer 2025 edition) The most important thing that I've learnt about LLMs is that you have to control their "context". LLMs are stateless - as you have a "conversation" with them, the previous messages are sent back and forth, between you and the LLM, growing in size every time a new question or reply is added. The maximum size of the state they can pass is called the "context window". Newer models have bigger context windows. You might think that bigger is better, but there's actually a sweet spot. Too little context and the LLM will "fill in the gaps" by delving into its training data - and if it can't find something directly relevant, it will choose the next best thing. This phenomenon is commonly known as "hallucinations" and is one of those things people, who don't really use LLMs, use as proof that these things are useless - when actually you've not given it enough information. But too much context and the LLM gets overwhelmed. It gets stuck in loops, it goes off on a tangent and needs the context clearing before it can do anything useful. That's why there's a new, emerging, discipline called "Prompt Engineering". Some people scoff at this - "it's just a sentence, how can that be engineering?" Well, it may not be _engineering_ in the strict sense but it's definitely more than "just a sentence" - it's how you make sure that the context that the LLM is working with is just the right size, with the important details it needs without any of the extraneous stuff. And arguably, like engineering, you need to understand the constraints and test the results to ensure that they are within acceptable tolerances. When using Claude Code, I use the following files to control the context. * CLAUDE.md - Claude's instruction file. Claude can generate this for you, but I find it puts too much in there. Mine basically says: * This is a Ruby on Rails application * Use `bundle exec standardrb --fix` to run the linter * Use `bin/rails spec` to run the full test suite * Use `bin/rspec path/to/file_spec.rb:LINE_NUMBER` to run an individual spec * Details on the models and application structure are in `docs/glossary.md` * Details on coding conventions are in `docs/style-guide.md` * Glossary - describing the structure and ubiquitous language used in the application * Style Guide - conventions and notes on how the code itself is structured (for example, using Phlex components or always using resourceful routes) * Commands - Claude Code has a `commands` folder, where you can define specific commands (prompts); more on these later. Note that the CLAUDE.md file is very short but includes _references_ to the other files. If the LLM needs to know which models to look it, it can read the glossary, if it needs to write code, it can look at the style guide - but it won't load those files into its context unless they're necessary. ### Writing Code Ever had a ticket that says something like "the customer wants more widgets"? It's not really helpful - _why_ do they want more widgets, what are they trying to achieve. So I've written some prompts for bug reports and feature specifications that ask the reporter for more details. Unlike filling out a generic form, the LLM has been instructed to ask certain questions based on the previous answers - plus it comes across as a conversation, so is much more natural for non-technical users. When I'm designing a new function I often ask the LLM for advice. Especially when it's something technical, such as an external API. Recently I had to amend the contents of Word document's XML. The specification was a 5000 page PDF (of which I have read about 600). But the LLM immediately knew which tags and structures I needed to look at. Every now and then I need to make a change to a load of files. I could figure out a load of regexes, look up the syntax for `sed` and `awk` and write a script to do it. Or I can say "find all ruby classes that do something like this ... then adjust them like this and if they should be name-spaced, update the module and move them to the correct folder". Not an exact regex in sight. In fact the LLM can correctly respond to instructions that are _nothing_ like regexes - such as "find classes that are structured like this one". I used this to do a major refactoring on a large application. The test suite took over 40 minutes to run, but I knew how to speed it up. The problem was there were over 30,000 test cases - I just couldn't face doing the work. So instead, I updated a couple of the specs myself and got the LLM to do it for me. "Look at how I've edited these files and make the same changes across all the rest of the specs - after each one, run the linter, then the individual test, fixing if needed; then move on to the next spec". I started it running on Friday afternoon, by Sunday evening, the entire test suite took less than 15 minutes to run. If you've got tests, the LLM is really good at bug fixing. I spent 3 hours banging my head on the desk trying to fix a weird routing error; then I asked Claude Code. At first it tried a load of things I had tried, then it "thought" "the issue is to do with the routing, so I'll replace that". And the test passed! I looked at what it had done; it replaced `documents_form_path(@form)` with `Rails.application.routes.url_helprs.documents_form_path(@path)`. I'm not really sure what caused the routing problem - but I wasted 3 hours while Claude fixed it in under 10 minutes. And as for that Word XML processor - Claude can spot typos and issues in the XML in seconds. Speaking of tests, I often write entire specifications first. "I need a new class ... it will do X, it won't do Y it will do Z". Then I go through them and make each one pass, one at a time. Except now I often ask Claude to make them pass - it goes away and writes code, running the test suite until it's got a working implementation. In a few cases it's ended up with code than I would have done. And even in the cases where it doesn't, the tests pass so I know it's safe to ship and safe to refactor later. ### Reading Code Remember, code is easy to write but hard to read. At least for humans. LLMs are actually quite good at reading code. So you can ask it "how does this work?" when looking at a new project, and it will probably give you a decent answer. However, the thing I hate most about reading code are code reviews. So I thought I'd see if I get the LLM to do the boring bits for me. I added in a "code_review.md" file to Claude's commands folder which detailed the process for code reviews: * Read the issue from Linear (our issue tracking system) * Read the project style guide * Run the linter and ensure all tests pass * Do a diff between the feature branch and the develop branch * Briefly evaluate if the diff implements everything required in the issue ticket * Check that the changes match the style guide * Perform a security check on the changed code - ensure all endpoints have automated tests to verify authentication and authorisation * Ensure that the project glossary, README and other documentation has been updated to include details of these changes * Make a final recommendation on the changes: * Accepted - the code meets the requirements * Accepted with UI Review - the code meets the requirements but includes user-interface changes, so requires a visual review * Rejected - the code does not meet requirements and should be returned to developer with feedback It doesn't mean I can just trust Claude's code review. But it does do a lot of the boring, tedious stuff and tell me how much effort I need to put in. If Claude says it's OK, I do a scan of the diff to see what it's missed. But if Claude says it's not, I dive in and do a full review. Also, Claude updates the documentation for me. ### Integrating LLMs into Rails applications #### RubyLLM The RubyLLM gem helped me understand how to make these things useful. I'd read the various APIs and bits of documentation but nothing really made sense till I saw RubyLLM's ruby-ish interface. It lets you connect to an LLM's API, send it messages and receive responses. In one of our projects, I started with some simple tasks - "if this image does not include ALT text then ask the LLM to summarise the image" and "extract the keywords from this document" (which then get used to generate a Postgres full text search index). Then I got started on a more complex task - importing a PDF document. The instructions included "see if there is an existing document with this filename; if there is then add a new revision otherwise create a new document". To make this work, you add in "tools" - functions that the LLM can call. These are implemented as ruby classes with an `execute` method; they also include a description and list of parameters (and types). When you start the RubyLLM chat, you pass it the tool instances and, as the LLM is working, it decides if it needs to make a tool call, based upon its current context and the tool descriptions. #### MCP The next step was to try building a "conversational" interface. As I mentioned at the start, this is the thing that could be transformative for computer interfaces. To make that work, I investigated the Model Context Protocol - a very simple JSON API that runs over `stdio` or streaming HTTP/SSE. The key thing about it are that it also includes discovery (think an OpenAPI specification but much much simpler). The Fast-MCP gem is rack middleware that implements a streaming HTTP/SSE server inside your rack application. You add in "resources" (such as documents) and "tools" (functions that the LLM can call, just like for Ruby-LLM), the gem then publishes these and the protocol allows the LLM to discover which resources and tools are available and then call them whenever needed. In addition, MCP includes OAuth2 - so if a resource or tool returns a 401, it tries OAuth2 discovery and then asks the user to authenticate. Unfortunately I've only had time to do a very basic investigation into Fast-MCP (one resource and one tool) - but I had added OAuth2 discovery and client registration to my application (which the doorkeeper gem does not include), which is an important starting point. But it looks pretty simple; the only thing that I'm not sure about is the best way to organise a lot of repeating functionality - HTML controllers, JSON API controllers and MCP tools and resources. ## King Ludd As I said at the start, the two big things about LLMs are that they are very good at reading code and they could be the basis of a computer interface that is much less alienating for a lot of people. But there still remain a lot of questions about this technology and the situation is changing very fast. However, I'm from Nottingham; we call ourselves the Rebel City, because of Robin Hood, the Civil War, Brian Clough (and now Evangelos Marinakis) and the Luddites. I'm happy to call myself a Luddite - they didn't hate the technology, they wanted the owners of that technology to stop treating them with contempt. And that's how I feel today. Embrace the technology, but don't trust the people who own it.
theartandscienceofruby.com
August 24, 2025 at 9:03 PM