Lightnews — Scholar-powered news

Ben Brumfield

@benwbrum.bsky.social

1.3K followers 1.7K following 130 posts

Open Source #DigitalHumanities software engineer.
Founder of FromThePage.com, a platform for collaborative #manuscript #transcription to engage the public in #archives and create digital scholarly editions.

Posts Replies Media Videos

Ben Brumfield

@benwbrum.bsky.social

This one made a big impression on me: pulitzercenter.org/stories/lega...

However this might be a bit closer to the time-frame you're looking for: www.theatlantic.com/magazine/arc...

The Legacy of India’s Quest to Sterilize Millions of Men

Breaking the Cycle: Part 1 In 1976, men across India were drastically changing their behavior. Some were abandoning the beds inside their homes to sleep in fields; others were skipping major festivals...

pulitzercenter.org

September 15, 2025 at 4:00 PM

Ben Brumfield

@benwbrum.bsky.social

I'd love to hear how and whether other developers in #digitalhumanities or libraries and archives have done similar experiments and evaluations.

For now, we're exercising a lot more discipline with these tools to keep from wasting our time on shiny new things.

September 15, 2025 at 3:53 PM

Ben Brumfield

@benwbrum.bsky.social

3. Pitching in on specific tasks that we don't have enough skills to do as well ourselves, like translating messages for our recurring "Werk/Arbeit;obra/trabajo" problem or tweaking our UI for A11Y issues (which it does well with).

September 15, 2025 at 3:51 PM

Ben Brumfield

@benwbrum.bsky.social

2. Helping refactor or isolate our legacy test suite. Asking an agent to isolate a single test at a time might finally get us out of dependency hell in our test suite.

September 15, 2025 at 3:50 PM

Ben Brumfield

@benwbrum.bsky.social

1. Fixing small user-reported/developer-noticed problems so that we don't have to interrupt developers in other effort. This lets us fix bugs on a managers schedule instead of a maker's schedule (cf. Paul Graham) . (Sara and I spend most of our days on a managers schedule, unfortunately)

September 15, 2025 at 3:49 PM

Ben Brumfield

@benwbrum.bsky.social

Our friends asked us an important question: what are we trying to accomplish? What would be the sweet spot for agentic development?

Here's what we identified:

September 15, 2025 at 3:48 PM

Ben Brumfield

@benwbrum.bsky.social

And these interactions are very costly. AI-authored PRs averaged three times the interventions that are needed when we work with PRs created by our developer Will. (Will is amazing!) And our current AI agents don't seem to be able to actually run our test suite, which we may be able to fix.

Mean numbers of comments on PRs
Will: 2.6
Copilot: 7.9

Ben/Sara interventions per commit:
Will: 0
Copilot: 3-4

Environment issues:
Rubocop (linter)
Test environment
I18n-tasks
Do we invest in solving these?
More time sunk into unpleasant AI environment problems
Might make this actually work?

September 15, 2025 at 3:46 PM

Ben Brumfield

@benwbrum.bsky.social

The quantity of AI work also doesn't indicate quality. Fully a third of the PRs opened by Copilot Agent or Codex were so bad that we abandoned them rather than trying to fix them -- often after several interactions with the AI agent.

Pie charts showing pull requests accepted (two thirds) vs. abandoned (one third) by Copilot Agent or Codex.

September 15, 2025 at 3:42 PM

Ben Brumfield

@benwbrum.bsky.social

The bad news is that while our issue backlog dropped, the backlog of pull requests needing review, test and approval skyrocketed. Sara and I started reviewing AI PRs instead of more important PRs.

Some of the issue backlog wasn't due to AI at all, but simple grooming (closing duplicates, etc.)

September 15, 2025 at 3:40 PM

Ben Brumfield

@benwbrum.bsky.social

At the end of July, we tried an experiment going through our old issue backlog. Low-stakes bug fixes and enhancements seemed ideal for turning over to an AI. The good news was that we made a lot of progress on our backlog.

Graph showing a spike in issues closed at the end of July through August, as well as overall reduction in open issues.

September 15, 2025 at 3:37 PM

Ben Brumfield

@benwbrum.bsky.social

Oh, I agree completely. We're doing different research from yours, but we're definitely not turning our backs on LLMs altogether.

September 5, 2025 at 3:53 PM

Ben Brumfield

@benwbrum.bsky.social

I worry that the non-corporate models have many of the threats to goods scholarship--seductive plausibility, atrophy of human critical skills--that the corporate models do. Hitching the DH bandwagon to that seems like it would not improve the reputation of #digitalhumanities among trad scholars.

September 5, 2025 at 1:17 PM

Ben Brumfield

@benwbrum.bsky.social

Congratulations, Sheila. I just spotted this and am very glad for you and the team.

September 2, 2025 at 7:38 PM

Ben Brumfield

@benwbrum.bsky.social

I'll just wait for it to show up in my podcast feed.

August 28, 2025 at 8:03 PM

Ben Brumfield

@benwbrum.bsky.social

If I click the link from Chrome (in Texas), I see a brief flash of the episode page, followed by a replacement with a 404 page. Loading it in Lynx shows the episode page (but without any player, of course).

August 28, 2025 at 7:52 PM

Ben Brumfield

@benwbrum.bsky.social

I've subscribed to the InOurTime podcast for years. For the last year or so, each episode punctuated by some medium-obnoxious ads, and prefaced by an annoying "BBC Sounds is Supported by Advertising outside the UK" announcement.

Still very much worth it when I've got easy access to a FF button.

August 28, 2025 at 7:49 PM

Ben Brumfield

@benwbrum.bsky.social

For a more extensive commercial solution, we support crowdsourced transcription and translation at FromThePage.com -- you'll probably need to do your own outreach, however, since we don't have a lot of volunteers who work on Polish processes.

FromThePage.com

August 14, 2025 at 2:20 PM

Ben Brumfield

@benwbrum.bsky.social

Thanks for this lovely photo!

July 23, 2025 at 9:15 PM

Reposted by Ben Brumfield

Martin R. Kalfatovic

@udcmrk.bsky.social

What's the Character Error Rate of a Volunteer? Analyzing accuracy in cultural heritage crowdsourcing projects | @benwbrum.bsky.social | #DH2025

July 18, 2025 at 11:22 AM

Ben Brumfield

@benwbrum.bsky.social

Oh, that's awesome! I had no idea #bots was a thing.

June 17, 2025 at 9:38 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news