Ben Brumfield
benwbrum.bsky.social
Ben Brumfield
@benwbrum.bsky.social
Open Source #DigitalHumanities software engineer.
Founder of FromThePage.com, a platform for collaborative #manuscript #transcription to engage the public in #archives and create digital scholarly editions.
Reposted by Ben Brumfield
Quick blog post noting some thoughts on 'Documenting AI-created/enhanced records in catalogues/metadata/displays' - I'd love to know who's already doing it, and how? www.openobjects.org.uk/2026/02/docu... #AI4LAM #MuseTech
Documenting AI-created/enhanced records in catalogues/metadata/displays? – Open Objects
Open Objects
www.openobjects.org.uk
February 9, 2026 at 2:38 PM
Reposted by Ben Brumfield
I am very excited to report that a project I have been working on for 3+ years is finally seeing the light of day! Along with @byzcapp.bsky.social and Jesse Torgerson, we debut a new annual section in Digital Philology that proposes the dataset as a new genre of publication. odfms.hcommons.org
Open Datasets For Medieval Studies – ODFMS is a showcase of world-class research on the Middle Ages in dataset format.
odfms.hcommons.org
February 6, 2026 at 8:20 PM
Reposted by Ben Brumfield
Come work with me at the Library of Virginia in Richmond!

We're an awesome state library/archive with 100m+ items going back 400+ years.

Grant writer / donor comms ($50k-$55k): lvafoundation.org/grants-and-e...

Director of org excellence & assessment ($90k+): www.jobs.virginia.gov/jobs/directo...
February 6, 2026 at 1:59 PM
This is the first time we've been willing to add LLM-supported transcription to FromThePage. Last week's webinar: www.youtube.com/watch?v=UhqR...
Introducing Gemini 3.0 Support in FromThePage
YouTube video by fromthepage
www.youtube.com
December 15, 2025 at 2:23 PM
Reposted by Ben Brumfield
I'm not ready to discard the entire ecosystem we've built around making data openly available to possibly, hopefully slightly impair commercial AI training. Users uploading data (they often don't have the right to relicense for AI training) into AI models is a drop in the ocean.
December 11, 2025 at 1:44 PM
Reposted by Ben Brumfield
My worry here is that tabular, numerical info is much easier to decipher and standardize, so it gets turned into a dataset without…
Think about:
- Scientists' lab notebooks with abandoned experiments that might be promising with today's technology
- Rejected grant proposals addressing problems we can now tackle
- Field notes from geological surveys
- Medical case files, engineering reports, meteorological observations
December 11, 2025 at 12:49 PM
In four hours we will present our experiments integrating #gemini AI #transcription into the #crowdsourcing platform #FromThePage

content.fromthepage.com/dec-2025-web...
Introducing Gemini 3.0 Support in FromThePage (December 11, 2025) - FromThePage Blog
content.fromthepage.com
December 11, 2025 at 12:58 PM
Reposted by Ben Brumfield
"From a transcribers point of view I will be very unlikely now to continue devoting time to working on straightforward handwritten documents without an AI draft as a starting point ."

She plans to attend our webinar this Thursday and might be willing to talk about her experience.
December 9, 2025 at 2:43 PM
Reposted by Ben Brumfield
An update: after a week of testing, the same volunteer is now very happy:
"I have been very pleased to use the AI facility in transcribing Nicholas Piper Log books for Whitby Literary & Philosophical Society."
...
December 9, 2025 at 2:43 PM
It looks like we passed another milestone on FromThePage last week:
December 8, 2025 at 2:28 PM
Reposted by Ben Brumfield
But it really feels like libraries & archives as a field suddenly just went from "we aren't generally attempting to do automated handwriting recognition because it's at the edge of what's possible" to "oh boy now we have another doable but labor-intensive collections enhancement task on the backlog"
December 3, 2025 at 3:03 PM
Reposted by Ben Brumfield
People who like to transcribe tend to be hands-on types, or puzzle solvers, or people who read between the lines. Transcribing is a way of thinking. It is not for every project, but for some projects, it can be crucial.
November 26, 2025 at 3:40 PM
Reposted by Ben Brumfield
As someone with FtP projects, I think there are still several good reasons for people to choose transcribing. The goal is not necessarily to record the words, but to find meaning. Sometimes that comes from reading the words, other times it comes from closely reading each mark.+
November 26, 2025 at 3:38 PM
Reposted by Ben Brumfield
I have never felt that AI is a threat to transcription projects. Transcription is such a fulfilling experience and the words run through you in ways that reading alone can never do. Meeting up with scribes of the past will always be thrilling.
November 26, 2025 at 2:09 PM
Introducing #Gemini 3 Support in #FromThePage content.fromthepage.com/introducing-...

We're still developing capabilities and guardrails now, but plan to present it all at a webinar December 11.
Introducing Gemini 3.0 Support in FromThePage - FromThePage Blog
When Ben sent me Mark Humphries’ report on testing a new, unreleased Gemini model, I got scared. And excited. Mark is a historian and digital humanist who’s gone deep on analyzing AI tools for textual...
content.fromthepage.com
November 24, 2025 at 1:54 PM
Yesterday, Google released Gemini 3, which has gotten really interesting reviews from Mark Humphries: generativehistory.substack.com/p/the-sugar-...

Also yesterday, we shipped an integration between FromThePage and Gemini, allowing transcribers the option of starting with an AI draft.
November 19, 2025 at 3:48 PM
Should a volunteer use #AI to help them transcribe pages for a #crowdsourcing project? That question got me thinking about why, exactly, my answer is "no" and what kinds of purposes different transcriptions may be used for.

content.fromthepage.com/can-voluntee...
Is That Transcription Really Human? - FromThePage Blog
Last month, someone asked this question on the Genealogy and AI Facebook group:If volunteers use AI to transcribe documents, is that OK? I have strong opinions, but want to explain them. First off, th...
content.fromthepage.com
October 23, 2025 at 5:25 PM
We did a serious analysis of our experiments adding AI coding agents (Codex, Github Copilot Agent) to our development process at FromThePage for a group of friends in software yesterday, which I thought I'd share here as well. After a couple of months of experiments, the results are very mixed.
September 15, 2025 at 3:34 PM
A little feature we shipped yesterday in FromThePage: you can now reference a work by page ranges, so fromthepage.com/unclibraries... shows a single two-page letter from the fifty-page folder uploaded as a single work!
folder 1251: Correspondence, 1865 (Cameron Family Papers - Records of Enslavement) | FromThePage
folder 1251: Correspondence, 1865 (Cameron Family Papers - Records of Enslavement) - read Work.
fromthepage.com
August 20, 2025 at 12:31 PM
Reposted by Ben Brumfield
What's the Character Error Rate of a Volunteer? Analyzing accuracy in cultural heritage crowdsourcing projects | @benwbrum.bsky.social | #DH2025
July 18, 2025 at 11:22 AM
Sitting in coffeeshop listening to three strangers at the bar getting into a detailed conversation about the different accents of Louisiana in both English and French. It's a good start to the day.
July 2, 2025 at 2:09 PM
Reposted by Ben Brumfield
Come hear Dreanna Belden share about "One Great Document: a Woman’s Perspective on the Marquis de Lafayette’s 1824 Visit to Yorktown and Norfolk," during our lightning talks at next week's ADE Annual Meeting.
June 13, 2025 at 1:17 PM
Does Oxford University Press no longer publicize conference discounts or holiday sales?
June 9, 2025 at 2:53 PM
I'm really enjoying the inaugural Alliance for Texas History conference at Texas State University. Best session I've attended so far? It's a toss-up between one on a freedman blacksmith and another on highway construction through African-American neighborhoods in Houston roadstaken.org #ATxH
Roadstaken
roadstaken.org
May 16, 2025 at 6:58 PM