Matt Miller
banner
thisismattmiller.com
Matt Miller
@thisismattmiller.com
Libraries/Data -- thisismattmiller.com
Halloween blog post: Italian Giallo Horror Films

thisismattmiller.com/post/giallo/

- Using vision language model to analyze a 70 film corpus (🧟) / 80,000 frames
- Build and plot “trope clusters” across movies

Probably the longest eye acting supercut you've seen: youtu.be/cGrmkOwut6k
Giallo
Using a vision language model to analyze Italian Giallo films
thisismattmiller.com
October 31, 2025 at 6:50 PM
New Post: PEN America Banned Books 2025 dataset
thisismattmiller.com/post/book-ba...

Looking at school district book bans

- Interactive Map interface to the books banned in 2024-2025
- A faceted browse interface to the 3700 books
- Subject heading analysis
October 17, 2025 at 7:38 PM
New Blog Post.
Library of Congress & Flickr Commons: Analysis of user interactions on 40,000 images
thisismattmiller.com/post/lc-flic...
- Organizing 95K photo comments.
- Viewer to explore user georectified images
- Folksonomy tagging vs LCSH Vocabulary
- Placing into the Wiki* knowledge graph
LC & Flickr Commons
Library of Congress & Flickr Commons: Analysis of user interactions on 40,000 images.
thisismattmiller.com
October 7, 2025 at 7:03 PM
One output, 1 hour 40mins of Siskel and Ebert summaries:
www.youtube.com/watch?v=hFLM...
July 30, 2025 at 8:10 PM
Trying out workflows that use multimodal LLMs for validating and QA.

In this blog I walk through a test using 1000 Siskel and Ebert videos to extract key video frames and other data.

thisismattmiller.com/post/buildin...
Building datasets from video collections using local & cloud LLMs
Using Qwen2.5-VL, Gemini 2.5 and Whisper to build a Siskel and Ebert dataset
thisismattmiller.com
July 30, 2025 at 7:43 PM
Reposted by Matt Miller
maintenance
July 26, 2025 at 7:06 PM
Reposted by Matt Miller
New dataset on bestsellers from 40+ countries, with consistent coverage for France, Germany, Spain, Italy, and the U.S.

Congrats to the authors @sdileonardi.bsky.social, @beccacohen.bsky.social, and @dan-sinnamon.bsky.social on this major contribution! 🎉

🔗: doi.org/10.18737/386...
July 29, 2025 at 2:49 PM
Reposted by Matt Miller
Gremlins 2: The New Batch (1990)
Director: Joe Dante
Cast: Phoebe Cates-Kline, Sylvester Stallone, Hulk Hogan, Zach Galligan, Christopher Lee
Watch Review
wp / wd
July 24, 2025 at 9:19 AM
thisismattmiller.com/post/glitch/

New blog post about @glitch.com shutdown, how I migrated my apps, and how I used glitch for teaching and creative projects.
thisismattmiller.com
July 23, 2025 at 7:28 PM
The Library of Congress BIBFRAME Update is online today at 1PM EDT.
Talks about:
- Hubs (BF ontology)
- BF Cataloging at Penn Libraries
- BF Validation Tooling
listserv.loc.gov/cgi-bin/wa?A...
listserv.loc.gov
June 30, 2025 at 1:56 PM
Yeah we have bots endlessly flooding id.loc.gov stressing servers to the limit trying to scrape millions of html pages even though we offer pretty much all of it as bulk downloads: id.loc.gov/download/
June 17, 2025 at 10:00 PM
Reposted by Matt Miller
ideal
May 16, 2025 at 11:06 AM
A new very chill bot, for these very un-chill times.
Posts FERNS from "Ferns: British and exotic..." by E. J. Lowe. 8 vols 1856-1860.
Makes a new collage every 8 hours.
judicious
May 15, 2025 at 4:09 AM
Reposted by Matt Miller
Interesting! Because they just terminated their grant for the Post45 Data Collective, which preserves and establishes access to collections of literary and cultural data!
NEH announces a new funding opportunity to support "projects that develop and implement educational programs for professionals who preserve and provide access to humanities collections" bit.ly/43aviek
Preservation and Access Education and Training
Supports the development of knowledge and skills among professionals responsible for preserving and establishing access to humanities collections.
bit.ly
May 1, 2025 at 11:04 PM
If you need me I’ll be in…
April 14, 2025 at 5:22 AM
Reposted by Matt Miller
waiting for the dead bodies to arrive, but they're not doing me any favors
March 24, 2025 at 3:06 AM
It’s true, and /r/pslf is full of community bureaucratic evocation tips: “did you try the wet signature doc upload ritual?” or “are you sure you used this exact text in your reconsideration request summoning?” that are unclear if they ever actually work.
New minizine: "$93,605 : A Student Loan Ghost Story."

Framing it as a 30-year haunting starts to get at the impact #StudentDebt has had on my life. 👻 Read it online for free at violetbfox.info/minizines/.
March 10, 2025 at 7:17 PM
I wrote a bit about turning triples into pie charts.
March 5, 2025 at 7:36 PM
Reposted by Matt Miller
We’ve posted a job ad to join our team at LC Labs.

I am very proud and excited about the work we have planned. Please share.

I’ll also mention that we are part of the legislative branch and this is a partner-supported project.

www.usajobs.gov/job/832669800
Sr. Innovation Specialist
<p>This position is located in the Digital Innovation Division, Digital Strategy Directorate, Office of the Chief Information Officer.</p> <p>The position description number for this position is 35903...
www.usajobs.gov
February 26, 2025 at 11:33 PM
Reposted by Matt Miller
Untitled
February 12, 2025 at 11:06 AM
Less Gemini AI in my gmail and more fixing the broken Bus Stop theme that I've used for the last 15+ years
February 10, 2025 at 6:24 PM
"How to tell the birds from the flowers and other wood-cuts." 1929 public domain book.
babel.hathitrust.org/cgi/pt?id=co...
found via
thisismattmiller.github.io/hathi-pd-202...
February 7, 2025 at 4:31 AM
New blog post: Three interfaces to explore the 50,000 1929 HathiTrust resources that entered the public domain last month

thisismattmiller.com/post/hathi-p...
Hathi PD 2025
Data and tools to explore 50,000 1929 public domain titles in HathiTrust
thisismattmiller.com
February 6, 2025 at 7:06 PM
preparing for department of ed shutdown potentiality, downloading all my info, and just looking at my pslf dashboard... "he was just three days from retirement"...almost made it :(
February 6, 2025 at 5:32 PM
Reposted by Matt Miller
New publication! “Knowledge Graphing Art Archives: Methods and Tools from the Semantic Lab’s E.A.T. Project”

Highlighting work creating a knowledge graph for archival materials from the avant-garde movement, Experiments in Art and Technology (E.A.T.).

openhumanitiesdata.metajnl.com/articles/10....
openhumanitiesdata.metajnl.com
January 28, 2025 at 4:10 PM