Matteo Cargnelutti
banner
matteocargnelutti.dev
Matteo Cargnelutti
@matteocargnelutti.dev
Principal Engineer @ Institutional Data Initiative, Harvard Law School.
Fellow @ Library Innovation Lab, Harvard Law School.
Working on digital preservation, open knowledge, and the data that underpins AI. https://matteocargnelutti.dev
Reposted by Matteo Cargnelutti
Today we released Institutional Books 1.0, a 242B token dataset from Harvard Library's collections, refined for accuracy and usability. 🧵
June 12, 2025 at 9:12 PM
Reposted by Matteo Cargnelutti
1M public domain books now available digitally, through our Institutional Data Initiative at Harvard.
Today we released Institutional Books 1.0, a 242B token dataset from Harvard Library's collections, refined for accuracy and usability. 🧵
June 12, 2025 at 9:34 PM
Reposted by Matteo Cargnelutti
I'm pleased to announce we're expanding our mission at the @institutionaldatainitiative.org with an open call for institutional collaborators, new digitization at Harvard Law School Library, and additional support to advance this work. institutionaldatainitiative.org/posts/open-c...
Expanding Our Mission: An Open Call for Collaborators
Today, we’re pleased to announce an open call for institutional collaborators as new support expands the research capacity of the Institutional Data Initiative.
institutionaldatainitiative.org
March 5, 2025 at 3:36 PM
Reposted by Matteo Cargnelutti
I noticed someone mentioning that they couldn't afford "How Git Works" right now.

we have a "buy one give one" program where we give away 1 free copy for every zine sold. You can use code BUYONEGIVEONE to get a free PDF copy of How Git Works if $12 is a lot for you.

wizardzines.com/zines/git/
How Git Works
wizardzines.com
February 25, 2025 at 4:46 PM
Reposted by Matteo Cargnelutti
What insights emerge when a librarian, a software engineer, and a legal scholar come together to experiment with Retrieval Augmented Generation (RAG) to explore over 800,000 French legal articles 🇫🇷?

Blog post: lil.law.harvard.edu/blog/2025/01...
Case study: lil.law.harvard.edu/open-french-...
January 22, 2025 at 8:35 PM
Reposted by Matteo Cargnelutti
"We picked a century scale because most physical objects can survive 100 years in good care. It is attainable, and yet we selected it because the design of mainstream digital storage mediums are nowhere close to even considering this mark."

lil.law.harvard.edu/century-scale-storage
December 13, 2024 at 2:08 PM
Reposted by Matteo Cargnelutti
Yesterday we launched the Institutional Data Initiative at Harvard Law School Library to work with libraries, government agencies, and other knowledge institutions to help refine and publish their collections as data, with an eye toward AI. 🧵 bsky.app/profile/inst...
December 13, 2024 at 2:03 PM