Greg Leppert
@leppert.me
Working on AI and access to knowledge at Harvard. Executive Director of the Institutional Data Initiative; Chief Technologist of the Berkman Klein Center.
Amanda Watson, @institutionaldatainitiative.org's Library Chair and leader of Harvard Law School Library, spoke about the importance of publishing library collections as data to guide the future of AI. hls.harvard.edu/today/food-f...
Food for (AI) thought and the library initiative improving AI’s digital diet - Harvard Law School
Amanda Watson of the Harvard Law School Library says the release of Harvard’s digitized collection is only the beginning of collaborations between libraries and tech firms.
hls.harvard.edu
July 24, 2025 at 7:34 PM
Amanda Watson, @institutionaldatainitiative.org's Library Chair and leader of Harvard Law School Library, spoke about the importance of publishing library collections as data to guide the future of AI. hls.harvard.edu/today/food-f...
Reposted by Greg Leppert
With @yh-huang.bsky.social, I'm excited to share our Digital Collections Explorer, an open-source, multimodal viewer for digital collections! Users can search with both natural language inputs and reverse image search.
Paper: arxiv.org/abs/2507.00961
Public demo: digital-collections-explorer.com
Paper: arxiv.org/abs/2507.00961
Public demo: digital-collections-explorer.com
Digital Collections Explorer: An Open-Source, Multimodal Viewer for Searching Digital Collections
We present Digital Collections Explorer, a web-based, open-source exploratory search platform that leverages CLIP (Contrastive Language-Image Pre-training) for enhanced visual discovery of digital col...
arxiv.org
July 2, 2025 at 8:56 PM
With @yh-huang.bsky.social, I'm excited to share our Digital Collections Explorer, an open-source, multimodal viewer for digital collections! Users can search with both natural language inputs and reverse image search.
Paper: arxiv.org/abs/2507.00961
Public demo: digital-collections-explorer.com
Paper: arxiv.org/abs/2507.00961
Public demo: digital-collections-explorer.com
This starts in an hour.
This Monday, @institutionaldatainitiative.org will host Petr Knoth to share his experience leading CORE ("The world’s largest collection of open access research papers") as the rise of AI brings new meaning, and challenges, to stewarding knowledge repositories. Join us virtually via the link below.
June 23, 2025 at 3:41 PM
This starts in an hour.
This Monday, @institutionaldatainitiative.org will host Petr Knoth to share his experience leading CORE ("The world’s largest collection of open access research papers") as the rise of AI brings new meaning, and challenges, to stewarding knowledge repositories. Join us virtually via the link below.
June 20, 2025 at 5:43 PM
This Monday, @institutionaldatainitiative.org will host Petr Knoth to share his experience leading CORE ("The world’s largest collection of open access research papers") as the rise of AI brings new meaning, and challenges, to stewarding knowledge repositories. Join us virtually via the link below.
Tomorrow, it's our pleasure to host @ayahbdeir.bsky.social to talk about the power of data in building an AI ecosystem that's open, transparent, and fair. 11am ET on June 17th. Register at the link below to attend virtually.
June 16, 2025 at 7:48 PM
Tomorrow, it's our pleasure to host @ayahbdeir.bsky.social to talk about the power of data in building an AI ecosystem that's open, transparent, and fair. 11am ET on June 17th. Register at the link below to attend virtually.
The @institutionaldatainitiative.org is proud to support The New Commons challenge. $100k grants along with mentorship. Let's get impactful data into the AI ecosystem.
(1/4) CALL FOR APPLICATIONS FOR DATA COMMONS FOR AI
🏆Today, The Open Data Policy Lab (a collaboration btwn The GovLab & @microsoft.com launched The New Commons Challenge—an innovation challenge to foster the creation of data commons that can support generative AI developed in the public interest.
🏆Today, The Open Data Policy Lab (a collaboration btwn The GovLab & @microsoft.com launched The New Commons Challenge—an innovation challenge to foster the creation of data commons that can support generative AI developed in the public interest.
April 14, 2025 at 3:46 PM
The @institutionaldatainitiative.org is proud to support The New Commons challenge. $100k grants along with mentorship. Let's get impactful data into the AI ecosystem.
Reposted by Greg Leppert
To start the weekend, we've got a brand new experience for case law on CourtListener. It has better typography, more features and metadata, five million scanned decisions from @harvardlil.bsky.social, and a lot more. Read all about it and let us know what you think: free.law/2025/03/21/c...
A Faster, Smarter, Unified Case Law Experience
A redesigned case law modernizes the reading experience with enhanced layout and typography, more advanced features, better speed, and more.
free.law
March 21, 2025 at 10:53 PM
To start the weekend, we've got a brand new experience for case law on CourtListener. It has better typography, more features and metadata, five million scanned decisions from @harvardlil.bsky.social, and a lot more. Read all about it and let us know what you think: free.law/2025/03/21/c...
As the @institutionaldatainitiative.org expands its mission, we’re announcing a collaboration with @bpl.boston.gov to develop AI-driven tools capable of accelerating new digitization at libraries across the world, starting at the Boston Public Library. institutionaldatainitiative.org/posts/using-...
Using AI to Accelerate Digitization at Boston Public Librarys
Today, as part of our mission expansion, we’re announcing a collaboration with BPL to develop AI-driven tools capable of accelerating new digitization of large collections at libraries across the worl...
institutionaldatainitiative.org
March 12, 2025 at 1:23 PM
As the @institutionaldatainitiative.org expands its mission, we’re announcing a collaboration with @bpl.boston.gov to develop AI-driven tools capable of accelerating new digitization at libraries across the world, starting at the Boston Public Library. institutionaldatainitiative.org/posts/using-...
I'm pleased to announce we're expanding our mission at the @institutionaldatainitiative.org with an open call for institutional collaborators, new digitization at Harvard Law School Library, and additional support to advance this work. institutionaldatainitiative.org/posts/open-c...
Expanding Our Mission: An Open Call for Collaborators
Today, we’re pleased to announce an open call for institutional collaborators as new support expands the research capacity of the Institutional Data Initiative.
institutionaldatainitiative.org
March 5, 2025 at 3:36 PM
I'm pleased to announce we're expanding our mission at the @institutionaldatainitiative.org with an open call for institutional collaborators, new digitization at Harvard Law School Library, and additional support to advance this work. institutionaldatainitiative.org/posts/open-c...
Great op-ed from @shaynelongpre.bsky.social on the effects AI — as a technology and as a market — is having on the web. www.technologyreview.com/2025/02/11/1...
AI crawler wars threaten to make the web more closed for everyone
There’s an accelerating cat-and-mouse game between web publishers and AI crawlers, and we all stand to lose.
www.technologyreview.com
February 12, 2025 at 7:27 PM
Great op-ed from @shaynelongpre.bsky.social on the effects AI — as a technology and as a market — is having on the web. www.technologyreview.com/2025/02/11/1...
In 15mins (1pm ET), I'll be giving a talk about @institutionaldatainitiative.org and our quest to build the most boring dataset in the world. Tune in here: lu.ma/iqkqvcus
Greg Leppert, Harvard | The Most Boring Dataset in the World · Luma
Foresight Institute’s Intelligent Cooperation Group
The Most Boring Dataset in the World
Abstract: Data is a critical raw material in the construction of AI.…
lu.ma
January 29, 2025 at 5:47 PM
In 15mins (1pm ET), I'll be giving a talk about @institutionaldatainitiative.org and our quest to build the most boring dataset in the world. Tune in here: lu.ma/iqkqvcus
@karaswisher.bsky.social and @ylecun.bsky.social discuss our @institutionaldatainitiative.org in the latest Pivot. It's essential that we establish equitable and sustainable models for data access and stewardship in the age of AI, and IDI is working on exactly that podcasts.apple.com/us/podcast/m...
Meta's Chief AI Scientist Yann LeCun Makes the Case for Open Source | On With Kara Swisher
Podcast Episode · Pivot · 12/21/2024 · 55m
podcasts.apple.com
December 21, 2024 at 9:26 PM
@karaswisher.bsky.social and @ylecun.bsky.social discuss our @institutionaldatainitiative.org in the latest Pivot. It's essential that we establish equitable and sustainable models for data access and stewardship in the age of AI, and IDI is working on exactly that podcasts.apple.com/us/podcast/m...
Yesterday we launched the Institutional Data Initiative at Harvard Law School Library to work with libraries, government agencies, and other knowledge institutions to help refine and publish their collections as data, with an eye toward AI. 🧵 bsky.app/profile/inst...
Hello world. institutionaldatainitiative.org/hello-world....
How Knowledge Institutions Can Build a Promethean Moment
Why we’re launching the Institutional Data Initiative to work with libraries, government agencies, and other knowledge institutions to develop data collections and best practices for artificial intell...
institutionaldatainitiative.org
December 13, 2024 at 2:03 PM
Yesterday we launched the Institutional Data Initiative at Harvard Law School Library to work with libraries, government agencies, and other knowledge institutions to help refine and publish their collections as data, with an eye toward AI. 🧵 bsky.app/profile/inst...