Sebastian Majstorovic
banner
storytracer.com
Sebastian Majstorovic
@storytracer.com
Open Data Consultant for @eleutherai.bsky.social & Digital History Advisor for @eui-history.bsky.social. Co-founder of @datarescueproject.org and @sucho-org.bsky.social. Website: https://www.storytracer.com/
Reposted by Sebastian Majstorovic
BREAKING NEWS
Bureau of Labor Statistics announced cancellations of several key data releases

🔺 Job Openings and Labor Turnover (JOLTS)
🔺 Employment Situation
🔺 Consumer Price Index
... and more

This has knock-on effects for other products, like GDP (produced at BEA)

www.bls.gov/bls/2025-lap...
Revised news release dates following the 2025 lapse in appropriations
Revised news release dates following the 2025 lapse in appropriations
www.bls.gov
November 21, 2025 at 4:50 PM
Reposted by Sebastian Majstorovic
Building datasets to train smaller, task-focused models used to be incredibly time-consuming.

Very excited to see SAM3 massively lower that barrier. Describe the class you want to detect and get annotated datasets automatically!

Try it yourself: huggingface.co/datasets/uv-...!
November 21, 2025 at 1:30 PM
Reposted by Sebastian Majstorovic
1/ Announcing GovScape – a public search system for 10 million U.S. government PDFs (70 million pages)! GovScape offers visual search, semantic text search, and keyword search. Explore below:

Website: www.govscape.net
ArXiv link: arxiv.org/abs/2511.11010
www.govscape.net
November 18, 2025 at 8:19 PM
The @mozilla.org team has done a spectacular job for MozFest 2025. If you‘re also in Barcelona and would like to chat about Open Data and Open Source AI send me a DM, I‘m here until Monday! #mozfest #mozfest2025 #mozillafestival #mozilla #opensource #ai
November 7, 2025 at 2:36 PM
Reposted by Sebastian Majstorovic
The John D. and Catherine T. MacArthur Foundation has generously awarded us funding to secure our own storage. This critical processing space will be instrumental in ensuring that large datasets can be temporarily stored, curated, and described.

Thank you, MacArthur Foundation, for your support!
Data Rescue Projects receives support from the John D. and Catherine T. MacArthur Foundation to support data rescue efforts
FOR IMMEDIATE RELEASE Since launching in February 2025, the Data Rescue Project has grown substantially. At this point, the DRP has enabled the rescue of more than 1,000 datasets from US Federal…
www.datarescueproject.org
November 4, 2025 at 5:20 PM
Reposted by Sebastian Majstorovic
Members of our Steering Committee @lyndamk.bsky.social and @storytracer.com are in Strasbourg France today and tomorrow to talk about our DRP at Numérique en Commune[s]. Some of the earliest interest in our work was from the French media so it is exciting to be here.
October 29, 2025 at 1:50 PM
Reposted by Sebastian Majstorovic
A neat tool I just came across: Viabundus, a digital road map of northern Europe 1350-1650, that lets you calculate contemporary travel routes/times. In 1500, going Amiens → Köln by horse took almost 7 days and 13 toll payments.

#medievalsky

www.landesgeschichte.uni-goettingen.de/handelsstras...
October 24, 2025 at 10:58 PM
Reposted by Sebastian Majstorovic
Very nice work! IMO, this is the kind of topic that more libraries/GLAM/DH people should be working on. The training of these models is *relatively* simple. As always, the missing ingredient is readily accessible data.
It's been brewing for months: @inriaparisnlp.bsky.social releases CoMMA (Corpus of Multilingual Medieval Archives) !

📚 2.5bn tokens of mostly Latin and French texts
🕰️ 800→1600 CE
📜 23k manuscripts
🖥️ 18k on the reading interface: comma.inria.fr
🔍 Paper: inria.hal.science/hal-05299220v1

(1/🧵)
CoMMA
comma.inria.fr
October 15, 2025 at 3:55 PM
Reposted by Sebastian Majstorovic
We are honored to receive an NDSA Digital Preservation Excellence award. In accepting the award, @lyndamk.bsky.social expressed how this work is only possible due to our volunteers who "have spent countless hours working to ensure that public data remains a public good that is publicly accessible."
October 10, 2025 at 8:25 PM
Reposted by Sebastian Majstorovic
Making things in Global Asia

🗺️ Explore the new online exhibition by our @erc.europa.eu project CAPASIA 👉 loom.ly/ig-EXQY

The exhibit highlights the overlooked, but nonetheless worldmaking, role of Asian manufacturing in the global history of #earlymodern capitalism

🌐 #onlineexhibition #capitalism
Making Things in Global Asia Exhibition
Long before Asia became the epicentre of global manufacturing at the end of the twentieth century, it was home to sophisticated cultures of production in the early modern period (1500-1800). This rich...
loom.ly
September 22, 2025 at 12:26 PM
Reposted by Sebastian Majstorovic
Robert Redford (1936-2025) 🤍
September 16, 2025 at 1:16 PM
Reposted by Sebastian Majstorovic
Y'all are ducking fantastic! Because of you and your efforts to #SaveOurSigns, we've collected more than 6000 photos from over 300 #NationalParks. Keep up the great work!
September 3, 2025 at 8:30 PM
Reposted by Sebastian Majstorovic
New DRP post is up. If you aren't familiar with the Federal Statistical System, we encourage you to learn more. And tell others. Be that data nerd in your family.

www.datarescueproject.org/the-federal-...
The Federal Statistical System Under Threat
Learn about the Federal Statistical System and ways to support it. Public data is a public good!
www.datarescueproject.org
September 2, 2025 at 6:28 PM
Reposted by Sebastian Majstorovic
Powerful presentation from @lyndamk.bsky.social of @datarescueproject.org on ongoing work to rescue and preserve digital and physical federal data targeted by the Trump administration www.datarescueproject.org #OpenFest25
Data Rescue Project
Preserving public data
www.datarescueproject.org
September 2, 2025 at 2:00 PM
Reposted by Sebastian Majstorovic
EPFL, ETH Zurich, and CSCS today released Apertus, Switzerland's first large-scale, multilingual language model (LLM). As a fully open LLM, it serves as a building block for developers and organizations to create their own applications.
ethz.ch/en/news-and-...
Apertus: a fully open, transparent, multilingual language model
EPFL, ETH Zurich and the Swiss National Supercomputing Centre (CSCS) released Apertus today, Switzerland’s first large-scale, open, multilingual language model — a milestone in generative AI for trans...
ethz.ch
September 2, 2025 at 9:07 AM
Reposted by Sebastian Majstorovic
Struggling to participate in the Indian Ocean sea trade

Read the 🔓 #OpenAccess article by Michael O'Sullivan on the evolution of Ottoman shipping in the Indian Ocean from 1650 to 1900 👉 buff.ly/GN9hF8G

Part of our CAPASIA @erc.europa.eu project research 👉 buff.ly/K57rhlr

📚 #Fridayreads
August 29, 2025 at 1:08 PM
Here we go, they're seriously coming after Wikipedia now: www.heise.de/en/news/Wiki...
Wikipedia: Republicans launch investigation
An investigative committee in the USA is demanding information from the Wikipedia Foundation within two weeks about suspected manipulation of content.
www.heise.de
August 29, 2025 at 6:21 AM
Reposted by Sebastian Majstorovic
🎙️ Say hello to OLMoASR—our fully open, from-scratch speech-to-text (STT) model. Trained on a curated audio-text set, it boosts zero-shot ASR and now powers STT in the Ai2 Playground. 👇
August 28, 2025 at 4:13 PM
Reposted by Sebastian Majstorovic
Great article about our work profiling @storytracer.com Thanks @zeit.de and @cendt.de
@zeit.de hat heute eine einen langen Artikel über das @datarescueproject.org veröffentlicht. @cendt.de schafft es in seinem Porträt, eindrücklich und verständlich zu erklären, warum öffentliche Daten in den USA vor der Löschung gerettet werden müssen. www.zeit.de/2025/37/date...
Datenlöschung in den USA: Ein Back-up der Realität
Die US-Regierung tilgt Informationen aus dem Internet, die nicht ins MAGA-Weltbild passen. Ein Mann aus Köln versucht, schneller zu sichern, als Trump löschen kann.
www.zeit.de
August 28, 2025 at 12:50 PM
@zeit.de hat heute eine einen langen Artikel über das @datarescueproject.org veröffentlicht. @cendt.de schafft es in seinem Porträt, eindrücklich und verständlich zu erklären, warum öffentliche Daten in den USA vor der Löschung gerettet werden müssen. www.zeit.de/2025/37/date...
Datenlöschung in den USA: Ein Back-up der Realität
Die US-Regierung tilgt Informationen aus dem Internet, die nicht ins MAGA-Weltbild passen. Ein Mann aus Köln versucht, schneller zu sichern, als Trump löschen kann.
www.zeit.de
August 28, 2025 at 11:14 AM
Reposted by Sebastian Majstorovic
Thank you for speaking out! We all work together to #SaveOurSigns

saveoursigns.org
August 24, 2025 at 10:45 AM
Reposted by Sebastian Majstorovic
#DataRescuers, let's get #InFormation! Help us stay informed and submit any datasets that need rescuing to our Nominations Form:
Nominate Data in Need of Rescue | Baserow
Please fill out this form to nominate public datasets, data resources, and tools created and maintained by U.S. government organizations which may be in danger of deletion or alteration. After…
baserow.datarescueproject.org
August 14, 2025 at 8:15 PM
Reposted by Sebastian Majstorovic
Absolutely necessary investment and good move by NVIDIA. Very hard to do science when you’re simultaneously trying to answer your research question, and figure out how your instruments were built and what they do.
NSF and NVIDIA award Ai2 a combined $152M to support building a national level fully open AI ecosystem | Ai2
Ai2 has been awarded a combined $152 million from the U.S. National Science Foundation (NSF) and NVIDIA as part of a jointly funded project to advance our research and develop truly open AI models and...
allenai.org
August 14, 2025 at 12:20 PM
Reposted by Sebastian Majstorovic
Trump's pick for Bureau of Labor Statistics commissioner suggests suspending monthly jobs report. cnn.it/4fvvr0u
August 12, 2025 at 4:15 PM
Reposted by Sebastian Majstorovic
Amazing work by @lyndamk.bsky.social and @datarescueproject.org to preserve history in the face of fascism. #SaveOurSigns
In roughly six weeks, more than a dozen exhibits about slavery at Independence National Historical Park could be removed or covered up by the Department of Interior.

Philadelphians are trying to preserve or archive these sites before it could be too late.
Inside the fight to save more than a dozen Independence Park exhibits from potential Trump admin removal in September
Two Philadelphians are working to preserve or archive historic sites at Independence National Historical Park before items are removed or covered by the Trump administration in the fall.
inquirer.com
August 1, 2025 at 8:19 PM