David Steinberg
@david4096.bsky.social
I make stuff to help scientists focus on their research david4096.github.io
Catch you at the next one I hope ;)
October 6, 2025 at 10:18 AM
Catch you at the next one I hope ;)
Nice to meet you too!
April 4, 2025 at 10:16 PM
Nice to meet you too!
The photo we saw reminded me immediately of some of the goals of @dynamicland.org as seen here dynamicland.org/2023/Improvi...
Improvising cellular playgrounds in Realtalk
dynamicland.org
February 26, 2025 at 4:46 PM
The photo we saw reminded me immediately of some of the goals of @dynamicland.org as seen here dynamicland.org/2023/Improvi...
Another important direction is making immersive visual experiences that make data models accessible in a visual and humane way. I hope to experience this in person at a museum github.com/dbcls/dive
GitHub - dbcls/dive: Data Integration Visual Exploration (DIVE)
Data Integration Visual Exploration (DIVE). Contribute to dbcls/dive development by creating an account on GitHub.
github.com
February 26, 2025 at 4:40 PM
Another important direction is making immersive visual experiences that make data models accessible in a visual and humane way. I hope to experience this in person at a museum github.com/dbcls/dive
Slide from a course at @maastrichtu.bsky.social that’s up on GitHub github.com/MaastrichtU-...
GitHub - MaastrichtU-IDS/UM_KEN4256_KnowledgeGraphs: Resources for the KG course at IDS, Maastricht University
Resources for the KG course at IDS, Maastricht University - MaastrichtU-IDS/UM_KEN4256_KnowledgeGraphs
github.com
February 25, 2025 at 9:21 AM
Slide from a course at @maastrichtu.bsky.social that’s up on GitHub github.com/MaastrichtU-...
From Prof Anna Fensel’s keynote a roundup of some of the connections between AI and semantic
February 25, 2025 at 9:11 AM
From Prof Anna Fensel’s keynote a roundup of some of the connections between AI and semantic
Check out our first preprint from #biohacakathon Fukushima 2024 and expect more on this work 🤓 files.osf.io/v1/resources...
files.osf.io
February 17, 2025 at 10:02 PM
Check out our first preprint from #biohacakathon Fukushima 2024 and expect more on this work 🤓 files.osf.io/v1/resources...
We found some low hanging fruit for improvement and tested out bringing a bio dataset into Croissant. We think that continually increasing the use of ontologies and controlled vocabularies will be crucial for data harmonization and the new era of multimodal models!
February 17, 2025 at 10:02 PM
We found some low hanging fruit for improvement and tested out bringing a bio dataset into Croissant. We think that continually increasing the use of ontologies and controlled vocabularies will be crucial for data harmonization and the new era of multimodal models!
We made a simple tool for converting CroissantML to #RDF so it could be analyzed using #SPARQL and looked for differences between its usage between Kaggle and Hugging Face github.com/david4096/cr...
GitHub - david4096/croissant-rdf: Tools for working with RDF from Croissant JSON-LD resources
Tools for working with RDF from Croissant JSON-LD resources - GitHub - david4096/croissant-rdf: Tools for working with RDF from Croissant JSON-LD resources
github.com
February 17, 2025 at 10:02 PM
We made a simple tool for converting CroissantML to #RDF so it could be analyzed using #SPARQL and looked for differences between its usage between Kaggle and Hugging Face github.com/david4096/cr...
It works by providing a controlled vocabulary for high level dataset metadata as well as specific metadata for columnar data, which might seem like a small thing but is a huge step forward for bringing tools to data
February 17, 2025 at 10:02 PM
It works by providing a controlled vocabulary for high level dataset metadata as well as specific metadata for columnar data, which might seem like a small thing but is a huge step forward for bringing tools to data
@hf.co , @kaggle.com , OpenML, DataVerse and others are all implementing some or part of the CroissantML spec that interoperates with tooling like Tensorflow so you can load datasets directly into your AI training code
February 17, 2025 at 10:02 PM
@hf.co , @kaggle.com , OpenML, DataVerse and others are all implementing some or part of the CroissantML spec that interoperates with tooling like Tensorflow so you can load datasets directly into your AI training code