Chris Mungall
cmungall.bsky.social
Chris Mungall
@cmungall.bsky.social
Berkeley Lab, Environmental Genomics and Systems Biology division. #GeneOntology #MonarchInitiative #AllianceGenome #NationalMicrobimeDataCollaborative #OBOFoundry.
Pinned
Wow, AlphaGenome is a huge deal, 1mb context windows, and prediction of a variety of features, with cell and tissue specificity! Read @anshulkundaje.bsky.social's excellent thread for the details. I want to additionally highlight one additional thing for my structured data nerd friends...
This a really exciting leap forward for genomic sequence to activity gene regulation models. It is a genuine improvement over pretty much all SOTA models spanning a wide range of regulatory, transcriptional and post-transcriptional processes. 1/
Excited to launch our AlphaGenome API goo.gle/3ZPUeFX along with the preprint goo.gle/45AkUyc describing and evaluating our latest DNA sequence model powering the API. Looking forward to seeing how scientists use it! @googledeepmind
You can also find training material, how-to guides, links to tools, tips for making your knowledge base agentic curation hallucination-resistant etc here: ai4curation.io/aidocs/
AI4Curators - AI Guides
Documentation for AI for curation
ai4curation.io
February 12, 2026 at 3:59 PM
For staying up to date, I recommend joining the Monarch/OBO Academy (free) and following along with the excellent training material on all things semantics and AI. Find a link to our Slack here: obofoundry.org
OBO Foundry
obofoundry.org
February 12, 2026 at 3:59 PM
The first part was 4 hours, and a mix of foundational basics and hands on activities (thanks to Jonah Cool of @anthropic.com for complimentary Pro accounts!). Slides + recordings here:
doi.org/10.5281/zeno...
Gene Ontology Curators AI Workshop (Part 1)
Goals:   Equip curators with general purpose AI skills and literacy, usable in a variety of different contexts, and to unblock paths to continued exploration.    By the end of training curators will b...
doi.org
February 12, 2026 at 3:59 PM
If you have more time and are looking for more of a foundational introduction to genAI (with lots of bio examples, and no maths) we are running a training series for members of the @geneontology.bsky.social consortium.
Gene Ontology Curators AI Workshop (Part 1)
Goals:   Equip curators with general purpose AI skills and literacy, usable in a variety of different contexts, and to unblock paths to continued exploration.    By the end of training curators will b...
doi.org
February 12, 2026 at 3:59 PM
By the way, codex is also great, and we'd love to try and incorporate more material on opencode and other tools, but we have limited time and resources, and more of us were familiar with CC, so we went with that.
February 12, 2026 at 3:59 PM
For these kinds of workshops it can be a challenge getting everyone set up with both agent code installation AND coordinating subscription access. We found GitHub spaces works great for this, everyone gets a vscode + claude code + skills directly in their browser! github.com/ai4curation/...
GitHub - ai4curation/icbo-ai-tutorial: Material for 2025 ICBO AI Tutorial.
Material for 2025 ICBO AI Tutorial. . Contribute to ai4curation/icbo-ai-tutorial development by creating an account on GitHub.
github.com
February 12, 2026 at 3:59 PM
We also ran a workshop at ICBO last year "Accelerating Ontology Curation with Agentic AI and GitHub" aimed at ontology developers where we had hands-on session using CC for live ontology editing. www.youtube.com/watch?v=_9Re...
ICBO 2025: Accelerating Ontology Curation with Agentic AI and GitHub
YouTube video by Monarch Initiative
www.youtube.com
February 12, 2026 at 3:59 PM
A key message here is that coding agents are not just for code! Yes, most run in the terminal or vscode, and they get a lot of their power from running command line tools. But you don't need to know anything about the command line! Coding agents can edit any kind of file (and any kind of verifier)
February 12, 2026 at 3:59 PM
Part 2 will be posted here shortly: oboacademy.github.io/obook/course...
Monarch Ontology Training - OBO Semantic Engineering Training
oboacademy.github.io
February 12, 2026 at 3:59 PM
As part of the @monarchinitiative.bsky.social /OBO Academy series, we had @christabone.bsky.social give us a two part introduction to "Efficient Biocuration and Bioinformatics with Claude Code". Part 1 (video and hands-on material) is here: oboacademy.github.io/obook/tutori...
Getting Started with Claude Code - OBO Semantic Engineering Training
oboacademy.github.io
February 12, 2026 at 3:59 PM
What are those tools? I have been waiting for the agent harness that marries the power of a coding agent with a less intimidating UI. There are some great candidates: Goose (has CLI + UI), Claude Desktop, now Claude Co-work. But increasingly I'm recommending: go straight for a coding agent tool!
February 12, 2026 at 3:59 PM
This week I participated in the excellent @biocurator.bsky.social virtual AI workshop. I presented some general tips for learning about agents. zenodo.org/records/1861... A lot of the advice comes down to: find time to learn+don't wait for the perfect curation tool, start using existing agent tools!
Staying in the Loop: A Biocurator's Guide to Agentic AI Developments
Staying in the Loop: A Biocurator’s Guide to Agentic AI Developments Large language models have rapidly become part of everyday scientific workflows, yet most biocurators still interact with AI primar...
zenodo.org
February 12, 2026 at 3:59 PM
Over the last few months I've been helping organize various tutorials and workshops on agentic AI, aimed mostly at biocurators, ontology developers, and PIs of knowledge bases / data resources. Some of this might be generally useful to folks who don't identify as a 'technical' or an 'AI' person.🧵
February 12, 2026 at 3:59 PM
To be fair, the main finding was the delta between LLMs alone and LLMs in hands of users: “We identify user interactions as a challenge to the deployment of LLMs for medical advice”. Current models blow away 4o, and likely more forgiving of inexperienced users, but I suspect the delta remains
February 12, 2026 at 6:19 AM
Claude code in a loop plus some markdown files for skills and agents is all you need!
January 16, 2026 at 6:19 AM
Last year we made a CLI wrapper for different deep research APIs. As a baseline implementation we do a simple Claude Code in a loop. It works rather well!
Well, I discovered there is a name for this pattern: Ralph. We made a Ralph Wiggum deep researcher. monarch-initiative.github.io/deep-researc...
January 12, 2026 at 2:41 AM
Ontologically, ontogenetically, and phylogenetically, yes
December 20, 2025 at 7:29 PM
Reposted by Chris Mungall
📣 New preprint from us at phagefoundry.org 📣
A solid machine learning framework & to predict strain-level phage-host interactions across diverse bacterial genera from genome sequences alone. Avery Noonan from the Arkin Lab led this massive effort
www.biorxiv.org/content/10.1...
Phage Foundry
phagefoundry.org
November 16, 2025 at 5:58 PM
See the thread (from the original arXiv preprint) over on Mastodon: genomic.social/@Cmungall/11...
Chris Mungall (@Cmungall@genomic.social)
Attached: 1 image How can we scale up manual classification of chemical structures in databases like ChEBI? Can we help curators place new structures into classes like "terpenoid", based on their che...
genomic.social
October 3, 2025 at 2:58 PM
We developed and evaluated a method to learn python chemical structure classifiers using LLMs. These can give classifications+explanations at runtime. With @jannahastings.bsky.social @justaddcoffee.bsky.social Noel O'Boyle, Daniel Korn, Adnan Malik jcheminf.biomedcentral.com/articles/10....
Chemical classification program synthesis using generative artificial intelligence - Journal of Cheminformatics
Accurately classifying chemical structures is essential for cheminformatics and bioinformatics, including tasks such as identifying bioactive compounds of interest, screening molecules for toxicity to humans, finding non-organic compounds with desirable material properties, or organizing large chemical libraries for drug discovery or environmental monitoring. However, manual classification is labor-intensive and difficult to scale to large chemical databases. Existing automated approaches either rely on manually constructed classification rules, or are deep learning methods that lack explainability. This work presents an approach that uses generative artificial intelligence to automatically write chemical classifier programs for classes in the Chemical Entities of Biological Interest (ChEBI) database. These programs can be used for efficient deterministic run-time classification of SMILES structures, with natural language explanations. The programs themselves constitute an explainable computable ontological model of chemical class nomenclature, which we call the ChEBI Chemical Class Program Ontology (C3PO). We validated our approach against the ChEBI database, and compared our results against deep learning models and a naive SMARTS pattern based classifier. C3PO outperforms the naive classifier, but does not reach the performance of state of the art deep learning methods. However, C3PO has a number of strengths that complement deep learning methods, including explainability and reduced data dependence. C3PO can be used alongside deep learning classifiers to provide an explanation of the classification, where both methods agree. The programs can be used as part of the ontology development process, and iteratively refined by expert human curators.
jcheminf.biomedcentral.com
October 3, 2025 at 2:57 PM
Reposted by Chris Mungall
Hiding in plain sight - how close are we to mapping ALL 🧬enhancers🧬 in the genome?

Our new paper by Mannion et al. takes a systematic look at "hidden enhancers" and why they remain so hard to find. With @mosterwalder.bsky.social, @jlopezrios.bsky.social & many more

www.nature.com/articles/s41...
August 8, 2025 at 6:10 PM
One super pedantic minor ontological pet peeve is the use of the term "simulation", since that leads me to expect a agent-based or physics-style simulation of cell perturbations. But in fact this pattern could be used for those too! And I guess the terminological horse has long bolted here..
August 25, 2025 at 12:41 AM
But of course rBio is very cool independent of my nerdy obsession with FMs using ontologies/KGs! This general distillation pattern is likely to be very useful for integrating knowledge with the weights in massive omics FMs..
two dalek robots are standing next to each other in a room with the words 76totterslane above them
Alt: Dalek meme: Daleks saying "EX-PLAIN" (alluding to use of technique to make foundation models, the "daleks", more explainable)
media.tenor.com
August 25, 2025 at 12:41 AM
For another use of ontologies in genomic foundation models, see the recent AlphaGenome paper bsky.app/profile/cmun...
Wow, AlphaGenome is a huge deal, 1mb context windows, and prediction of a variety of features, with cell and tissue specificity! Read @anshulkundaje.bsky.social's excellent thread for the details. I want to additionally highlight one additional thing for my structured data nerd friends...
This a really exciting leap forward for genomic sequence to activity gene regulation models. It is a genuine improvement over pretty much all SOTA models spanning a wide range of regulatory, transcriptional and post-transcriptional processes. 1/
August 25, 2025 at 12:41 AM