MetaboliteAnnotator: AI-Assisted Name Harmonization and Metadata Enrichment Tool for Metabolomics
Metabolite metadata enrichment remains a significant challenge in metabolomics due to the limitations of static databases, incomplete metabolite coverage, and the labor-intensive nature of manual verification. Here, we present MetaboliteAnnotator, an R Shiny-based application for AI-assisted metabolite name harmonization and metadata enrichment. MetaboliteAnnotator implements a hierarchical procedure, including preprocessing of input metabolite names, matching against a curated local resource (covering information on ∼640,000 metabolites names), PubChem-based real-time retrieval, and AI-assisted matching for ambiguous compounds, followed by real-time integration of KEGG, CTD, Reactome, and ChEBI. Compared with MetaboAnalyst 6.0 and MetaboliteIDmapping, MetaboliteAnnotator achieved significantly higher name hit rates across all six MetaboLights data sets 93.2% in positive mode (4021/4314 names) and 93.5% in negative mode (2344/2510 names). MetaboliteAnnotator outputs standardized identifiers (e.g., InChIKey, PubChem CID), endogenous/exogenous information, pathway mappings, and metabolite-gene/phenotype associations for downstream biological interpretation.