Skip to main content
Center Computational Health at Helmholtz Munich
Helmholtz Munich I Daniela Barreto

RepoRT: Data Repository Enhances Metabolomics Research

Core Facilities, Featured Publication,

Metabolomics is a scientific approach that involves the comprehensive study of small molecules, known as metabolites, within a biological system. In a major stride towards advancing metabolomics research, researchers around Michael Witting from the Core Facility Metabolomics and Proteomics at Helmholtz Munich developed RepoRT, an innovative data repository aimed at transforming small molecule retention time prediction in liquid chromatography. RepoRT paves the way for improved metabolite identification and understanding of biological processes. The results were now published in Nature Methods.

Metabolomics is a central method to study health and disease and focusses on the analysis of small molecules - so called metabolites - from different biological specimens. The investigation of metabolites provides valuable insights into the biochemical pathways and metabolic activity occurring in cells, tissues or organisms. However, identifying and characterizing metabolites poses a significant challenge, due to the complexity and dynamics of these samples. Despite decades of research and numerous machine learning models, predicting small molecule retention times has remained a complex issue due to variations in compound structure and chromatographic systems. Retention time, a crucial parameter in metabolite identification, represents the time taken for a molecule to pass through a liquid chromatography column used for high-resolution separation of components in a sample. 

So far, new approaches for metabolite identification have focused on the development of data analysis strategies for mass spectrometric data. However, it becomes clear that certain limitations exist and orthogonal information such as retention times need to be leveraged, but only limited methods exist. In this study, the team of analytical chemists and bioinformaticians established RepoRT, a repository for metabolite retention times. Their goal: Enabling in future other bioinformatic and machine learning groups to develop algorithms for the enhanced prediction of retention times to aid metabolite identification. 

“This is the first repository delivering small molecule retention times as training data in a machine learning-ready fashion, which hopefully leads to a fast uptake of retention time prediction in metabolite identification workflows” says Michael Witting, the corresponding author of this article.  

Original publication 

Fleming Kretschmer et al. (2024): RepoRT: a comprehensive repository for small molecule retention times. Nature Methods. DOI:


Dr. Michael Witting