Skip to main content

Understanding Diabetes: Single-Cell Atlas Leverages Machine Learning to Decipher Diabetes at the Molecular Level

Featured Publication, New Research Findings, Diabetes, IDR, Computational Health, ICB,

A collaborative endeavor between computer scientists and diabetes researchers at Helmholtz Munich has yielded novel insights into the mechanisms underlying type 1 and type 2 diabetes. This collaboration has resulted in the creation of the first mouse islet atlas (MIA). Leveraging the power of machine learning, the team of scientists integrated single-cell datasets to reveal the molecular alterations that occur during the progression of diabetes and to highlight the distinctions between type 1 and type 2 diabetes. Their findings have been published in Nature Metabolism.

Type 1 (T1D) and type 2 diabetes (T2D) are caused by the loss or dysfunction of the insulin-producing cells in the pancreas, the β-cells, leading to disrupted blood glucose regulation. The β-cells and other cell types in the so-called Langerhans islets of the pancreas communicate with each other and jointly regulate blood glucose levels via hormone secretion. Currently, β-cell function or dysfunction is mainly assessed by measuring the levels of the hormone insulin in the bloodstream. Unfortunately, this is not sufficient to reveal the exact disease-causing mechanism that leads to β-cell failure during autoimmune attack in T1D or are associated with high blood glucose and lipid levels in T2D. Therefore, researchers worldwide studied the gene expression of pancreatic islets in mouse models by analyzing the RNA content within individual cells. This resulted in multiple generated single-cell gene expression (scRNA-seq) datasets. These datasets contain hundreds of thousands of cells with thousands of genes measured in each cell. However, the complexity of these datasets in terms of disease progression, islet cell types, differences in mouse strains and diabetes models as well as laboratory procedures and data processing has hitherto prevented the generation of a consensus on why and how β-cells become dysfunctional during diabetes progression, hindering the understanding of the underlying cause of T1D and T2D.

A team of computer scientists from the research group of Prof. Fabian Theis together with a team led by the diabetes expert Prof. Heiko Lickert, both from Helmholtz Munich, leveraged recent advances in machine learning, particularly deep representation learning and data integration, to develop the mouse islet atlas (MIA). An atlas in the world of cells is a comprehensive collection of data, that provides detailed information about cellular function. These atlases are valuable resources for researchers and scientists studying cellular biology. The MIA integrates nine scRNA-seq datasets and over 300,000 single cells with 1,000-8,000 genes measured per cell. MIA presents the first comprehensive single-cell gene expression resource of mouse pancreatic islets. Through the integration of these big datasets, the scientists were able to decipher how β-cells change their gene expression from healthy to a diseased state in T1D and T2D or in other dysfunction conditions such as aging, enabling the discovery of potential molecular pathways and targets for the prevention of β-cell failure. Furthermore, the integration and direct comparison of data across different datasets and laboratories help to bring consensus into the research community.

Mouse Islet Atlas Deciphers β-Cell Dysfunction

The authors explored the data integrated within MIA, identifying molecular changes shared or specific to T1D and T2D models during disease progression. This revealed that all healthy adult samples contain heterogeneous β-cells, varying in the levels of insulin-production-induced cell stress and aging-associated patterns. MIA also helped to resolve the question of which mouse diabetes models should be used to study human T1D and T2D, thus supporting future experimental study design. The researchers showed that the streptozotocin model, a widely used experimental model of chemical β-cell destruction, which was hitherto used to model both T1D and T2D, better corresponds to T2D when comparing β-cell identities. Additionally, an intermediate β-cell state between healthy and diabetic cells was observed in all diabetes models. This β-cell state and the molecular pathways that are active in these β-cells might be involved in diabetes progression or remission, potentially offering molecular targets for future treatment strategies. Overall, this study is a showcase of how big data integration and the resulting MIA can be exploited to assess a large number of single-cell gene expression datasets generated in many laboratories around the world to find a consensus about diabetes development and new insights that could not have been reached with individual datasets.

The Atlas Represents a Resource for the Future

MIA presents a comprehensive resource, enabling both interactive exploration via cellxgene and computational analysis. For example, MIA can be used to look up which cells express a gene of interest and how strongly the gene is expressed. New samples can be also interpreted in the light of the conditions within MIA by mapping them onto the atlas. In the future, MIA may be further extended and updated to continuously capture newly generated data.


Original publication

Hrovatin, K. et al. (2023): Delineating mouse β-cell identity during lifetime and in diabetes with a single cell atlas. Nature Metabolism.


About the scientists

Karin Hrovatin, PhD student enrolled in the MUDS graduate school at Helmholtz Munich

Prof. Dr. Heiko Lickert, Director at the Institute of Diabetes and Regeneration Research (IDR) at Helmholtz Munich and Professor and Chair of Beta Cell Biology at the Medical Faculty at the Technical University Munich (TUM)

Prof. Dr. Dr. Fabian Theis, Head of the Computational Health Center (CHC), Director of the Institute of Computational Biology (ICB) at Helmholtz Munich and Director of Helmholtz AI


The mouse islet atlas (MIA) offers new insights into diabetes by enhancing the study of pancreatic islets and β-cells.

Multiple single cell gene expression datasets across diverse biological conditions were integrated into a single resource, which led to new insights that could not have been obtained from individual datasets.

The illustration was created by Karin Hrovatin.

Fabian Theis

Prof. Dr. Fabian Theis

Director of the Computational Health Center, Director of the Institute for Computational Biology