Skip to main content
Artificial Neural Network
Siarhei - stock.adobe.com

Nicheformer: a foundation model for single-cell and spatial omics

We are excited to present our latest interview with Anna Schaar, who, jointly with Alejandro Tejada-Lapuerta, has led the research on Nicheformer

We are excited to present our latest interview with Anna Schaar, who, jointly with Alejandro Tejada-Lapuerta, has led the research on Nicheformer

This transformer-based foundation model integrates single-cell and spatial transcriptomics data to predict spatially resolved cellular information, enhancing our understanding of tissue microenvironments and advancing spatial omics analysis. Nicheformer lays the foundation for future pioneering work in spatial single-cell analysis and was developed in the lab of Prof. Dr. Dr. Fabian Theis, who is head of the Computational Health Center.

 

Congratulations on your recent paper about NicheFormer. Can you please briefly explain your paper's main findings and significance in your field?

Anna Schaar: Nicheformer is a transformer-based foundation model that combines human and mouse dissociated single-cell and targeted spatial transcriptomics data to learn a cellular representation useful for a large variety of downstream tasks. Nicheformer is pretrained on the largest collection to date of over 57 million dissociated and 53 million spatially resolved cells across 73 tissues from both human and mouse. Subsequently, the model is fine-tuned on spatial tasks for spatial omics data to decode spatially resolved cellular information. Nicheformer is evaluated on a novel set of relevant downstream tasks such as spatial density prediction or niche and region label prediction. In particular, we show that Nicheformer enables the prediction of the spatial context of dissociated cells, allowing the transfer of rich spatial information to scRNA-seq datasets. We define a series of novel spatial prediction problems and observe consistent top performance of Nicheformer, demonstrating the advantage of the improved model capacity of the underlying transformer. Altogether, our large-scale resource of more than 110 million cells in a partial spatial context, together with the set of novel spatial learning tasks and the Nicheformer model itself, will pave the way for the next generation of machine-learning models for spatial single-cell analysis.

What inspired you to pursue this research topic, and what challenges did you encounter during your study?

Anna Schaar: The growing number of datasets measuring single cells that were dissected from their microenvironment as well as in their native cellular microenvironment opens new possibilities in terms of biological questions as well as Machine Learning approaches. Our work is greatly inspired by foundation model approaches attempted in other fields such as computer vision and natural language processing.

What are the next steps in your research, and how do you plan to build on these findings?

Anna Schaar: Nicheformer is a step towards creating a generalizable multiscale model for single-cell and spatial biology, bridging the gap from the single-cell to the tissue modality. In the future, we plan to extend our efforts by using additional components of spatial transcriptomic measurements and additional modalities to build a multi-modal single-cell foundation model.

The authors would like to thank Hongkui Zeng, Michael Kunst, and the Allen Brain Atlas consortium for providing early access to their MERFISH whole mouse brain atlas as well as the additional unpublished MERFISH mouse brain datasets. Additionally, they would like to thank Mats Nilsson and Sergio Marco Salas for providing early access to their unpublished Xenium and ISS datasets.

See our Foundation Models page here