Backpackers Hiking on a desert dune, Pioneer Campus Key Visual Biomedical AI

Systems Genetics and Machine Learning

Francesco Paolo Casale Lab

Paolo’s research group focuses on the development and application of machine learning and statistical approaches to advance our understanding of complex trait and disease biology.

About our research

Our research interests lie in the development and application of machine learning and statistical tools to analyze large genetic cohorts with deep molecular and phenotypic data, with the ultimate goal to further our understanding of complex trait biology. We aim to address fundamental biomedical questions such as: Which are the molecular, cellular and organ-level traits associated with disease severity and progression? Which of these are likely to drive disease pathogenesis? How does the interplay of genetic and environmental factors affect these traits?

Our approach combines principles from machine learning, statistical inference and systems genetics, with a strong focus on model scalability, robustness and interpretability. Current major research areas include the development of scalable tools for genetic association studies, deep learning models for imaging genetics, and computational methods to study gene-environment interactions and disease subtypes.

Leverage scalable machine learning and statistical tools together with large system genetics datasets to further our understanding of human disease biology.

Publications

Genome Res., DOI: 10.1101/gr.280689.125 (2025)

Nappi, A. ; Shilova, L. ; Karaletsos, T. ; Cai, N. ; Casale, F.P.

BayesRVAT enhances rare-variant association testing through Bayesian aggregation of functional annotations.
In: (Research in Computational Molecular Biology). 2025. 428-431 (Lect. Notes Comput. Sc. ; 15647 LNBI)

Nappi, A. ; Cai, N. ; Casale, F.P.

Bayesian aggregation of multiple annotations enhances rare variant association testing.

Hölzlwimmer, F.R. ; Lindner, J. ; Tsitsiridis, G. ; Wagner, N. ; Casale, F.P. ; Yépez, V.A. ; Gagneur, J.

Aberrant gene expression prediction across human tissues.

Han, S. ; Yu, S. ; Shi, M. ; Harada, M. ; Ge, J. ; Lin, J. ; Prehn, C. ; Petrera, A. ; Li, Y. ; Sam, F. ; Matullo, G. ; Adamski, J. ; Suhre, K. ; Gieger, C. ; Hauck, S.M. ; Herder, C. ; Roden, M. ; Casale, F.P. ; Cai, N. ; Peters, A. ; Wang-Sattler, R.

LEOPARD: Missing view completion for multi-timepoint omics data via representation disentanglement and temporal knowledge transfer.
Genome Res. 34, 1276-1285 (2024)

Sens, D.W. ; Shilova, L. ; Gräf, L. ; Grebenshchikova, M. ; Eskofier, B.M. ; Casale, F.P.

Genetics-driven risk predictions leveraging the Mendelian randomization framework.
In: (Research in Computational Molecular Biology). Gewerbestrasse 11, Cham, Ch-6330, Switzerland: Springer International Publishing Ag, 2024. 385-389 (Lect. Notes Comput. Sc. ; 14758 LNCS)

Gräf, L. ; Sens, D.W. ; Shilova, L. ; Casale, F.P.

Disease risk predictions with differentiable mendelian randomization.
In: Proceedings of Machine Learning Research (27th International Conference on Artificial Intelligence and Statistics (AISTATS), MAY 02-04, 2024, Valencia, SPAIN). 2024. 3664-3672 (Int. Conf. art. intell. stat. ; 238)

Engelmann, J.P. ; Palma, A. ; Tomczak, J.M. ; Theis, F.J. ; Casale, F.P.

Mixed models with multiple instance learning.
2023 in
In: (Proceedings - International Symposium on Biomedical Imaging, 18-21 April 2023, Cartagena, Colombia). 345 E 47th St, New York, Ny 10017 Usa: Ieee, 2023. 5 ( ; 2023-April)

Sens, D. ; Sadafi, A. ; Casale, F.P. ; Navab, N. ; Marr, C.

BEL: A Bag Embedding Loss for Transformer Enhances Multiple Instance Whole Slide Image Classification.

McCaw, Z.R. ; O'Dushlaine, C. ; Somineni, H. ; Bereket, M. ; Klein, C. ; Karaletsos, T. ; Casale, F.P. ; Koller, D. ; Soare, T.W.

An allelic-series rare-variant association test for candidate-gene discovery.
Nat. Biotechnol. 33, 155-160 (2015)

Buettner, F. ; Natarajan, K.N. ; Casale, F.P. ; Proserpio, V. ; Scialdone, A. ; Theis, F.J. ; Teichmann, S.A. ; Marioni, J.C. ; Stegle, O.

Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells.

Contact PioneerCampus

Porträt Paolo Casale
Francesco Paolo Casale

PI "Systems Genetics & Machine Learning"