Systems Genetics and Machine Learning
Francesco Paolo Casale Lab
Paolo’s research group focuses on the development and application of machine learning and statistical approaches to advance our understanding of complex trait and disease biology.
About our research
Our research interests lie in the development and application of machine learning and statistical tools to analyze large genetic cohorts with deep molecular and phenotypic data, with the ultimate goal to further our understanding of complex trait biology. We aim to address fundamental biomedical questions such as: Which are the molecular, cellular and organ-level traits associated with disease severity and progression? Which of these are likely to drive disease pathogenesis? How does the interplay of genetic and environmental factors affect these traits?
Our approach combines principles from machine learning, statistical inference and systems genetics, with a strong focus on model scalability, robustness and interpretability. Current major research areas include the development of scalable tools for genetic association studies, deep learning models for imaging genetics, and computational methods to study gene-environment interactions and disease subtypes.
Publications
Nappi, A. ; Shilova, L. ; Karaletsos, T. ; Cai, N. ; Casale, F.P.
BayesRVAT enhances rare-variant association testing through Bayesian aggregation of functional annotations.Nappi, A. ; Cai, N. ; Casale, F.P.
Bayesian aggregation of multiple annotations enhances rare variant association testing.Hölzlwimmer, F.R. ; Lindner, J. ; Tsitsiridis, G. ; Wagner, N. ; Casale, F.P. ; Yépez, V.A. ; Gagneur, J.
Aberrant gene expression prediction across human tissues.Han, S. ; Yu, S. ; Shi, M. ; Harada, M. ; Ge, J. ; Lin, J. ; Prehn, C. ; Petrera, A. ; Li, Y. ; Sam, F. ; Matullo, G. ; Adamski, J. ; Suhre, K. ; Gieger, C. ; Hauck, S.M. ; Herder, C. ; Roden, M. ; Casale, F.P. ; Cai, N. ; Peters, A. ; Wang-Sattler, R.
LEOPARD: Missing view completion for multi-timepoint omics data via representation disentanglement and temporal knowledge transfer.Sens, D.W. ; Shilova, L. ; Gräf, L. ; Grebenshchikova, M. ; Eskofier, B.M. ; Casale, F.P.
Genetics-driven risk predictions leveraging the Mendelian randomization framework.Gräf, L. ; Sens, D.W. ; Shilova, L. ; Casale, F.P.
Disease risk predictions with differentiable mendelian randomization.Engelmann, J.P. ; Palma, A. ; Tomczak, J.M. ; Theis, F.J. ; Casale, F.P.
Mixed models with multiple instance learning.Sens, D. ; Sadafi, A. ; Casale, F.P. ; Navab, N. ; Marr, C.
BEL: A Bag Embedding Loss for Transformer Enhances Multiple Instance Whole Slide Image Classification.McCaw, Z.R. ; O'Dushlaine, C. ; Somineni, H. ; Bereket, M. ; Klein, C. ; Karaletsos, T. ; Casale, F.P. ; Koller, D. ; Soare, T.W.
An allelic-series rare-variant association test for candidate-gene discovery.Buettner, F. ; Natarajan, K.N. ; Casale, F.P. ; Proserpio, V. ; Scialdone, A. ; Theis, F.J. ; Teichmann, S.A. ; Marioni, J.C. ; Stegle, O.
Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells.