Helmholtz Munich

Genetic and Epigenetic Gene Regulation

Overview

The Heinig lab is located at the Insitiute of Computational Biology, which is part of the Computational Health Center of Helmholtz Munich.

Together with my research group, I develop AI solutions for personalized network based precision medicine.

Technological advances allow for an unprecedented in-depth characterization of the molecular basis of complex diseases. In particular SNP genotyping, DNA methylation assays and gene expression profiling in large cohorts have been used to identify numerous disease associated loci and genes. However, a deeper mechanistic or systems level understanding of disease processes still remains elusive in most cases.

The aim of our research is the development and application of computational and statistical tools for the identification of molecular regulatory networks underlying common diseases and the genetic and epigenetic mechanisms controlling these networks from population level DNA and multi-omics data sets. In a second step we aim to personalize the networks based on single cell data. This will enable us to implement new concepts for precision medicine. A special focus is the molecular characterization of metabolic and cardiovascular diseases, in particular diabetes and arrhythmias like atrial or ventricular fibrillation.

Motivated by the fact that most disease associated variants identified to date are located in non-coding parts of the genome, which likely harbors regulatory elements, we are studying the effect of naturally occurring sequence variation on gene regulation. To characterize regulatory sequence variants two related challenges have to be met: 1) regulatory elements have to be recognized and 2) the corresponding target genes have to be identified. Epigenetic marks such as histone modifications have proved instrumental for the identification of regulatory elements in the genome, while the integrated analysis of genetic variation and gene expression provides a strategy (expression QTL mapping) to identify targets of regulatory variants. Ultimately the integration of genetic, genomic and epigenomic data set is expected to lead to a comprehensive understanding of regulatory sequence variation and its role in disease. Towards these goals we have:

developed a random walk approach to identify the regulatory networks underlying common disease from DNA-methylation data (Hawe Nature Genetics 2022) - code
extended this approach using Bayesian graphical models to make full use of prior networks and the correlation structure in the data (Hawe Genome Medicine 2022) - code
integrated proteomics and gene expression to identify cis and trans acting eQTL/pQTL in the human atrial appendage (Assum Nature Communications 2022) - code
reviewed current methods for network reconstruction from multi-omics data (Hawe Frontiers in Genetics 2019)
performed one of the largest eQTL studies to date in the human heart (Heinig Genome Biology 2017)
developed the computational tool histoneHMM for the identification of differentially modified regions for histone modifications with broad genomic footprints (Heinig BMC Bioinformatics 2015). developed predictive models to identify functional genomic elements predictive of regulatory variants (Budach, Heinig* and Marsico* Genetics 2016)
performed an integrated analysis of the consequences of genetic variation for multiple levels of epigenetic and transcriptional regulation (Rintisch*, Heinig* Genome Res 2014)
developed a statistical approach for the identification of a transcription factor driven regulatory network, including its master regulator and the interpretation of disease association (type 1 diabetes) using this regulatory network (Heinig Nature 2010)
developed the computational tool sTRAP for the identification of causative cis regulatory variants affecting transcription factor binding (Manke*, Heinig* Hum Mutat 2010) and successfully applied this tool in a disease gene study (Monti Nat Genet 2008) for heart failure

Single cell RNA-seq not only enables us to explore the individual celltypes and their transcriptional programs. It also enables to study the effects of common and rare gene variation with celltype resolution. Importantly, it enables us to personalize gene regulatory networks.

Current approaches infer a single network, which can be thought of as a static reference for the whole population. In reality however, inter-individual differences in the genome and the environment are expected to cause differences in the network topology. Therefore, not just a single reference core gene network but a personalizable core gene network is required for precision medicine applications. Single cell RNA-seq measures the full transcriptomes of multiple cells of the same person. This allows to infer person and celltype-specific gene regulatory networks.

To fully leverage the potential of single cell data we have:

teamed up with the seed network for the human cell atlas to build the first reference cell atlas of the human heart (Litvinukova Nature 2020) - code
studied the effect of rare pathogenic variants on cardiac celltype composition and expression programs in patients with dilated cardiomyopathy (Reichart Science 2022) - code
teamed up with the single cell eQTLgen consortium to outline a strategy to identify cell type specific eQTL (van der Wijst Elife 2020)
developed scPower: the first scalabel and generally applicable power analysis tool for multi-sample single cell transcriptomics experiments such as eQTL (Schmid Nature Communications 2021) - code
developed computational approaches to personalize co-expression networks (Li Genome Biology 2023 - in press)

Complex traits are associated with houndreds if not thousands of non-coding variants throughout the whole genome. Theoretical models such as the omnigenic core gene model have been proposed to reconcile this observed genetic architecture with the potential molecular mechanisms: when small effects of multiple risk loci converge on the same downstream core genes in regulatory networks, a large proportion of the heritability can be explained. The key challenges are that downstream targets are difficult to identify using QTL data and that the core genes for specific diseases are unknown.

To adress these challenges, we have:

developed Speos - a graph neural network approach to predict core genes of complex disease from network, GWAS and gene expression data (Ratajzcak bioRxiv 2023) - code - docs
developed a data integration approach that makes use of polygenic risk scores and pathway annotations to identify trans-acting QTL from protein and transcript expression data. We applied it to the atrial fibrillation cohort of the symAtrial consortium to identify candidate core genes of atrial fibrillation (Assum Nature Communications 2022) - code

Lab members

PhD Student

Simon is interested in characterising mitochondrial sequence variations across human tissues. To this end, he develops novel computational approaches to learn about their effects on donor phenotypes and the molecular pathways through with they are mediated. On top of that, he works on the analysis of single-cell and spatial transcriptomic data to study early mouse brain development.

Email • Website • Linkedin • Twitter • Github

2024 Nature Medicine

Kami Pekayvaz*, Corinna Losert*, Viktoria Knottenberg*, Christoph Gold, Irene V. van Blokland, Roy Oelen, Hilde E. Groot, Jan Walter Benjamins, Sophia Brambs, Rainer Kaiser, Adrian Gottschlich, Gordon Victor Hoffmann, Luke Eivers, Alejandro Martinez-Navarro, Nils Bruns, Susanne Stiller, Sezer Akgöl, Keyang Yue, Vivien Polewka, Raphael Escaig, Markus Joppich, Aleksandar Janjic, Oliver Popp, Sebastian Kobold, Tobias Petzold, Ralf Zimmer, Wolfgang Enard, Kathrin Saar, Philipp Mertins, Norbert Huebner, Pim van der Harst, Lude H. Franke, Monique G. P. van der Wijst, Steffen Massberg, Matthias Heinig†, Leo Nicolai†, Konstantin Stark†

Multiomic analyses uncover immunological signatures in acute and chronic coronary syndromes

2023 Nature Communications

Florin Ratajczak, Mitchell Joblin, Marcel Hildebrandt, Martin Ringsquandl, Pascal Falter-Braun, Matthias Heinig

Speos: an ensemble graph representation learning framework to predict core gene candidates for complex diseases

2022 Nature Genetics

Hawe JS*, Wilson R*, Schmid KT*, Zhou L, Lakshmanan LN, Lehne BC, Kühnel B, Scott WR, Wielscher M, Yew YW, Baumbach C, Lee DP, Marouli E, Bernard M, Pfeiffer L, Matías-García PR, Autio MI, Bourgeois S, Herder C, Karhunen V, Meitinger T, Prokisch H, Rathmann W, Roden M, Sebert S, Shin J, Strauch K, Zhang W, Tan WLW, Hauck SM, Merl-Pham J, Grallert H, Barbosa EGV; MuTHER Consortium, Illig T, Peters A, Paus T, Pausova Z, Deloukas P, Foo RSY, Jarvelin MR, Kooner JS, Loh M†, Heinig M†, Gieger C†, Waldenberger M†, Chambers JC†.

Genetic variation influencing DNA methylation provides insights into molecular mechanisms regulating genomic function

2022 Nature Communications

Assum I*, Krause J*, Scheinhardt MO, Müller C, Hammer E, Börschel CS, Völker U, Conradi L, Geelhoed B, Zeller T, Schnabel RB†, Heinig M†

Tissue-specific multi-omics analysis of atrial fibrillation

2021 Nature Communications

Schmid KT, Höllbacher B, Cruceanu C, Böttcher A, Lickert H, Binder EB, Theis FJ, Heinig M

scPower accelerates and optimizes the design of multi-sample single cell transcriptomic studies