Network Embedding Across Multiple Tissues and Data Modalities Elucidates the Context of Host Factors Important for COVID-19 Infection
COVID-19 is a heterogeneous disease caused by SARS-CoV-2. Aside from infections of the lungs, the highly variable symptom severity is influenced by genetic predispositions and pre existing diseases. We developed a holistic framework, based on graph inference and graph embedding, to understand molecular pathways affected by SARS-CoV-2 in the context of pre-pandemic data, such as gene expression data, polygenetic predispositions and disease phenotypes across over 900 patients and 50 tissues.
“Node embeddings capture the topological structure of the highly complex biological graph. They allow for efficient investigation of the relationship between the different data and allow us to find important associations between COVID-19 genes and diseases such as ischemic heart disease, cerebrovascular disease, and hypertension. ” says Emy Yue Hu, the PhD student who led the project.