Machine Learning

2022
Pardo, M. ; Offer, S. ; Hartner, E. ; Di Bucchianico, S. ; Bisig, B. ; Bauer, S. ; Pantzke, J. ; Zimmermann, E. ; Cao, X. ; Binder, S. ; Kuhn, E. ; Huber, A. ; Jeong, S. ; Käfer, U. ; Schneider, E. ; Mesceriakovas, A. ; Bendl, J. ; Brejcha, R. ; Buchholz, A. ; Gat, D. ; Hohaus, T. ; Rastak, N. ; Karg, E.W. ; Jakobi, G. ; Kalberer, M. ; Kanashova, T. ; Hu, Y. ; Ogris, C. ; Marsico, A. ; Theis, F.J. ; Shalit, T. ; Gröger, T.M. ; Rüger, C.P. ; Oeder, S. ; Orasche, J. ; Paul, A. ; Ziehm, T. ; Zhang, Z.H. ; Adam, T. ; Sippula, O. ; Sklorz, M. ; Schnelle-Kreis, J. ; Czech, H. ; Kiendler-Scharr, A. ; Zimmermann, R. ; Rudich, Y.
Environ. Int. 166:107366 (2022)
The health effects of exposure to secondary organic aerosols (SOAs) are still limited. Here, we investigated and compared the toxicities of soot particles (SP) coated with β-pinene SOA (SOAβPin-SP) and SP coated with naphthalene SOA (SOANap-SP) in a human bronchial epithelial cell line (BEAS-2B) residing at the air-liquid interface. SOAβPin-SP mostly contained oxygenated aliphatic compounds from β-pinene photooxidation, whereas SOANap-SP contained a significant fraction of oxygenated aromatic products under similar conditions. Following exposure, genome-wide transcriptome responses showed an Nrf2 oxidative stress response, particularly for SOANap-SP. Other signaling pathways, such as redox signaling, inflammatory signaling, and the involvement of matrix metalloproteinase, were identified to have a stronger impact following exposure to SOANap-SP. SOANap-SP also induced a stronger genotoxicity response than that of SOAβPin-SP. This study elucidated the mechanisms that govern SOA toxicity and showed that, compared to SOAs derived from a typical biogenic precursor, SOAs from a typical anthropogenic precursor have higher toxicological potency, which was accompanied with the activation of varied cellular mechanisms, such as aryl hydrocarbon receptor. This can be attributed to the difference in chemical composition; specifically, the aromatic compounds in the naphthalene-derived SOA had higher cytotoxic potential than that of the β-pinene-derived SOA.
Wissenschaftlicher Artikel
Scientific Article
Lopez, J.P. ; Luecken, M. ; Brivio, E. ; Karamihalev, S. ; Kos, A. ; De Donno, C. ; Benjamin, A. ; Yang, H. ; Dick, A.L.W. ; Stoffel, R. ; Flachskamm, C. ; Ressle, A. ; Roeh, S. ; Huettl, R.E. ; Parl, A. ; Eggert, C. ; Novak, B. ; Yan, Y. ; Yeoh, K. ; Holzapfel, M. ; Hauger, B. ; Harbich, D. ; Schmid, B. ; Di Giaimo, R. ; Turck, C.W. ; Schmidt, M.V. ; Deussing, J.M. ; Eder, M. ; Dine, J. ; Theis, F.J. ; Chen, A.
Neuron 110, 2283-2298.e9 (2022)
A single sub-anesthetic dose of ketamine produces a rapid and sustained antidepressant response, yet the molecular mechanisms responsible for this remain unclear. Here, we identified cell-type-specific transcriptional signatures associated with a sustained ketamine response in mice. Most interestingly, we identified the Kcnq2 gene as an important downstream regulator of ketamine action in glutamatergic neurons of the ventral hippocampus. We validated these findings through a series of complementary molecular, electrophysiological, cellular, pharmacological, behavioral, and functional experiments. We demonstrated that adjunctive treatment with retigabine, a KCNQ activator, augments ketamine's antidepressant-like effects in mice. Intriguingly, these effects are ketamine specific, as they do not modulate a response to classical antidepressants, such as escitalopram. These findings significantly advance our understanding of the mechanisms underlying the sustained antidepressant effects of ketamine, with important clinical implications.
Wissenschaftlicher Artikel
Scientific Article
Lutz, K. ; Musumeci, A. ; Sie, C. ; Dursun, E. ; Winheim, E. ; Bagnoli, J. ; Ziegenhain, C. ; Rausch, L. ; Bergen, V. ; Luecken, M. ; Oostendorp, R.A.J. ; Schraml, B.U. ; Theis, F.J. ; Enard, W. ; Korn, T. ; Krug, A.B.
Nat. Commun. 13:3456 (2022)
Plasmacytoid and conventional dendritic cells (pDC and cDC) are generated from progenitor cells in the bone marrow and commitment to pDCs or cDC subtypes may occur in earlier and later progenitor stages. Cells within the CD11c+MHCII-/loSiglec-H+CCR9lo DC precursor fraction of the mouse bone marrow generate both pDCs and cDCs. Here we investigate the heterogeneity and commitment of subsets in this compartment by single-cell transcriptomics and high-dimensional flow cytometry combined with cell fate analysis: Within the CD11c+MHCII-/loSiglec-H+CCR9lo DC precursor pool cells expressing high levels of Ly6D and lacking expression of transcription factor Zbtb46 contain CCR9loB220hi immediate pDC precursors and CCR9loB220lo (lo-lo) cells which still generate pDCs and cDCs in vitro and in vivo under steady state conditions. cDC-primed cells within the Ly6DhiZbtb46- lo-lo precursors rapidly upregulate Zbtb46 and pass through a Zbtb46+Ly6D+ intermediate stage before acquiring cDC phenotype after cell division. Type I IFN stimulation limits cDC and promotes pDC output from this precursor fraction by arresting cDC-primed cells in the Zbtb46+Ly6D+ stage preventing their expansion and differentiation into cDCs. Modulation of pDC versus cDC output from precursors by external factors may allow for adaptation of DC subset composition at later differentiation stages.
Wissenschaftlicher Artikel
Scientific Article
Falkai, P. ; Koutsouleris, N. ; Bertsch, K. ; Bialas, M. ; Binder, E. ; Bühner, M. ; Buyx, A. ; Cai, N. ; Cappello, S. ; Ehring, T. ; Gensichen, J. ; Hamann, J. ; Hasan, A. ; Henningsen, P. ; Leucht, S. ; Möhrmann, K.H. ; Nagelstutz, E. ; Padberg, F. ; Peters, A. ; Pfäffel, L. ; Reich-Erkelenz, D. ; Riedl, V. ; Rueckert, D. ; Schmitt, A. ; Schulte-Körne, G. ; Scheuring, E. ; Schulze, T.G. ; Starzengruber, R. ; Stier, S. ; Theis, F.J. ; Winkelmann, J. ; Wurst, W. ; Priller, J.
Front. Psychiatr. 13:815718 (2022)
The Federal Ministry of Education and Research (BMBF) issued a call for a new nationwide research network on mental disorders, the German Center of Mental Health (DZPG). The Munich/Augsburg consortium was selected to participate as one of six partner sites with its concept "Precision in Mental Health (PriMe): Understanding, predicting, and preventing chronicity." PriMe bundles interdisciplinary research from the Ludwig-Maximilians-University (LMU), Technical University of Munich (TUM), University of Augsburg (UniA), Helmholtz Center Munich (HMGU), and Max Planck Institute of Psychiatry (MPIP) and has a focus on schizophrenia (SZ), bipolar disorder (BPD), and major depressive disorder (MDD). PriMe takes a longitudinal perspective on these three disorders from the at-risk stage to the first-episode, relapsing, and chronic stages. These disorders pose a major health burden because in up to 50% of patients they cause untreatable residual symptoms, which lead to early social and vocational disability, comorbidities, and excess mortality. PriMe aims at reducing mortality on different levels, e.g., reducing death by psychiatric and somatic comorbidities, and will approach this goal by addressing interdisciplinary and cross-sector approaches across the lifespan. PriMe aims to add a precision medicine framework to the DZPG that will propel deeper understanding, more accurate prediction, and personalized prevention to prevent disease chronicity and mortality across mental illnesses. This framework is structured along the translational chain and will be used by PriMe to innovate the preventive and therapeutic management of SZ, BPD, and MDD from rural to urban areas and from patients in early disease stages to patients with long-term disease courses. Research will build on platforms that include one on model systems, one on the identification and validation of predictive markers, one on the development of novel multimodal treatments, one on the regulation and strengthening of the uptake and dissemination of personalized treatments, and finally one on testing of the clinical effectiveness, utility, and scalability of such personalized treatments. In accordance with the translational chain, PriMe's expertise includes the ability to integrate understanding of bio-behavioral processes based on innovative models, to translate this knowledge into clinical practice and to promote user participation in mental health research and care.
Review
Review
Stirm, L. ; Huypens, P. ; Sass, S. ; Batra, R. ; Fritsche, L. ; Brucker, S. ; Abele, H. ; Hennige, A.M. ; Theis, F.J. ; Beckers, J. ; Hrabě de Angelis, M. ; Fritsche, A. ; Häring, H.-U. ; Staiger, H.
Sci. Rep. 12:6793 (2022)
This Article contains an error in Table 1 where the mean value and standard deviation of pregnancy week for the "screening group:NGT women" was incorrectly given as 23.0 +/- 9.5. The correct numbers are 26.5 +/- 2.1. Incorrect: (Table presented.) Correct: (Table presented.).
Brunner, A.D. ; Thielert, M. ; Vasilopoulou, C.G. ; Ammar, C. ; Coscia, F. ; Mund, A. ; Hoerning, O.B. ; Bache, N. ; Apalategui, A. ; Lubeck, M. ; Richter, S. ; Fischer, D.S. ; Raether, O. ; Park, M.A. ; Meier, F. ; Theis, F.J. ; Mann, M.
Mol. Syst. Biol. 18:e10798 (2022)
Single-cell technologies are revolutionizing biology but are today mainly limited to imaging and deep sequencing. However, proteins are the main drivers of cellular function and in-depth characterization of individual cells by mass spectrometry (MS)-based proteomics would thus be highly valuable and complementary. Here, we develop a robust workflow combining miniaturized sample preparation, very low flow-rate chromatography, and a novel trapped ion mobility mass spectrometer, resulting in a more than 10-fold improved sensitivity. We precisely and robustly quantify proteomes and their changes in single, FACS-isolated cells. Arresting cells at defined stages of the cell cycle by drug treatment retrieves expected key regulators. Furthermore, it highlights potential novel ones and allows cell phase prediction. Comparing the variability in more than 430 single-cell proteomes to transcriptome data revealed a stable-core proteome despite perturbation, while the transcriptome appears stochastic. Our technology can readily be applied to ultra-high sensitivity analyses of tissue material, posttranslational modifications, and small molecule studies from small cell counts to gain unprecedented insights into cellular heterogeneity in health and disease.
Wissenschaftlicher Artikel
Scientific Article
Giehrl-Schwab, J. ; Giesert, F. ; Rauser, B. ; Lao, C.L. ; Hembach, S. ; Lefort, S. ; Ibarra Del Rio, I.A. ; Koupourtidou; C. ; Luecken, M. ; Truong, D.-J.J. ; Fischer-Sternjak, J. ; Masserdotti, G. ; Prakash, N. ; Ninkovic, J. ; Hölter, S.M. ; Vogt Weisenhorn, D.M. ; Theis, F.J. ; Götz, M. ; Wurst, W.
EMBO Mol. Med.:e14797 (2022)
Direct reprogramming based on genetic factors resembles a promising strategy to replace lost cells in degenerative diseases such as Parkinson's disease. For this, we developed a knock-in mouse line carrying a dual dCas9 transactivator system (dCAM) allowing the conditional in vivo activation of endogenous genes. To enable a translational application, we additionally established an AAV-based strategy carrying intein-split-dCas9 in combination with activators (AAV-dCAS). Both approaches were successful in reprogramming striatal astrocytes into induced GABAergic neurons confirmed by single-cell transcriptome analysis of reprogrammed neurons in vivo. These GABAergic neurons functionally integrate into striatal circuits, alleviating voluntary motor behavior aspects in a 6-OHDA Parkinson's disease model. Our results suggest a novel intervention strategy beyond the restoration of dopamine levels. Thus, the AAV-dCAS approach might enable an alternative route for clinical therapies of Parkinson's disease.
Wissenschaftlicher Artikel
Scientific Article
Gayoso, A. ; Lopez, R. ; Xing, G. ; Boyeau, P. ; Valiollah Pour Amiri, V. ; Hong, J. ; Wu, K. ; Jayasuriya, M. ; Mehlman, E. ; Langevin, M. ; Liu, Y. ; Samaran, J. ; Misrachi, G. ; Nazaret, A. ; Clivio, O. ; Xu, C. ; Ashuach, T. ; Gabitto, M. ; Lotfollahi, M. ; Svensson, V. ; da Veiga Beltrame, E. ; Kleshchevnikov, V. ; Talavera-López, C. ; Pachter, L. ; Theis, F.J. ; Streets, A. ; Jordan, M.I. ; Regier, J. ; Yosef, N.
Nat. Biotechnol. 40, 163-166 (2022)
Letter to the Editor
Letter to the Editor
Palla, G. ; Spitzer, H. ; Klein, M. ; Fischer, D.S. ; Schaar, A. ; Kuemmerle, L. ; Rybakov, S. ; Ibarra Del Rio, I.A. ; Holmberg, O. ; Virshup, I. ; Lotfollahi, M. ; Richter, S. ; Theis, F.J.
Nat. Methods 19, 171–178 (2022)
Spatial omics data are advancing the study of tissue organization and cellular communication at an unprecedented scale. Flexible tools are required to store, integrate and visualize the large diversity of spatial omics data. Here, we present Squidpy, a Python framework that brings together tools from omics and image analysis to enable scalable description of spatial molecular data, such as transcriptome or multivariate proteins. Squidpy provides efficient infrastructure and numerous analysis methods that allow to efficiently store, manipulate and interactively visualize spatial omics data. Squidpy is extensible and can be interfaced with a variety of already existing libraries for the scalable analysis of spatial omics data.
Wissenschaftlicher Artikel
Scientific Article
Offer, S. ; Hartner, E. ; Di Bucchianico, S. ; Bisig, B. ; Bauer, S. ; Pantzke, J. ; Zimmermann, E. ; Cao, X. ; Binder, S. ; Kuhn, E. ; Huber, A. ; Jeong, S. ; Käfer, U. ; Martens, P. ; Mesceriakovas, A. ; Bendl, J. ; Brejcha, R. ; Buchholz, A. ; Gat, D. ; Hohaus, T. ; Rastak, N. ; Jakobi, G. ; Kalberer, M. ; Kanashova, T. ; Hu, Y. ; Ogris, C. ; Marsico, A. ; Theis, F.J. ; Pardo, M. ; Gröger, T.M. ; Oeder, S. ; Orasche, J. ; Paul, A. ; Ziehm, T. ; Zhang, Z.H. ; Adam, T. ; Sippula, O. ; Sklorz, M. ; Schnelle-Kreis, J. ; Czech, H. ; Kiendler-Scharr, A. ; Rudich, Y. ; Zimmermann, R.
Environ. Health Perspect. 130:27003 (2022)
BACKGROUND: Secondary organic aerosols (SOAs) formed from anthropogenic or biogenic gaseous precursors in the atmosphere substantially contribute to the ambient fine particulate matter [PM ≤2.5μm in aerodynamic diameter (PM2.5)] burden, which has been associated with adverse human health effects. However, there is only limited evidence on their differential toxicological impact. OBJECTIVES: We aimed to discriminate toxicological effects of aerosols generated by atmospheric aging on combustion soot particles (SPs) of gaseous biogenic (β-pinene) or anthropogenic (naphthalene) precursors in two different lung cell models exposed at the air-liquid interface (ALI). METHODS: Mono- or cocultures of lung epithelial cells (A549) and endothelial cells (EA.hy926) were exposed at the ALI for 4 h to different aerosol concentrations of a photochemically aged mixture of primary combustion SP and β-pinene (SOAβPIN-SP) or naphthalene (SOANAP-SP). The internally mixed soot/SOA particles were comprehensively characterized in terms of their physical and chemical properties. We conducted toxicity tests to determine cytotoxicity, intracellular oxidative stress, primary and secondary genotoxicity, as well as inflammatory and angiogenic effects. RESULTS: We observed considerable toxicity-related outcomes in cells treated with either SOA type. Greater adverse effects were measured for SOANAP-SP compared with SOAβPIN-SP in both cell models, whereas the nano-sized soot cores alone showed only minor effects. At the functional level, we found that SOANAP-SP augmented the secretion of malondialdehyde and interleukin-8 and may have induced the activation of endothelial cells in the coculture system. This activation was confirmed by comet assay, suggesting secondary genotoxicity and greater angiogenic potential. Chemical characterization of PM revealed distinct qualitative differences in the composition of the two secondary aerosol types. DISCUSSION: In this study using A549 and EA.hy926 cells exposed at ALI, SOA compounds had greater toxicity than primary SPs. Photochemical aging of naphthalene was associated with the formation of more oxidized, more aromatic SOAs with a higher oxidative potential and toxicity compared with β-pinene. Thus, we conclude that the influence of atmospheric chemistry on the chemical PM composition plays a crucial role for the adverse health outcome of emissions. https://doi.org/10.1289/EHP9413.
Wissenschaftlicher Artikel
Scientific Article
Palla, G. ; Fischer, D.S. ; Regev, A. ; Theis, F.J.
Nat. Biotechnol. 40, 308–318 (2022)
Methods for profiling RNA and protein expression in a spatially resolved manner are rapidly evolving, making it possible to comprehensively characterize cells and tissues in health and disease. To maximize the biological insights obtained using these techniques, it is critical to both clearly articulate the key biological questions in spatial analysis of tissues and develop the requisite computational tools to address them. Developers of analytical tools need to decide on the intrinsic molecular features of each cell that need to be considered, and how cell shape and morphological features are incorporated into the analysis. Also, optimal ways to compare different tissue samples at various length scales are still being sought. Grouping these biological problems and related computational algorithms into classes across length scales, thus characterizing common issues that need to be addressed, will facilitate further progress in spatial transcriptomics and proteomics.
Review
Review
Luecken, M. ; Zaragosi, L.E. ; Madissoon, E. ; Sikkema, L. ; Firsova, A.B. ; De Domenico, E. ; Kuemmerle, L. ; Saglam, A. ; Berg, M. ; Gay, A.C.A. ; Schniering, J. ; Mayr, C. ; Abalo, X.M. ; Larsson, L. ; Sountoulidis, A. ; Teichmann, S. ; van Eunen, K. ; Koppelman, G.H. ; Saeb-Parsy, K. ; Leroy, S. ; Powell, P. ; Sarkans, U. ; Timens, W. ; Lundeberg, J. ; van den Berge, M. ; Nilsson, M. ; Horváth, P. ; Denning, J. ; Papatheodorou, I. ; Schultze, J.L. ; Schiller, H. B. ; Barbry, P. ; Petoukhov, I. ; Misharin, A.V. ; Adcock, I. ; von Papen, M. ; Theis, F.J. ; Samakovlis, C. ; Meyer, K.B. ; Nawijn, M.C.
Eur. Respir. J. 59:2102057 (2022)
The Human Cell Atlas (HCA) consortium aims to establish an atlas of all organs in the healthy human body at single-cell resolution to increase our understanding of basic biological processes that govern development, physiology and anatomy, and to accelerate diagnosis and treatment of disease. The lung biological network of the HCA aims to generate the Human Lung Cell Atlas as a reference for the cellular repertoire, molecular cell states and phenotypes, and the cell-cell interactions that characterise normal lung homeostasis in healthy lung tissue. Such a reference atlas of the healthy human lung will facilitate mapping the changes in the cellular landscape in disease. The discovAIR project is one of six pilot actions for the HCA funded by the European Commission in the context of the H2020 framework program. DiscovAIR aims to establish the first draft of an integrated Human Lung Cell Atlas, combining single-cell transcriptional and epigenetic profiling with spatially resolving techniques on matched tissue samples, as well as including a number of chronic and infectious diseases of the lung. The integrated Lung Cell Atlas will be available as a resource for the wider respiratory community, including basic and translational scientists, clinical medicine, and the private sector, as well as for patients with lung disease and the interested lay public. We anticipate that the Lung Cell Atlas will be the founding stone for a more detailed understanding of the pathogenesis of lung diseases, guiding the design of novel diagnostics and preventive or curative interventions.
Review
Review
Lange, M. ; Bergen, V. ; Klein, M. ; Setty, M. ; Reuter, B. ; Bakhti, M. ; Lickert, H. ; Ansari, M. ; Schniering, J. ; Schiller, H. B. ; Pe'er, D. ; Theis, F.J.
Nat. Methods 19, 159–170 (2022)
Computational trajectory inference enables the reconstruction of cell state dynamics from single-cell RNA sequencing experiments. However, trajectory inference requires that the direction of a biological process is known, largely limiting its application to differentiating systems in normal development. Here, we present CellRank ( https://cellrank.org ) for single-cell fate mapping in diverse scenarios, including regeneration, reprogramming and disease, for which direction is unknown. Our approach combines the robustness of trajectory inference with directional information from RNA velocity, taking into account the gradual and stochastic nature of cellular fate decisions, as well as uncertainty in velocity vectors. On pancreas development data, CellRank automatically detects initial, intermediate and terminal populations, predicts fate potentials and visualizes continuous gene expression trends along individual lineages. Applied to lineage-traced cellular reprogramming data, predicted fate probabilities correctly recover reprogramming outcomes. CellRank also predicts a new dedifferentiation trajectory during postinjury lung regeneration, including previously unknown intermediate cell states, which we confirm experimentally.
Wissenschaftlicher Artikel
Scientific Article
Magaletta, M.E. ; Lobo, M. ; Kernfeld, E.M. ; Aliee, H. ; Huey, J.D. ; Parsons, T.J. ; Theis, F.J. ; Maehr, R.
Nat. Commun. 13:457 (2022)
Maldevelopment of the pharyngeal endoderm, an embryonic tissue critical for patterning of the pharyngeal region and ensuing organogenesis, ultimately contributes to several classes of human developmental syndromes and disorders. Such syndromes are characterized by a spectrum of phenotypes that currently cannot be fully explained by known mutations or genetic variants due to gaps in characterization of critical drivers of normal and dysfunctional development. Despite the disease-relevance of pharyngeal endoderm, we still lack a comprehensive and integrative view of the molecular basis and gene regulatory networks driving pharyngeal endoderm development. To close this gap, we apply transcriptomic and chromatin accessibility single-cell sequencing technologies to generate a multi-omic developmental resource spanning pharyngeal endoderm patterning to the emergence of organ-specific epithelia in the developing mouse embryo. We identify cell-type specific gene regulation, distill GRN models that define developing organ domains, and characterize the role of an immunodeficiency-associated forkhead box transcription factor.
Wissenschaftlicher Artikel
Scientific Article
Luecken, M. ; Büttner, M. ; Chaichoompu, K. ; Danese, A. ; Interlandi, M. ; Müller, M.F. ; Strobl, D.C. ; Zappia, L. ; Dugas, M. ; Colomé-Tatché, M. ; Theis, F.J.
Nat. Methods 19, 41-50 (2022)
Single-cell atlases often include samples that span locations, laboratories and conditions, leading to complex, nested batch effects in data. Thus, joint analysis of atlas datasets requires reliable data integration. To guide integration method choice, we benchmarked 68 method and preprocessing combinations on 85 batches of gene expression, chromatin accessibility and simulation data from 23 publications, altogether representing >1.2 million cells distributed in 13 atlas-level integration tasks. We evaluated methods according to scalability, usability and their ability to remove batch effects while retaining biological variation using 14 evaluation metrics. We show that highly variable gene selection improves the performance of data integration methods, whereas scaling pushes methods to prioritize batch removal over conservation of biological variation. Overall, scANVI, Scanorama, scVI and scGen perform well, particularly on complex integration tasks, while single-cell ATAC-sequencing integration performance is strongly affected by choice of feature space. Our freely available Python module and benchmarking pipeline can identify optimal data integration methods for new data, benchmark new methods and improve method development.
Wissenschaftlicher Artikel
Scientific Article
Hrovatin, K. ; Fischer, D.S. ; Theis, F.J.
Mol. Metab. 57:101396 (2022)
Background: Single-cell metabolic studies bring new insights into cellular function, which can often not be captured on other omics layers. Metabolic information has wide applicability, such as for the study of cellular heterogeneity or for the understanding of drug mechanisms and biomarker development. However, metabolic measurements on single-cell level are limited by insufficient scalability and sensitivity, as well as resource intensiveness, and are currently not possible in parallel with measuring transcript state, commonly used to identify cell types. Nevertheless, because omics layers are strongly intertwined, it is possible to make metabolic predictions based on measured data of more easily measurable omics layers together with prior metabolic network knowledge. Scope of review: We summarize the current state of single-cell metabolic measurement and modeling approaches, motivating the use of computational techniques. We review three main classes of computational methods used for prediction of single-cell metabolism: pathway-level analysis, constraint-based modeling, and kinetic modeling. We describe the unique challenges arising when transitioning from bulk to single-cell modeling. Finally, we propose potential model extensions and computational methods that could be leveraged to achieve these goals. Major conclusions: Single-cell metabolic modeling is a rising field that provides a new perspective for understanding cellular functions. The presented modeling approaches vary in terms of input requirements and assumptions, scalability, modeled metabolic layers, and newly gained insights. We believe that the use of prior metabolic knowledge will lead to more robust predictions and will pave the way for mechanistic and interpretable machine-learning models.
Review
Review
2021
Wendisch, D. ; Dietrich, O. ; Mari, T. ; von Stillfried, S. ; Ibarra Del Rio, I.A. ; Mittermaier, M. ; Mache, C. ; Chua, R.L. ; Knöll, R. ; Timm, S. ; Brumhard, S. ; Krammer, T. ; Zauber, H. ; Hiller, A.L. ; Pascual-Reguant, A. ; Mothes, R. ; Bülow, R.D. ; Schulze, J. ; Leipold, A.M. ; Djudjaj, S. ; Erhard, F. ; Geffers, R. ; Pott, F. ; Kazmierski, J. ; Radke, J. ; Pergantis, P. ; Baßler, K. ; Conrad, C. ; Aschenbrenner, A.C. ; Sawitzki, B. ; Landthaler, M. ; Wyler, E. ; Horst, D. ; Hippenstiel, S. ; Hocke, A.C. ; Heppner, F.L. ; Uhrig, A. ; Garcia, C. ; Machleidt, F. ; Herold, S. ; Elezkurtaj, S. ; Thibeault, C. ; Witzenrath, M. ; Cochain, C. ; Suttorp, N. ; Drosten, C. ; Goffinet, C. ; Kurth, F. ; Schultze, J.L. ; Radbruch, H. ; Ochs, M. ; Eils, R. ; Müller-Redetzky, H. ; Hauser, A.E. ; Luecken, M. ; Theis, F.J. ; Wolff, T. ; Boor, P. ; Selbach, M. ; Saliba, A.E. ; Sander, L.E.
Cell 184, 6243-6261.e27 (2021)
COVID-19-induced “acute respiratory distress syndrome” (ARDS) is associated with prolonged respiratory failure and high mortality, but the mechanistic basis of lung injury remains incompletely understood. Here, we analyze pulmonary immune responses and lung pathology in two cohorts of patients with COVID-19 ARDS using functional single-cell genomics, immunohistology, and electron microscopy. We describe an accumulation of CD163-expressing monocyte-derived macrophages that acquired a profibrotic transcriptional phenotype during COVID-19 ARDS. Gene set enrichment and computational data integration revealed a significant similarity between COVID-19-associated macrophages and profibrotic macrophage populations identified in idiopathic pulmonary fibrosis. COVID-19 ARDS was associated with clinical, radiographic, histopathological, and ultrastructural hallmarks of pulmonary fibrosis. Exposure of human monocytes to SARS-CoV-2, but not influenza A virus or viral RNA analogs, was sufficient to induce a similar profibrotic phenotype in vitro. In conclusion, we demonstrate that SARS-CoV-2 triggers profibrotic macrophage responses and pronounced fibroproliferative ARDS.
Wissenschaftlicher Artikel
Scientific Article
Büttner, M. ; Ostner, J. ; Müller, C. ; Theis, F.J. ; Schubert, B.
Nat. Commun. 12:6876 (2021)
Compositional changes of cell types are main drivers of biological processes. Their detection through single-cell experiments is difficult due to the compositionality of the data and low sample sizes. We introduce scCODA ( https://github.com/theislab/scCODA ), a Bayesian model addressing these issues enabling the study of complex cell type effects in disease, and other stimuli. scCODA demonstrated excellent detection performance, while reliably controlling for false discoveries, and identified experimentally verified cell type changes that were missed in original analyses.
Wissenschaftlicher Artikel
Scientific Article
Cruceanu, C. ; Dony, L. ; Krontira, A.C. ; Fischer, D.S. ; Roeh, S. ; Di Giaimo, R. ; Kyrousi, C. ; Kaspar, L. ; Knauer-Arloth, J. ; Czamara, D. ; Martinelli, S. ; Wehner, S. ; Breen, M.S. ; Koedel, M. ; Sauer, S. ; Sportelli, V. ; Rex-Haffner, M. ; Cappello, S. ; Theis, F.J. ; Binder, E.B.
Am. J. Psychiatry 179, 375-387 (2021)
OBJECTIVE: A fine-tuned balance of glucocorticoid receptor (GR) activation is essential for organ formation, with disturbances influencing many health outcomes. In utero, glucocorticoids have been linked to brain-related negative outcomes, with unclear underlying mechanisms, especially regarding cell-type-specific effects. An in vitro model of fetal human brain development, induced human pluripotent stem cell (hiPSC)-derived cerebral organoids, was used to test whether cerebral organoids are suitable for studying the impact of prenatal glucocorticoid exposure on the developing brain. METHODS: The GR was activated with the synthetic glucocorticoid dexamethasone, and the effects were mapped using single-cell transcriptomics across development. RESULTS: The GR was expressed in all cell types, with increasing expression levels through development. Not only did its activation elicit translocation to the nucleus and the expected effects on known GR-regulated pathways, but also neurons and progenitor cells showed targeted regulation of differentiation- and maturation-related transcripts. Uniquely in neurons, differentially expressed transcripts were significantly enriched for genes associated with behavior-related phenotypes and disorders. This human neuronal glucocorticoid response profile was validated across organoids from three independent hiPSC lines reprogrammed from different source tissues from both male and female donors. CONCLUSIONS: These findings suggest that excessive glucocorticoid exposure could interfere with neuronal maturation in utero, leading to increased disease susceptibility through neurodevelopmental processes at the interface of genetic susceptibility and environmental exposure. Cerebral organoids are a valuable translational resource for exploring the effects of glucocorticoids on early human brain development.
Wissenschaftlicher Artikel
Scientific Article
Aliee, H. ; Massip, F. ; Qi, C. ; Stella de Biase, M. ; van Nijnatten, J. ; Kersten, E.T.G. ; Kermani, N.Z. ; Khuder, B. ; Vonk, J.M. ; Vermeulen, R.C.H. ; U-BIOPRED study group ; Cambridge Lung Cancer Early Detection Programme ; INER-Ciencias Mexican Lung Program ; Neighbors, M. ; Tew, G.W. ; Grimbaldeston, M.A. ; Ten Hacken, N.H.T. ; Hu, S. ; Guo, Y. ; Zhang, X. ; Sun, K. ; Hiemstra, P.S. ; Ponder, B.A. ; Makela, M.J. ; Malmström, K. ; Rintoul, R.C. ; Reyfman, P.A. ; Theis, F.J. ; Brandsma, C.A. ; Adcock, I.M. ; Timens, W. ; Xu, C.J. ; van den Berge, M. ; Schwarz, R.F. ; Koppelman, G.H. ; Nawijn, M.C. ; Faiz, A.
Allergy, DOI: 10.1111/all.15152 (2021)
Wissenschaftlicher Artikel
Scientific Article
Schmid, K. ; Höllbacher, B. ; Cruceanu, C. ; Böttcher, A. ; Lickert, H. ; Binder, E.B. ; Theis, F.J. ; Heinig, M.
Nat. Commun. 12:6625 (2021)
Single cell RNA-seq has revolutionized transcriptomics by providing cell type resolution for differential gene expression and expression quantitative trait loci (eQTL) analyses. However, efficient power analysis methods for single cell data and inter-individual comparisons are lacking. Here, we present scPower; a statistical framework for the design and power analysis of multi-sample single cell transcriptomic experiments. We modelled the relationship between sample size, the number of cells per individual, sequencing depth, and the power of detecting differentially expressed genes within cell types. We systematically evaluated these optimal parameter combinations for several single cell profiling platforms, and generated broad recommendations. In general, shallow sequencing of high numbers of cells leads to higher overall power than deep sequencing of fewer cells. The model, including priors, is implemented as an R package and is accessible as a web tool. scPower is a highly customizable tool that experimentalists can use to quickly compare a multitude of experimental designs and optimize for a limited budget.
Wissenschaftlicher Artikel
Scientific Article
Reddy, K.D. ; Lan, A. ; Boudewijn, I.M. ; Rathnayake, S.N.H. ; Koppelman, G.H. ; Aliee, H. ; Theis, F.J. ; Oliver, B.G. ; van den Berge, M. ; Faiz, A.
Am. J. Respir. Cell Mol. Biol. 65, 366-377 (2021)
Current smoking contributes to worsened asthma prognosis and more severe symptoms and limits the beneficial effects of corticosteroids. As the nasal epithelium can reflect smoking-induced changes in the lower airways, it is a relevant source to investigate changes in gene expression and DNA methylation. This study explores gene expression and DNA methylation changes in current and ex-smokers with asthma. Matched gene expression and epigenome-wide DNA methylation samples collected from nasal brushings of 55 patients enrolled in a clinical trial investigation of current and ex-smoker patients with asthma were analyzed. Differential gene expression and DNA methylation analyses were conducted comparing current smokers with ex-smokers. Expression quantitative trait methylation (eQTM) analysis was completed to explore smoking-relevant genes by CpG sites that differ between current and ex-smokers. To investigate the relevance of the smoking-associated DNA methylation changes for the lower airways, significant CpG sites were explored in bronchial biopsies from patients who had stopped smoking. A total of 809 genes and 18,814 CpG sites were differentially associated with current smoking in the nose. The cis-eQTM analysis uncovered 171 CpG sites with a methylation status associated with smoking-related gene expression, including AHRR, ALDH3A1, CYP1A1, and CYP1B1. The methylation status of CpG sites altered by current smoking reversed with 1 year of smoking cessation. We confirm that current smoking alters epigenetic patterns and affects gene expression in the nasal epithelium of patients with asthma, which is partially reversible in bronchial biopsies after smoking cessation. We demonstrate the ability to discern molecular changes in the nasal epithelium, presenting this as a tool in future investigations into disease-relevant effects of tobacco smoke.
Wissenschaftlicher Artikel
Scientific Article
Way, G.P. ; Greene, C.S. ; Carninci, P. ; Carvalho, B.S. ; de Hoon, M. ; Finley, S. ; Gosline, S.J.C. ; Le Cao, K.A. ; Lee, J.S.H. ; Marchionni, L. ; Robine, N. ; Sindi, S.S. ; Theis, F.J. ; Yang, J.Y.H. ; Carpenter, A.E. ; Fertig, E.J.
PLoS Biol. 19:e3001419 (2021)
Evo:lvPinlegaisnecsoynnfcirwmitthhatthalelhceoamdipnugtleavtieolnsarreevreopluretisoenntoevdecor rtrheectplya:st 30 years, computational biology has emerged as a mature scientific field. While the field has made major contributions toward improving scientific knowledge and human health, individual computational biology practitioners at various institutions often languish in career development. As optimistic biologists passionate about the future of our field, we propose solutions for both eager and reluctant individual scientists, institutions, publishers, funding agencies, and educators to fully embrace computational biology. We believe that in order to pave the way for the next generation of discoveries, we need to improve recognition for computational biologists and better align pathways of career success with pathways of scientific progress. With 10 outlined steps, we call on all adjacent fields to move away from the traditional individual, single-discipline investigator research model and embrace multidisciplinary, data-driven, team science.
Review
Review
Nguyen, B.H.P. ; Ohnmacht, A. ; Galhoz, A. ; Büttner, M. ; Theis, F.J. ; Menden, M.
Diabetologe, DOI: 10.1007/s11428-021-00817-w (2021)
HintergrundDiabetes mellitus entwickelt sich zu einem globalen Gesundheitsproblem, das eine Transformation der Forschung und der medizinischen Praxis für ein besseres Patientenmanagement erfordert. Diesbezüglich bieten die Fülle an Daten und die Fortschritte in der Technologie und der künstlichen Intelligenz Möglichkeiten für ein solches Unterfangen.ZieleDiese Übersichtsarbeit soll einen Überblick über künstliche Intelligenz und die aktuelle Forschung in ihrer Anwendung im Bereich Diabetes geben, insbesondere zur Risikovorhersage, Diagnose, Prognose und Vorhersage von Komplikationen.FazitKünstliche Intelligenz transformiert die Diabetesforschung in vielen technischen und organisatorischen Aspekten. Obwohl ihr Einsatz noch begrenzt und mit vielen Herausforderungen konfrontiert ist, wird sie wahrscheinlich künftig die medizinische Behandlung beeinflussen, indem sie eine automatisierte und personalisierte Gesundheitsversorgung für Erkrankte bietet.
Review
Review
Zappia, L. ; Theis, F.J.
Genome Biol. 22:301 (2021)
Recent years have seen a revolution in single-cell RNA-sequencing (scRNA-seq) technologies, datasets, and analysis methods. Since 2016, the scRNA-tools database has cataloged software tools for analyzing scRNA-seq data. With the number of tools in the database passing 1000, we provide an update on the state of the project and the field. This data shows the evolution of the field and a change of focus from ordering cells on continuous trajectories to integrating multiple samples and making use of reference datasets. We also find that open science practices reward developers with increased recognition and help accelerate the field.
Review
Review
Oppenländer, L. ; Palit, S. ; Stemmer, K. ; Greisle, T. ; Sterr, M. ; Salinno, C. ; Bastidas-Ponce, A. ; Feuchtinger, A. ; Böttcher, A. ; Ansarullah ; Theis, F.J. ; Lickert, H.
Mol. Metab. 54:101330 (2021)
While the effectiveness of bariatric surgery in restoring β-cell function has been described in type-2 diabetes (T2D) patients and animal models for years, the mechanistic underpinnings are largely unknown. The possibility of vertical sleeve gastrectomy (VSG) to rescue a clinically-relevant, late-stage T2D condition and to promote β-cell recovery has not been investigated on a single-cell level. Nevertheless, characterization of the heterogeneity and functional states of β-cells after VSG is a fundamental step to understand mechanisms of glycaemic recovery and to ultimately develop alternative, less-invasive therapies. Here, we report that VSG was superior to calorie restriction in late-stage T2D and rapidly restored normoglycaemia in morbidly obese and overt diabetic db/db mice. Single-cell profiling of islets of Langerhans showed that VSG induced distinct, intrinsic changes in the β-cell transcriptome, but not in that of α-, δ-, and PP-cells. VSG triggered fast β-cell redifferentiation and functional improvement within only two weeks of intervention, which is not seen upon calorie restriction. Furthermore, VSG expanded β-cell area by means of redifferentiation and by creating a proliferation competent β-cell state. Collectively, our study reveals the superiority of VSG in the remission of far-progressed T2D and presents paths of β-cell regeneration and molecular pathways underlying the glycaemic benefits of VSG.
Wissenschaftlicher Artikel
Scientific Article
Verdun, C.M. ; Fuchs, T. ; Harar, P. ; Elbrächter, D. ; Fischer, D.S. ; Berner, J. ; Grohs, P. ; Theis, F.J. ; Krahmer, F.
Front. Publ. Health 9:583377 (2021)
Background: Due to the ongoing COVID-19 pandemic, demand for diagnostic testing has increased drastically, resulting in shortages of necessary materials to conduct the tests and overwhelming the capacity of testing laboratories. The supply scarcity and capacity limits affect test administration: priority must be given to hospitalized patients and symptomatic individuals, which can prevent the identification of asymptomatic and presymptomatic individuals and hence effective tracking and tracing policies. We describe optimized group testing strategies applicable to SARS-CoV-2 tests in scenarios tailored to the current COVID-19 pandemic and assess significant gains compared to individual testing. Methods: We account for biochemically realistic scenarios in the context of dilution effects on SARS-CoV-2 samples and consider evidence on specificity and sensitivity of PCR-based tests for the novel coronavirus. Because of the current uncertainty and the temporal and spatial changes in the prevalence regime, we provide analysis for several realistic scenarios and propose fast and reliable strategies for massive testing procedures. Key Findings: We find significant efficiency gaps between different group testing strategies in realistic scenarios for SARS-CoV-2 testing, highlighting the need for an informed decision of the pooling protocol depending on estimated prevalence, target specificity, and high- vs. low-risk population. For example, using one of the presented methods, all 1.47 million inhabitants of Munich, Germany, could be tested using only around 141 thousand tests if the infection rate is below 0.4% is assumed. Using 1 million tests, the 6.69 million inhabitants from the city of Rio de Janeiro, Brazil, could be tested as long as the infection rate does not exceed 1%. Moreover, we provide an interactive web application, available at www.grouptexting.com, for visualizing the different strategies and designing pooling schemes according to specific prevalence scenarios and test configurations. Interpretation: Altogether, this work may help provide a basis for an efficient upscaling of current testing procedures, which takes the population heterogeneity into account and is fine-grained towards the desired study populations, e.g., mild/asymptomatic individuals vs. symptomatic ones but also mixtures thereof. Funding: German Science Foundation (DFG), German Federal Ministry of Education and Research (BMBF), Chan Zuckerberg Initiative DAF, and Austrian Science Fund (FWF).
Wissenschaftlicher Artikel
Scientific Article
Danese, A. ; Richter, M. ; Chaichoompu, K. ; Fischer, D.S. ; Theis, F.J. ; Colomé-Tatché, M.
Nat. Commun. 12:5228 (2021)
EpiScanpy is a toolkit for the analysis of single-cell epigenomic data, namely single-cell DNA methylation and single-cell ATAC-seq data. To address the modality specific challenges from epigenomics data, epiScanpy quantifies the epigenome using multiple feature space constructions and builds a nearest neighbour graph using epigenomic distance between cells. EpiScanpy makes the many existing scRNA-seq workflows from scanpy available to large-scale single-cell data from other -omics modalities, including methods for common clustering, dimension reduction, cell type identification and trajectory learning techniques, as well as an atlas integration tool for scATAC-seq datasets. The toolkit also features numerous useful downstream functions, such as differential methylation and differential openness calling, mapping epigenomic features of interest to their nearest gene, or constructing gene activity matrices using chromatin openness. We successfully benchmark epiScanpy against other scATAC-seq analysis tools and show its outperformance at discriminating cell types.
Wissenschaftlicher Artikel
Scientific Article
Bergen, V. ; Soldatov, R.A. ; Kharchenko, P.V. ; Theis, F.J.
Mol. Syst. Biol. 17:e10282 (2021)
RNA velocity has enabled the recovery of directed dynamic information from single-cell transcriptomics by connecting measurements to the underlying kinetics of gene expression. This approach has opened up new ways of studying cellular dynamics. Here, we review the current state of RNA velocity modeling approaches, discuss various examples illustrating limitations and potential pitfalls, and provide guidance on how the ensuing challenges may be addressed. We then outline future directions on how to generalize the concept of RNA velocity to a wider variety of biological systems and modalities.
Review
Review
Fischer, D.S. ; Dony, L. ; König, M. ; Moeed, A. ; Zappia, L. ; Heumos, L. ; Tritschler, S. ; Holmberg, O. ; Aliee, H. ; Theis, F.J.
Genome Biol. 22:248 (2021)
Single-cell RNA-seq datasets are often first analyzed independently without harnessing model fits from previous studies, and are then contextualized with public data sets, requiring time-consuming data wrangling. We address these issues with sfaira, a single-cell data zoo for public data sets paired with a model zoo for executable pre-trained models. The data zoo is designed to facilitate contribution of data sets using ontologies for metadata. We propose an adaption of cross-entropy loss for cell type classification tailored to datasets annotated at different levels of coarseness. We demonstrate the utility of sfaira by training models across anatomic data partitions on 8 million cells.
Wissenschaftlicher Artikel
Scientific Article
Aliluev, A. ; Tritschler, S. ; Sterr, M. ; Oppenländer, L. ; Hinterdobler, J. ; Greisle, T. ; Irmler, M. ; Beckers, J. ; Sun, N. ; Walch, A.K. ; Stemmer, K. ; Kindt, A. ; Krumsiek, J. ; Tschöp, M.H. ; Luecken, M. ; Theis, F.J. ; Lickert, H. ; Böttcher, A.
Nat. Metab. 3, 1202-1216 (2021)
Excess nutrient uptake and altered hormone secretion in the gut contribute to a systemic energy imbalance, which causes obesity and an increased risk of type 2 diabetes and colorectal cancer. This functional maladaptation is thought to emerge at the level of the intestinal stem cells (ISCs). However, it is not clear how an obesogenic diet affects ISC identity and fate. Here we show that an obesogenic diet induces ISC and progenitor hyperproliferation, enhances ISC differentiation and cell turnover and changes the regional identities of ISCs and enterocytes in mice. Single-cell resolution of the enteroendocrine lineage reveals an increase in progenitors and peptidergic enteroendocrine cell types and a decrease in serotonergic enteroendocrine cell types. Mechanistically, we link increased fatty acid synthesis, Ppar signaling and the Insr-Igf1r-Akt pathway to mucosal changes. This study describes molecular mechanisms of diet-induced intestinal maladaptation that promote obesity and therefore underlie the pathogenesis of the metabolic syndrome and associated complications.
Wissenschaftlicher Artikel
Scientific Article
Lotfollahi, M. ; Naghipourfar, M. ; Luecken, M. ; Khajavi, M. ; Büttner, M. ; Wagenstetter, M. ; Avsec, Z. ; Gayoso, A. ; Yosef, N. ; Interlandi, M. ; Rybakov, S. ; Misharin, A.V. ; Theis, F.J.
Nat. Biotechnol., DOI: 10.1038/s41587-021-01001-7 (2021)
Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Here we introduce a deep learning strategy for mapping query datasets on top of a reference called single-cell architectural surgery (scArches). scArches uses transfer learning and parameter optimization to enable efficient, decentralized, iterative reference building and contextualization of new datasets with existing references without sharing raw data. Using examples from mouse brain, pancreas, immune and whole-organism atlases, we show that scArches preserves biological state information while removing batch effects, despite using four orders of magnitude fewer parameters than de novo integration. scArches generalizes to multimodal reference mapping, allowing imputation of missing modalities. Finally, scArches retains coronavirus disease 2019 (COVID-19) disease variation when mapping to a healthy reference, enabling the discovery of disease-specific cell states. scArches will facilitate collaborative projects by enabling iterative construction, updating, sharing and efficient use of reference atlases.
Wissenschaftlicher Artikel
Scientific Article
Fischer, D.S. ; Ansari, M. ; Wagner, K.I. ; Jarosch, S. ; Huang, Y. ; Mayr, C. ; Strunz, M. ; Lang, N.J. ; D'Ippolito, E. ; Hammel, M. ; Mateyka, L. ; Weber, S. ; Wolff, L.S. ; Witter, K. ; Fernandez, I.E. ; Leuschner, G. ; Milger, K. ; Frankenberger, M. ; Nowak, L. ; Heinig-Menhard, K. ; Koch, I. ; Stoleriu, M.-G. ; Hilgendorff, A. ; Behr, J. ; Pichlmair, A. ; Schubert, B. ; Theis, F.J. ; Busch, D.H. ; Schiller, H. B. ; Schober, K.
Nat. Commun. 12:4515 (2021)
The in vivo phenotypic profile of T cells reactive to severe acute respiratory syndrome (SARS)-CoV-2 antigens remains poorly understood. Conventional methods to detect antigen-reactive T cells require in vitro antigenic re-stimulation or highly individualized peptide-human leukocyte antigen (pHLA) multimers. Here, we use single-cell RNA sequencing to identify and profile SARS-CoV-2-reactive T cells from Coronavirus Disease 2019 (COVID-19) patients. To do so, we induce transcriptional shifts by antigenic stimulation in vitro and take advantage of natural T cell receptor (TCR) sequences of clonally expanded T cells as barcodes for 'reverse phenotyping'. This allows identification of SARS-CoV-2-reactive TCRs and reveals phenotypic effects introduced by antigen-specific stimulation. We characterize transcriptional signatures of currently and previously activated SARS-CoV-2-reactive T cells, and show correspondence with phenotypes of T cells from the respiratory tract of patients with severe disease in the presence or absence of virus in independent cohorts. Reverse phenotyping is a powerful tool to provide an integrated insight into cellular states of SARS-CoV-2-reactive T cells across tissues and activation states.
Wissenschaftlicher Artikel
Scientific Article
Aliee, H. ; Theis, F.J.
Cell Syst. 12, 706-715.e4 (2021)
Knowing cell-type proportions in a tissue is very important to identify which cells or cell types are targeted by a disease or perturbation. Hence, several deconvolution methods have been proposed to infer cell-type proportions from bulk RNA samples. Their performance with noisy reference profiles and closely correlated cell types highly depends on the set of genes undergoing deconvolution. In this work, we introduce AutoGeneS, a platform that automatically extracts discriminative genes and reveals the cellular heterogeneity of bulk RNA samples. AutoGeneS requires no prior knowledge about marker genes and selects genes by simultaneously optimizing multiple criteria: minimizing the correlation and maximizing the distance between cell types. AutoGeneS can be applied to reference profiles from various sources like single-cell experiments or sorted cell populations. Ground truth cell proportions analyzed by flow cytometry confirmed the accuracy of AutoGeneS in identifying cell-type proportions. AutoGeneS is available for use via a standalone Python package (https://github.com/theislab/AutoGeneS).
Wissenschaftlicher Artikel
Scientific Article
Scheibner, K. ; Schirge, S. ; Burtscher, I. ; Büttner, M. ; Sterr, M. ; Yang, D. ; Böttcher, A. ; Ansarullah ; Irmler, M. ; Beckers, J. ; Cernilogar, F.M. ; Schotta, G. ; Theis, F.J. ; Lickert, H.
Nat. Cell Biol., DOI: 10.1038/s41556-021-00735-5 (2021)
In the version of this Article originally published, text referencing ATAC-seq data was incorrectly retained. References to ATAC-seq data, which are not included in this study, should be removed from the text in the Results sections ‘In vitro-generated definitive endoderm forms by partial EMT’ and ‘Foxa2 suppresses a complete EMT during endoderm formation’, as well as from the author contributions section. The Methods subsection ‘ChIP-seq and ATAC-seq data visualization’ should also be completely removed. The errors have been corrected.
Ditz, B. ; Boekhoudt, J.G. ; Aliee, H. ; Theis, F.J. ; Nawijn, M.C. ; Brandsma, C.A. ; Hiemstra, P.S. ; Timens, W. ; Tew, G.W. ; Grimbaldeston, M.A. ; Neighbors, M. ; Guryev, V. ; van den Berge, M. ; Faiz, A.
ERJ Open Res. 7:00104-2021 (2021)
More DEGs are detected by RNA-Seq than microarrays in COPD lung biopsies and are associated with immunological pathways. Performing bulk tissue cell-type deconvolution in microarray lung samples, using the SVR method, reflects RNA-Seq results. https://bit.ly/2N8sY3s.
Wissenschaftlicher Artikel
Scientific Article
Ji, Y. ; Lotfollahi, M. ; Wolf, F.A. ; Theis, F.J.
Cell Syst. 12, 522-537 (2021)
Cell biology is fundamentally limited in its ability to collect complete data on cellular phenotypes and the wide range of responses to perturbation. Areas such as computer vision and speech recognition have addressed this problem of characterizing unseen or unlabeled conditions with the combined advances of big data, deep learning, and computing resources in the past 5 years. Similarly, recent advances in machine learning approaches enabled by single-cell data start to address prediction tasks in perturbation response modeling. We first define objectives in learning perturbation response in single-cell omics; survey existing approaches, resources, and datasets (https://github.com/theislab/sc-pert); and discuss how a perturbation atlas can enable deep learning models to construct an informative perturbation latent space. We then examine future avenues toward more powerful and explainable modeling using deep neural networks, which enable the integration of disparate information sources and an understanding of heterogeneous, complex, and unseen systems.
Review
Review
Scheibner, K. ; Schirge, S. ; Burtscher, I. ; Büttner, M. ; Sterr, M. ; Yang, D. ; Böttcher, A. ; Ansarullah ; Irmler, M. ; Beckers, J. ; Cernilogar, F.M. ; Schotta, G. ; Theis, F.J. ; Lickert, H.
Nat. Cell Biol. 23, 692-703 (2021)
It is generally accepted that epiblast cells ingress into the primitive streak by epithelial-to-mesenchymal transition (EMT) to give rise to the mesoderm; however, it is less clear how the endoderm acquires an epithelial fate. Here, we used embryonic stem cell and mouse embryo knock‐in reporter systems to combine time-resolved lineage labelling with high-resolution single-cell transcriptomics. This allowed us to resolve the morphogenetic programs that segregate the mesoderm from the endoderm germ layer. Strikingly, while the mesoderm is formed by classical EMT, the endoderm is formed independent of the key EMT transcription factor Snail1 by mechanisms of epithelial cell plasticity. Importantly, forkhead box transcription factor A2 (Foxa2) acts as an epithelial gatekeeper and EMT suppressor to shield the endoderm from undergoing a mesenchymal transition. Altogether, these results not only establish the morphogenetic details of germ layer formation, but also have broader implications for stem cell differentiation and cancer metastasis.
Wissenschaftlicher Artikel
Scientific Article
Warnat-Herresthal, S. ; Schultze, H. ; Shastry, K.L. ; Manamohan, S. ; Mukherjee, S. ; Garg, V. ; Sarveswara, R. ; Händler, K. ; Pickkers, P. ; Aziz, N.A. ; Ktena, S. ; Tran, F. ; Bitzer, M. ; Ossowski, S. ; Casadei, N. ; Herr, C. ; Petersheim, D. ; Behrends, U. ; Kern, F. ; Fehlmann, T. ; Schommers, P. ; Lehmann, C. ; Augustin, M. ; Rybniker, J. ; Altmüller, J. ; Mishra, N. ; Bernardes, J.P. ; Krämer, B.F. ; Bonaguro, L. ; Schulte-Schrepping, J. ; De Domenico, E. ; Siever, C. ; Kraut, M. ; Desai, M. ; Monnet, B. ; Saridaki, M. ; Siegel, C.M. ; Drews, A. ; Nuesch-Germano, M. ; Theis, H. ; Heyckendorf, J. ; Schreiber, S. ; Kim-Hellmuth, S. ; Nattermann, J. ; Skowasch, D. ; Kurth, I. ; Keller, A. ; Bals, R. ; Nürnberg, P. ; Rieß, O. ; Rosenstiel, P. ; Netea, M.G. ; Theis, F.J. ; Backes, M. ; Aschenbrenner, A.C. ; Ulas, T. ; Deutsche COVID-19 Omics Initiative (DeCOI) (De La Rosa Velázquez, I.A.) ; Breteler, M.M.B. ; Giamarellos-Bourboulis, E.J. ; Kox, M. ; Beck, M. ; Cheran, S. ; Woodacre, M.S. ; Lim Goh, E. ; Schultze, J.L.
Nature (2021)
Fast and reliable detection of patients with severe and heterogeneous illnesses is a major goal of precision medicine . Patients with leukaemia can be identified using machine learning on the basis of their blood transcriptomes . However, there is an increasing divide between what is technically possible and what is allowed, because of privacy legislation . Here, to facilitate the integration of any medical data from any data owner worldwide without violating privacy laws, we introduce Swarm Learning—a decentralized machine-learning approach that unites edge computing, blockchain-based peer-to-peer networking and coordination while maintaining confidentiality without the need for a central coordinator, thereby going beyond federated learning. To illustrate the feasibility of using Swarm Learning to develop disease classifiers using distributed data, we chose four use cases of heterogeneous diseases (COVID-19, tuberculosis, leukaemia and lung pathologies). With more than 16,400 blood transcriptomes derived from 127 clinical studies with non-uniform distributions of cases and controls and substantial study biases, as well as more than 95,000 chest X-ray images, we show that Swarm Learning classifiers outperform those developed at individual sites. In addition, Swarm Learning completely fulfils local confidentiality regulations by design. We believe that this approach will notably accelerate the introduction of precision medicine. 1,2 3 4,5
Wissenschaftlicher Artikel
Scientific Article
Klinger, E. ; Motta, A. ; Marr, C. ; Theis, F.J. ; Helmstaedter, M.
Nat. Commun. 12:2785 (2021)
With the availability of cellular-resolution connectivity maps, connectomes, from the mammalian nervous system, it is in question how informative such massive connectomic data can be for the distinction of local circuit models in the mammalian cerebral cortex. Here, we investigated whether cellular-resolution connectomic data can in principle allow model discrimination for local circuit modules in layer 4 of mouse primary somatosensory cortex. We used approximate Bayesian model selection based on a set of simple connectome statistics to compute the posterior probability over proposed models given a to-be-measured connectome. We find that the distinction of the investigated local cortical models is faithfully possible based on purely structural connectomic data with an accuracy of more than 90%, and that such distinction is stable against substantial errors in the connectome measurement. Furthermore, mapping a fraction of only 10% of the local connectome is sufficient for connectome-based model distinction under realistic experimental constraints. Together, these results show for a concrete local circuit example that connectomic data allows model selection in the cerebral cortex and define the experimental strategy for obtaining such connectomic data.
Wissenschaftlicher Artikel
Scientific Article
Böttcher, A. ; Büttner, M. ; Tritschler, S. ; Sterr, M. ; Aliluev, A. ; Oppenländer, L. ; Burtscher, I. ; Sass, S. ; Irmler, M. ; Beckers, J. ; Ziegenhain, C. ; Enard, W. ; Schamberger, A.C. ; Verhamme, F.M. ; Eickelberg, O. ; Theis, F.J. ; Lickert, H.
Nat. Cell Biol. 23, 566-576 (2021)
A Correction to this paper has been published: https://doi.org/10.1038/s41556-021-00667-0.
Türei, D. ; Valdeolivas, A. ; Gul, L. ; Palacio-Escat, N. ; Klein, M. ; Ivanova, O. ; Ölbei, M. ; Gábor, A. ; Theis, F.J. ; Módos, D. ; Korcsmáros, T. ; Saez-Rodriguez, J.
Mol. Syst. Biol. 17:e9923 (2021)
Molecular knowledge of biological processes is a cornerstone in omics data analysis. Applied to single-cell data, such analyses provide mechanistic insights into individual cells and their interactions. However, knowledge of intercellular communication is scarce, scattered across resources, and not linked to intracellular processes. To address this gap, we combined over 100 resources covering interactions and roles of proteins in inter- and intracellular signaling, as well as transcriptional and post-transcriptional regulation. We added protein complex information and annotations on function, localization, and role in diseases for each protein. The resource is available for human, and via homology translation for mouse and rat. The data are accessible via OmniPath's web service (https://omnipathdb.org/), a Cytoscape plug-in, and packages in R/Bioconductor and Python, providing access options for computational and experimental scientists. We created workflows with tutorials to facilitate the analysis of cell-cell interactions and affected downstream intracellular signaling processes. OmniPath provides a single access point to knowledge spanning intra- and intercellular processes for data analysis, as we demonstrate in applications studying SARS-CoV-2 infection and ulcerative colitis.
Wissenschaftlicher Artikel
Scientific Article
Suwandhi, L. ; Altun, I. ; Karlina, R. ; Miok, V. ; Wiedemann, T. ; Fischer, D.S. ; Walzthoeni, T. ; Lindner, C. ; Böttcher, A. ; Heinzmann, S.S. ; Israel, A. ; Khalil, A. ; Braun, A. ; Pramme-Steinwachs, I. ; Burtscher, I. ; Schmitt-Kopplin, P. ; Heinig, M. ; Elsner, M. ; Lickert, H. ; Theis, F.J. ; Ussar, S.
Nat. Commun. 12:1588 (2021)
Adipose tissue expansion, as seen in obesity, is often metabolically detrimental causing insulin resistance and the metabolic syndrome. However, white adipose tissue expansion at early ages is essential to establish a functional metabolism. To understand the differences between adolescent and adult adipose tissue expansion, we studied the cellular composition of the stromal vascular fraction of subcutaneous adipose tissue of two and eight weeks old mice using single cell RNA sequencing. We identified a subset of adolescent preadipocytes expressing the mature white adipocyte marker Asc-1 that showed a low ability to differentiate into beige adipocytes compared to Asc-1 negative cells in vitro. Loss of Asc-1 in subcutaneous preadipocytes resulted in spontaneous differentiation of beige adipocytes in vitro and in vivo. Mechanistically, this was mediated by a function of the amino acid transporter ASC-1 specifically in proliferating preadipocytes involving the intracellular accumulation of the ASC-1 cargo D-serine.
Wissenschaftlicher Artikel
Scientific Article
Weberpals, J. ; Becker, T. ; Davies, J. ; Schmich, F. ; Rüttinger, D. ; Theis, F.J. ; Bauer-Mehren, A.
Epidemiology 32, 378-388 (2021)
BACKGROUND: Due to the non-randomized nature of real-world data, prognostic factors need to be balanced, which is often done by propensity scores (PS). This study aimed to investigate whether autoencoders, which are unsupervised deep learning architectures, might be leveraged to compute PS. METHODS: We selected patient-level data of 128,368 first-line treated cancer patients from the Flatiron Health EHR-derived de-identified database. We trained an autoencoder architecture to learn a lower-dimensional patient representation, which we used to compute PS. To compare the performance of an autoencoder-based PS with established methods, we performed a simulation study. We assessed the balancing and adjustment performance using standardized mean differences (SMD), root-mean-square-errors (RMSE), percent bias and confidence interval (CI) coverage. To illustrate the application of the autoencoder-based PS, we emulated the PRONOUNCE trial by applying the trial's protocol elements within an observational database setting, comparing two chemotherapy regimens. RESULTS: All methods but the manual variable selection approach led to well-balanced cohorts with average SMDs <0.1. LASSO yielded on average the lowest deviation of resulting estimates (RMSE 0.0205) followed by the autoencoder approach (RMSE 0.0248). Altering the hyperparameter setup in sensitivity analysis, the autoencoder approach led to similar results as LASSO (RMSE 0.0203 and 0.0205, respectively). In the case study, all methods provided a similar conclusion with point estimates clustered around the null (e.g. HRautoencoder 1.01 [95% CI 0.80-1.27] vs. HRPRONOUNCE 1.07 [0.83-1.36]). INTERPRETATION: Autoencoder-based PS computation was a feasible approach to control for confounding but did not perform better than some established approaches like LASSO.
Wissenschaftlicher Artikel
Scientific Article
Salinno, C. ; Büttner, M. ; Cota, P. ; Tritschler, S. ; Tarquis-Medina, M. ; Bastidas-Ponce, A. ; Scheibner, K. ; Burtscher, I. ; Böttcher, A. ; Theis, F.J. ; Bakhti, M. ; Lickert, H.
Mol. Metab. 49:101188 (2021)
OBJECTIVE: Islets of Langerhans contain heterogeneous populations of insulin-producing β-cells. Surface markers and respective antibodies for isolation, tracking, and analysis are urgently needed to study β-cell heterogeneity and explore the mechanisms to harness the regenerative potential of immature β-cells. METHODS: We performed single-cell mRNA profiling of early postnatal mouse islets and re-analyzed several single-cell mRNA sequencing datasets from mouse and human pancreas and islets. We used mouse primary islets, iPSC-derived endocrine cells, Min6 insulinoma, and human EndoC-βH1 β-cell lines and performed FAC sorting, Western blotting, and imaging to support and complement the findings from the data analyses. RESULTS: We found that all endocrine cell types expressed the cluster of differentiation 81 (CD81) during pancreas development, but the expression levels of this protein were gradually reduced in β-cells during postnatal maturation. Single-cell gene expression profiling and high-resolution imaging revealed an immature signature of β-cells expressing high levels of CD81 (CD81high) compared to a more mature population expressing no or low levels of this protein (CD81low/-). Analysis of β-cells from different diabetic mouse models and in vitro β-cell stress assays indicated an upregulation of CD81 expression levels in stressed and dedifferentiated β-cells. Similarly, CD81 was upregulated and marked stressed human β-cells in vitro. CONCLUSIONS: We identified CD81 as a novel surface marker that labels immature, stressed, and dedifferentiated β-cells in the adult mouse and human islets. This novel surface marker will allow us to better study β-cell heterogeneity in healthy subjects and diabetes progression.
Wissenschaftlicher Artikel
Scientific Article
Mayr, C. ; Simon, L. ; Leuschner, G. ; Ansari, M. ; Schniering, J. ; Geyer, P.E. ; Angelidis, I. ; Strunz, M. ; Singh, P. ; Kneidinger, N. ; Reichenberger, F. ; Silbernagel, E. ; Böhm, S. ; Adler, H. ; Lindner, M. ; Maurer, B. ; Hilgendorff, A. ; Prasse, A. ; Behr, J. ; Mann, M. ; Eickelberg, O. ; Theis, F.J. ; Schiller, H. B.
EMBO Mol. Med. 13:e12871 (2021)
The correspondence of cell state changes in diseased organs to peripheral protein signatures is currently unknown. Here, we generated and integrated single-cell transcriptomic and proteomic data from multiple large pulmonary fibrosis patient cohorts. Integration of 233,638 single-cell transcriptomes (n = 61) across three independent cohorts enabled us to derive shifts in cell type proportions and a robust core set of genes altered in lung fibrosis for 45 cell types. Mass spectrometry analysis of lung lavage fluid (n = 124) and plasma (n = 141) proteomes identified distinct protein signatures correlated with diagnosis, lung function, and injury status. A novel SSTR2+ pericyte state correlated with disease severity and was reflected in lavage fluid by increased levels of the complement regulatory factor CFHR1. We further discovered CRTAC1 as a biomarker of alveolar type-2 epithelial cell health status in lavage fluid and plasma. Using cross-modal analysis and machine learning, we identified the cellular source of biomarkers and demonstrated that information transfer between modalities correctly predicts disease status, suggesting feasibility of clinical cell state monitoring through longitudinal sampling of body fluid proteomes.
Wissenschaftlicher Artikel
Scientific Article
Conlon, T.M. ; John-Schuster, G. ; Heide, D. ; Pfister, D. ; Lehmann, M. ; Hu, Y. ; Ertüz, Z. ; López, M.A. ; Ansari, M. ; Strunz, M. ; Mayr, C. ; Angelidis, I. ; Ciminieri, C. ; Costa, R. ; Kohlhepp, M.S. ; Guillot, A. ; Güneş, G. ; Jeridi, A. ; Funk, M.C. ; Beroshvili, G. ; Prokosch, S. ; Hetzer, J. ; Verleden, S.E. ; Alsafadi, H.N. ; Lindner, M. ; Burgstaller, G. ; Becker, L. ; Irmler, M. ; Dudek, M. ; Janzen, J. ; Goffin, E. ; Gosens, R. ; Knolle, P. ; Pirotte, B. ; Stöger, T. ; Beckers, J. ; Wagner, D.E. ; Singh, I. ; Theis, F.J. ; Hrabě de Angelis, M. ; O’Connor, T. ; Tacke, F. ; Boutros, M. ; Dejardin, E. ; Eickelberg, O. ; Schiller, H. B. ; Königshoff, M. ; Heikenwalder, M. ; Yildirim, A.Ö.
Nature 589, E6 (2021)
In the HTML version of this Article, owing to a typesetting error, the affiliations for author Indrabahadur Singh were incorrect. The correct affiliation is ‘Emmy Noether Research Group Epigenetic Machineries and Cancer, Division of Chronic Inflammation and Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany’. The PDF and print versions of the Article are correct. In addition, Ilias Angelidis should have been listed as an author, with the affiliation: ‘Comprehensive Pneumology Center (CPC), Institute of Lung Biology and Disease, Helmholtz Zentrum München, Member of the German Center for Lung Research (DZL), Neuherberg, Germany’. They designed, undertook, and analysed scRNA-seq experiments, and analysed and interpreted data (see ‘Author contributions’). Finally, in the original Article, authors Mathias Heikenwalder and Ali Önder Yildirim were listed as ‘jointly supervising’ authors instead of ‘equally contributing’ authors, alongside authors Thomas M. Conlon and Gerrit John-Schuster. The original Article has been corrected online.
Rajewsky, N. ; Almouzni, G. ; Gorski, S.A. ; Aerts, S. ; Amit, I. ; Bertero, M.G. ; Bock, C. ; Bredenoord, A.L. ; Cavalli, G. ; Chiocca, S. ; Clevers, H. ; de Strooper, B. ; Eggert, A. ; Ellenberg, J. ; Fernández, X.M. ; Figlerowicz, M. ; Gasser, S.M. ; Hubner, N. ; Kjems, J. ; Knoblich, J.A. ; Krabbe, G. ; Lichter, P. ; Linnarsson, S. ; Marine, J.C. ; Marioni, J.C. ; Marti-Renom, M.A. ; Netea, M.G. ; Nickel, D. ; Nollmann, M. ; Novak, H.R. ; Parkinson, H. ; Piccolo, S. ; Pinheiro, I. ; Pombo, A. ; Popp, C. ; Reik, W. ; Roman-Roman, S. ; Rosenstiel, P. ; Schultze, J.L. ; Stegle, O. ; Tanay, A. ; Testa, G. ; Thanos, D. ; Theis, F.J. ; Torres-Padilla, M.E. ; Valencia, A. ; Vallot, C. ; van Oudenaarden, A. ; Vidal, M. ; Voet, T. ; LifeTime Community (Schiller, H. B. ; Ziegler, A.-G.)
Nature 592:E8 (2021)
In this Perspective, owing to an error in the HTML, the surname of author Alejandro López-Tobón of the LifeTime Community Working Groups consortium was indexed as ‘Tobon’ rather than ‘López-Tobón’ and the accents were missing. The HTML version of the original Perspective has been corrected; the PDF and print versions were always correct. *A list of authors and their affiliations appears online.
Lopez, J.P. ; Brivio, E. ; Santambrogio, A. ; De Donno, C. ; Kos, A. ; Peters, M. ; Rost, N. ; Czamara, D. ; Brückl, T.M. ; Roeh, S. ; Pöhlmann, M.L. ; Engelhardt, C. ; Ressle, A. ; Stoffel, R. ; Tontsch, A. ; Villamizar, J.M. ; Reincke, M. ; Riester, A. ; Sbiera, S. ; Fassnacht, M. ; Mayberg, H.S. ; Craighead, W.E. ; Dunlop, B.W. ; Nemeroff, C.B. ; Schmidt, M.V. ; Binder, E.B. ; Theis, F.J. ; Beuschlein, F. ; Andoniadou, C.L. ; Chen, A.
Sci. Adv. 7:eabe4497 (2021)
Chronic activation and dysregulation of the neuroendocrine stress response have severe physiological and psychological consequences, including the development of metabolic and stress-related psychiatric disorders. We provide the first unbiased, cell type-specific, molecular characterization of all three components of the hypothalamic-pituitary- adrenal axis, under baseline and chronic stress conditions. Among others, we identified a previously unreported subpopulation of Abcb1b+ cells involved in stress adaptation in the adrenal gland. We validated our findings in a mouse stress model, adrenal tissues from patients with Cushing's syndrome, adrenocortical cell lines, and peripheral cortisol and genotyping data from depressed patients. This extensive dataset provides a valuable resource for researchers and clinicians interested in the organism's nervous and endocrine responses to stress and the interplay between these tissues. Our findings raise the possibility that modulating ABCB1 function may be important in the development of treatment strategies for patients suffering from metabolic and stress-related psychiatric disorders.
Wissenschaftlicher Artikel
Scientific Article
Meier, F. ; Köhler, N. ; Brunner, A.D. ; Wanka, J.-M.H. ; Voytik, E. ; Strauss, M.T. ; Theis, F.J. ; Mann, M.
Nat. Commun. 12:1185 (2021)
The size and shape of peptide ions in the gas phase are an under-explored dimension for mass spectrometry-based proteomics. To investigate the nature and utility of the peptide collisional cross section (CCS) space, we measure more than a million data points from whole-proteome digests of five organisms with trapped ion mobility spectrometry (TIMS) and parallel accumulation-serial fragmentation (PASEF). The scale and precision (CV < 1%) of our data is sufficient to train a deep recurrent neural network that accurately predicts CCS values solely based on the peptide sequence. Cross section predictions for the synthetic ProteomeTools peptides validate the model within a 1.4% median relative error (R > 0.99). Hydrophobicity, proportion of prolines and position of histidines are main determinants of the cross sections in addition to sequence-specific interactions. CCS values can now be predicted for any peptide and organism, forming a basis for advanced proteomics workflows that make full use of the additional information.
Wissenschaftlicher Artikel
Scientific Article
Thomas, J. ; Wang, R. ; Batra, R. ; Böhner, A. ; Garzorz-Stark, N. ; Eberlein, B. ; Theis, F.J. ; Biedermann, T. ; Schmidt-Weber, C.B. ; Zink, A. ; Eyerich, K. ; Eyerich, S.
J. Invest. Dermatol. 141, 681-685.e6 (2021)
Wissenschaftlicher Artikel
Scientific Article
Kunze, S. ; Cecil, A. ; Prehn, C. ; Möller, G. ; Ohlmann, A. ; Wildner, G. ; Thurau, S. ; Unger, K. ; Rößler, U. ; Hölter, S.M. ; Tapio, S. ; Wagner, F. ; Beyerlein, A. ; Theis, F.J. ; Zitzelsberger, H. ; Kulka, U. ; Adamski, J. ; Graw, J. ; Dalke, C.
Int. J. Radiat. Biol. 97, 529-540 (2021)
PURPOSE: The long-term effect of low and moderate doses of ionizing radiation on the lens is still a matter of debate and needs to be evaluated in more detail. MATERIAL AND METHODS: We conducted a detailed histological analysis of eyes from B6C3F1 mice cohorts after acute gamma irradiation (60Co source; 0.063 Gy/min) at young adult age of 10 weeks with doses of 0.063, 0.125 and 0.5 Gy. Sham irradiated (0 Gy) mice were used as controls. To test for genetic susceptibility heterozygous Ercc2 mutant mice were used and compared to wild type mice of the same strain background. Mice of both sexes were included in all cohorts. Eyes were collected 4 hours, 12, 18 and 24 months after irradiation. For a better understanding of the underlying mechanisms, metabolomics analyses were performed in lenses and plasma samples of the same mouse cohorts at 4 and 12 hours as well as 12, 18 and 24 months after irradiation. For this purpose, a targeted analysis was chosen. RESULTS: This analysis revealed histological changes particularly in the posterior part of the lens that rarely can be observed by using Scheimpflug imaging, as we reported previously. We detected a significant increase of posterior subcapsular cataracts 18 and 24 months after irradiation with 0.5 Gy (odds ratio 9.3; 95%-confidence interval 2.1 - 41.3) independent of sex and genotype. Doses below 0.5 Gy (i.e. 0.063 and 0.125 Gy) did not significantly increase the frequency of posterior subcapsular cataracts at any time point. In lenses, we observed a clear effect of sex and aging but not of irradiation or genotype. While metabolomics analyses of plasma from the same mice showed only a sex effect. CONCLUSIONS: This paper demonstrates a significant radiation-induced increase in the incidence of posterior subcapsular cataracts, which could not be identified using Scheimpflug imaging as the only diagnostic tool.
Wissenschaftlicher Artikel
Scientific Article
Böttcher, A. ; Büttner, M. ; Tritschler, S. ; Sterr, M. ; Aliluev, A. ; Oppenländer, L. ; Burtscher, I. ; Sass, S. ; Irmler, M. ; Beckers, J. ; Ziegenhain, C. ; Enard, W. ; Schamberger, A.C. ; Verhamme, F.M. ; Eickelberg, O. ; Theis, F.J. ; Lickert, H.
Nat. Cell Biol. 23, 23-31 (2021)
A detailed understanding of intestinal stem cell (ISC) self-renewal and differentiation is required to treat chronic intestinal diseases. However, the different models of ISC lineage hierarchy1–6 and segregation7–12 are subject to debate. Here, we have discovered non-canonical Wnt/planar cell polarity (PCP)-activated ISCs that are primed towards the enteroendocrine or Paneth cell lineage. Strikingly, integration of time-resolved lineage labelling with single-cell gene expression analysis revealed that both lineages are directly recruited from ISCs via unipotent transition states, challenging the existence of formerly predicted bi- or multipotent secretory progenitors7–12. Transitory cells that mature into Paneth cells are quiescent and express both stem cell and secretory lineage genes, indicating that these cells are the previously described Lgr5+ label-retaining cells7. Finally, Wnt/PCP-activated Lgr5+ ISCs are molecularly indistinguishable from Wnt/β-catenin-activated Lgr5+ ISCs, suggesting that lineage priming and cell-cycle exit is triggered at the post-transcriptional level by polarity cues and a switch from canonical to non-canonical Wnt/PCP signalling. Taken together, we redefine the mechanisms underlying ISC lineage hierarchy and identify the Wnt/PCP pathway as a new niche signal preceding lateral inhibition in ISC lineage priming and segregation.
Wissenschaftlicher Artikel
Scientific Article
Karlina, R. ; Lutter, D. ; Miok, V. ; Fischer, D.S. ; Altun, I. ; Schöttl, T. ; Schorpp, K.K. ; Israel, A. ; Cero, C. ; Johnson, J.W. ; Kapser-Fischer, I. ; Böttcher, A. ; Keipert, S. ; Feuchtinger, A. ; Graf, E. ; Strom, T.M. ; Walch, A.K. ; Lickert, H. ; Walzthoeni, T. ; Heinig, M. ; Theis, F.J. ; García-Cáceres, C. ; Cypess, A.M. ; Ussar, S.
Life Sci. All. 4:e202000924 (2021)
Brown adipose tissue (BAT) plays an important role in the regulation of body weight and glucose homeostasis. Although increasing evidence supports white adipose tissue heterogeneity, little is known about heterogeneity within murine BAT. Recently, UCP1 high and low expressing brown adipocytes were identified, but a developmental origin of these subtypes has not been studied. To obtain more insights into brown preadipocyte heterogeneity, we use single-cell RNA sequencing of the BAT stromal vascular fraction of C57/BL6 mice and characterize brown preadipocyte and adipocyte clonal cell lines. Statistical analysis of gene expression profiles from brown preadipocyte and adipocyte clones identify markers distinguishing brown adipocyte subtypes. We confirm the presence of distinct brown adipocyte populations in vivo using the markers EIF5, TCF25, and BIN1. We also demonstrate that loss of Bin1 enhances UCP1 expression and mitochondrial respiration, suggesting that BIN1 marks dormant brown adipocytes. The existence of multiple brown adipocyte subtypes suggests distinct functional properties of BAT depending on its cellular composition, with potentially distinct functions in thermogenesis and the regulation of whole body energy homeostasis.
Wissenschaftlicher Artikel
Scientific Article
Krautenbacher, N. ; Kabesch, M. ; Horak, E. ; Braun-Fahrländer, C. ; Genuneit, J. ; Boznanski, A. ; von Mutius, E. ; Theis, F.J. ; Fuchs, C. ; Ege, M.J. ; GABRIELA, PASTURE study groups
Pediatr. Allergy Immunol. 32, 295-304 (2021)
Background: The asthma syndrome is influenced by hereditary and environmental factors. With the example of farm exposure, we study whether genetic and environmental factors interact for asthma. Methods: Statistical learning approaches based on penalized regression and decision trees were used to predict asthma in the GABRIELA study with 850 cases (9% farm children) and 857 controls (14% farm children). Single-nucleotide polymorphisms (SNPs) were selected from a genome-wide dataset based on a literature search or by statistical selection techniques. Prediction was assessed by receiver operating characteristics (ROC) curves and validated in the PASTURE cohort. Results: Prediction by family history of asthma and atopy yielded an area under the ROC curve (AUC) of 0.62 [0.57-0.66] in the random forest machine learning approach. By adding information on demographics (sex and age) and 26 environmental exposure variables, the quality of prediction significantly improved (AUC = 0.65 [0.61-0.70]). In farm children, however, environmental variables did not improve prediction quality. Rather SNPs related to IL33 and RAD50 contributed significantly to the prediction of asthma (AUC = 0.70 [0.62-0.78]). Conclusions: Asthma in farm children is more likely predicted by other factors as compared to non-farm children though in both forms, family history may integrate environmental exposure, genotype and degree of penetrance.
Wissenschaftlicher Artikel
Scientific Article
2020
Radon, K. ; Saathoff, E. ; Pritsch, M. ; Guggenbühl Noller, J.M. ; Kroidl, I. ; Olbrich, L. ; Thiel, V. ; Diefenbach, M. ; Riess, F. ; Förster, F. ; Theis, F.J. ; Wieser, A. ; Hoelscher, M. ; the KoCo19 collaboration group (Hasenauer, J. ; Fuchs, C. ; Castelletti, N. ; Zeggini, E. ; Laxy, M. ; Leidl, R. ; Schwettmann, L.)
BMC Public Health 20:1335 (2020)
An amendment to this paper has been published and can be accessed via the original article.
Chlis, N.-K. ; Rausch, L. ; Brocker, T. ; Kranich, J. ; Theis, F.J.
Nucleic Acids Res. 48, 11335-11346 (2020)
High-content imaging and single-cell genomics are two of the most prominent high-throughput technologies for studying cellular properties and functions at scale. Recent studies have demonstrated that information in large imaging datasets can be used to estimate gene mutations and to predict the cell-cycle state and the cellular decision making directly from cellular morphology. Thus, high-throughput imaging methodologies, such as imaging flow cytometry can potentially aim beyond simple sorting of cellpopulations. We introduce IFC-seq, a machine learning methodology for predicting the expression profile of every cell in an imaging flow cytometry experiment. Since it is to-date unfeasible to observe singlecell gene expression and morphology in flow, we integrate uncoupled imaging data with an independent transcriptomics dataset by leveraging common surface markers. We demonstrate that IFC-seq successfully models gene expression of a moderate number of key gene-markers for two independent imaging flow cytometry datasets: (i) human blood mononuclear cells and (ii) mouse myeloid progenitor cells. In the case of mouse myeloid progenitor cells IFC-seq can predict gene expression directly from brightfield images in a label-free manner, using a convolutional neural network. The proposed method promises to add gene expression information to existing and new imaging flow cytometry datasets, at no additional cost.
Wissenschaftlicher Artikel
Scientific Article
Lotfollahi, M. ; Naghipourfar, M. ; Theis, F.J. ; Wolf, F.A.
Bioinformatics 36, 2, i610-i617 (2020)
MOTIVATION: While generative models have shown great success in sampling high-dimensional samples conditional on low-dimensional descriptors (stroke thickness in MNIST, hair color in CelebA, speaker identity in WaveNet), their generation out-of-distribution poses fundamental problems due to the difficulty of learning compact joint distribution across conditions. The canonical example of the conditional variational autoencoder (CVAE), for instance, does not explicitly relate conditions during training and, hence, has no explicit incentive of learning such a compact representation. RESULTS: We overcome the limitation of the CVAE by matching distributions across conditions using maximum mean discrepancy in the decoder layer that follows the bottleneck. This introduces a strong regularization both for reconstructing samples within the same condition and for transforming samples across conditions, resulting in much improved generalization. As this amount to solving a style-transfer problem, we refer to the model as transfer VAE (trVAE). Benchmarking trVAE on high-dimensional image and single-cell RNA-seq, we demonstrate higher robustness and higher accuracy than existing approaches. We also show qualitatively improved predictions by tackling previously problematic minority classes and multiple conditions in the context of cellular perturbation response to treatment and disease based on high-dimensional single-cell gene expression data. For generic tasks, we improve Pearson correlations of high-dimensional estimated means and variances with their ground truths from 0.89 to 0.97 and 0.75 to 0.87, respectively. We further demonstrate that trVAE learns cell-type-specific responses after perturbation and improves the prediction of most cell-type-specific genes by 65%. AVAILABILITY AND IMPLEMENTATION: The trVAE implementation is available via github.com/theislab/trvae. The results of this article can be reproduced via github.com/theislab/trvae_reproducibility.
Wissenschaftlicher Artikel
Scientific Article
Kranich, J. ; Chlis, N.-K. ; Rausch, L. ; Latha, A. ; Schifferer, M. ; Kurz, T. ; Foltyn-Arfa Kia, A. ; Simons, M. ; Theis, F.J. ; Brocker, T.
J. Extra. Vesicles 9:1792683 (2020)
The in vivo detection of dead cells remains a major challenge due to technical hurdles. Here, we present a novel method, where injection of fluorescent milk fat globule-EGF factor 8 protein (MFG-E8) in vivo combined with imaging flow cytometry and deep learning allows the identification of dead cells based on their surface exposure of phosphatidylserine (PS) and other image parameters. A convolutional autoencoder (CAE) was trained on defined pictures and successfully used to identify apoptotic cells in vivo. However, unexpectedly, these analyses also revealed that the great majority of PS+ cells were not apoptotic, but rather live cells associated with PS+ extracellular vesicles (EVs). During acute viral infection apoptotic cells increased slightly, while up to 30% of lymphocytes were decorated with PS+ EVs of antigen-presenting cell (APC) exosomal origin. The combination of recombinant fluorescent MFG-E8 and the CAE-method will greatly facilitate analyses of cell death and EVs in vivo.
Wissenschaftlicher Artikel
Scientific Article
Chlis, N.-K. ; Karlas, A. ; Fasoula, N.-A. ; Kallmayer, M. ; Eckstein, H.H. ; Theis, F.J. ; Ntziachristos, V. ; Marr, C.
Photoacoustics 20:100203 (2020)
Multispectral Optoacoustic Tomography (MSOT) resolves oxy- (HbO2) and deoxy-hemoglobin (Hb) to perform vascular imaging. MSOT suffers from gradual signal attenuation with depth due to light-tissue interactions: an effect that hinders the precise manual segmentation of vessels. Furthermore, vascular assessment requires functional tests, which last several minutes and result in recording thousands of images. Here, we introduce a deep learning approach with a sparse-UNET (S-UNET) for automatic vascular segmentation in MSOT images to avoid the rigorous and time-consuming manual segmentation. We evaluated the S-UNET on a test-set of 33 images, achieving a median DICE score of 0.88. Apart from high segmentation performance, our method based its decision on two wavelengths with physical meaning for the task-at-hand: 850 nm (peak absorption of oxy-hemoglobin) and 810 nm (isosbestic point of oxy-and deoxy-hemoglobin). Thus, our approach achieves precise data-driven vascular segmentation for automated vascular assessment and may boost MSOT further towards its clinical translation.
Wissenschaftlicher Artikel
Scientific Article
Ansari, M. ; Fischer, D.S. ; Theis, F.J.
Lect. Notes Comput. Sc. 12396 LNCS, 105-114 (2020)
Technological advances in the last decade resulted in an explosion of biological data. Sequencing methods in particular provide large-scale data sets as resource for incorporation of machine learning in the biological field. By measuring DNA accessibility for instance, enzymatic hypersensitivity assays facilitate identification of regions of open chromatin in the genome, marking potential locations of regulatory elements. ATAC-seq is the primary method of choice to determine these footprints. It allows measurements on the cellular level, complementing the recent progress in single cell transcriptomics. However, as the method-specific enzymes tend to bind preferentially to certain sequences, the accessibility profile is confounded by binding specificity. The inference of open chromatin should be adjusted for this bias[1]. To enable such corrections, we built a deep learning model that learns the sequence specificity of ATAC-seq’s enzyme Tn5 on naked DNA. We found binding preferences and demonstrate that cleavage patterns specific to Tn5 can successfully be discovered by the means of convolutional neural networks. Such models can be combined with accessibility analysis in the future in order to predict bias on new sequences and furthermore provide a better picture of the regulatory landscape of the genome.
Wissenschaftlicher Artikel
Scientific Article
Holmberg, O. ; Köhler, N. ; Martins, T. ; Siedlecki, J. ; Herold, T. ; Keidel, L. ; Asani, B. ; Schiefelbein, J. ; Priglinger, S. ; Kortuem, K.U. ; Theis, F.J.
Nat. Mach. Intell. 2, 719-726 (2020)
Access to large, annotated samples represents a considerable challenge for training accurate deep-learning models in medical imaging. Although at present transfer learning from pre-trained models can help with cases lacking data, this limits design choices and generally results in the use of unnecessarily large models. Here we propose a self-supervised training scheme for obtaining high-quality, pre-trained networks from unlabelled, cross-modal medical imaging data, which will allow the creation of accurate and efficient models. We demonstrate the utility of the scheme by accurately predicting retinal thickness measurements based on optical coherence tomography from simple infrared fundus images. Subsequently, learned representations outperformed advanced classifiers on a separate diabetic retinopathy classification task in a scenario of scarce training data. Our cross-modal, three-stage scheme effectively replaced 26,343 diabetic retinopathy annotations with 1,009 semantic segmentations on optical coherence tomography and reached the same classification accuracy using only 25% of fundus images, without any drawbacks, since optical coherence tomography is not required for predictions. We expect this concept to apply to other multimodal clinical imaging, health records and genomics data, and to corresponding sample-starved learning problems.
Wissenschaftlicher Artikel
Scientific Article
Conlon, T.M. ; John-Schuster, G. ; Heide, D. ; Pfister, D. ; Lehmann, M. ; Hu, Y. ; Ertüz, Z. ; López, M.A. ; Ansari, M. ; Strunz, M. ; Mayr, C. ; Ciminieri, C. ; Costa, R. ; Kohlhepp, M.S. ; Guillot, A. ; Güneş, G. ; Jeridi, A. ; Funk, M.C. ; Beroshvili, G. ; Prokosch, S. ; Hetzer, J. ; Verleden, S.E. ; Alsafadi, H.N. ; Lindner, M. ; Burgstaller, G. ; Becker, L. ; Irmler, M. ; Dudek, M. ; Janzen, J. ; Goffin, E. ; Gosens, R. ; Knolle, P. ; Pirotte, B. ; Stöger, T. ; Beckers, J. ; Wagner, D.E. ; Singh, I. ; Theis, F.J. ; Hrabě de Angelis, M. ; O’Connor, T. ; Tacke, F. ; Boutros, M. ; Dejardin, E. ; Eickelberg, O. ; Schiller, H. B. ; Königshoff, M. ; Heikenwalder, M. ; Yildirim, A.Ö.
Nature 588, 151–156 (2020)
Blockade of lymphotoxin beta-receptor (LT beta R) signalling restores WNT signalling and epithelial repair in a model of chronic obstructive pulmonary disease.Lymphotoxin beta-receptor (LT beta R) signalling promotes lymphoid neogenesis and the development of tertiary lymphoid structures(1,2), which are associated with severe chronic inflammatory diseases that span several organ systems(3-6). How LT beta R signalling drives chronic tissue damage particularly in the lung, the mechanism(s) that regulate this process, and whether LT beta R blockade might be of therapeutic value have remained unclear. Here we demonstrate increased expression of LT beta R ligands in adaptive and innate immune cells, enhanced non-canonical NF-kappa B signalling, and enriched LT beta R target gene expression in lung epithelial cells from patients with smoking-associated chronic obstructive pulmonary disease (COPD) and from mice chronically exposed to cigarette smoke. Therapeutic inhibition of LT beta R signalling in young and aged mice disrupted smoking-related inducible bronchus-associated lymphoid tissue, induced regeneration of lung tissue, and reverted airway fibrosis and systemic muscle wasting. Mechanistically, blockade of LT beta R signalling dampened epithelial non-canonical activation of NF-kappa B, reduced TGF beta signalling in airways, and induced regeneration by preventing epithelial cell death and activating WNT/beta-catenin signalling in alveolar epithelial progenitor cells. These findings suggest that inhibition of LT beta R signalling represents a viable therapeutic option that combines prevention of tertiary lymphoid structures(1) and inhibition of apoptosis with tissue-regenerative strategies.
Wissenschaftlicher Artikel
Scientific Article
Rajewsky, N. ; Almouzni, G. ; Gorski, S.A. ; Aerts, S. ; Amit, I. ; Bertero, M.G. ; Bock, C. ; Bredenoord, A.L. ; Cavalli, G. ; Chiocca, S. ; Clevers, H. ; de Strooper, B. ; Eggert, A. ; Ellenberg, J. ; Fernández, X.M. ; Figlerowicz, M. ; Gasser, S.M. ; Hubner, N. ; Kjems, J. ; Knoblich, J.A. ; Krabbe, G. ; Lichter, P. ; Linnarsson, S. ; Marine, J.C. ; Marioni, J. ; Marti-Renom, M.A. ; Netea, M.G. ; Nickel, D. ; Nollmann, M. ; Novak, H.R. ; Parkinson, H. ; Piccolo, S. ; Pinheiro, I. ; Pombo, A. ; Popp, C. ; Reik, W. ; Roman-Roman, S. ; Rosenstiel, P. ; Schultze, J.L. ; Stegle, O. ; Tanay, A. ; Testa, G. ; Thanos, D. ; Theis, F.J. ; Torres-Padilla, M.E. ; Valencia, A. ; Vallot, C. ; van Oudenaarden, A. ; Vidal, M. ; Voet, T. ; LifeTime Community (Schiller, H. B.) ; LifeTime Community (Ziegler, A.-G.)
Nature 587, 377–386 (2020)
Here we describe the LifeTime Initiative, which aims to track, understand and target human cells during the onset and progression of complex diseases, and to analyse their response to therapy at single-cell resolution. This mission will be implemented through the development, integration and application of single-cell multi-omics and imaging, artificial intelligence and patient-derived experimental disease models during the progression from health to disease. The analysis of large molecular and clinical datasets will identify molecular mechanisms, create predictive computational models of disease progression, and reveal new drug targets and therapies. The timely detection and interception of disease embedded in an ethical and patient-centred vision will be achieved through interactions across academia, hospitals, patient associations, health data management systems and industry. The application of this strategy to key medical challenges in cancer, neurological and neuropsychiatric disorders, and infectious, chronic inflammatory and cardiovascular diseases at the single-cell level will usher in cell-based interceptive medicine in Europe over the next decade.The LifeTime initiative is an ambitious, multidisciplinary programme that aims to improve healthcare by tracking individual human cells during disease processes and responses to treatment in order to develop and implement cell-based interceptive medicine in Europe.
Review
Review
Fischer, D.S. ; Wu, Y. ; Schubert, B. ; Theis, F.J.
Mol. Syst. Biol. 16:e9416 (2020)
It has recently become possible to simultaneously assay T-cell specificity with respect to large sets of antigens and the T-cell receptor sequence in high-throughput single-cell experiments. Leveraging this new type of data, we propose and benchmark a collection of deep learning architectures to model T-cell specificity in single cells. In agreement with previous results, we found that models that treat antigens as categorical outcome variables outperform those that model the TCR and antigen sequence jointly. Moreover, we show that variability in single-cell immune repertoire screens can be mitigated by modeling cell-specific covariates. Lastly, we demonstrate that the number of bound pMHC complexes can be predicted in a continuous fashion providing a gateway to disentangle cell-to-dextramer binding strength and receptor-to-pMHC affinity. We provide these models in the Python package TcellMatch to allow imputation of antigen specificities in single-cell RNA-seq studies on T cells without the need for MHC staining.
Wissenschaftlicher Artikel
Scientific Article
Strunz, M. ; Simon, L. ; Ansari, M. ; Kathiriya, J.J. ; Angelidis, I. ; Mayr, C. ; Tsidiridis, G. ; Lange, M. ; Mattner, L. ; Yee, M. ; Ogar, P. ; Sengupta, A. ; Kukhtevich, I. ; Schneider, R. ; Zhao, Z. ; Voss, C. ; Stöger, T. ; Neumann, J.H.L. ; Hilgendorff, A. ; Behr, J. ; O'Reilly, M. ; Lehmann, M. ; Burgstaller, G. ; Königshoff, M. ; Chapman, H.A. ; Theis, F.J. ; Schiller, H. B.
Nat. Commun. 11:3559 (2020)
The cell type specific sequences of transcriptional programs during lung regeneration have remained elusive. Using time-series single cell RNA-seq of the bleomycin lung injury model, we resolved transcriptional dynamics for 28 cell types. Trajectory modeling together with lineage tracing revealed that airway and alveolar stem cells converge on a unique Krt8+transitional stem cell state during alveolar regeneration. These cells have squamous morphology, feature p53 and NFkB activation and display transcriptional features of cellular senescence. The Krt8+ state appears in several independent models of lung injury and persists in human lung fibrosis, creating a distinct cell-cell communication network with mesenchyme and macrophages during repair. We generated a model of gene regulatory programs leading to Krt8+transitional cells and their terminal differentiation to alveolar type-1 cells. We propose that in lung fibrosis, perturbed molecular checkpoints on the way to terminal differentiation can cause aberrant persistence of regenerative intermediate stem cell states. Injury repair is characterized by the generation of transient cell states important for tissue recovery. Here, the authors present a single cell RNA-seq map of recovery from bleomycin lung injury in mice and uncover a Krt8+ transitional stem cell state that precedes the regeneration of AT1 cells and persists in human lung fibrosis.
Wissenschaftlicher Artikel
Scientific Article
Bergen, V. ; Lange, M. ; Peidli, S. ; Wolf, F.A. ; Theis, F.J.
Nat. Biotechnol. 38, 1408–1414 (2020)
scVelo reconstructs transient cell states and differentiation pathways from single-cell RNA-sequencing data.RNA velocity has opened up new ways of studying cellular differentiation in single-cell RNA-sequencing data. It describes the rate of gene expression change for an individual gene at a given time point based on the ratio of its spliced and unspliced messenger RNA (mRNA). However, errors in velocity estimates arise if the central assumptions of a common splicing rate and the observation of the full splicing dynamics with steady-state mRNA levels are violated. Here we present scVelo, a method that overcomes these limitations by solving the full transcriptional dynamics of splicing kinetics using a likelihood-based dynamical model. This generalizes RNA velocity to systems with transient cell states, which are common in development and in response to perturbations. We apply scVelo to disentangling subpopulation kinetics in neurogenesis and pancreatic endocrinogenesis. We infer gene-specific rates of transcription, splicing and degradation, recover each cell's position in the underlying differentiation processes and detect putative driver genes. scVelo will facilitate the study of lineage decisions and gene regulation.
Wissenschaftlicher Artikel
Scientific Article
Radon, K. ; Saathoff, E. ; Pritsch, M. ; Guggenbühl Noller, J.M. ; Kroidl, I. ; Olbrich, L. ; Thiel, V. ; Diefenbach, M. ; Riess, F. ; Förster, F. ; Theis, F.J. ; Wieser, A. ; Hoelscher, M. ; the KoCo19 collaboration group (Hasenauer, J. ; Castelletti, N. ; Zeggini, E. ; Laxy, M. ; Leidl, R. ; Schwettmann, L.) ; the KoCo19 collaboration group (Fuchs, C.)
BMC Public Health 20:1036 (2020)
BackgroundDue to the SARS-CoV-2 pandemic, public health interventions have been introduced globally in order to prevent the spread of the virus and avoid the overload of health care systems, especially for the most severely affected patients. Scientific studies to date have focused primarily on describing the clinical course of patients, identifying treatment options and developing vaccines. In Germany, as in many other regions, current tests for SARS-CoV2 are not conducted on a representative basis and in a longitudinal design. Furthermore, knowledge about the immune status of the population is lacking. Nonetheless, these data are needed to understand the dynamics of the pandemic and hence to appropriately design and evaluate interventions. For this purpose, we recently started a prospective population-based cohort in Munich, Germany, with the aim to develop a better understanding of the state and dynamics of the pandemic.MethodsIn 100 out of 755 randomly selected constituencies, 3000 Munich households are identified via random route and offered enrollment into the study. All household members are asked to complete a baseline questionnaire and subjects >= 14years of age are asked to provide a venous blood sample of <= 3ml for the determination of SARS-CoV-2 IgG/IgA status. The residual plasma and the blood pellet are preserved for later genetic and molecular biological investigations. For twelve months, each household member is asked to keep a diary of daily symptoms, whereabouts and contacts via WebApp. If symptoms suggestive for COVID-19 are reported, family members, including children <14years, are offered a pharyngeal swab taken at the Division of Infectious Diseases and Tropical Medicine, LMU University Hospital Munich, for molecular testing for SARS-CoV-2. In case of severe symptoms, participants will be transferred to a Munich hospital. For one year, the study teams re-visits the households for blood sampling every six weeks.DiscussionWith the planned study we will establish a reliable epidemiological tool to improve the understanding of the spread of SARS-CoV-2 and to better assess the effectiveness of public health measures as well as their socio-economic effects. This will support policy makers in managing the epidemic based on scientific evidence.
Wissenschaftlicher Artikel
Scientific Article
Müller, J.B. ; Geyer, P.E. ; Colaço, A.R. ; Treit, P.V. ; Strauss, M.T. ; Oroshi, M. ; Doll, S. ; Virreira Winter, S. ; Bader, J.M. ; Koehler, N. ; Theis, F.J. ; Santos, A. ; Mann, M.
Nature 582, 592–596 (2020)
Proteins carry out the vast majority of functions in all biological domains, but for technological reasons their large-scale investigation has lagged behind the study of genomes. Since the first essentially complete eukaryotic proteome was reported(1), advances in mass-spectrometry-based proteomics(2)have enabled increasingly comprehensive identification and quantification of the human proteome(3-6). However, there have been few comparisons across species(7,8), in stark contrast with genomics initiatives(9). Here we use an advanced proteomics workflow-in which the peptide separation step is performed by a microstructured and extremely reproducible chromatographic system-for the in-depth study of 100 taxonomically diverse organisms. With two million peptide and 340,000 stringent protein identifications obtained in a standardized manner, we double the number of proteins with solid experimental evidence known to the scientific community. The data also provide a large-scale case study for sequence-based machine learning, as we demonstrate by experimentally confirming the predicted properties of peptides fromBacteroides uniformis. Our results offer a comparative view of the functional organization of organisms across the entire evolutionary range. A remarkably high fraction of the total proteome mass in all kingdoms is dedicated to protein homeostasis and folding, highlighting the biological challenge of maintaining protein structure in all branches of life. Likewise, a universally high fraction is involved in supplying energy resources, although these pathways range from photosynthesis through iron sulfur metabolism to carbohydrate metabolism. Generally, however, proteins and proteomes are remarkably diverse between organisms, and they can readily be explored and functionally compared at www.proteomesoflife.org.
Wissenschaftlicher Artikel
Scientific Article
Fischer, A. ; Koopmans, T. ; Ramesh, P. ; Christ, S. ; Strunz, M. ; Wannemacher, J. ; Aichler, M. ; Feuchtinger, A. ; Walch, A.K. ; Ansari, M. ; Theis, F.J. ; Schorpp, K.K. ; Hadian, K. ; Neumann, P.A. ; Schiller, H. B. ; Rinkevich, Y.
Nat. Commun. 11:3068 (2020)
Surgical adhesions are bands of scar tissues that abnormally conjoin organ surfaces. Adhesions are a major cause of post-operative and dialysis-related complications, yet their patho-mechanism remains elusive, and prevention agents in clinical trials have thus far failed to achieve efficacy. Here, we uncover the adhesion initiation mechanism by coating beads with human mesothelial cells that normally line organ surfaces, and viewing them under adhesion stimuli. We document expansive membrane protrusions from mesothelia that tether beads with massive accompanying adherence forces. Membrane protrusions precede matrix deposition, and can transmit adhesion stimuli to healthy surfaces. We identify cytoskeletal effectors and calcium signaling as molecular triggers that initiate surgical adhesions. A single, localized dose targeting these early germinal events completely prevented adhesions in a preclinical mouse model, and in human assays. Our findings classifies the adhesion pathology as originating from mesothelial membrane bridges and offer a radically new therapeutic approach to treat adhesions.
Wissenschaftlicher Artikel
Scientific Article
Ziegler, C.G.K. ; Allon, S.J. ; Nyquist, S.K. ; Mbano, I.M. ; Miao, V.N. ; Tzouanas, C.N. ; Cao, Y. ; Yousif, A.S. ; Bals, J. ; Hauser, B.M. ; Feldman, J. ; Muus, C. ; Wadsworth, M.H. ; Kazer, S.W. ; Hughes, T.K. ; Doran, B. ; Gatter, G.J. ; Vukovic, M. ; Taliaferro, F. ; Mead, B.E. ; Guo, Z. ; Wang, J.P. ; Gras, D. ; Plaisant, M. ; Ansari, M. ; Angelidis, I. ; Adler, H. ; Sucre, J.M.S. ; Taylor, C.J. ; Lin, B. ; Waghray, A. ; Mitsialis, V. ; Dwyer, D.F. ; Buchheit, K.M. ; Boyce, J.A. ; Barrett, N.A. ; Laidlaw, T.M. ; Carroll, S.L. ; Colonna, L. ; Tkachev, V. ; Peterson, C.W. ; Yu, A. ; Zheng, H.B. ; Gideon, H.P. ; Winchell, C.G. ; Lin, P.L. ; Bingle, C.D. ; Snapper, S.B. ; Kropski, J.A. ; Theis, F.J. ; Schiller, H. B. ; Zaragosi, L.E. ; Barbry, P. ; Leslie, A. ; Kiem, H.P. ; Flynn, J.L. ; Fortune, S.M. ; Berger, B. ; Finberg, R.W. ; Kean, L.S. ; Garber, M. ; Schmidt, A.G. ; Lingwood, D. ; Shalek, A.K. ; Ordovas-Montanes, J.
Cell 181, 1016-1035 (2020)
There is pressing urgency to understand the pathogenesis of the severe acute respiratory syndrome coronavirus clade 2 (SARS-CoV-2), which causes the disease COVID-19. SARS-CoV-2 spike (S) protein binds angiotensin-converting enzyme 2 (ACE2), and in concert with host proteases, principally transmembrane serine protease 2 (TMPRSS2), promotes cellular entry. The cell subsets targeted by SARS-CoV-2 in host tissues and the factors that regulate ACE2 expression remain unknown. Here, we leverage human, non-human primate, and mouse single-cell RNA-sequencing (scRNA-seq) datasets across health and disease to uncover putative targets of SARS-CoV-2 among tissue-resident cell subsets. We identify ACE2 and TMPRSS2 co-expressing cells within lung type II pneumocytes, ileal absorptive enterocytes, and nasal goblet secretory cells. Strikingly, we discovered that ACE2 is a human interferon-stimulated gene (ISG) in vitro using airway epithelial cells and extend our findings to in vivo viral infections. Our data suggest that SARS-CoV-2 could exploit species-specific interferon-driven upregulation of ACE2, a tissue-protective mediator during lung injury, to enhance infection.
Wissenschaftlicher Artikel
Scientific Article
Sungnak, W. ; Huang, N. ; Bécavin, C. ; Berg, M. ; Queen, R. ; Litvinukova, M. ; Talavera-López, C. ; Maatz, H. ; Reichart, D. ; Sampaziotis, F. ; Worlock, K.B. ; Yoshida, M. ; Barnes, J.L. ; HCA Lung Biological Network (Schiller, H. B. ; Theis, F.J.)
Nat. Med. 26, 681–687 (2020)
We investigated SARS-CoV-2 potential tropism by surveying expression of viral entry-associated genes in single-cell RNA-sequencing data from multiple tissues from healthy human donors. We co-detected these transcripts in specific respiratory, corneal and intestinal epithelial cells, potentially explaining the high efficiency of SARS-CoV-2 transmission. These genes are co-expressed in nasal epithelial cells with genes involved in innate immunity, highlighting the cells’ potential role in initial viral infection, spread and clearance. The study offers a useful resource for further lines of inquiry with valuable clinical samples from COVID-19 patients and we provide our data in a comprehensive, open and user-friendly fashion at www.covid19cellatlas.org.
Wissenschaftlicher Artikel
Scientific Article
Sachs, S. ; Bastidas-Ponce, A. ; Tritschler, S. ; Bakhti, M. ; Böttcher, A. ; Sánchez-Garrido, M.A. ; Tarquis Medina, M. ; Kleinert, M. ; Fischer, K. ; Jall, S. ; Harger, A. ; Bader, E. ; Roscioni, S. ; Ussar, S. ; Feuchtinger, A. ; Yesildag, B. ; Neelakandhan, A. ; Jensen, C.B. ; Cornu, M. ; Yang, B. ; Finan, B. ; DiMarchi, R.D. ; Tschöp, M.H. ; Theis, F.J. ; Hofmann, S.M. ; Müller, T.D. ; Lickert, H.
Nat. Metab. 2, 380 (2020)
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Kozak, E.L. ; Palit, S. ; Miranda Rodriguez, J.R. ; Janjic, A. ; Böttcher, A. ; Lickert, H. ; Enard, W. ; Theis, F.J. ; López-Schier, H.
Curr. Biol. 30, 1142-1151 (2020)
Most plane-polarized tissues are formed by identically oriented cells [1, 2]. A notable exception occurs in the vertebrate vestibular system and lateral-line neuromasts, where mechanosensory hair cells orient along a single axis but in opposite directions to generate bipolar epithelia [3-5]. In zebrafish neuromasts, pairs of hair cells arise from the division of a non-sensory progenitor [6, 7] and acquire opposing planar polarity via the asymmetric expression of the polarity-determinant transcription factor Emx2 [8-11]. Here, we reveal the initial symmetry-breaking step by decrypting the developmental trajectory of hair cells using single-cell RNA sequencing (scRNA-seq), diffusion pseudotime analysis, lineage tracing, and mutagenesis. We show that Emx2 is absent in non-sensory epithelial cells, begins expression in hair-cell progenitors, and is downregulated in one of the sibling hair cells via signaling through the Notch1a receptor. Analysis of Emx2-deficient specimens, in which every hair cell adopts an identical direction, indicates that Emx2 asymmetry does not result from auto-regulatory feedback. These data reveal a two-tiered mechanism by which the symmetric monodirectional ground state of the epithelium is inverted by deterministic initiation of Emx2 expression in hair-cell progenitors and a subsequent stochastic repression of Emx2 in one of the sibling hair cells breaks directional symmetry to establish planar bipolarity.
Wissenschaftlicher Artikel
Scientific Article
Lähnemann, D. ; Köster, J. ; Szczurek, E. ; McCarthy, D.J. ; Hicks, S.C. ; Robinson, M.D. ; Vallejos, C.A. ; Campbell, K.R. ; Beerenwinkel, N. ; Mahfouz, A. ; Pinello, L. ; Skums, P. ; Stamatakis, A. ; Attolini, C.S.O. ; Aparicio, S. ; Baaijens, J. ; Balvert, M. ; Barbanson, B.d. ; Cappuccio, A. ; Corleone, G. ; Dutilh, B.E. ; Florescu, M. ; Guryev, V. ; Holmer, R. ; Jahn, K. ; Lobo, T.J. ; Keizer, E.M. ; Khatri, I. ; Kielbasa, S.M. ; Korbel, J.O. ; Kozlov, A.M. ; Kuo, T.H. ; Lelieveldt, B.P.F. ; Mandoiu, I.I. ; Marioni, J.C. ; Marschall, T. ; Mölder, F. ; Niknejad, A. ; Raczkowski, L. ; Reinders, M. ; Ridder, J.d. ; Saliba, A.E. ; Somarakis, A. ; Stegle, O. ; Theis, F.J. ; Yang, H. ; Zelikovsky, A. ; McHardy, A.C. ; Raphael, B.J. ; Shah, S.P. ; Schönhuth, A.
Genome Biol. 21:31 (2020)
The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands - or even millions - of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.
Review
Review
Strunz, M. ; Simon, L. ; Ansari, M. ; Mattner, L. ; Angelidis, I. ; Mayr, C. ; Kathiriya, J. ; Yee, M. ; Ogar, P. ; Voss, C. ; Stöger, T. ; Kukhtevich, I. ; Schneider, R. ; Lehmann, M. ; Koenigshoff, M. ; Burgstaller, G. ; O'Reilly, M. ; Chapman, H. ; Theis, F.J. ; Schiller, H. B.
Wound Repair Regen. 28, A7-A7 (2020)
Meeting abstract
Meeting abstract
van der Wijst, M. ; de Vries, D.H. ; Groot, H.E. ; Trynka, G. ; Hon, C.C. ; Bonder, M.J. ; Stegle, O. ; Nawijn, M.C. ; Idaghdour, Y. ; van der Harst, P. ; Ye, C.J. ; Powell, J. ; Theis, F.J. ; Mahfouz, A. ; Heinig, M. ; Franke, L.
eLife 9:e52155 (2020)
In recent years, functional genomics approaches combining genetic information with bulk RNA-sequencing data have identified the downstream expression effects of disease-associated genetic risk factors through so-called expression quantitative trait locus (eQTL) analysis. Single-cell RNA-sequencing creates enormous opportunities for mapping eQTLs across different cell types and in dynamic processes, many of which are obscured when using bulk methods. Rapid increase in throughput and reduction in cost per cell now allow this technology to be applied to large-scale population genetics studies. To fully leverage these emerging data resources, we have founded the single-cell eQTLGen consortium (sc-eQTLGen), aimed at pinpointing the cellular contexts in which disease-causing genetic variants affect gene expression. Here, we outline the goals, approach and potential utility of the sc-eQTLGen consortium. We also provide a set of study design considerations for future single-cell eQTL studies.
Wissenschaftlicher Artikel
Scientific Article
Angerer, P. ; Fischer, D.S. ; Theis, F.J. ; Scialdone, A. ; Marr, C.
Bioinformatics 36, 4291-4295 (2020)
MOTIVATION: Dimensionality reduction is a key step in the analysis of single-cell RNA-sequencing data. It produces a low-dimensional embedding for visualization and as a calculation base for downstream analysis. Nonlinear techniques are most suitable to handle the intrinsic complexity of large, heterogeneous single-cell data. However, with no linear relation between gene and embedding coordinate, there is no way to extract the identity of genes driving any cell's position in the low-dimensional embedding, making it difficult to characterize the underlying biological processes. RESULTS: In this article, we introduce the concepts of local and global gene relevance to compute an equivalent of principal component analysis loadings for non-linear low-dimensional embeddings. Global gene relevance identifies drivers of the overall embedding, while local gene relevance identifies those of a defined sub-region. We apply our method to single-cell RNA-seq datasets from different experimental protocols and to different low-dimensional embedding techniques. This shows our method's versatility to identify key genes for a variety of biological processes. AVAILABILITY AND IMPLEMENTATION: To ensure reproducibility and ease of use, our method is released as part of destiny 3.0, a popular R package for building diffusion maps from single-cell transcriptomic data. It is readily available through Bioconductor. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Wissenschaftlicher Artikel
Scientific Article
Raimundez-Alvarez, E. ; Keller, S. ; Ebert, K. ; Hug, S. ; Theis, F.J. ; Maier, D. ; Luber, B. ; Hasenauer, J.
PLoS Comput. Biol. 16:e1007147 (2020)
Author summaryUnraveling the causal differences between drug responders and non-responders is an important challenge. The information can help to understand molecular mechanisms and to guide the selection and design of targeted therapies. Here, we approach this problem for cetuximab treatment for gastric cancer using mechanistic mathematical modeling. The proposed model describes responder and non-responder gastric cancer cell lines and can predict the response in several validation experiments. Our analysis provides a differentiated view on mutations and explains, for instance, the relevance of MET mutations and the insignificance of PIK3CA mutation in the considered cell lines. The model might potentially provide the basis for understanding the recent failure of several clinical studies.Targeted cancer therapies are powerful alternatives to chemotherapies or can be used complementary to these. Yet, the response to targeted treatments depends on a variety of factors, including mutations and expression levels, and therefore their outcome is difficult to predict. Here, we develop a mechanistic model of gastric cancer to study response and resistance factors for cetuximab treatment. The model captures the EGFR, ERK and AKT signaling pathways in two gastric cancer cell lines with different mutation patterns. We train the model using a comprehensive selection of time and dose response measurements, and provide an assessment of parameter and prediction uncertainties. We demonstrate that the proposed model facilitates the identification of causal differences between the cell lines. Furthermore, our study shows that the model provides predictions for the responses to different perturbations, such as knockdown and knockout experiments. Among other results, the model predicted the effect of MET mutations on cetuximab sensitivity. These predictive capabilities render the model a basis for the assessment of gastric cancer signaling and possibly for the development and discovery of predictive biomarkers.
Wissenschaftlicher Artikel
Scientific Article
Knauer-Arloth, J. ; Eraslan, G. ; Andlauer, T.F.M. ; Martins, J. ; Iurato, S. ; Kühnel, B. ; Waldenberger, M. ; Frank, J. ; Gold, R. ; Hemmer, B. ; Luessi, F. ; Nischwitz, S. ; Paul, F. ; Wiendl, H. ; Gieger, C. ; Heilmann-Heimbach, S. ; Kacprowski, T. ; Laudes, M. ; Meitinger, T. ; Peters, A. ; Rawal, R. ; Strauch, K. ; Lucae, S. ; Müller-Myhsok, B. ; Rietschel, M. ; Theis, F.J. ; Binder, E.B. ; Müller, N.S.
PLoS Comput. Biol. 16:e1007616 (2020)
Genome-wide association studies (GWAS) identify genetic variants associated with traits or diseases. GWAS never directly link variants to regulatory mechanisms. Instead, the functional annotation of variants is typically inferred by post hoc analyses. A specific class of deep learning-based methods allows for the prediction of regulatory effects per variant on several cell type-specific chromatin features. We here describe "DeepWAS", a new approach that integrates these regulatory effect predictions of single variants into a multivariate GWAS setting. Thereby, single variants associated with a trait or disease are directly coupled to their impact on a chromatin feature in a cell type. Up to 61 regulatory SNPs, called dSNPs, were associated with multiple sclerosis (MS, 4,888 cases and 10,395 controls), major depressive disorder (MDD, 1,475 cases and 2,144 controls), and height (5,974 individuals). These variants were mainly non-coding and reached at least nominal significance in classical GWAS. The prediction accuracy was higher for DeepWAS than for classical GWAS models for 91% of the genome-wide significant, MS-specific dSNPs. DSNPs were enriched in public or cohort-matched expression and methylation quantitative trait loci and we demonstrated the potential of DeepWAS to generate testable functional hypotheses based on genotype data alone. DeepWAS is available at https://github.com/cellmapslab/DeepWAS.
Wissenschaftlicher Artikel
Scientific Article
Sachs, S. ; Bastidas-Ponce, A. ; Tritschler, S. ; Bakhti, M. ; Böttcher, A. ; Sánchez-Garrido, M.A. ; Tarquis Medina, M. ; Kleinert, M. ; Fischer, K. ; Jall, S. ; Harger, A. ; Bader, E. ; Roscioni, S. ; Ussar, S. ; Feuchtinger, A. ; Yesildag, B. ; Neelakandhan, A. ; Jensen, C.B. ; Cornu, M. ; Yang, B. ; Finan, B. ; DiMarchi, R.D. ; Tschöp, M.H. ; Theis, F.J. ; Hofmann, S.M. ; Müller, T.D. ; Lickert, H.
Nat. Metab. 2, 192-209 (2020)
Dedifferentiation of insulin-secreting β cells in the islets of Langerhans has been proposed to be a major mechanism of β-cell dysfunction. Whether dedifferentiated β cells can be targeted by pharmacological intervention for diabetes remission, and ways in which this could be accomplished, are unknown as yet. Here we report the use of streptozotocin-induced diabetes to study β-cell dedifferentiation in mice. Single-cell RNA sequencing (scRNA-seq) of islets identified markers and pathways associated with β-cell dedifferentiation and dysfunction. Single and combinatorial pharmacology further show that insulin treatment triggers insulin receptor pathway activation in β cells and restores maturation and function for diabetes remission. Additional β-cell selective delivery of oestrogen by Glucagon-like peptide-1 (GLP-1-oestrogen conjugate) decreases daily insulin requirements by 60%, triggers oestrogen-specific activation of the endoplasmic-reticulum-associated protein degradation system, and further increases β-cell survival and regeneration. GLP-1-oestrogen also protects human β cells against cytokine-induced dysfunction. This study not only describes mechanisms of β-cell dedifferentiation and regeneration, but also reveals pharmacological entry points to target dedifferentiated β cells for diabetes remission.
Wissenschaftlicher Artikel
Scientific Article
Lauffer, F. ; Jargosch, M. ; Baghin, V. ; Krause, L. ; Kempf, W. ; Absmaier-Kijak, M. ; Morelli, M. ; Madonna, S. ; Marsais, F. ; Lepescheux, L. ; Albanesi, C. ; Müller, N.S. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Biedermann, T. ; Vandeghinste, N. ; Steidl, S. ; Eyerich, K.
J. Eur. Acad. Dermatol. Venereol. 34, 800-809 (2020)
Background: Key pathogenic events of psoriasis and atopic eczema (AE) are misguided immune reactions of the skin. IL-17C is an epithelial-derived cytokine, whose impact on skin inflammation is unclear. Objective: We sought to characterize the role of IL-17C in human ISD. Methods: IL-17C gene and protein expression was assessed by immunohistochemistry and transcriptome analysis. Primary human keratinocytes were stimulated and expression of cytokines chemokines was determined by qRT-PCR and luminex assay. Neutrophil migration towards supernatant of stimulated keratinocytes was assessed. IL-17C was depleted using a new IL-17C-specific antibody (MOR106) in murine models of psoriasis (IL-23 injection model) and AE (MC903 model) as well as in human skin biopsies of psoriasis and AE. Effects on cell influx (mouse models) and gene expression (human explant cultures) were determined. Results: Expression of IL-17C mRNA and protein was elevated in various ISD. We demonstrate that IL-17C potentiates the expression of innate cytokines, antimicrobial peptides (IL-36G, S100A7 and HBD2) and chemokines (CXCL8, CXCL10, CCL5 and VEGF) and the autocrine induction of IL-17C in keratinocytes. Cell-free supernatant of keratinocytes stimulated with IL-17C was strongly chemotactic for neutrophils, thus demonstrating a critical role for IL-17C in immune cell recruitment. IL-17C depletion significantly reduced cell numbers of T cells, neutrophils and eosinophils in murine models of psoriasis and AE and led to a significant downregulation of inflammatory mediators in human skin biopsies of psoriasis and AE ex vivo. Conclusion: IL-17C amplifies epithelial inflammation in Th2 and Th17 dominated skin inflammation and represents a promising target for the treatment of ISD.
Wissenschaftlicher Artikel
Scientific Article
Förster, K. ; Ertl-Wagner, B. ; Ehrhardt, H. ; Busen, H. ; Sass, S. ; Pomschar, A. ; Naehrlich, L. ; Schulze, A. ; Flemmer, A.W. ; Hübener, C. ; Eickelberg, O. ; Theis, F.J. ; Dietrich, O. ; Hilgendorff, A.
Thorax 75, 184-187 (2020)
We developed a MRI protocol using transverse (T2) and longitudinal (T1) mapping sequences to characterise lung structural changes in preterm infants with bronchopulmonary dysplasia (BPD). We prospectively enrolled 61 infants to perform 3-Tesla MRI of the lung in quiet sleep. Statistical analysis was performed using logistic Group Lasso regression and logistic regression. Increased lung T2 relaxation time and decreased lung T1 relaxation time indicated BPD yielding an area under the curve (AUC) of 0.80. Results were confirmed in an independent study cohort (AUC 0.75) and mirrored by lung function testing, indicating the high potential for MRI in future BPD diagnostics.
Wissenschaftlicher Artikel
Scientific Article
Pitea, A. ; Kondofersky, I. ; Sass, S. ; Theis, F.J. ; Müller, N.S. ; Unger, K.
Brief. Bioinform. 21, 272-281 (2020)
Copy number aberrations (CNAs) are known to strongly affect oncogenes and tumour suppressor genes. Given the critical role CNAs play in cancer research, it is essential to accurately identify CNAs from tumour genomes. One particular challenge in finding CNAs is the effect of confounding variables. To address this issue, we assessed how commonly used CNA identification algorithms perform on SNP 6.0 genotyping data in the presence of confounding variables. We simulated realistic synthetic data with varying levels of three confounding variables-the tumour purity, the length of a copy number region and the CNA burden (the percentage of CNAs present in a profiled genome)-and evaluated the performance of OncoSNP, ASCAT, GenoCNA, GISTIC and CGHcall. Furthermore, we implemented and assessed CGHcall*, an adjusted version of CGHcall accounting for high CNA burden. Our analysis on synthetic data indicates that tumour purity and the CNA burden strongly influence the performance of all the algorithms. No algorithm can correctly find lost and gained genomic regions across all tumour purities. The length of CNA regions influenced the performance of ASCAT, CGHcall and GISTIC. OncoSNP, GenoCNA and CGHcall* showed little sensitivity. Overall, CGHcall* and OncoSNP showed reasonable performance, particularly in samples with high tumour purity. Our analysis on the HapMap data revealed a good overlap between CGHcall, CGHcall* and GenoCNA results and experimentally validated data. Our exploratory analysis on the TCGA HNSCC data revealed plausible results of CGHcall, CGHcall* and GISTIC in consensus HNSCC CNA regions.
Wissenschaftlicher Artikel
Scientific Article
2019
Cruceanu, C. ; Dony, L. ; Kontira, A.C. ; Fischer, D.S. ; Roeh, S. ; DiGiaimo, R. ; Cappello, S. ; Theis, F.J. ; Binder, E.B.
Eur. Neuropsychopharmacol. 29, S7-S8 (2019)
Meeting abstract
Meeting abstract
Thomas, J. ; Küpper, M. ; Batra, R. ; Jargosch, M. ; Atenhan, A. ; Baghin, V. ; Krause, L. ; Lauffer, F. ; Biedermann, T. ; Theis, F.J. ; Eyerich, K. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Garzorz-Stark, N.
J. Eur. Acad. Dermatol. Venereol. 33, 2380-2380 (2019)
Authorship correction on Is the humoral immunity dispensable for the pathogenesis of psoriasis? Thomas J, Küpper M, Batra R, Jargosch M, Atenhan A, Baghin V, Krause L, Lauffer F, Biedermann T, Theis FJ, Eyerich K, Eyerich S, Garzorz-Stark N. J Eur Acad Dermatol Venereol. 2019 Jan; 33(1): 115–122. https://doi.org/10.1111/jdv.15101. Epub 2018 Jul 2. This corrigendum is to note that the name of Prof. Carsten Schmidt-Weber was inadvertently omitted as an author in the initial version of the paper. Schmidt-Weber CB has been added for his participation and contributions in this project.
Theis, F.J. ; Ludwig, T.
Inf.-Spektrum, DOI: 10.1007/s00287-019-01220-y (2019)
Sonstiges: Meinungsartikel
Other: Opinion
Holmberg, O. ; Kortuem, K.U. ; Koehler, N. ; Theis, F.J.
Invest. Ophthalmol. Vis. Sci. 60 (2019)
Meeting abstract
Meeting abstract
Böttcher, A. ; Tritschler, S. ; Yang, K. ; Theis, F.J. ; Lickert, H. ; Wolf, E. ; Kemter, E.
Xenotransplantation 26 (2019)
Meeting abstract
Meeting abstract
Knauer-Arloth, J. ; Eraslan, G. ; Andlauer, T. ; Gieger, C. ; Gold, R. ; Heilmann-Heimbach, S. ; Kacprowski, T. ; Meitinger, T. ; Laudes, M. ; Luessi, F. ; Müller-Myhsok, B. ; Nischwitz, S. ; Peters, A. ; Paul, F. ; Rawal, R. ; Strauch, K. ; Wiendl, H. ; Hemmer, B. ; Theis, F.J. ; Binder, E. ; Müller, N.S.
Mult. Scler. J. 25, 906-907 (2019)
Meeting abstract
Meeting abstract
Tetko, I.V. ; Theis, F.J. ; Karpov, P. ; Kůrková, V.
Lect. Notes Comput. Sc. 11731 LNCS, v-vii (2019)
Editorial
Editorial
Rausch, L. ; Kranich, J. ; Chlis, N.-K. ; Schifferer, M. ; Simons, M. ; Theis, F.J. ; Brocker, T.
Eur. J. Immunol. 49, 88-88 (2019)
Meeting abstract
Meeting abstract
Musumeci, A. ; Lutz, K. ; Dursun, E. ; Sie, C. ; Ziegenhain, C. ; Bagnoli, J. ; Luecken, M. ; Korn, T. ; Enard, W. ; Theis, F.J. ; Krug, A.
Eur. J. Immunol. 49, 48-48 (2019)
Meeting abstract
Meeting abstract
Bakhti, M. ; Scheibner, K. ; Tritschler, S. ; Bastidas-Ponce, A. ; Tarquis-Medina, M. ; Theis, F.J. ; Lickert, H.
Mol. Metab. 30, 16-29 (2019)
Objective: Translation of basic research from bench-to-bedside relies on a better understanding of similarities and differences between mouse and human cell biology, tissue formation, and organogenesis. Thus, establishing ex vivo modeling systems of mouse and human pancreas development will help not only to understand evolutionary conserved mechanisms of differentiation and morphogenesis but also to understand pathomechanisms of disease and design strategies for tissue engineering.Methods: Here, we established a simple and reproducible Matrigel-based three-dimensional (3D) cyst culture model system of mouse and human pancreatic progenitors (PPs) to study pancreatic epithelialization and endocrinogenesis ex vivo. In addition, we reanalyzed previously reported single-cell RNA sequencing (scRNA-seq) of mouse and human pancreatic lineages to obtain a comprehensive picture of differential expression of key transcription factors (TFs), cell-cell adhesion molecules and cell polarity components in PPs during endocrinogenesis.Results: We generated mouse and human polarized pancreatic epithelial cysts derived from PPs. This system allowed to monitor establishment of pancreatic epithelial polarity and lumen formation in cellular and sub-cellular resolution in a dynamic time-resolved fashion. Furthermore, both mouse and human pancreatic cysts were able to differentiate towards the endocrine fate. This differentiation system together with scRNA-seq analysis revealed how apical-basal polarity and tight and adherens junctions change during endocrine differentiation.Conclusions: We have established a simple 3D pancreatic cyst culture system that allows to tempo-spatial resolve cellular and subcellular processes on the mechanistical level, which is otherwise not possible in vivo.
Wissenschaftlicher Artikel
Scientific Article
Tetko, I.V. ; Theis, F.J. ; Karpov, P. ; Kůrková, V.
Lect. Notes Comput. Sc. 11727 LNCS, v-vii (2019)
Editorial
Editorial
Tetko, I.V. ; Theis, F.J. ; Karpov, P. ; Kůrková, V.
Lect. Notes Comput. Sc. 11728 LNCS, v-vii (2019)
Editorial
Editorial
Tetko, I.V. ; Theis, F.J. ; Karpov, P. ; Kůrková, V.
Lect. Notes Comput. Sc. 11729 LNCS, v-vii (2019)
Editorial
Editorial
Mameishvili, E. ; Serafimidis, I. ; Iwaszkiewicz, S. ; Lesche, M. ; Reinhardt, S. ; Bölicke, N. ; Büttner, M. ; Stellas, D. ; Papadimitropoulou, A. ; Szabolcs, M. ; Anastassiadis, K. ; Dahl, A. ; Theis, F.J. ; Efstratiadis, A. ; Gavalas, A.
Proc. Natl. Acad. Sci. U.S.A. 116, 20679-20688 (2019)
The presence of progenitor or stem cells in the adult pancreas and their potential involvement in homeostasis and cancer development remain unresolved issues. Here, we show that mouse centroacinar cells can be identified and isolated by virtue of the mitochondrial enzyme Aldh1b1 that they uniquely express. These cells are necessary and sufficient for the formation of self-renewing adult pancreatic organoids in an Aldh1b1-dependent manner. Aldh1b1-expressing centroacinar cells are largely quiescent, self-renew, and, as shown by genetic lineage tracing, contribute to all 3 pancreatic lineages in the adult organ under homeostatic conditions. Single-cell RNA sequencing analysis of these cells identified a progenitor cell population, established its molecular signature, and determined distinct differentiation pathways to early progenitors. A distinct feature of these progenitor cells is the preferential expression of small GTPases, including Kras, suggesting that they might be susceptible to Kras-driven oncogenic transformation. This finding and the overexpression of Aldh1b1 in human and mouse pancreatic cancers, driven by activated Kras, prompted us to examine the involvement of Aldh1b1 in oncogenesis. We demonstrated genetically that ablation of Aldh1b1 completely abrogates tumor development in a mouse model of Kras(G12D)-induced pancreatic cancer.
Wissenschaftlicher Artikel
Scientific Article
Ashuach, T. ; Fischer, D.S. ; Kreimer, A. ; Ahituv, N. ; Theis, F.J. ; Yosef, N.
Genome Biol. 20:183 (2019)
Massively parallel reporter assays (MPRAs) can measure the regulatory function of thousands of DNA sequences in a single experiment. Despite growing popularity, MPRA studies are limited by a lack of a unified framework for analyzing the resulting data. Here we present MPRAnalyze: a statistical framework for analyzing MPRA count data. Our model leverages the unique structure of MPRA data to quantify the function of regulatory sequences, compare sequences' activity across different conditions, and provide necessary flexibility in an evolving field. We demonstrate the accuracy and applicability of MPRAnalyze on simulated and published data and compare it with existing methods.
Wissenschaftlicher Artikel
Scientific Article
Czamara, D. ; Eraslan, G. ; Lahti, J. ; Figueiredo, A.S. ; Girchenko, P. ; Lahti-Pulkkinen, M. ; Hämäläinen, E. ; Kajantie, E. ; Laivuori, H. ; Villa, P. ; Reynolds, R. ; Müller, N.S. ; Theis, F.J. ; Räikkönen, K. ; Binder, E.
Eur. Neuropsychopharmacol. 29, 1037-1037 (2019)
Meeting abstract
Meeting abstract
Schaupp, S. ; Budde, M. ; Kondofersky, I. ; Papiol, S. ; Heilbronner, U. ; Gade, K. ; Anderson-Schmidt, H. ; Kalman, J. ; Senner, F. ; Andlauer, T.F.M. ; Rietschel, M. ; Degenhardt, F. ; Müller, N.S. ; Theis, F.J. ; Schulze, T.
Eur. Neuropsychopharmacol. 29, 1161-1161 (2019)
Meeting abstract
Meeting abstract
Schulte, E. ; Kondofersky, I. ; Budde, M. ; Adorjan, K. ; Aldinger, F. ; Anderson-Schmidt, H. ; Gade, K. ; Heilbronner, U. ; Kalman, J. ; Papiol, S. ; Theis, F.J. ; Falkai, P. ; Müller, N.S. ; Schulze, T.G.
Eur. Neuropsychopharmacol. 29, 1257-1258 (2019)
Meeting abstract
Meeting abstract
Weberpals, J. ; Becker, T. ; Schmich, F. ; Ruettinger, D. ; Theis, F.J. ; Bauer-Mehren, A.
Pharmacoepidemiol. Drug Saf. 28, 585-586 (2019)
Meeting abstract
Meeting abstract
Tirier, S.M. ; Park, J. ; Preußer, F. ; Amrhein, L. ; Gu, Z. ; Steiger, S. ; Mallm, J.P. ; Krieger, T. ; Waschow, M. ; Eismann, B. ; Gut, M. ; Gut, I.G. ; Rippe, K. ; Schlesner, M. ; Theis, F.J. ; Fuchs, C. ; Ball, C.R. ; Glimm, H. ; Conrad, C.
Sci. Rep. 9:12367 (2019)
Patient-derived 3D cell culture systems are currently advancing cancer research since they potentiate the molecular analysis of tissue-like properties and drug response under well-defined conditions. However, our understanding of the relationship between the heterogeneity of morphological phenotypes and the underlying transcriptome is still limited. To address this issue, we here introduce "pheno-seq" to directly link visual features of 3D cell culture systems with profiling their transcriptome. As prototypic applications breast and colorectal cancer (CRC) spheroids were analyzed by pheno-seq. We identified characteristic gene expression signatures of epithelial-to-mesenchymal transition that are associated with invasive growth behavior of clonal breast cancer spheroids. Furthermore, we linked long-term proliferative capacity in a patient-derived model of CRC to a lowly abundant PROX1-positive cancer stem cell subtype. We anticipate that the ability to integrate transcriptome analysis and morphological patho-phenotypes of cancer cells will provide novel insight on the molecular origins of intratumor heterogeneity.
Wissenschaftlicher Artikel
Scientific Article
Lotfollahi, M. ; Wolf, F.A. ; Theis, F.J.
Nat. Methods 16, 715-721 (2019)
Accurately modeling cellular response to perturbations is a central goal of computational biology. While such modeling has been based on statistical, mechanistic and machine learning models in specific settings, no generalization of predictions to phenomena absent from training data (out-of-sample) has yet been demonstrated. Here, we present scGen (https://github.com/theislab/scgen), a model combining variational autoencoders and latent space vector arithmetics for high-dimensional single-cell gene expression data. We show that scGen accurately models perturbation and infection response of cells across cell types, studies and species. In particular, we demonstrate that scGen learns cell-type and species-specific responses implying that it captures features that distinguish responding from non-responding genes and cells. With the upcoming availability of large-scale atlases of organs in a healthy state, we envision scGen to become a tool for experimental design through in silico screening of perturbation response in the context of disease and drug treatment.
Wissenschaftlicher Artikel
Scientific Article
Uzbas, F. ; Opperer, F. ; Sönmezer, C. ; Shaposhnikov, D. ; Sass, S. ; Krendl, C. ; Angerer, P. ; Theis, F.J. ; Müller, N.S. ; Drukker, M.
Genome Biol. 20:155 (2019)
We describe a highly sensitive, quantitative, and inexpensive technique for targeted sequencing of transcript cohorts or genomic regions from thousands of bulk samples or single cells in parallel. Multiplexing is based on a simple method that produces extensive matrices of diverse DNA barcodes attached to invariant primer sets, which are all pre-selected and optimized in silico. By applying the matrices in a novel workflow named Barcode Assembly foR Targeted Sequencing (BART-Seq), we analyze developmental states of thousands of single human pluripotent stem cells, either in different maintenance media or upon Wnt/beta-catenin pathway activation, which identifies the mechanisms of differentiation induction. Moreover, we apply BART-Seq to the genetic screening of breast cancer patients and identify BRCA mutations with very high precision. The processing of thousands of samples and dynamic range measurements that outperform global transcriptomics techniques makes BART-Seq first targeted sequencing technique suitable for numerous research applications.
Wissenschaftlicher Artikel
Scientific Article
Caicedo, J.C. ; Roth, J. ; Goodman, A. ; Becker, T. ; Karhohs, K.W. ; Broisin, M. ; Molnar, C. ; McQuin, C. ; Singh, S. ; Theis, F.J. ; Carpenter, A.E.
Cytometry A 95, 952-965 (2019)
Identifying nuclei is often a critical first step in analyzing microscopy images of cells and classical image processing algorithms are most commonly used for this task. Recent developments in deep learning can yield superior accuracy, but typical evaluation metrics for nucleus segmentation do not satisfactorily capture error modes that are relevant in cellular images. We present an evaluation framework to measure accuracy, types of errors, and computational efficiency; and use it to compare deep learning strategies and classical approaches. We publicly release a set of 23,165 manually annotated nuclei and source code to reproduce experiments and run the proposed evaluation methodology. Our evaluation framework shows that deep learning improves accuracy and can reduce the number of biologically relevant errors by half. © 2019 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry.
Wissenschaftlicher Artikel
Scientific Article
Patil, S. ; Heuser, C. ; de Almeida, G.P. ; Theis, F.J. ; Zielinski, C.E.
Front. Immunol. 10:1515 (2019)
Recent advances in cytometry have radically altered the fate of single-cell proteomics by allowing a more accurate understanding of complex biological systems. Mass cytometry (CyTOF) provides simultaneous single-cell measurements that are crucial to understand cellular heterogeneity and identify novel cellular subsets. High-dimensional CyTOF data were traditionally analyzed by gating on bivariate dot plots, which are not only laborious given the quadratic increase of complexity with dimension but are also biased through manual gating. This review aims to discuss the impact of new analysis techniques for in-depths insights into the dynamics of immune regulation obtained from static snapshot data and to provide tools to immunologists to address the high dimensionality of their single-cell data.
Review
Review
Erhard, F. ; Baptista, M.A.P. ; Krammer, T. ; Hennig, T. ; Lange, M. ; Arampatzi, P. ; Jürges, C.S. ; Theis, F.J. ; Saliba, A.-E. ; Dölken, L.
Nature 571, 419-423 (2019)
Single-cell RNA sequencing (scRNA-seq) has highlighted the important role of intercellular heterogeneity in phenotype variability in both health and disease(1). However, current scRNA-seq approaches provide only a snapshot of gene expression and convey little information on the true temporal dynamics and stochastic nature of transcription. A further key limitation of scRNA-seq analysis is that the RNA profile of each individual cell can be analysed only once. Here we introduce single-cell, thiol-(SH)-linked alkylation of RNA for metabolic labelling sequencing (scSLAM-seq), which integrates metabolic RNA labelling(2), biochemical nucleoside conversion(3) and scRNA-seq to record transcriptional activity directly by differentiating between new and old RNA for thousands of genes per single cell. We use scSLAM-seq to study the onset of infection with lytic cytomegalovirus in single mouse fibroblasts. The cell-cycle state and dose of infection deduced from old RNA enable dose-response analysis based on new RNA. scSLAM-seq thereby both visualizes and explains differences in transcriptional activity at the single-cell level. Furthermore, it depicts 'on-off' switches and transcriptional burst kinetics in host gene expression with extensive gene-specific differences that correlate with promoter-intrinsic features (TBP-TATA-box interactions and DNA methylation). Thus, gene-specific, and not cell-specific, features explain the heterogeneity in transcriptomes between individual cells and the transcriptional response to perturbations.
Wissenschaftlicher Artikel
Scientific Article
Hawe, J. ; Theis, F.J. ; Heinig, M.
Front. Genet. 10:535 (2019)
A major goal in systems biology is a comprehensive description of the entirety of all complex interactions between different types of biomolecules-also referred to as the interactome-and how these interactions give rise to higher, cellular and organism level functions or diseases. Numerous efforts have been undertaken to define such interactomes experimentally, for example yeast-two-hybrid based protein-protein interaction networks or ChIP-seq based protein-DNA interactions for individual proteins. To complement these direct measurements, genome-scale quantitative multi-omics data (transcriptomics, proteomics, metabolomics, etc.) enable researchers to predict novel functional interactions between molecular species. Moreover, these data allow to distinguish relevant functional from non-functional interactions in specific biological contexts. However, integration of multi-omics data is not straight forward due to their heterogeneity. Numerous methods for the inference of interaction networks from homogeneous functional data exist, but with the advent of large-scale paired multi-omics data a new class of methods for inferring comprehensive networks across different molecular species began to emerge. Here we review state-of-the-art techniques for inferring the topology of interaction networks from functional multi-omics data, encompassing graphical models with multiple node types and quantitative-trait-loci (QTL) based approaches. In addition, we will discuss Bayesian aspects of network inference, which allow for leveraging already established biological information such as known protein-protein or protein-DNA interactions, to guide the inference process.
Review
Review
Tritschler, S. ; Büttner, M. ; Fischer, D.S. ; Lange, M. ; Bergen, V. ; Lickert, H. ; Theis, F.J.
Development 146:dev170506 (2019)
Single cell genomics has become a popular approach to uncover the cellular heterogeneity of progenitor and terminally differentiated cell types with great precision. This approach can also delineate lineage hierarchies and identify molecular programmes of cell-fate acquisition and segregation. Nowadays, tens of thousands of cells are routinely sequenced in single cell-based methods and even more are expected to be analysed in the future. However, interpretation of the resulting data is challenging and requires computational models at multiple levels of abstraction. In contrast to other applications of single cell sequencing, where clustering approaches dominate, developmental systems are generally modelled using continuous structures, trajectories and trees. These trajectory models carry the promise of elucidating mechanisms of development, disease and stimulation response at very high molecular resolution. However, their reliable analysis and biological interpretation requires an understanding of their underlying assumptions and limitations. Here, we review the basic concepts of such computational approaches and discuss the characteristics of developmental processes that can be learnt from trajectory models.
Review
Review
Luecken, M. ; Theis, F.J.
Mol. Syst. Biol. 15:e8746 (2019)
Single-cell RNA-seq has enabled gene expression to be studied at an unprecedented resolution. The promise of this technology is attracting a growing user base for single-cell analysis methods. As more analysis tools are becoming available, it is becoming increasingly difficult to navigate this landscape and produce an up-to-date workflow to analyse one's data. Here, we detail the steps of a typical single-cell RNA-seq analysis, including pre-processing (quality control, normalization, data correction, feature selection, and dimensionality reduction) and cell- and gene-level downstream analysis. We formulate current best-practice recommendations for these steps based on independent comparison studies. We have integrated these best-practice recommendations into a workflow, which we apply to a public dataset to further illustrate how these steps work in practice. Our documented case study can be found at . This review will serve as a workflow tutorial for new entrants into the field, and help established users update their analysis pipelines.
Review
Review
Vieira Braga, F.A. ; Kar, G. ; Berg, M. ; Carpaij, O.A. ; Polanski, K. ; Simon, L. ; Brouwer, S. ; Gomes, T. ; Hesse, L. ; Jiang, J. ; Fasouli, E.S. ; Efremova, M. ; Vento-Tormo, R. ; Talavera-López, C. ; Jonker, M.R. ; Affleck, K. ; Palit, S. ; Strzelecka, P.M. ; Firth, H.V. ; Mahbubani, K.T. ; Cvejic, A. ; Meyer, K.B. ; Saeb-Parsy, K. ; Luinge, M. ; Brandsma, C.A. ; Timens, W. ; Angelidis, I. ; Strunz, M. ; Koppelman, G.H. ; van Oosterhout, A.J. ; Schiller, H. B. ; Theis, F.J. ; van den Berge, M. ; Nawijn, M.C. ; Teichmann, S.A.
Nat. Med. 25, 1153-1163 (2019)
Human lungs enable efficient gas exchange and form an interface with the environment, which depends on mucosal immunity for protection against infectious agents. Tightly controlled interactions between structural and immune cells are required to maintain lung homeostasis. Here, we use single-cell transcriptomics to chart the cellular landscape of upper and lower airways and lung parenchyma in healthy lungs, and lower airways in asthmatic lungs. We report location-dependent airway epithelial cell states and a novel subset of tissue-resident memory T cells. In the lower airways of patients with asthma, mucous cell hyperplasia is shown to stem from a novel mucous ciliated cell state, as well as goblet cell hyperplasia. We report the presence of pathogenic effector type 2 helper T cells (T(H)2) in asthmatic lungs and find evidence for type 2 cytokines in maintaining the altered epithelial cell states. Unbiased analysis of cell-cell interactions identifies a shift from airway structural cell communication in healthy lungs to a T(H)2-dominated interactome in asthmatic lungs.
Wissenschaftlicher Artikel
Scientific Article
Czamara, D. ; Eraslan, G. ; Page, C.M. ; Lahti, J. ; Lahti-Pulkkinen, M. ; Hämäläinen, E. ; Kajantie, E. ; Laivuori, H. ; Villa, P.M. ; Reynolds, R.M. ; Nystad, W. ; Håberg, S.E. ; London, S.J. ; O'Donnell, K.J. ; Garg, E. ; Meaney, M.J. ; Entringer, S. ; Wadhwa, P.D. ; Buss, C. ; Jones, M.J. ; Lin, D.T.S. ; MacIsaac, J.L. ; Kobor, M.S. ; Koen, N. ; Zar, H.J. ; Koenen, K.C. ; Dalvie, S. ; Stein, D.J. ; Kondofersky, I. ; Müller, N.S. ; Theis, F.J. ; Räikkönen, K. ; Binder, E.B.
Nat. Commun. 10:2548 (2019)
Epigenetic processes, including DNA methylation (DNAm), are among the mechanisms allowing integration of genetic and environmental factors to shape cellular function. While many studies have investigated either environmental or genetic contributions to DNAm, few have assessed their integrated effects. Here we examine the relative contributions of prenatal environmental factors and genotype on DNA methylation in neonatal blood at variably methylated regions (VMRs) in 4 independent cohorts (overall n = 2365). We use Akaike's information criterion to test which factors best explain variability of methylation in the cohort-specific VMRs: several prenatal environmental factors (E), genotypes in cis (G), or their additive (G + E) or interaction (GxE) effects. Genetic and environmental factors in combination best explain DNAm at the majority of VMRs. The CpGs best explained by either G, G + E or GxE are functionally distinct. The enrichment of genetic variants from GxE models in GWAS for complex disorders supports their importance for disease risk.
Wissenschaftlicher Artikel
Scientific Article
Bastidas-Ponce, A. ; Tritschler, S. ; Dony, L. ; Scheibner, K. ; Tarquis-Medina, M. ; Salinno, C. ; Schirge, S. ; Burtscher, I. ; Böttcher, A. ; Theis, F.J. ; Lickert, H. ; Bakhti, M.
Development 146:dev173849 (2019)
Deciphering mechanisms of endocrine cell induction, specification and lineage allocation in vivo will provide valuable insights into how the islets of Langerhans are generated. Currently, it is ill defined how endocrine progenitors segregate into different endocrine subtypes during development. Here, we generated a novel neurogenin 3 (Ngn3)-Venus fusion (NVF) reporter mouse line, that closely mirrors the transient endogenous Ngn3 protein expression. To define an in vivo roadmap of endocrinogenesis, we performed single cell RNA sequencing of 36,351 pancreatic epithelial and NVF+ cells during secondary transition. This allowed Ngn3(low) endocrine progenitors, Ngn3(high) endocrine precursors, Fev(+) endocrine lineage and hormone(+) endocrine subtypes to be distinguished and time-resolved, and molecular programs during the step-wise lineage restriction steps to be delineated. Strikingly, we identified 58 novel signature genes that show the same transient expression dynamics as Ngn3 in the 7260 profiled Ngn3-expressing cells. The differential expression of these genes in endocrine precursors associated with their cell-fate allocation towards distinct endocrine cell types. Thus, the generation of an accurately regulated NVF reporter allowed us to temporally resolve endocrine lineage development to provide a fine-grained single cell molecular profile of endocrinogenesis in vivo.
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J. ; Lickert, H.
Nature 569, 342-343 (2019)
The use of stem-cell-derived beta-cells to replace those destroyed in pancreatic islets has the potential to cure diabetes. A new analysis provides a deep mechanistic understanding of islet-cell differentiation from stem cells. SEE ARTICLE P. 368
Editorial
Editorial
Matejka, K. ; Stückler, F. ; Salomon, M. ; Ensenauer, R. ; Reischl, E. ; Hoerburger, L. ; Grallert, H. ; Kastenmüller, G. ; Peters, A. ; Daniel, H. ; Krumsiek, J. ; Theis, F.J. ; Hauner, H. ; Laumen, H.
PLoS ONE 14:e0216110 (2019)
BackgroundGenome-wide association studies of common diseases or metabolite quantitative traits often identify common variants of small effect size, which may contribute to phenotypes by modulation of gene expression. Thus, there is growing demand for cellular models enabling to assess the impact of gene regulatory variants with moderate effects on gene expression. Mitochondrial fatty acid oxidation is an important energy metabolism pathway. Common noncoding acyl-CoA dehydrogenase short chain (ACADS) gene variants are associated with plasma C4-acylcarnitine levels and allele-specific modulation of ACADS expression may contribute to the observed phenotype.Methods and findingsWe assessed ACADS expression and intracellular acylcarnitine levels in human lymphoblastoid cell lines (LCL) genotyped for a common ACADS variant associated with plasma C4-acylcarnitine and found a significant genotype-dependent decrease of ACADS mRNA and protein. Next, we modelled gradual decrease of ACADS expression using a tetracy-cline- regulated shRNA-knockdown of ACADS in Huh7 hepatocytes, a cell line with high fatty acid oxidation-(FAO)-capacity. Assessing acylcarnitine flux in both models, we found increased C4-acylcarnitine levels with decreased ACADS expression levels. Moreover, assessing time-dependent changes of acylcarnitine levels in shRNA-hepatocytes with altered ACADS expression levels revealed an unexpected effect on long-and medium-chain fatty acid intermediates. ConclusionsBoth, genotyped LCL and regulated shRNA-knockdown are valuable tools to model moderate, gradual gene-regulatory effects of common variants on cellular phenotypes. Decreasing ACADS expression levels modulate short and surprisingly also long/medium chain acylcarnitines, and may contribute to increased plasma acylcarnitine levels.
Wissenschaftlicher Artikel
Scientific Article
Schiller, H. B. ; Montoro, D.T. ; Simon, L. ; Rawlins, E.L. ; Meyer, K.B. ; Strunz, M. ; Vieira Braga, F. ; Timens, W. ; Koppelman, G.H. ; Budinger, G.R.S. ; Burgess, J.K. ; van den Berge, M. ; Theis, F.J. ; Regev, A. ; Kaminski, N. ; Rajagopal, J. ; Teichmann, S.A. ; Misharin, A.V. ; Nawijn, M.C.
Am. J. Respir. Cell Mol. Biol. 61, 31-41 (2019)
Lung disease accounts for every sixth death globally. Profiling the molecular state of all lung cell types in health and disease is currently revolutionizing the identification of disease mechanisms and will aid the design of novel diagnostic and personalized therapeutic regimens. Recent progress in high-throughput techniques for single-cell genomic and transcriptomic analyses has opened up new possibilities to study individual cells within a tissue, classify these into cell types, and characterize variations in their molecular profiles as a function of genetics, environment, cell-cell interactions, developmental processes, aging, or disease. Integration of these cell state definitions with spatial information allows the in-depth molecular description of cellular neighborhoods and tissue microenvironments, including the tissue resident structural and immune cells, the tissue matrix, and the microbiome. The Human Cell Atlas consortium aims to characterize all cells in the healthy human body and has prioritized lung tissue as one of the flagship projects. Here, we present the rationale, the approach, and the expected impact of a Human Lung Cell Atlas.
Review
Review
Laimighofer, M. ; Lickert, R. ; Fuerst, R. ; Theis, F.J. ; Winkler, C. ; Bonifacio, E. ; Ziegler, A.-G. ; Krumsiek, J.
Sci. Rep. 9:6250 (2019)
Birth by Cesarean section increases the risk of developing type 1 diabetes later in life. We aimed to elucidate common regulatory processes observed after Cesarean section and the development of islet autoimmunity, which precedes type 1 diabetes, by investigating the transcriptome of blood cells in the developing immune system. To investigate Cesarean section effects, we analyzed longitudinal gene expression profiles from peripheral blood mononuclear cells taken at several time points from children with increased familial and genetic risk for type 1 diabetes. For islet autoimmunity, we compared gene expression differences between children after initiation of islet autoimmunity and age-matched children who did not develop islet autoantibodies. Finally, we compared both results to identify common regulatory patterns. We identified the pentose phosphate pathway and pyrimidine metabolism - both involved in nucleotide synthesis and cell proliferation - to be differentially expressed in children born by Cesarean section and after islet autoimmunity. Comparison of global gene expression signatures showed that transcriptomic changes were systematically and significantly correlated between Cesarean section and islet autoimmunity. Moreover, signatures of both Cesarean section and islet autoimmunity correlated with transcriptional changes observed during activation of isolated CD4+ T lymphocytes. In conclusion, we identified shared molecular changes relating to immune cell activation in children born by Cesarean section and children who developed autoimmunity. Our results serve as a starting point for further investigations on how a type 1 diabetes risk factor impacts the young immune system at a molecular level.
Wissenschaftlicher Artikel
Scientific Article
Eraslan, G. ; Avsec, Ž. ; Gagneur, J. ; Theis, F.J.
Nat. Rev. Genet. 20, 389-403 (2019)
As a data-driven science, genomics largely utilizes machine learning to capture dependencies in data and derive novel biological hypotheses. However, the ability to extract new insights from the exponentially increasing volume of genomics data requires more expressive machine learning models. By effectively leveraging large data sets, deep learning has transformed fields such as computer vision and natural language processing. Now, it is becoming the method of choice for many genomics modelling tasks, including predicting the impact of genetic variation on gene regulatory mechanisms such as DNA accessibility and splicing.
Review
Review
Fischer, D.S. ; Fiedler, A. ; Kernfeld, E.M. ; Genga, R.M.J. ; Bastidas-Ponce, A. ; Bakhti, M. ; Lickert, H. ; Hasenauer, J. ; Maehr, R. ; Theis, F.J.
Nat. Biotechnol. 37, 461-468 (2019)
Recent single-cell RNA-sequencing studies have suggested that cells follow continuous transcriptomic trajectories in an asynchronous fashion during development. However, observations of cell flux along trajectories are confounded with population size effects in snapshot experiments and are therefore hard to interpret. In particular, changes in proliferation and death rates can be mistaken for cell flux. Here we present pseudodynamics, a mathematical framework that reconciles population dynamics with the concepts underlying developmental trajectories inferred from time-series single-cell data. Pseudodynamics models population distribution shifts across trajectories to quantify selection pressure, population expansion, and developmental potentials. Applying this model to time-resolved single-cell RNA-sequencing of T-cell and pancreatic beta cell maturation, we characterize proliferation and apoptosis rates and identify key developmental checkpoints, data inaccessible to existing approaches.
Wissenschaftlicher Artikel
Scientific Article
Krautenbacher, N. ; Flach, N. ; Böck, A. ; Laubhahn, K. ; Laimighofer, M. ; Theis, F.J. ; Ankerst, D.P. ; Fuchs, C. ; Schaub, B.
Allergy 74, 1364-1373 (2019)
Background Associations between childhood asthma phenotypes and genetic, immunological, and environmental factors have been previously established. Yet, strategies to integrate high-dimensional risk factors from multiple distinct data sets, and thereby increase the statistical power of analyses, have been hampered by a preponderance of missing data and lack of methods to accommodate them. Methods We assembled questionnaire, diagnostic, genotype, microarray, RT-qPCR, flow cytometry, and cytokine data (referred to as data modalities) to use as input factors for a classifier that could distinguish healthy children, mild-to-moderate allergic asthmatics, and nonallergic asthmatics. Based on data from 260 German children aged 4-14 from our university outpatient clinic, we built a novel multilevel prediction approach for asthma outcome which could deal with a present complex missing data structure. Results The optimal learning method was boosting based on all data sets, achieving an area underneath the receiver operating characteristic curve (AUC) for three classes of phenotypes of 0.81 (95%-confidence interval (CI): 0.65-0.94) using leave-one-out cross-validation. Besides improving the AUC, our integrative multilevel learning approach led to tighter CIs than using smaller complete predictor data sets (AUC = 0.82 [0.66-0.94] for boosting). The most important variables for classifying childhood asthma phenotypes comprised novel identified genes, namely PKN2 (protein kinase N2), PTK2 (protein tyrosine kinase 2), and ALPP (alkaline phosphatase, placental). Conclusion Our combination of several data modalities using a novel strategy improved classification of childhood asthma phenotypes but requires validation in external populations. The generic approach is applicable to other multilevel data-based risk prediction settings, which typically suffer from incomplete data.
Wissenschaftlicher Artikel
Scientific Article
Wang, R. ; Thomas, J. ; Batra, R. ; Biedermann, T. ; Theis, F.J. ; Eyerich, S. ; Eyerich, K.
Exp. Dermatol. 28, E29-E30 (2019)
Meeting abstract
Meeting abstract
Wolf, F.A. ; Hamey, F.K. ; Plass, M. ; Solana, J. ; Dahlin, J.S. ; Göttgens, B. ; Rajewsky, N. ; Simon, L. ; Theis, F.J.
Genome Biol. 20:59 (2019)
Single-cell RNA-seq quantifies biological heterogeneity across both discrete cell types and continuous cell transitions. Partition-based graph abstraction (PAGA) provides an interpretable graph-like map of the arising data manifold, based on estimating connectivity of manifold partitions (https://github.com/theislab/paga). PAGA maps preserve the global topology of data, allow analyzing data at different resolutions, and result in much higher computational efficiency of the typical exploratory data analysis workflow. We demonstrate the method by inferring structure-rich cell maps with consistent topology across four hematopoietic datasets, adult planaria and the zebrafish embryo and benchmark computational performance on one million neurons.
Wissenschaftlicher Artikel
Scientific Article
Fröhlich, F. ; Reiser, A. ; Fink, L. ; Woschée, D. ; Ligon, T. ; Theis, F.J. ; Rädler, J.O. ; Hasenauer, J.
NPJ Syst. Biol. Appl. 5:11 (2019)
The original version of this Article had an incorrect Article number of 1, an incorrect Volume of 5 and an incorrect Publication year of 2019. These errors have now been corrected in the PDF and HTML versions of the Article.
Angelidis, I. ; Simon, L. ; Fernandez, I.E. ; Strunz, M. ; Mayr, C. ; Greiffo, F.R. ; Tsitsiridis, G. ; Ansari, M. ; Graf, E. ; Strom, T.M. ; Nagendran, M. ; Desai, T. ; Eickelberg, O. ; Mann, M. ; Theis, F.J. ; Schiller, H. B.
Nat. Commun. 10:963 (2019)
Aging promotes lung function decline and susceptibility to chronic lung diseases, which are the third leading cause of death worldwide. Here, we use single cell transcriptomics and mass spectrometry-based proteomics to quantify changes in cellular activity states across 30 cell types and chart the lung proteome of young and old mice. We show that aging leads to increased transcriptional noise, indicating deregulated epigenetic control. We observe cell type-specific effects of aging, uncovering increased cholesterol biosynthesis in type-2 pneumocytes and lipofibroblasts and altered relative frequency of airway epithelial cells as hallmarks of lung aging. Proteomic profiling reveals extracellular matrix remodeling in old mice, including increased collagen IV and XVI and decreased Fraser syndrome complex proteins and collagen XIV. Computational integration of the aging proteome with the single cell transcriptomes predicts the cellular source of regulated proteins and creates an unbiased reference map of the aging lung.
Wissenschaftlicher Artikel
Scientific Article
Sheng, X. ; Nenseth, H.Z. ; Qu, S. ; Kuzu, O.F. ; Frahnow, T. ; Simon, L. ; Greene, S. ; Zeng, Q. ; Fazli, L. ; Rennie, P.S. ; Mills, I.G. ; Danielsen, H. ; Theis, F.J. ; Patterson, J.B. ; Jin, Y. ; Saatcioglu, F.
Nat. Commun. 10:323 (2019)
Activation of endoplasmic reticulum (ER) stress/the unfolded protein response (UPR) has been linked to cancer, but the molecular mechanisms are poorly understood and there is a paucity of reagents to translate this for cancer therapy. Here, we report that an IRE1 alpha RNase-specific inhibitor, MKC8866, strongly inhibits prostate cancer (PCa) tumor growth as monotherapy in multiple preclinical models in mice and shows synergistic antitumor effects with current PCa drugs. Interestingly, global transcriptomic analysis reveal that IRE1 alpha-XBP1s pathway activity is required for c-MYC signaling, one of the most highly activated oncogenic pathways in PCa. XBP1s is necessary for optimal c-MYC mRNA and protein expression, establishing, for the first time, a direct link between UPR and oncogene activation. In addition, an XBP1-specific gene expression signature is strongly associated with PCa prognosis. Our data establish IRE1 alpha-XBP1s signaling as a central pathway in PCa and indicate that its targeting may offer novel treatment strategies.
Wissenschaftlicher Artikel
Scientific Article
Eraslan, G. ; Simon, L. ; Mircea, M. ; Müller, N.S. ; Theis, F.J.
Nat. Commun. 10:390 (2019)
Single-cell RNA sequencing (scRNA-seq) has enabled researchers to study gene expression at a cellular resolution. However, noise due to amplification and dropout may obstruct analyses, so scalable denoising methods for increasingly large but sparse scRNA-seq data are needed. We propose a deep count autoencoder network (DCA) to denoise scRNA-seq datasets. DCA takes the count distribution, overdispersion and sparsity of the data into account using a negative binomial noise model with or without zero-inflation, and nonlinear gene-gene dependencies are captured. Our method scales linearly with the number of cells and can, therefore, be applied to datasets of millions of cells. We demonstrate that DCA denoising improves a diverse set of typical scRNA-seq data analyses using simulated and real datasets. DCA outperforms existing methods for data imputation in quality and speed, enhancing biological discovery.
Wissenschaftlicher Artikel
Scientific Article
Kaderali, L. ; Theis, F.J. ; Ganusov, V.V. ; Ciupe, S.M. ; Mehr, R. ; Ribeiro, R.M. ; Hernandez-Vargas, E.A.
Front. Microbiol. 9:3338 (2019)
Editorial
Editorial
Büttner, M. ; Miao, Z. ; Wolf, F.A. ; Teichmann, S.A. ; Theis, F.J.
Nat. Methods 16, 43-49 (2019)
Single-cell transcriptomics is a versatile tool for exploring heterogeneous cell populations, but as with all genomics experiments, batch effects can hamper data integration and interpretation. The success of batch-effect correction is often evaluated by visual inspection of low-dimensional embeddings, which are inherently imprecise. Here we present a user-friendly, robust and sensitive k-nearest-neighbor batch-effect test (kBET; https://github.com/theislab/kBET) for quantification of batch effects. We used kBET to assess commonly used batch-regression and normalization approaches, and to quantify the extent to which they remove batch effects while preserving biological variability. We also demonstrate the application of kBET to data from peripheral blood mononuclear cells (PBMCs) from healthy donors to distinguish cell-type-specific inter-individual variability from changes in relative proportions of cell populations. This has important implications for future data-integration efforts, central to projects such as the Human Cell Atlas.
Wissenschaftlicher Artikel
Scientific Article
Tischler, J. ; Gruhn, W.H. ; Reid, J. ; Allgeyer, E. ; Buettner, F. ; Marr, C. ; Theis, F.J. ; Simons, B.D. ; Wernisch, L. ; Surani, M.A.
EMBO J. 38:e99518 (2019)
An intricate link is becoming apparent between metabolism and cellular identities. Here, we explore the basis for such a link in an in vitro model for early mouse embryonic development: from naive pluripotency to the specification of primordial germ cells (PGCs). Using single-cell RNA-seq with statistical modelling and modulation of energy metabolism, we demonstrate a functional role for oxidative mitochondrial metabolism in naive pluripotency. We link mitochondrial tricarboxylic acid cycle activity to IDH2-mediated production of alpha-ketoglutarate and through it, the activity of key epigenetic regulators. Accordingly, this metabolite has a role in the maintenance of naive pluripotency as well as in PGC differentiation, likely through preserving a particular histone methylation status underlying the transient state of developmental competence for the PGC fate. We reveal a link between energy metabolism and epigenetic control of cell state transitions during a developmental trajectory towards germ cell specification, and establish a paradigm for stabilizing fleeting cellular states through metabolic modulation.
Wissenschaftlicher Artikel
Scientific Article
Thomas, J. ; Küpper, M. ; Batra, R. ; Jargosch, M. ; Atenhan, A. ; Baghin, V. ; Krause, L. ; Lauffer, F. ; Biedermann, T. ; Theis, F.J. ; Eyerich, K. ; Eyerich, S. ; Garzorz-Stark, N.
J. Eur. Acad. Dermatol. Venereol. 33, 115-122 (2019)
Background Imbalances of T-cell subsets are hallmarks of disease-specific inflammation in psoriasis. However, the relevance of B cells for psoriasis remains poorly investigated. Objective To analyse the role of B cells and immunoglobulins for the disease-specific immunology of psoriasis. Methods We characterized B-cell subsets and immunoglobulin levels in untreated psoriasis patients (n = 37) and compared them to healthy controls (n = 20) as well as to psoriasis patients under disease-controlling systemic treatment (n = 28). B-cell subsets were analysed following the flow cytometric gating strategy based on the surface markers CD24, CD38 and CD138. Moreover, immunofluorescence stainings were used to detect IgA in psoriatic skin. Results We found significantly increased levels of IgA in the serum of treatment-naive psoriasis patients correlating with disease score. However, IgA was only observed in dermal vessels of skin sections. Concerning B-cell subsets, we only found a moderately positive correlation of CD138(+) plasma cells with IgA levels and disease score in treatment-naive psoriasis patients. Confirming our hypothesis that psoriasis can develop in the absence of functional humoral immunity, we investigated a patient who suffered concomitantly from both psoriasis and a hereditary common variable immune defect (CVID) characterized by a lack of B cells and immunoglobulins. We detected variants in three of the 13 described genes of CVID and a so far undescribed variant in the ligand of the TNFRSF13B receptor leading to disturbed B-cell maturation and antibody production. However, this patient showed typical psoriasis regarding clinical presentation, histology or T-cell infiltrate. Finally, in a group of psoriasis patients under systemic treatment, neither did IgA levels drop nor did plasma cells correlate with IgA levels and disease score. Conclusion B-cell alterations might rather be an epiphenomenal finding in psoriasis with a clear dominance of T cells over shifts in B-cell subsets.
Wissenschaftlicher Artikel
Scientific Article
2018
Backofen, R. ; Costa, F. ; Theis, F.J. ; Marr, C. ; Preusse, M. ; Becker, C. ; Saunders, S. ; Palme, K. ; Dovzhenko, O.
In: Lecture Notes in Bioengineerin. 2018. 85-100
MicroRNAs, gene encoded small RNA molecules, play an integral part in gene regulation by binding to target mRNAs and preventing their translation. The prediction of microRNA–mRNA-binding sites and the resulting interaction network are essential to understand, and thus influence, regulation of a genetic information flow inside the living organism. Numerous algorithms have been proposed based on various heuristics; however the predictions often vary considerably. In this proposal we will extend a physical model for the binding of microRNAs to the corresponding target and establish an extended set of features influencing binding probabilities. We will be faced with the challenge of (i) too many features and (ii) few known interactions on which to train any prediction algorithm. This problem will be solved using (i) information-theoretical criteria for feature reduction, (ii) regularization, (iii) application of the Infomax approach to guarantee minimal loss of information after dimension reduction, and (iv) experimental validation of theoretical predictions using a novel test system. This strategy will allow (i) statistical analysis of the predicted microRNA–mRNA hypergraph, (ii) characterization of network motives and hierarchies, (iii) identification of missing links, and (iv) removal of false interactions.
Müller, N.S. ; Sass, S. ; Offermann, B. ; Singh, A. ; Knauer, S. ; Schüttler, A. ; Minardi, J.N. ; Theis, F.J. ; Busch, H. ; Boerries, M.
In: Lecture Notes in Bioengineering. 2018. 115-136
Cell–Cell communication is a complex process regulating the homeostasis and cellular decisions in a multicellular organism. The correction information flow is a necessity for a healthy cellular microenvironment and proper response to external stimuli, such as inflammation and wound healing. Altered cell–cell communication is a hallmark of aging and disease. In particular, tumor–stroma interactions have attracted increased attention in recent years as putative therapeutic targets of intervention. Most studies so far have investigated individual cytokines or analyzed steady-state feedback-entangled cell–cell communication. Here, we study the onset of cell–cell communication by a defined double paracrine experimental setup of skin cells. We build in the experimental model systems developed in the first funding period and use conditioned supernatant stimulation to record whole transcriptome response time series as well as changes in the whole secretome to correlate cytokine patterns with phenotype responses. Moreover, we model the changes in gene expression and cytokine secretion through communication theoretic approaches through independent component analysis and Gaussian processes. The information from these general models is used for mechanistic, whole cell modeling using gene regulatory networks and Boolean models that comprise long-term dynamics of the cellular responses as well as multiple time scales of protein signaling, gene expression, and auto- and paracrine feedbacks. Such approaches will elucidate bi-stability of cellular homeostasis locking the cells into inflammatory or migratory states. Lastly, we will test the generic regulatory schemes by comparison of our currently investigated skin communication model with a tumor–stroma interaction system of human melanoma and fibroblast cells.
Kyncl, M. ; Bast, L. ; Henkel, L. ; Theis, F.J. ; Oostendorp, R.A.J. ; Marr, C. ; Goetze, K.S.
Blood 132 (2018)
Meeting abstract
Meeting abstract
Hross, S. ; Theis, F.J. ; Sixt, M. ; Hasenauer, J.
J. R. Soc. Interface 15:20180600 (2018)
Spatial patterns are ubiquitous on the subcellular, cellular and tissue level, and can be studied using imaging techniques such as light and fluorescence microscopy. Imaging data provide quantitative information about biological systems; however, mechanisms causing spatial patterning often remain elusive. In recent years, spatio-temporal mathematical modelling has helped to overcome this problem. Yet, outliers and structured noise limit modelling of whole imaging data, and models often consider spatial summary statistics. Here, we introduce an integrated data-driven modelling approach that can cope with measurement artefacts and whole imaging data. Our approach combines mechanistic models of the biological processes with robust statistical models of the measurement process. The parameters of the integrated model are calibrated using a maximum-likelihood approach. We used this integrated modelling approach to study in vivo gradients of the chemokine (C-C motif) ligand 21 (CCL21). CCL21 gradients guide dendritic cells and are important in the adaptive immune response. Using artificial data, we verified that the integrated modelling approach provides reliable parameter estimates in the presence of measurement noise and that bias and variance of these estimates are reduced compared to conventional approaches. The application to experimental data allowed the parametrization and subsequent refinement of the model using additional mechanisms. Among other results, model-based hypothesis testing predicted lymphatic vessel-dependent concentration of heparan sulfate, the binding partner of CCL21. The selected model provided an accurate description of the experimental data and was partially validated using published data. Our findings demonstrate that integrated statistical modelling of whole imaging data is computationally feasible and can provide novel biological insights.
Wissenschaftlicher Artikel
Scientific Article
Bast, L. ; Calzolari, F. ; Strasser, M. ; Hasenauer, J. ; Theis, F.J. ; Ninkovic, J. ; Marr, C.
Cell Rep. 25, 3231-3240.e8 (2018)
Adult murine neural stem cells (NSCs) generate neurons in drastically declining numbers with age. How cellular dynamics sustain neurogenesis and how alterations with age may result in this decline are unresolved issues. We therefore clonally traced NSC lineages using confetti reporters in young and middle-aged adult mice. To understand the underlying mechanisms, we derived mathematical models that explain observed clonal cell type abundances. The best models consistently show self-renewal of transit-amplifying progenitors and rapid neuroblast cell cycle exit. In middle-aged mice, we identified an increased probability of asymmetric stem cell divisions at the expense of symmetric differentiation, accompanied by an extended persistence of quiescence between activation phases. Our model explains existing longitudinal population data and identifies particular cellular properties underlying adult NSC homeostasis and the aging of this stem cell compartment.
Wissenschaftlicher Artikel
Scientific Article
Fröhlich, F. ; Reiser, A. ; Fink, L. ; Woschée, D. ; Ligon, T. ; Theis, F.J. ; Rädler, J.O. ; Hasenauer, J.
NPJ Syst. Biol. Appl. 5 (2018)
Single-cell time-lapse studies have advanced the quantitative understanding of cellular pathways and their inherent cell-to-cell variability. However, parameters retrieved from individual experiments are model dependent and their estimation is limited, if based on solely one kind of experiment. Hence, methods to integrate data collected under different conditions are expected to improve model validation and information content. Here we present a multi-experiment nonlinear mixed effect modeling approach for mechanistic pathway models, which allows the integration of multiple single-cell perturbation experiments. We apply this approach to the translation of green fluorescent protein after transfection using a massively parallel read-out of micropatterned single-cell arrays. We demonstrate that the integration of data from perturbation experiments allows the robust reconstruction of cell-to-cell variability, i.e., parameter densities, while each individual experiment provides insufficient information. Indeed, we show that the integration of the datasets on the population level also improves the estimates for individual cells by breaking symmetries, although each of them is only measured in one experiment. Moreover, we confirmed that the suggested approach is robust with respect to batch effects across experimental replicates and can provide mechanistic insights into the nature of batch effects. We anticipate that the proposed multi-experiment nonlinear mixed effect modeling approach will serve as a basis for the analysis of cellular heterogeneity in single-cell dynamics.
Wissenschaftlicher Artikel
Scientific Article
Fröhlich, F. ; Kessler, T. ; Weindl, D. ; Shadrin, A. ; Schmiester, L. ; Hache, H. ; Muradyan, A. ; Schütte, M. ; Lim, J.H. ; Heinig, M. ; Theis, F.J. ; Lehrach, H. ; Wierling, C. ; Lange, B. ; Hasenauer, J.
Cell Syst. 7, 567-579 (2018)
Mechanistic models are essential to deepen the understanding of complex diseases at the molecular level. Nowadays, high-throughput molecular and phenotypic characterizations are possible, but the integration of such data with prior knowledge on signaling pathways is limited by the availability of scalable computational methods. Here, we present a computational framework for the parameterization of large-scale mechanistic models and its application to the prediction of drug response of cancer cell lines from exome and transcriptome sequencing data. This framework is over 10 4 times faster than state-of-the-art methods, which enables modeling at previously infeasible scales. By applying the framework to a model describing major cancer-associated pathways (>1,200 species and >2,600 reactions), we could predict the effect of drug combinations from single drug data. This is the first integration of high-throughput datasets using large-scale mechanistic models. We anticipate this to be the starting point for development of more comprehensive models allowing a deeper mechanistic insight.
Wissenschaftlicher Artikel
Scientific Article
Foerster, K. ; Ertl-Wagner, B. ; Sass, S. ; Stoecklein, S. ; Schoeppe, F. ; Dietrich, O. ; Pomschar, A. ; Schulze, A. ; Huebener, C. ; Theis, F.J. ; Ehrhardt, H. ; Flemmer, A. ; Hilgendorff, A.
Am. J. Respir. Crit. Care Med. 197 (2018)
Meeting abstract
Meeting abstract
Foerster, K. ; Sass, S. ; Pomschar, A. ; Ehrhardt, H. ; Naehrlich, L. ; Schulze, A. ; Flemmer, A. ; Huebener, C. ; Eickelberg, O. ; Theis, F.J. ; Dietrich, O. ; Ertl-Wagner, B. ; Hilgendorff, A.
Am. J. Respir. Crit. Care Med. 197 (2018)
Meeting abstract
Meeting abstract
Nawijn, M.C. ; Rajagopal, J. ; Koppelman, G.H. ; van den Berge, M. ; Rose-Zerilli, M.J. ; Holloway, J.W. ; Schultze, J.L. ; Barbry, P. ; Teichmann, S.A. ; Prabhakar, S. ; Theis, F.J. ; Schiller, H. B.
Am. J. Respir. Crit. Care Med. 197 (2018)
Meeting abstract
Meeting abstract
Fischer, D.S. ; Theis, F.J. ; Yosef, N.
Nucleic Acids Res. 46:e119 (2018)
Temporal changes to the concentration of molecular species such as mRNA, which take place in response to various environmental cues, can often be modeled as simple continuous functions such as a single pulse (impulse) model. The simplicity of such functional representations can provide an improved performance on fundamental tasks such as noise reduction, imputation and differential expression analysis. However, temporal gene expression profiles are often studied with models that treat time as a categorical variable, neglecting the dependence between time points. Here, we present ImpulseDE2, a framework for differential expression analysis that combines the power of the impulse model as a continuous representation of temporal responses along with a noise model tailored specifically to sequencing data. We compare the simple categorical models to ImpulseDE2 and to other continuous models based on natural cubic splines and demonstrate the utility of the continuous approach for studying differential expression in time course sequencing experiments. A unique feature of ImpulseDE2 is the ability to distinguish permanently from transiently up- or down-regulated genes. Using an in vitro differentiation dataset, we demonstrate that this gene classification scheme can be used to highlight distinct transcriptional programs that are associated with different phases of the differentiation process.
Wissenschaftlicher Artikel
Scientific Article
Neschen, S. ; Wu, M. ; Fuchs, C. ; Kondofersky, I. ; Theis, F.J. ; Hrabě de Angelis, M. ; Häring, H.-U. ; Sartorius, T.
Exp. Clin. Endocrinol. Diabet. 126, 20-29 (2018)
Aims and Methods Glucose homeostasis and energy balance are under control by peripheral and brain processes. Especially insulin signaling in the brain seems to impact whole body glucose homeostasis and interacts with fatty acid signaling. In humans circulating saturated fatty acids are negatively associated with brain insulin action while animal studies suggest both positive and negative interactions of fatty acids and insulin brain action. This apparent discrepancy might reflect a difference between acute and chronicfatty acid signaling. To address this question we investigated the acute effect of an intracere-broventricular palmitic acid administration on peripheral glucose homeostasis. We developed and implemented a method for simultaneous monitoring of brain activity and peripheral insulin action in freely moving mice by combining radiotelemetry electrocorticography (ECoG) and euglycemic-hyperin-sulinemic clamps. This method allowed gaining insight in the early kinetics of brain fatty acid signaling and its contemporaneous effect on liverfunction in vivo, which, to our knowledge, has not been assessed so far in mice.Results Insulin-induced brain activity in the theta and beta band was decreased by acute intracerebroventricular application of palmitic acid. Peripherally it amplified insulin action as demonstrated by a significant inhibition of endogenous glucose production and increased glucose infusion rate. Moreover, our results further revealed that the brain effect of peripheral insulin is modulated by palm itic acid load in the brain.Conclusion These findings suggest that insulin action is amplified in the periphery and attenuated in the brain by acute palmitic acid application. Thus, our results indicate that acute palmitic acid signaling in the brain may be different from chronic effects.
Wissenschaftlicher Artikel
Scientific Article
Do, K.T. ; Wahl, S. ; Raffler, J. ; Molnos, S. ; Laimighofer, M. ; Adamski, J. ; Suhre, K. ; Strauch, K. ; Peters, A. ; Gieger, C. ; Langenberg, C. ; Stewart, I.D. ; Theis, F.J. ; Grallert, H. ; Kastenmüller, G. ; Krumsiek, J.
Metabolomics 14:128 (2018)
BACKGROUND: Untargeted mass spectrometry (MS)-based metabolomics data often contain missing values that reduce statistical power and can introduce bias in biomedical studies. However, a systematic assessment of the various sources of missing values and strategies to handle these data has received little attention. Missing data can occur systematically, e.g. from run day-dependent effects due to limits of detection (LOD); or it can be random as, for instance, a consequence of sample preparation. METHODS: We investigated patterns of missing data in an MS-based metabolomics experiment of serum samples from the German KORA F4 cohort (n = 1750). We then evaluated 31 imputation methods in a simulation framework and biologically validated the results by applying all imputation approaches to real metabolomics data. We examined the ability of each method to reconstruct biochemical pathways from data-driven correlation networks, and the ability of the method to increase statistical power while preserving the strength of established metabolic quantitative trait loci. RESULTS: Run day-dependent LOD-based missing data accounts for most missing values in the metabolomics dataset. Although multiple imputation by chained equations performed well in many scenarios, it is computationally and statistically challenging. K-nearest neighbors (KNN) imputation on observations with variable pre-selection showed robust performance across all evaluation schemes and is computationally more tractable. CONCLUSION: Missing data in untargeted MS-based metabolomics data occur for various reasons. Based on our results, we recommend that KNN-based imputation is performed on observations with variable pre-selection since it showed robust results in all evaluation schemes.
Wissenschaftlicher Artikel
Scientific Article
Feigelman, J ; Weindl, D. ; Theis, F.J. ; Marr, C. ; Hasenauer, J.
In: Lecture Notes in Computer Science (16th International Conference on Computational Methods in Systems Biology, 12-14 September 2018, Brno; Czech Republic). 2018. 300-306 ( ; 11095 LNBI)
The linear noise approximation (LNA) provides an approximate description of the statistical moments of stochastic chemical reaction networks (CRNs). LNA is a commonly used modeling paradigm describing the probability distribution of systems of biochemical species in the intracellular environment. Unlike exact formulations, the LNA remains computationally feasible even for CRNs with many reactions. The tractability of the LNA makes it a common choice for inference of unknown chemical reaction parameters. However, this task is impeded by a lack of suitable inference tools for arbitrary CRN models. In particular, no available tool provides temporal cross-correlations, parameter sensitivities and efficient numerical integration. In this manuscript we present LNA++, which allows for fast derivation and simulation of the LNA including the computation of means, covariances, and temporal cross-covariances. For efficient parameter estimation and uncertainty analysis, LNA++ implements first and second order sensitivity equations. Interfaces are provided for easy integration with Matlab and Python. Implementation and availability: LNA++ is implemented as a combination of C/C++, Matlab and Python scripts. Code base and the release used for this publication are available on GitHub (https://github.com/ICB-DCM/LNAplusplus ) and Zenodo (https://doi.org/10.5281/zenodo.1287771 ).
Thomas, J. ; Kuepper, M.K. ; Batra, R. ; Jargosch, M. ; Atenhan, A. ; Baghin, V. ; Krause, L. ; Lauffer, F. ; Biedermann, T. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Eyerich, K. ; Eyerich, S. ; Garzorz-Stark, N.
Allergy 73, 732-733 (2018)
Meeting abstract
Meeting abstract
Jarasch, A. ; Glaser, A. ; Häring, H.-U. ; Roden, M. ; Schürmann, A. ; Solimena, M. ; Theis, F.J. ; Tschöp, M.H. ; Wess, G. ; Hrabě de Angelis, M.
Diabetologe 14, 486-492 (2018)
Seit 1980 vervierfachte sich die Zahl der Menschen mit Diabetes weltweit. Allein in Deutschland leiden knapp 7 Mio. Menschen an dieser Stoffwechselerkrankung, und jedes Jahr erkranken bis zu 500.000 neu daran. Diese Zahlen machen deutlich, wie dringend neue wirksame Präventionsmaßnahmen und innovative Behandlungsformen benötigt werden. Die Digitalisierung ermöglicht es, die Volkskrankheit Diabetes in einer neuen Dimension zu erforschen, um sehr früh Subtypen dieser Stoffwechselerkrankung zu erkennen und geeignete personalisierte Präventionsmaßnahmen anzubieten. Mit dem Aufbau eines digitalen Diabetespräventionszentrums könnten Gesundheits- und Forschungsdaten aus unterschiedlichsten Quellen zusammengeführt und mit innovativen Informationstechnologien (IT: Informationstechnik) analysiert und ausgewertet werden, um unterschiedliche Diabetessubtypen identifizieren und spezifische Präventions- und Therapiemaßnahmen anbieten zu können, die durch die enge Zusammenarbeit mit der Bevölkerung direkt einsetzbar wären.
Review
Review
Ballnus, B. ; Schaper, S. ; Theis, F.J. ; Hasenauer, J.
Bioinformatics 34, 494-501 (2018)
Motivation: Mathematical models have become standard tools for the investigation of cellular processes and the unraveling of signal processing mechanisms. The parameters of these models are usually derived from the available data using optimization and sampling methods. However, the efficiency of these methods is limited by the properties of the mathematical model, e.g. nonidentifiabilities, and the resulting posterior distribution. In particular, multi-modal distributions with long valleys or pronounced tails are difficult to optimize and sample. Thus, the developement or improvement of optimization and sampling methods is subject to ongoing research. Results: We suggest a region-based adaptive parallel tempering algorithm which adapts to the problem-specific posterior distributions, i.e. modes and valleys. The algorithm combines several established algorithms to overcome their individual shortcomings and to improve sampling efficiency. We assessed its properties for established benchmark problems and two ordinary differential equation models of biochemical reaction networks. The proposed algorithm outperformed state-of-the-art methods in terms of calculation efficiency and mixing. Since the algorithm does not rely on a specific problem structure, but adapts to the posterior distribution, it is suitable for a variety of model classes.
Wissenschaftlicher Artikel
Scientific Article
Colomé-Tatché, M. ; Theis, F.J.
Curr. Opin. Syst. Biol. 7, 54-59 (2018)
Single cell high throughput genomic measurements are revolutionizing the fields of biology and medicine, providing a means to tackle biological problems that have thus far been inaccessible, such as the systematic discovery of new cell types, the identification of cellular heterogeneity in health and disease, or the cell-fate decisions taking place during differentiation and reprogramming. Recently implemented multi–omics measurements of genomes, transcriptomes, epigenomes, proteomes and chromatin organization are opening up new avenues to begin to disentangle the causal relationship between -omics layers and how these co-determine higher-order cellular phenotypes. This technological revolution is not restricted to basic science but promises major breakthroughs in medical diagnostics and treatments. In this paper we review existing computational methods for the analysis and integration of different -omics layers and discuss what new approaches are needed to leverage the full potential of single cell multi-omics data.
Review
Review
Strasser, M. ; Hoppe, P.S. ; Loeffler, D. ; Kokkaliaris, K.D. ; Schroeder, T. ; Theis, F.J. ; Marr, C.
Nat. Commun. 9:2697 (2018)
Molecular regulation of cell fate decisions underlies health and disease. To identify molecules that are active or regulated during a decision, and not before or after, the decision time point is crucial. However, cell fate markers are usually delayed and the time of decision therefore unknown. Fortunately, dividing cells induce temporal correlations in their progeny, which allow for retrospective inference of the decision time point. We present a computational method to infer decision time points from correlated marker signals in genealogies and apply it to differentiating hematopoietic stem cells. We find that myeloid lineage decisions happen generations before lineage marker onsets. Inferred decision time points are in agreement with data from colony assay experiments. The levels of the myeloid transcription factor PU.1 do not change during, but long after the predicted lineage decision event, indicating that the PU.1/GATA1 toggle switch paradigm cannot explain the initiation of early myeloid lineage choice.
Wissenschaftlicher Artikel
Scientific Article
Simon, L. ; Karg, S. ; Westermann, A.J. ; Engel, M. ; Elbehery, A.H.A. ; Hense, B.A. ; Heinig, M. ; Deng, L. ; Theis, F.J.
GigaScience 7, 1-8 (2018)
Background: With the advent of the age of big data in bioinformatics, large volumes of data and high-performance computing power enable researchers to perform re-analyses of publicly available datasets at an unprecedented scale. Ever more studies imply the microbiome in both normal human physiology and a wide range of diseases. RNA sequencing technology (RNA-seq) is commonly used to infer global eukaryotic gene expression patterns under defined conditions, including human disease-related contexts; however, its generic nature also enables the detection of microbial and viral transcripts. Findings:We developed a bioinformatic pipeline to screen existing human RNA-seq datasets for the presence of microbial and viral reads by re-inspecting the non-human-mapping read fraction. We validated this approach by recapitulating outcomes from six independent, controlled infection experiments of cell line models and compared them with an alternative metatranscriptomic mapping strategy. We then applied the pipeline to close to 150 terabytes of publicly available raw RNA-seq data from more than 17,000 samples from more than 400 studies relevant to human disease using state-of-the-art high-performance computing systems. The resulting data from this large-scale re-analysis are made available in the presented MetaMap resource. Conclusions: Our results demonstrate that common human RNA-seq data, including those archived in public repositories, might contain valuable information to correlate microbial and viral detection patterns with diverse diseases. The presented MetaMap database thus provides a rich resource for hypothesis generation toward the role of the microbiome in human disease. Additionally, codes to process new datasets and perform statistical analyses are made available.
Wissenschaftlicher Artikel
Scientific Article
Hastreiter, S. ; Skylaki, S. ; Loeffler, D. ; Reimann, A. ; Hilsenbeck, O. ; Hoppe, P.S. ; Coutu, D.L. ; Kokkaliaris, K.D. ; Schwarzfischer, M. ; Anastassiadis, K. ; Theis, F.J. ; Schroeder, T.
Stem Cell Rep. 10, 58-69 (2018)
Embryonic stem cells (ESCs) display heterogeneous expression of pluripotency factors such as Nanog when cultured with serum and leukemia inhibitory factor (LIF). In contrast, dual inhibition of the signaling kinases GSK3 and MEK (2i) converts ESC cultures into a state with more uniform and high Nanog expression. However, it is so far unclear whether 2i acts through an inductive or selective mechanism. Here, we use continuous time-lapse imaging to quantify the dynamics of death, proliferation, and Nanog expression in mouse ESCs after 2i addition. We show that 2i has a dual effect: it both leads to increased cell death of Nanog low ESCs (selective effect) and induces and maintains high Nanog levels (inductive effect) in single ESCs. Genetic manipulation further showed that presence of NANOG protein is important for cell viability in 2i medium. This demonstrates complex Nanog-dependent effects of 2i treatment on ESC cultures.
Wissenschaftlicher Artikel
Scientific Article
Plass, M. ; Solana, J. ; Wolf, F.A. ; Ayoub, S. ; Misios, A. ; Glažar, P. ; Obermayer, B. ; Theis, F.J. ; Kocks, C. ; Rajewsky, N.
Science 360:eaaq1723 (2018)
Flatworms of the species Schmidtea mediterranea are immortal-adult animals contain a large pool of pluripotent stem cells that continuously differentiate into all adult cell types. Therefore, single-cell transcriptome profiling of adult animals should reveal mature and progenitor cells. By combining perturbation experiments, gene expression analysis, a computational method that predicts future cell states from transcriptional changes, and a lineage reconstruction method, we placed all major cell types onto a single lineage tree that connects all cells to a single stem cell compartment. We characterized gene expression changes during differentiation and discovered cell types important for regeneration. Our results demonstrate the importance of single-cell transcriptome analysis for mapping and reconstructing fundamental processes of developmental and regenerative biology at high resolution.
Wissenschaftlicher Artikel
Scientific Article
Apweiler, R. ; Beissbarth, T. ; Berthold, M.R. ; Blüthgen, N. ; Burmeister, Y. ; Dammann, O. ; Deutsch, A.J. ; Feuerhake, F. ; Franke, A. ; Hasenauer, J. ; Hoffmann, S. ; Höfer, T. ; Jansen, P.L.M. ; Kaderali, L. ; Klingmüller, U. ; Koch, I. ; Kohlbacher, O. ; Kuepfer, L. ; Lammert, F. ; Maier, D. ; Pfeifer, N. ; Radde, N. ; Rehm, M. ; Roeder, I. ; Saez-Rodriguez, J. ; Sax, U. ; Schmeck, B.T. ; Schuppert, A. ; Seilheimer, B. ; Theis, F.J. ; Vera, J. ; Wolkenhauer, O.
Exp. Mol. Med. 50:e453 (2018)
New technologies to generate, store and retrieve medical and research data are inducing a rapid change in clinical and translational research and health care. Systems medicine is the interdisciplinary approach wherein physicians and clinical investigators team up with experts from biology, biostatistics, informatics, mathematics and computational modeling to develop methods to use new and stored data to the benefit of the patient. We here provide a critical assessment of the opportunities and challenges arising out of systems approaches in medicine and from this provide a definition of what systems medicine entails. Based on our analysis of current developments in medicine and healthcare and associated research needs, we emphasize the role of systems medicine as a multilevel and multidisciplinary methodological framework for informed data acquisition and interdisciplinary data analysis to extract previously inaccessible knowledge for the benefit of patients.
Review
Review
Lauffer, F. ; Jargosch, M. ; Krause, L. ; Garzorz-Stark, N. ; Franz, R. ; Roenneberg, S. ; Böhner, A. ; Müller, N.S. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Biedermann, T. ; Eyerich, S. ; Eyerich, K.
J. Invest. Dermatol., DOI: 10.1016/j.jid.2018.02.034 (2018)
Interface dermatitis is a characteristic histological pattern that occurs in autoimmune and chronic inflammatory skin diseases. It is unknown whether a common mechanism orchestrates this distinct type of skin inflammation. Here we investigated the overlap of two different interface dermatitis positive skin diseases, lichen planus and lupus erythematosus. The shared transcriptome signature pointed toward a strong type I immune response, and biopsy-derived T cells were dominated by IFN-γ and tumor necrosis factor alpha (TNF-α) positive cells. The transcriptome of keratinocytes stimulated with IFN-γ and TNF-α correlated significantly with the shared gene regulations of lichen planus and lupus erythematosus. IFN-γ TNF-α or mixed supernatant of lesional T cells induced signs of keratinocyte cell death in three-dimensional skin equivalents. We detected a significantly enhanced epidermal expression of receptor-interacting-protein-kinase 3, a key regulator of necroptosis, in interface dermatitis. Phosphorylation of receptor-interacting-protein-kinase 3 and mixed lineage kinase domain like pseudokinase was induced in keratinocytes on stimulation with T-cell supernatant—an effect that was dependent on the presence of either IFN-γ or TNF-α in the T-cell supernatant. Small hairpin RNA knockdown of receptor-interacting-protein-kinase 3 prevented cell death of keratinocytes on stimulation with IFN-γ or TNF-α. In conclusion, type I immunity is associated with lichen planus and lupus erythematosus and induces keratinocyte necroptosis. These two mechanisms are potentially involved in interface dermatitis.
Wissenschaftlicher Artikel
Scientific Article
Benedetti, E. ; Pučić-Baković, M. ; Keser, T. ; Wahl, A. ; Hassinen, A. ; Yang, J.Y. ; Liu, L. ; Trbojević-Akmačić, I. ; Razdorov, G. ; Štambuk, J. ; Klarić, L. ; Ugrina, I. ; Selman, M.H.J. ; Wuhrer, M. ; Rudan, I. ; Polasek, O. ; Hayward, C. ; Grallert, H. ; Strauch, K. ; Peters, A. ; Meitinger, T. ; Gieger, C. ; Vilaj, M. ; Boons, G.J. ; Moremen, K.W. ; Ovchinnikova, T. ; Bovin, N. ; Kellokumpu, S. ; Theis, F.J. ; Lauc, G. ; Krumsiek, J.
Nat. Commun. 9:706 (2018)
Correction to: Nature Communications (2017) 8:1231. doi:10.1038/s41467-017-01525-0.
Wolf, F.A. ; Angerer, P. ; Theis, F.J.
Genome Biol. 19:15 (2018)
SCANPY is a scalable toolkit for analyzing single-cell gene expression data. It includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks. Its Python-based implementation efficiently deals with data sets of more than one million cells (https://github.com/theislab/Scanpy). Along with SCANPY, we present ANNDATA, a generic class for handling annotated data matrices (https://github.com/theislab/anndata).
Wissenschaftlicher Artikel
Scientific Article
Stirm, L. ; Huypens, P. ; Sass, S. ; Batra, R. ; Fritsche, L. ; Brucker, S. ; Abele, H. ; Hennige, A.M. ; Theis, F.J. ; Beckers, J. ; Hrabě de Angelis, M. ; Fritsche, A. ; Häring, H.-U. ; Staiger, H.
Sci. Rep. 8:1366 (2018)
The number of pregnancies complicated by gestational diabetes (GDM) is increasing worldwide. To identify novel characteristics of GDM, we studied miRNA profiles of maternal and fetal whole blood cells (WBCs) from GDM and normal glucose tolerant (NGT) pregnant women matched for body mass index and maternal age. After adjustment for maternal weight gain and pregnancy week, we identified 29 mature micro-RNAs (miRNAs) up-regulated in GDM, one of which, i.e., miRNA-340, was validated by qPCR. mRNA and protein expression of PAIP1, a miRNA-340 target gene, was found down-regulated in GDM women, accordingly. In lymphocytes derived from the mothers' blood and treated in vitro, insulin increased and glucose reduced miRNA-340 expression. In fetal cord blood samples, no associations of miRNA-340 with maternal GDM were observed. Our results provide evidence for insulin-induced epigenetic, i.e., miRNA-dependent, programming of maternal WBCs in GDM.
Wissenschaftlicher Artikel
Scientific Article
Gonzalez-Vallinas, M. ; Rodriguez-Paredes, M. ; Albrecht, M. ; Sticht, C. ; Stichel, D. ; Gutekunst, J. ; Pitea, A. ; Sass, S. ; Sánchez-Rivera, F.J. ; Bermejo, J.L. ; Schmitt, J. ; De La Torre, C. ; Warth, A. ; Theis, F.J. ; Müller, N.S. ; Gretz, N. ; Muley, T. ; Meister, M. ; Tschaharganeh, D.F. ; Schirmacher, P. ; Matthäus, F. ; Breuhahn, K.
Mol. Cancer Res. 16, 390-402 (2018)
Most lung cancer deaths are related to metastases, which indicates the necessity of detecting and inhibiting tumor cell dissemination. Here, we aimed to identify miRNAs involved in metastasis of lung adenocarcinoma as prognostic biomarkers and therapeutic targets. To that end, lymph node metastasis- associated miRNAs were identified in The Cancer Genome Atlas lung adenocarcinoma patient cohort (sequencing data; n = 449) and subsequently validated by qRT-PCR in an independent clinical cohort (n = 108). Overexpression of miRNAs located on chromosome 14q32 was associated with metastasis in lung adenocarcinoma patients. Importantly, Kaplan-Meier analysis and log-rank test revealed that higher expression levels of individual 14q32 miRNAs (mir-539, mir-323b, and mir- 487a) associated with worse disease-free survival of never-smoker patients. Epigenetic analysis including DNA methylation microarray data and bisulfite sequencing validation demonstrated that the induction of 14q32 cluster correlated with genomic hypomethylation of the 14q32 locus. CRISPR activation technology, applied for the first time to functionally study the increase of clustered miRNA levels in a coordinated manner, showed that simultaneous overexpression of 14q32 miRNAs promoted tumor cell migratory and invasive properties. Analysis of individual miRNAs by mimic transfection further illustrated that miR-323b-3p, miR-487a-3p, and miR-539-5p significantly contributed to the invasive phenotype through the indirect regulation of different target genes. In conclusion, overexpression of 14q32 miRNAs, associated with the respective genomic hypomethylation, promotes metastasis and correlates with poor patient prognosis in lung adenocarcinoma. Implications: This study points to chromosome 14q32miRNAs as promising targets to inhibit tumor cell dissemination and to predict patient prognosis in lung adenocarcinoma. Mol Cancer Res; 16(3); 390-402.
Wissenschaftlicher Artikel
Scientific Article
Dalke, C. ; Neff, F. ; Bains, S.K. ; Bright, S. ; Lord, D.J. ; Reitmeir, P. ; Rößler, U. ; Samaga, D. ; Unger, K. ; Braselmann, H. ; Wagner, F. ; Greiter, M. ; Gomolka, M. ; Hornhardt, S. ; Kunze, S. ; Kempf, S.J. ; Garrett, L. ; Hölter, S.M. ; Wurst, W. ; Rosemann, M. ; Azimzadeh, O. ; Tapio, S. ; Aubele, M. ; Theis, F.J. ; Hoeschen, C. ; Slijepcevic, P. ; Kadhim, M. ; Atkinson, M.J. ; Zitzelsberger, H. ; Kulka, U. ; Graw, J.
Radiat. Environ. Biophys. 57, 99-113 (2018)
Because of the increasing application of ionizing radiation in medicine, quantitative data on effects of low-dose radiation are needed to optimize radiation protection, particularly with respect to cataract development. Using mice as mammalian animal model, we applied a single dose of 0, 0.063, 0.125 and 0.5 Gy at 10 weeks of age, determined lens opacities for up to 2 years and compared it with overall survival, cytogenetic alterations and cancer development. The highest dose was significantly associated with increased body weight and reduced survival rate. Chromosomal aberrations in bone marrow cells showed a dose-dependent increase 12 months after irradiation. Pathological screening indicated a dose-dependent risk for several types of tumors. Scheimpflug imaging of the lens revealed a significant dose-dependent effect of 1% of lens opacity. Comparison of different biological end points demonstrated long-term effects of low-dose irradiation for several biological end points.
Wissenschaftlicher Artikel
Scientific Article
Förster, K. ; Sass, S. ; Ehrhardt, H. ; Mous, D.S. ; Rottier, R.J. ; Oak, P. ; Schulze, A. ; Flemmer, A.W. ; Gronbach, J. ; Hübener, C. ; Desai, T. ; Eickelberg, O. ; Theis, F.J. ; Hilgendorff, A.
Am. J. Respir. Crit. Care Med. 197, 1076-1080 (2018)
Wissenschaftlicher Artikel
Scientific Article
Garzorz-Stark, N. ; Lauffer, F. ; Krause, L. ; Thomas, J. ; Atenhan, A. ; Franz, R. ; Roenneberg, S. ; Boehner, A. ; Jargosch, M. ; Batra, R. ; Müller, N.S. ; Haak, S. ; Groß, C. ; Groß, O. ; Traidl-Hoffmann, C. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Biedermann, T. ; Eyerich, S. ; Eyerich, K.
J. Allergy Clin. Immunol. 141, 1320-1333.e11 (2018)
BACKGROUND: A standardized human model to study early pathogenic events in patients with psoriasis is missing. Activation of Toll-like receptor 7/8 by means of topical application of imiquimod is the most commonly used mouse model of psoriasis. OBJECTIVE: We sought to investigate the potential of a human imiquimod patch test model to resemble human psoriasis. METHODS: Imiquimod (Aldara 5% cream; 3M Pharmaceuticals, St Paul, Minn) was applied twice a week to the backs of volunteers (n = 18), and development of skin lesions was monitored over a period of 4 weeks. Consecutive biopsy specimens were taken for whole-genome expression analysis, histology, and T-cell isolation. Plasmacytoid dendritic cells (pDCs) were isolated from whole blood, stimulated with Toll-like receptor 7 agonist, and analyzed by means of extracellular flux analysis and real-time PCR. RESULTS: We demonstrate that imiquimod induces a monomorphic and self-limited inflammatory response in healthy subjects, as well as patients with psoriasis or eczema. The clinical and histologic phenotype, as well as the transcriptome, of imiquimod-induced inflammation in human skin resembles acute contact dermatitis rather than psoriasis. Nevertheless, the imiquimod model mimics the hallmarks of psoriasis. In contrast to classical contact dermatitis, in which myeloid dendritic cells sense haptens, pDCs are primary sensors of imiquimod. They respond with production of proinflammatory and T17-skewing cytokines, resulting in a T17 immune response with IL-23 as a key driver. In a proof-of-concept setting systemic treatment with ustekinumab diminished imiquimod-induced inflammation. CONCLUSION: In human subjects imiquimod induces contact dermatitis with the distinctive feature that pDCs are the primary sensors, leading to an IL-23/T17 deviation. Despite these shortcomings, the human imiquimod model might be useful to investigate early pathogenic events and prove molecular concepts in patients with psoriasis.
Wissenschaftlicher Artikel
Scientific Article
Frohnert, B.I. ; Laimighofer, M. ; Krumsiek, J. ; Theis, F.J. ; Winkler, C. ; Norris, J.M. ; Ziegler, A.-G. ; Rewers, M.J. ; Steck, A.K.
Pediatr. Diabetes 19, 277-283 (2018)
Background: Genetic predisposition for type 1 diabetes (T1D) is largely determined by human leukocyte antigen (HLA) genes; however, over 50 other genetic regions confer susceptibility. We evaluated a previously reported 10-factor weighted model derived from the Type 1 Diabetes Genetics Consortium to predict the development of diabetes in the Diabetes Autoimmunity Study in the Young (DAISY) prospective cohort. Performance of the model, derived from individuals with first-degree relatives (FDR) with T1D, was evaluated in DAISY general population (GP) participants as well as FDR subjects. Methods: The 10-factor weighted risk model (HLA, PTPN22, INS, IL2RA, ERBB3, ORMDL3, BACH2, IL27, GLIS3, RNLS), 3-factor model (HLA, PTPN22, INS), and HLA alone were compared for the prediction of diabetes in children with complete SNP data (n = 1941). Results: Stratification by risk score significantly predicted progression to diabetes by Kaplan-Meier analysis (GP: P=.00006; FDR: P=.0022). The 10-factor model performed better in discriminating diabetes outcome than HLA alone (GP, P=.03; FDR, P=.01). In GP, the restricted 3-factor model was superior to HLA (P=.03), but not different from the 10-factor model (P=.22). In contrast, for FDR the 3-factor model did not show improvement over HLA (P=.12) and performed worse than the 10-factor model (P=.02) Conclusions: We have shown a 10-factor risk model predicts development of diabetes in both GP and FDR children. While this model was superior to a minimal model in FDR, it did not confer improvement in GP. Differences in model performance in FDR vs GP children may lead to important insights into screening strategies specific to these groups.
Wissenschaftlicher Artikel
Scientific Article
2017
Regev, A. ; Teichmann, S.A. ; Lander, E.S. ; Amit, I. ; Benoist, C. ; Birney, E. ; Bodenmiller, B. ; Campbell, P.J. ; Carninci, P. ; Clatworthy, M. ; Clevers, H. ; Deplancke, B. ; Dunham, I. ; Eberwine, J. ; Eils, R. ; Enard, W. ; Farmer, A. ; Fugger, L. ; Göttgens, B. ; Hacohen, N. ; Haniffa, M. ; Hemberg, M. ; Kim, S.K. ; Klenerman, P. ; Kriegstein, A. ; Lein, E. ; Linnarsson, S. ; Lundberg, E. ; Lundeberg, J. ; Majumder, P. ; Marioni, J.C. ; Merad, M. ; Mhlanga, M. ; Nawijn, M.C. ; Netea, M. ; Nolan, G. ; Pe'er, D. ; Phillipakis, A. ; Ponting, C.P. ; Quake, S.R. ; Reik, W. ; Rozenblatt-Rosen, O. ; Sanes, J.R. ; Satija, R. ; Schumacher, T.N. ; Shalek, A.K. ; Shapiro, E. ; Sharma, P. ; Shin, J.W. ; Stegle, O. ; Stratton, M.R. ; Stubbington, M.J.T. ; Theis, F.J. ; Uhlén, M. ; van Oudenaarden, A. ; Wagner, A. ; Watt, F.M. ; Weissman, J.S. ; Wold, B.J. ; Xavier, R.J. ; Yosef, N.
eLife 6:e27041 (2017)
The recent advent of methods for high-throughput single-cell molecular profiling has catalyzed a growing sense in the scientific community that the time is ripe to complete the 150-year-old effort to identify all cell types in the human body. The Human Cell Atlas Project is an international collaborative effort that aims to define all human cell types in terms of distinctive molecular profiles (such as gene expression profiles) and to connect this information with classical cellular descriptions (such as location and morphology). An open comprehensive reference map of the molecular state of cells in healthy human tissues would propel the systematic study of physiological states, developmental trajectories, regulatory circuitry and interactions of cells, and also provide a framework for understanding cellular dysregulation in human disease. Here we describe the idea, its potential utility, early proofs-of-concept, and some design considerations for the Human Cell Atlas, including a commitment to open data, code, and community.
Wissenschaftlicher Artikel
Scientific Article
Benedetti, E. ; Pučić-Baković, M. ; Keser, T. ; Wahl, A. ; Hassinen, A. ; Yang, J.Y. ; Liu, L. ; Trbojević-Akmačić, I. ; Razdorov, G. ; Štambuk, J. ; Klarić, L. ; Ugrina, I. ; Selman, M.H.J. ; Wuhrer, M. ; Rudan, I. ; Polasek, O. ; Hayward, C. ; Grallert, H. ; Strauch, K. ; Peters, A. ; Meitinger, T. ; Gieger, C. ; Vilaj, M. ; Boons, G.J. ; Moremen, K.W. ; Ovchinnikova, T. ; Bovin, N. ; Kellokumpu, S. ; Theis, F.J. ; Lauc, G. ; Krumsiek, J.
Nat. Commun. 8:1483 (2017)
Immunoglobulin G (IgG) is a major effector molecule of the human immune response, and aberrations in IgG glycosylation are linked to various diseases. However, the molecular mechanisms underlying protein glycosylation are still poorly understood. We present a data-driven approach to infer reactions in the IgG glycosylation pathway using large-scale mass-spectrometry measurements. Gaussian graphical models are used to construct association networks from four cohorts. We find that glycan pairs with high partial correlations represent enzymatic reactions in the known glycosylation pathway, and then predict new biochemical reactions using a rule-based approach. Validation is performed using data from a GWAS and results from three in vitro experiments. We show that one predicted reaction is enzymatically feasible and that one rejected reaction does not occur in vitro. Moreover, in contrast to previous knowledge, enzymes involved in our predictions colocalize in the Golgi of two cell lines, further confirming the in silico predictions.
Wissenschaftlicher Artikel
Scientific Article
Dirmeier, S. ; Fuchs, C. ; Müller, N.S. ; Theis, F.J.
Bioinformatics 34, 896-898 (2017)
Modelling biological associations or dependencies using linear regression is often complicated when the analyzed data-sets are high-dimensional and less observations than variables are availableling this issue. Recently proposed regression models utilize prior knowledge on dependencies, e.g. in the form of graphs, arguing that this information will lead to more reliable estimates for regression coefficients. However, none of the proposed models for multivariate genomic response variables have been implemented as a computationally efficient, freely available library. In this paper we propose netReg, a package for graph-penalized regression models that use large networks and thousands of variables. netReg incorporates a priori generated biological graph information into linear models yielding sparse or smooth solutions for regression coefficients. netReg is implemented as both R-package and C ++ commandline tool. The main computations are done in C ++, where we use Armadillo for fast matrix calculations and Dlib for optimization. The R package is freely available on https://bioconductor.org/packages/netReg. The command line tool can be installed using the conda channel Bioconda. Installation details, issue reports, development versions, documentation and tutorials for the R and C ++ versions and the R package vignette can be found on GitHub ext-link-type="https://dirmeier.github.io/netReg/. The GitHub page also contains code for benchmarking and example datasets used in this paper.
Wissenschaftlicher Artikel
Scientific Article
Sobotta, S. ; Raue, A. ; Huang, X. ; Vanlier, J. ; Jünger, A. ; Bohl, S. ; Albrecht, U. ; Hahnel, M.J. ; Wolf, S. ; Müller, N.S. ; D'Alessandro, L.A. ; Mueller-Bohl, S. ; Boehm, M.E. ; Lucarelli, P. ; Bonefas, S. ; Damm, G. ; Seehofer, D. ; Lehmann, W.D. ; Rose-John, S. ; van der Hoeven, F. ; Gretz, N. ; Theis, F.J. ; Ehlting, C. ; Bode, J.G. ; Timmer, J. ; Schilling, M. ; Klingmüller, U.
Front. Physiol. 8:775 (2017)
IL-6 is a central mediator of the immediate induction of hepatic acute phase proteins (APP) in the liver during infection and after injury, but increased IL-6 activity has been associated with multiple pathological conditions. In hepatocytes, IL-6 activates JAK1-STAT3 signaling that induces the negative feedback regulator SOCS3 and expression of APPs. While different inhibitors of IL-6-induced JAK1-STAT3-signaling have been developed, understanding their precise impact on signaling dynamics requires a systems biology approach. Here we present a mathematical model of IL-6-induced JAK1-STAT3 signaling that quantitatively links physiological IL-6 concentrations to the dynamics of IL-6-induced signal transduction and expression of target genes in hepatocytes. The mathematical model consists of coupled ordinary differential equations (ODE) and the model parameters were estimated by a maximum likelihood approach, whereas identifiability of the dynamic model parameters was ensured by the Profile Likelihood. Using model simulations coupled with experimental validation we could optimize the long-term impact of the JAK-inhibitor Ruxolitinib, a therapeutic compound that is quickly metabolized. Model-predicted doses and timing of treatments helps to improve the reduction of inflammatory APP gene expression in primary mouse hepatocytes close to levels observed during regenerative conditions. The concept of improved efficacy of the inhibitor through multiple treatments at optimized time intervals was confirmed in primary human hepatocytes. Thus, combining quantitative data generation with mathematical modeling suggests that repetitive treatment with Ruxolitinib is required to effectively target excessive inflammatory responses without exceeding doses recommended by the clinical guidelines.
Wissenschaftlicher Artikel
Scientific Article
Krendl, C. ; Shaposhnikov, D. ; Rishko, V. ; Ori, C. ; Ziegenhain, C. ; Sass, S. ; Simon, L. ; Müller, N.S. ; Straub, T. ; Brooks, K.E. ; Chavez, S.L. ; Enard, W. ; Theis, F.J. ; Drukker, M.
Proc. Natl. Acad. Sci. U.S.A. 114, E9579-E9588 (2017)
To elucidate the molecular basis of BMP4-induced differentiation of human pluripotent stem cells (PSCs) toward progeny with trophectoderm characteristics, we produced transcriptome, epigenome H3K4me3, H3K27me3, and CpG methylation maps of trophoblast progenitors, purified using the surface marker APA. We combined them with the temporally resolved transcriptome of the preprogenitor phase and of single APA+ cells. This revealed a circuit of bivalent TFAP2A, TFAP2C, GATA2, and GATA3 transcription factors, coined collectively the "trophectoderm four" (TEtra), which are also present in human trophectoderm in vivo. At the onset of differentiation, the TEtra factors occupy multiple sites in epigenetically inactive placental genes and in OCT4 Functional manipulation of GATA3 and TFAP2A indicated that they directly couple trophoblast-specific gene induction with suppression of pluripotency. In accordance, knocking down GATA3 in primate embryos resulted in a failure to form trophectoderm. The discovery of the TEtra circuit indicates how trophectoderm commitment is regulated in human embryogenesis.
Wissenschaftlicher Artikel
Scientific Article
Angerer, P. ; Simon, L. ; Tritschler, S. ; Wolf, F.A. ; Fischer, D. ; Theis, F.J.
Curr. Opin. Syst. Biol. 4, 85-91 (2017)
Recent technological advances have enabled unprecedented insight into transcriptomics at the level of single cells. Single cell transcriptomics enables the measurement of tran- scriptomic information of thousands of single cells in a single experiment. The volume and complexity of resulting data make it a paradigm of big data. Consequently, the field is presented with new scientific and, in particular, analytical challenges where currently no scalable solutions exist. At the same time, exciting opportunities arise from increased resolution of single- cell RNA sequencing data and improved statistical power of ever growing datasets. Big single cell RNA sequencing data promises valuable insights into cellular heterogeneity which may significantly improve our understanding of biology and human disease. This review focuses on single cell tran- scriptomics and highlights the inherent opportunities and challenges in the context of big data analytics.
Wissenschaftlicher Artikel
Scientific Article
Molnos, S. ; Baumbach, C. ; Wahl, S. ; Müller-Nurasyid, M. ; Strauch, K. ; Wang-Sattler, R. ; Waldenberger, M. ; Meitinger, T. ; Adamski, J. ; Kastenmüller, G. ; Suhre, K. ; Peters, A. ; Grallert, H. ; Theis, F.J. ; Gieger, C.
BMC Bioinformatics 18:429 (2017)
Background Genome-wide association studies allow us to understand the genetics of complex diseases. Human metabolism provides information about the disease-causing mechanisms, so it is usual to investigate the associations between genetic variants and metabolite levels. However, only considering genetic variants and their effects on one trait ignores the possible interplay between different “omics” layers. Existing tools only consider single-nucleotide polymorphism (SNP)–SNP interactions, and no practical tool is available for large-scale investigations of the interactions between pairs of arbitrary quantitative variables. Results We developed an R package called pulver to compute p-values for the interaction term in a very large number of linear regression models. Comparisons based on simulated data showed that pulver is much faster than the existing tools. This is achieved by using the correlation coefficient to test the null-hypothesis, which avoids the costly computation of inversions. Additional tricks are a rearrangement of the order, when iterating through the different “omics” layers, and implementing this algorithm in the fast programming language C++. Furthermore, we applied our algorithm to data from the German KORA study to investigate a real-world problem involving the interplay among DNA methylation, genetic variants, and metabolite levels. Conclusions The pulver package is a convenient and rapid tool for screening huge numbers of linear regression models for significant interaction terms in arbitrary pairs of quantitative variables. pulver is written in R and C++, and can be downloaded freely from CRAN at https://cran.r-project.org/web/packages/pulver/.  
Wissenschaftlicher Artikel
Scientific Article
Garzorz-Stark, N. ; Lauffer, F. ; Krause, L. ; Gross, O. ; Traidl-Hoffmann, C. ; Theis, F.J. ; Schmidt-Weber, C. ; Biedermann, T. ; Eyerich, S. ; Eyerich, K.
J. Invest. Dermatol. 137, S275-S275 (2017)
Meeting abstract
Meeting abstract
Lauffer, F. ; Jargosch, M. ; Krause, L. ; Garzorz-Stark, N. ; Franz, R. ; Roenneberg, S. ; Biedermann, T. ; Theis, F.J. ; Schmidt-Weber, C. ; Eyerich, S. ; Eyerich, K.
J. Invest. Dermatol. 137, S254-S254 (2017)
Meeting abstract
Meeting abstract
Eulenberg, P. ; Köhler, N. ; Blasi, T. ; Filby, A. ; Carpenter, A.E. ; Rees, P. ; Theis, F.J. ; Wolf, F.A.
Nat. Commun. 8:463 (2017)
We show that deep convolutional neural networks combined with nonlinear dimension reduction enable reconstructing biological processes based on raw image data. We demonstrate this by reconstructing the cell cycle of Jurkat cells and disease progression in diabetic retinopathy. In further analysis of Jurkat cells, we detect and separate a subpopulation of dead cells in an unsupervised manner and, in classifying discrete cell cycle stages, we reach a sixfold reduction in error rate compared to a recent approach based on boosting on image features. In contrast to previous methods, deep learning based predictions are fast enough for on-the-fly analysis in an imaging flow cytometer.The interpretation of information-rich, high-throughput single-cell data is a challenge requiring sophisticated computational tools. Here the authors demonstrate a deep convolutional neural network that can classify cell cycle status on-the-fly.
Wissenschaftlicher Artikel
Scientific Article
Krautenbacher, N. ; Theis, F.J. ; Fuchs, C.
Comput. Math. Methods Med. 2017:7847531 (2017)
Epidemiological studies often utilize stratified data in which rare outcomes or exposures are artificially enriched. This design can increase precision in association tests but distorts predictions when applying classifiers on non-stratified data. Several methods correct for this so-called sample selection bias, but their performance remains unclear especially for machine learning classifiers. With an emphasis on two-phase case-control studies, we aim to assess which corrections to perform in which setting and to obtain methods suitable for machine learning techniques, especially the random forest. We propose two new resampling-based methods to resemble the original data and covariance structure: stochastic inverse-probability oversampling and parametric inverseprobability bagging. We compare all techniques for the random forest and other classifiers, both theoretically and on simulated and real data. Empirical results show that the random forest profits from only the parametric inverse-probability bagging proposed by us. For other classifiers, correction is mostly advantageous, and methods perform uniformly. We discuss consequences of inappropriate distribution assumptions and reason for different ....
Wissenschaftlicher Artikel
Scientific Article
Tritschler, S. ; Theis, F.J. ; Lickert, H. ; Böttcher, A.
Mol. Metab. 6, 974-990 (2017)
Background: Diabetes mellitus is characterized by loss or dysfunction of insulin-producing β-cells in the pancreas, resulting in failure of blood glucose regulation and devastating secondary complications. Thus, β-cells are currently the prime target for cell-replacement and regenerative therapy. Triggering endogenous repair is a promising strategy to restore β-cell mass and normoglycemia in diabetic patients. Potential strategies include targeting specific β-cell subpopulations to increase proliferation or maturation. Alternatively, transdifferentiation of pancreatic islet cells (e.g. α- or δ-cells), extra-islet cells (acinar and ductal cells), hepatocytes, or intestinal cells into insulin-producing cells might improve glycemic control. To this end, it is crucial to systematically characterize and unravel the transcriptional program of all pancreatic cell types at the molecular level in homeostasis and disease. Furthermore, it is necessary to better determine the underlying mechanisms of β-cell maturation, maintenance, and dysfunction in diabetes, to identify and molecularly profile endocrine subpopulations with regenerative potential, and to translate the findings from mice to man. Recent approaches in single-cell biology started to illuminate heterogeneity and plasticity in the pancreas that might be targeted for β-cell regeneration in diabetic patients. Scope of review: This review discusses recent literature on single-cell analysis including single-cell RNA sequencing, single-cell mass cytometry, and flow cytometry of pancreatic cell types in the context of mechanisms of endogenous β-cell regeneration. We discuss new findings on the regulation of postnatal β-cell proliferation and maturation. We highlight how single-cell analysis recapitulates described principles of functional β-cell heterogeneity in animal models and adds new knowledge on the extent of β-cell heterogeneity in humans as well as its role in homeostasis and disease. Furthermore, we summarize the findings on cell subpopulations with regenerative potential that might enable the formation of new β-cells in diseased state. Finally, we review new data on the transcriptional program and function of rare pancreatic cell types and their implication in diabetes. Major conclusion: Novel, single-cell technologies offer high molecular resolution of cellular heterogeneity within the pancreas and provide information on processes and factors that govern β-cell homeostasis, proliferation, and maturation. Eventually, these technologies might lead to the characterization of cells with regenerative potential and unravel disease-associated changes in gene expression to identify cellular and molecular targets for therapy.
Wissenschaftlicher Artikel
Scientific Article
Budde, M. ; Kondofersky, I. ; Adorjan, K. ; Aldinger, F. ; Anderson-Schmidt, H. ; Andlauer, T.F. ; Flatau, L. ; Gade, K. ; Heilbronner, U. ; Kalman, J. ; Papiol, S. ; Theis, F.J. ; Falkai, P. ; Müller, N.S. ; Schulze, T.G.
Eur. Neuropsychopharmacol. 27, 3, S406 (2017)
Background Bipolar disorder (BD), schizoaffective disorder (SZA) and schizophrenia (SZ) are severe mental illnesses that share - at least in parts - psychopathological features and an underlying polygenic nature. One characteristic of all three diagnoses is the highly variable disease course and outcome. This heterogeneity is one of the biggest challenges in studying the underlying biological mechanisms. Therefore, defining more homogeneous subgroups across diagnoses is a promising approach. However, there are no clear criteria as how to define a “good” or “poor” course of illness as different domains can be considered such as psychopathology, cognitive performance, psychosocial functioning, or quality of life. We aim to integrate these domains and define longitudinal clusters of patients across diagnoses. Furthermore, we explore the characteristics of these clusters and the association of cluster membership with the individual load on schizophrenia polygenic risk scores (SZ-PRS). Methods Participants were selected from an ongoing longitudinal project carried out at several centers in Germany and Austria (www.kfo241.de; www.PsyCourse.de). We characterize patients at four time-points over an 18-month period with a comprehensive phenotyping battery. The selected sample comprised a total of 198 participants (age(SD)=46.93(12.43); 46% females) with a DSM-IV diagnosis of SZ, SZA or BD, who completed the entire study period. DNA samples were genotyped using the Illumina PsychChip and imputed using the 1000 Genomes Phase 3 reference panel. SZ-PRS were calculated for all individuals based on the PGC2 SZ summary results. Factor analysis for mixed data (FAMD) was applied to compute abstract data dimensions in a set of 117 longitudinally measured variables, i.a. on psychopathology, cognitive performance, functioning and quality of life. Longitudinal trajectories of patients on the first dimension were used as inputs for k-mean clustering for longitudinal data. This, in turn, resulted in the identification of three distinct clusters of patients, which we used as predictive variables for SZ-PRS at 11 p-value thresholds in a linear regression model. Results Strongest loadings on the first dimension computed by FAMD were observed for quality of life items, a global depression rating and level of functioning. Three clusters of longitudinal trajectories were identified on this dimension: A) patients who scored highly on the dimension across all time points (58.1%); B) patients with consistently low scores (26.3%); C) patients who improved from baseline to the last follow up (15.7%). There were no significant between-group differences regarding sex, age, diagnoses, center, age at onset, and duration of illness. Cluster membership was significantly associated with the SZ-PRS with highest polygenic burden in cluster B. Discussion Although the reported results are preliminary and therefore have to be interpreted with caution, the approach of longitudinal clustering in order to identify cross-diagnostic, homogeneous subgroups of patients for genetic studies is promising. The next steps will be refinement of clusters by taking more than one dimension from the FAMD into account, verification of cluster solutions in an external dataset, and exploration of associations with other biological markers.  
Wissenschaftlicher Artikel
Scientific Article
Heilbronner, U. ; Jain, G. ; Kaurani, L. ; Kondofersky, I. ; Budde, M. ; Gade, K. ; Kalman, J. ; Adorjan, K. ; Aldinger, F. ; Anderson-Schmidt, H. ; Müller, N.S. ; Theis, F.J. ; Falkai, P. ; Fischer, A. ; Schulze, T.G.
Eur. Neuropsychopharmacol. 27, 3, S456-S457 (2017)
Background Illnesses from the schizophrenia-to-bipolar spectrum have a highly variable course. Determinants of these different individual trajectories have been of particular interest to scholars during the past century. Beyond rudimentary understanding, however, different course types have been difficult to delineate in categorical disease phenotypes. We have therefore embarked upon a project in which we seek to delineate different course types in a large longitudinal sample of deeply phenotyped patients suffering from disorders of the schizophrenia-to-bipolar continuum. With respect to biology, a dysregulation of microRNAs, small non-coding RNA molecules that flexibly influence transcription, in mental disorders is increasingly recognized. To combine both of these novel approaches, we plan investigate the role of microRNAs in different course types identified using longitudinal cluster analysis. Methods Longitudinal clustering Participants were selected from an ongoing longitudinal, multi-site study (www.kfo241.de, www.PsyCourse.de). Patients with a DSM-IV diagnosis of the schizophrenia-to-bipolar spectrum were comprehensively phenotyped at four time-points over a period of 18 months. A set of longitudinally measured variables on current psychopathology, medication adherence, substance use, cognitive performance, level of psychosocial functioning and various questionnaires was analyzed using factor analysis for mixed data followed by longitudinal cluster analyses. This resulted in the identification of distinct subpopulations of patients, each being heterogeneous in terms of diagnostic composition. MicroRNA sequencing So far, we have compared four different methods to isolate blood borne small non-coding RNAs for RNA-sequencing. By this we were able to establish SOPs for the reliable analysis of circulating small non-coding RNAs in longitudinal cohorts. Results We will present results of our research project at the meeting. Discussion We will discuss our research project at the meeting.  
Wissenschaftlicher Artikel
Scientific Article
Comes, A. ; Aldinger, F. ; Kondofersky, I. ; Adorjan, K. ; Anderson-Schmidt, H. ; Andlauer, T.F. ; Budde, M. ; Gade, K. ; Heilbronner, U. ; Kalman, J. ; Papiol, S. ; Theis, F.J. ; Falkai, P. ; Müller, N.S. ; Schulze, T.G.
Eur. Neuropsychopharmacol. 27, 3, S408-S409 (2017)
Background Psychiatric illnesses such as bipolar disorder, schizophrenia and schizoaffective disorder are severe, disabling disorders associated with decreased quality of life (QOL) and functioning (Bobes, Garcia-Portilla, Bascaran, Saiz, & Bousoño, 2007; Latalova, Prasko, Diveky, Kamaradova, & Velartova, 2010; Merikangas et al., 2012). Stigmatization, co-morbidities, adverse effects of medications, care models with deficits in personal and social recovery needs and chronic symptoms due to treatment resistance are factors that can lead to severe reductions in quality of life and functioning (Kahn et al., 2015; Sum, Ho, & Sim, 2015). In this study we aim to characterize patients with good and poor outcomes according to QOL and functioning scores. Using cluster analysis, we sought to identify longitudinal trajectories and investigate whether levels of QOL and functioning are associated with polygenic risk scores. Determining clusters of patients at higher risk of poorer outcomes is critical to provide early and effective interventions. Methods Longitudinal data was used from the Clinical Research Group 241 and PsyCourse studies in Germany. Participants were phenotyped using a comprehensive battery which included data on socio-demographics, history of illness, symptomatology, QOL and functioning. Data was collected at four equidistant time points over an 18-month period. The Infinium Psycharray from Illumina was used to genotype patients. Relevant questionnaire items (i.e. QOL, functioning scores, and socio-demographic data) were pre-selected and factor analysis for mixed data was applied to identify trends in the data. This allowed for the computation of abstract data dimensions which were used for calculation of longitudinal trajectories. These trajectories can be seen as a representation of the overall status of patients and both the overall level as well as the longitudinal change of this status were used as inputs for a k-mean clustering for longitudinal data (Genolini et al., 2013). This, in turn, resulted in the identification of three distinct subpopulations of patients. In a linear regression model we used clusters as predictive variables for polygenic risk scores at 11 thresholds. Results The dimension which explained the most variance was used for cluster analysis. This dimension was mainly driven by scores for self-satisfaction, life enjoyment, ability to cope with daily tasks, energy, and quality of life. In a sample of 198 patients, three clusters were observed; cluster A (39,4%) consisted of participants with the highest average scores for functioning and QOL, cluster B (33,8%) including participants with the lowest average scores for functioning and QOL, and cluster C (26,8%) consisting of participants who had great improvement in functioning and QOL scores over the course of the longitudinal study. Male patients were substantially overrepresented in cluster A and the inverse effect was observed in cluster B. No significant differences were seen for age of onset, age at interview, or duration of illness within the clusters. Polygenic risk scores at certain thresholds can be predicted by the clusters. In cluster B there was a trend for higher polygenic risk scores. Discussion Phenotypic data provide insight to target sufferers of severe mental illness with worse outcomes. Levels of functioning and QOL seem to be associated with polygenic risk scores. Further investigations are needed.  
Wissenschaftlicher Artikel
Scientific Article
Schulte, E. ; Kondofersky, I. ; Budde, M. ; Adorjan, K. ; Aldinger, F. ; Anderson-Schmidt, H. ; Andlauer, T.F. ; Gade, K. ; Heilbronner, U. ; Kalman, J. ; Papiol, S. ; Theis, F.J. ; Falkai, P. ; Müller, N.S. ; Schulze, T.G.
Eur. Neuropsychopharmacol. 27, 3, S401-S402 (2017)
Background Bipolar disorder (BD), schizophrenia (SZ) and schizoaffective disorder (SZA) can be disabling disorders associated with severe psychiatric symptomatology. Individual psychopathological features often overlap between these diagnostic groups and their severity can vary widely. More severe psychopathological features are generally associated with a less favorable outcome. Further, all three diseases are common complex genetic disorders with a polygenic genetic architecture in the majority of cases. The inherent heterogeneity with regard to disease severity has posed a significant challenge to both the study of the underlying disease mechanism and the clinical management. Therefore, stratification of cases into more homogeneous subgroups across diagnoses using both longitudinal clusters derived from psychometric data and genetic information could provide a means to identify individuals with higher risk for severe illness, mandating earlier and intensified clinical intervention. Methods Individuals included herein partake in an ongoing multisite cohort study across Germany and Austria (www.kfo241.de; www.PsyCourse.de). Participants were characterized at 4 time points over an 18-months period using a comprehensive phenotyping battery. The subsample used here totals 198 participants (46.9±12.4 yrs; 46% female) with DSM-IV diagnoses of SZ, SZA or BD. Blood DNA samples were genotyped using Illumina’s Infinium PsychArray and imputed using the 1000 genomes. SZ-PRS were calculated using PLINK 1.07. Effect sizes and p-values were determined with the PGC2 SZ summary results as discovery sample. A set of 67 longitudinally measured variables derived from the Positive and Negative Syndrome Scale (PANSS), the Inventory of Depressive Symptoms (IDS) and the Young Mania Rating Scale (YMRS) entered the cluster analyses. Factor analysis for mixed data (FAMD) was applied to compute abstract data dimensions, subsequently used to derive the longitudinal trajectories which then served as inputs for a k-mean clustering for longitudinal data. Identified clusters were employed in a linear regression model as predictive variables for SZ-PRS at 11 thresholds. Results Computed by FAMD, the strongest loadings were observed for PANSS and IDS on the first dimension and for IDS on the second dimension. Two clusters of longitudinal trajectories were identified in these dimensions: (A) individuals with continuously low scores on both PANSS and IDS (70.7%) and (B) individuals with consistently high scores on both PANSS and IDS (29.3%). Clusters differed significantly with regard to Global Assessment of Functioning (GAF; higher in (A); FDR-adjusted p-value=2.23x10-10), while there were no significant differences regarding sex, age, diagnoses, center, age at onset, family history or duration of illness. Cluster membership was not significantly associated with the SZ-PRS in either cluster. Discussion Although the results are preliminary and have to be interpreted with caution, the approach of longitudinal clustering to identify cross-diagnostic homogeneous subgroups of individuals appears to be feasible. The fact that more severe psychopathological features were not associated with increased genetic risk burden will also be interesting to explore further.  
Wissenschaftlicher Artikel
Scientific Article
Kazeroonian, A. ; Theis, F.J. ; Hasenauer, J.
Bioinformatics 33, i293-i300 (2017)
Motivation: Stochastic molecular processes are a leading cause of cell-to-cell variability. Their dynamics are often described by continuous-time discrete-state Markov chains and simulated using stochastic simulation algorithms. As these stochastic simulations are computationally demanding, ordinary differential equation models for the dynamics of the statistical moments have been developed. The number of state variables of these approximating models, however, grows at least quadratically with the number of biochemical species. This limits their application to small-and medium-sized processes. Results: In this article, we present a scalable moment-closure approximation (sMA) for the simulation of statistical moments of large-scale stochastic processes. The sMA exploits the structure of the biochemical reaction network to reduce the covariance matrix. We prove that sMA yields approximating models whose number of state variables depends predominantly on local properties, i.e. the average node degree of the reaction network, instead of the overall network size. The resulting complexity reduction is assessed by studying a range of medium-and large-scale biochemical reaction networks. To evaluate the approximation accuracy and the improvement in computational efficiency, we study models for JAK2/STAT5 signalling and NFjB signalling. Our method is applicable to generic biochemical reaction networks and we provide an implementation, including an SBML interface, which renders the sMA easily accessible.
Wissenschaftlicher Artikel
Scientific Article
Peng, T. ; Thorn, K. ; Schroeder, T. ; Wang, L. ; Theis, F.J. ; Marr, C. ; Navab, N.
Nat. Commun. 8:14836 (2017)
Quantitative analysis of bioimaging data is often skewed by both shading in space and background variation in time. We introduce BaSiC, an image correction method based on low-rank and sparse decomposition which solves both issues. In comparison to existing shading correction tools, BaSiC achieves high-accuracy with significantly fewer input images, works for diverse imaging conditions and is robust against artefacts. Moreover, it can correct temporal drift in time-lapse microscopy data and thus improve continuous single-cell quantification. BaSiC requires no manual parameter setting and is available as a Fiji/ImageJ plugin.
Wissenschaftlicher Artikel
Scientific Article
Milger, K. ; Götschke, J. ; Krause, L. ; Nathan, P. ; Alessandrini, F. ; Tufman, A. ; Fischer, R. ; Bartel, S. ; Theis, F.J. ; Behr, J. ; Dehmel, S. ; Müller, N.S. ; Kneidinger, N. ; Krauss-Etschmann, S.
Allergy 72, 1962-1971 (2017)
BACKGROUND: Asthma is a heterogeneous chronic disease with different phenotypes and treatment responses. Thus, there is a high clinical need for molecular disease biomarkers to aid in differentiating these distinct phenotypes. As MicroRNAs (miRNAs), that regulate gene expression at the post-transcriptional level, are altered in experimental and human asthma, circulating miRNAs are attractive candidates for the identification of novel biomarkers. This study aimed to identify plasmatic miRNA-based biomarkers of asthma, through a translational approach. METHODS: We pre-screened miRNAs in plasma samples from two different murine models of experimental asthma (ovalbumin and house dust mite); miRNAs deregulated in both models were further tested in a human training cohort of 20 asthma patients and 9 healthy controls. Candidate miRNAs were then validated in a second, independent group of 26 asthma patients and 12 healthy controls. RESULTS: 10 miRNA ratios consisting of 13 miRNAs were differentially regulated in both murine models. Measuring these miRNAs in the training cohort identified a biomarker signature consisting of 5 miRNA ratios (7 miRNAs). This signature showed a good sensitivity and specificity in the test cohort with an area under the receiver operating characteristic curve (AUC) of 0.92. Correlation of miRNA ratios with clinical characteristics further revealed associations with FVC % predicted, and oral corticosteroid or antileukotriene use. CONCLUSION: Distinct plasma miRNAs are differentially regulated both in murine and human allergic asthma and were associated with clinical characteristics of patients. Thus, we suggest that miRNA levels in plasma might have future potential to subphenotype patients with asthma.
Wissenschaftlicher Artikel
Scientific Article
Ballnus, B. ; Hug, S. ; Hatz, K. ; Görlitz, L. ; Hasenauer, J. ; Theis, F.J.
BMC Syst. Biol. 11:63 (2017)
BACKGROUND: In quantitative biology, mathematical models are used to describe and analyze biological processes. The parameters of these models are usually unknown and need to be estimated from experimental data using statistical methods. In particular, Markov chain Monte Carlo (MCMC) methods have become increasingly popular as they allow for a rigorous analysis of parameter and prediction uncertainties without the need for assuming parameter identifiability or removing non-identifiable parameters. A broad spectrum of MCMC algorithms have been proposed, including single- and multi-chain approaches. However, selecting and tuning sampling algorithms suited for a given problem remains challenging and a comprehensive comparison of different methods is so far not available. RESULTS: We present the results of a thorough benchmarking of state-of-the-art single- and multi-chain sampling methods, including Adaptive Metropolis, Delayed Rejection Adaptive Metropolis, Metropolis adjusted Langevin algorithm, Parallel Tempering and Parallel Hierarchical Sampling. Different initialization and adaptation schemes are considered. To ensure a comprehensive and fair comparison, we consider problems with a range of features such as bifurcations, periodical orbits, multistability of steady-state solutions and chaotic regimes. These problem properties give rise to various posterior distributions including uni- and multi-modal distributions and non-normally distributed mode tails. For an objective comparison, we developed a pipeline for the semi-automatic comparison of sampling results. CONCLUSION: The comparison of MCMC algorithms, initialization and adaptation schemes revealed that overall multi-chain algorithms perform better than single-chain algorithms. In some cases this performance can be further increased by using a preceding multi-start local optimization scheme. These results can inform the selection of sampling methods and the benchmark collection can serve for the evaluation of new algorithms. Furthermore, our results confirm the need to address exploration quality of MCMC chains before applying the commonly used quality measure of effective sample size to prevent false analysis conclusions.
Wissenschaftlicher Artikel
Scientific Article
Conlon, T.M. ; Hackl, M. ; Bartel, J. ; Krumsiek, J. ; Irmler, M. ; Beckers, J. ; Theis, F.J. ; Eickelberg, O. ; Yildirim, A.Ö.
Am. J. Respir. Crit. Care Med. 195 (2017)
Meeting abstract
Meeting abstract
Bartel, S. ; Schulz, N. ; Alessandrini, F. ; Schamberger, A.C. ; Pagel, P. ; Theis, F.J. ; Milger, K. ; Nößner, E. ; Stick, S.M. ; Kicic, A. ; Eickelberg, O. ; Freishtat, R.J. ; Krauss-Etschmann, S.
Sci. Rep. 7:46026 (2017)
Asthma is highly prevalent, but current therapies cannot influence the chronic course of the disease. It is thus important to understand underlying early molecular events. In this study, we aimed to use microRNAs (miRNAs) - which are critical regulators of signaling cascades - to identify so far uncharacterized asthma pathogenesis pathways. Therefore, deregulation of miRNAs was assessed in whole lungs from mice with ovalbumin (OVA)-induced allergic airway inflammation (AAI). In silico predicted target genes were confirmed in reporter assays and in house-dust-mite (HDM) induced AAI and primary human bronchial epithelial cells (NHBE) cultured at the air-liquid interface. We identified and validated the transcription factor cAMP-responsive element binding protein (Creb1) and its transcriptional co-activators (Crtc1-3) as targets of miR-17, miR-144, and miR-21. Sec14-like 3 (Sec14l3) - a putative target of Creb1 - was down-regulated in both asthma models and in NHBE cells upon IL13 treatment, while it's expression correlated with ciliated cell development and decreased along with increasing goblet cell metaplasia. Finally, we propose that Creb1/Crtc1-3 and Sec14l3 could be important for early responses of the bronchial epithelium to Th2-stimuli. This study shows that miRNA profiles can be used to identify novel targets that would be overlooked in mRNA based strategies.
Wissenschaftlicher Artikel
Scientific Article
Chlis, N.-K. ; Wolf, F.A. ; Theis, F.J.
Bioinformatics 33, 3211-3219 (2017)
Motivation: The identification of heterogeneities in cell populations by utilizing single-cell technologies such as single-cell RNA-Seq, enables inference of cellular development and lineage trees. Several methods have been proposed for such inference from high-dimensional single-cell data. They typically assign each cell to a branch in a differentiation trajectory. However, they commonly assume specific geometries such as tree-like developmental hierarchies and lack statistically sound methods to decide on the number of branching events. Results: We present K-Branches, a solution to the above problem by locally fitting half-lines to single-cell data, introducing a clustering algorithm similar to K-Means. These halflines are proxies for branches in the differentiation trajectory of cells. We propose a modified version of the GAP statistic for model selection, in order to decide on the number of lines that best describe the data locally. In this manner, we identify the location and number of subgroups of cells that are associated with branching events and full differentiation, respectively. We evaluate the performance of our method on single-cell RNA-Seq data describing the differentiation of myeloid progenitors during hematopoiesis, single-cell qPCR data of mouse blastocyst development, single-cell qPCR data of human myeloid monocytic leukemia and artificial data. Availability: An R implementation of K-Branches is freely available at https://github.com/theislab/kbranches.
Wissenschaftlicher Artikel
Scientific Article
Hilgendorff, A. ; Foerster, K. ; Sass, S. ; Stoecklein, S. ; Dietrich, O. ; Pomschar, A. ; Schulze, A. ; Huebener, C. ; Theis, F.J. ; Erhardt, H. ; Flemmer, A.W. ; Ertl-Wagner, B.
Am. J. Respir. Crit. Care Med. 195 (2017)
Meeting abstract
Meeting abstract
Piasecka, J. ; Hennig, H. ; Theis, F.J. ; Rees, P. ; Summers, H.D. ; Thornton, C.A.
J. Allergy Clin. Immunol. 139, AB163-AB163 (2017)
Meeting abstract
Meeting abstract
Florea, A.M. ; Varghese, E. ; McCallum, J.E. ; Mahgoub, S. ; Helmy, I. ; Varghese, S. ; Gopinath, N. ; Sass, S. ; Theis, F.J. ; Reifenberger, G. ; Büsselberg, D.
Oncotarget 8, 22876-22893 (2017)
Neuroblastoma (NB) is a pediatric cancer treated with poly-chemotherapy including platinum complexes (e.g. cisplatin (CDDP), carboplatin), DNA alkylating agents, and topoisomerase I inhibitors (e.g. topotecan (TOPO)). Despite aggressive treatment, NB may become resistant to chemotherapy. We investigated whether CDDP and TOPO treatment of NB cells interacts with the expression and function of proteins involved in regulating calcium signaling. Human neuroblastoma cell lines SH-SY5Y, IMR-32 and NLF were used to investigate the effects of CDDP and TOPO on cell viability, apoptosis, calcium homeostasis, and expression of selected proteins regulating intracellular calcium concentration ([Ca2+]i). In addition, the impact of pharmacological inhibition of [Ca2+]i-regulating proteins on neuroblastoma cell survival was studied. Treatment of neuroblastoma cells with increasing concentrations of CDDP (0.1-10 μM) or TOPO (0.1 nM-1 μM) induced cytotoxicity and increased apoptosis in a concentration- and time-dependent manner. Both drugs increased [Ca2+]i over time. Treatment with CDDP or TOPO also modified mRNA expression of selected genes encoding [Ca2+]i-regulating proteins. Differentially regulated genes included S100A6, ITPR1, ITPR3, RYR1 and RYR3. With FACS and confocal laser scanning microscopy experiments we validated their differential expression at the protein level. Importantly, treatment of neuroblastoma cells with pharmacological modulators of [Ca2+]i-regulating proteins in combination with CDDP or TOPO increased cytotoxicity. Thus, our results confirm an important role of calcium signaling in the response of neuroblastoma cells to chemotherapy and suggest [Ca2+]i modulation as a promising strategy for adjunctive treatment.
Wissenschaftlicher Artikel
Scientific Article
Kuepper, M.K. ; Thomas, J. ; Krause, L. ; Müller, N.S. ; Biedermann, T. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Eyerich, K. ; Eyerich, S. ; Garzorz-Stark, N.
Exp. Dermatol. 26, E69-E70 (2017)
Meeting abstract
Meeting abstract
Garzorz-Stark, N. ; Krause, L. ; Lauffer, F. ; Atenhan, A. ; Thomas, J. ; Theis, F.J. ; Müller, N.S. ; Biedermann, T. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Eyerich, K.
Exp. Dermatol. 26, E68-E68 (2017)
Meeting abstract
Meeting abstract
Lauffer, F. ; Krause, L. ; Franz, R. ; Garzorz-Stark, N. ; Biedermann, T. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Eyerich, K.
Exp. Dermatol. 26, E61-E61 (2017)
Meeting abstract
Meeting abstract
Blasi, T. ; Buettner, F. ; Strasser, M. ; Marr, C. ; Theis, F.J.
Phys. Biol. 14:036001 (2017)
MOTIVATION: Accessing gene expression at the single cell level has unraveled often large heterogeneity among seemingly homogeneous cells, which remained obscured in traditional population based approaches. The computational analysis of single-cell transcriptomics data, however, still imposes unresolved challenges with respect to normalization, visualization and modeling the data. One such issue are differences in cell size, which introduce additional variability into the data, for which appropriate normalization techniques are needed. Otherwise, these differences in cell size may obscure genuine heterogeneities among cell populations and lead to overdispersed steady-state distributions of mRNA transcript numbers. RESULTS: We present cgCorrect, a statistical framework to correct for differences in cell size that are due to cell growth in single-cell transcriptomics data. We derive the probability for the cell growth corrected mRNA transcript number given the measured, cell size dependent mRNA transcript number, based on the assumption that the average number of transcripts in a cell increases proportional to the cell's volume during cell cycle. cgCorrect can be used for both data normalization, and to analyze steady-state distributions used to infer the gene expression mechanism. We demonstrate its applicability on both simulated data and single-cell quantitative real-time PCR data from mouse blood stem and progenitor cells and to quantitative single-cell RNA-sequencing data obtained from mouse embryonic stem cells. We show that correcting for differences in cell size affects the interpretation of the data obtained by typically performed computational analysis.
Wissenschaftlicher Artikel
Scientific Article
Hilsenbeck, O. ; Schwarzfischer, M. ; Loeffler, D. ; Dimopoulos, S. ; Hastreiter, S. ; Marr, C. ; Theis, F.J. ; Schroeder, T.
Bioinformatics 33, 2020-2028 (2017)
Motivation: Quantitative large-scale cell microscopy is widely used in biological and medical research. Such experiments produce huge amounts of image data and thus require automated analysis. However, automated detection of cell outlines (cell segmentation) is typically challenging due to, e.g., high cell densities, cell-to-cell variability and low signal-to-noise ratios. Results: Here, we evaluate accuracy and speed of various state-of-the-art approaches for cell segmentation in light microscopy images using challenging real and synthetic image data. The results vary between datasets and show that the tested tools are either not robust enough or computationally expensive, thus limiting their application to large-scale experiments. We therefore developed fastER, a trainable tool that is orders of magnitude faster while producing state-of-the-art segmentation quality. It supports various cell types and image acquisition modalities, but is easy-to-use even for non-experts: it has no parameters and can be adapted to specific image sets by interactively labelling cells for training. As a proof of concept, we segment and count cells in over 200,000 brightfield images (1388 × 1040 pixels each) from a six day time-lapse microscopy experiment; identification of over 46,000,000 single cells requires only about two and a half hours on a desktop computer.
Wissenschaftlicher Artikel
Scientific Article
Buggenthin, F. ; Buettner, F. ; Hoppe, P.S. ; Endele, M. ; Kroiss, M. ; Strasser, M. ; Schwarzfischer, M. ; Loeffler, D. ; Kokkaliaris, K.D. ; Hilsenbeck, O. ; Schroeder, T. ; Theis, F.J. ; Marr, C.
Nat. Methods, DOI: 10.1038/nmeth.4182 (2017)
Differentiation alters molecular properties of stem and progenitor cells, leading to changes in their shape and movement characteristics. We present a deep neural network that prospectively predicts lineage choice in differentiating primary hematopoietic progenitors using image patches from brightfield microscopy and cellular movement. Surprisingly, lineage choice can be detected up to three generations before conventional molecular markers are observable. Our approach allows identification of cells with differentially expressed lineage-specifying genes without molecular labeling.
Wissenschaftlicher Artikel
Scientific Article
Jagiella, N. ; Rickert, D. ; Theis, F.J. ; Hasenauer, J.
Cell Syst. 4, 194–206.e9 (2017)
Mechanistic understanding of multi-scale biological processes, such as cell proliferation in a changing biological tissue, is readily facilitated by computational models. While tools exist to construct and simulate multi-scale models, the statistical inference of the unknown model parameters remains an open problem. Here, we present and benchmark a parallel approximate Bayesian computation sequential Monte Carlo (pABC SMC) algorithm, tailored for high-performance computing clusters. pABC SMC is fully automated and returns reliable parameter estimates and confidence intervals. By running the pABC SMC algorithm for ∼106 hr, we parameterize multi-scale models that accurately describe quantitative growth curves and histological data obtained in vivo from individual tumor spheroid growth in media droplets. The models capture the hybrid deterministic-stochastic behaviors of 105-106 of cells growing in a 3D dynamically changing nutrient environment. The pABC SMC algorithm reliably converges to a consistent set of parameters. Our study demonstrates a proof of principle for robust, data-driven modeling of multi-scale biological systems and the feasibility of multi-scale model parameterization through statistical inference. A new parallel approximate Bayesian computation sequential Monte Carlo (pABC SMC) algorithm allows for robust, data-driven modeling of multi-scale biological systems and demonstrates the feasibility of multi-scale model parameterization through statistical inference.
Wissenschaftlicher Artikel
Scientific Article
Fröhlich, F. ; Theis, F.J. ; Rädler, J.O. ; Hasenauer, J.
Bioinformatics 33, 1049-1056 (2017)
Motivation: Ordinary differential equation (ODE) models are frequently used to describe the dynamic behaviour of biochemical processes. Such ODE models are often extended by events to describe the effect of fast latent processes on the process dynamics. To exploit the predictive power of ODE models, their parameters have to be inferred from experimental data. For models without events, gradient based optimization schemes perform well for parameter estimation, when sensitivity equations are used for gradient computation. Yet, sensitivity equations for models with parameter- and state-dependent events and event-triggered observations are not supported by existing toolboxes. Results: In this manuscript, we describe the sensitivity equations for differential equation models with events and demonstrate how to estimate parameters from event-resolved data using event-triggered observations in parameter estimation. We consider a model for GFP expression after transfection and a model for spiking neurons and demonstrate that we can improve computational efficiency and robustness of parameter estimation by using sensitivity equations for systems with events. Moreover, we demonstrate that, by using event-outputs, it is possible to consider event-resolved data, such as time-to-event data, for parameter estimation with ODE models. By providing a user-friendly, modular implementation in the toolbox AMICI, the developed methods are made publicly available and can be integrated in other systems biology toolboxes. Availability and Implementation: We implement the methods in the open-source toolbox Advanced MATLAB Interface for CVODES and IDAS (AMICI, https://github.com/ICB-DCM/AMICI).
Wissenschaftlicher Artikel
Scientific Article
Fröhlich, F. ; Kaltenbacher, B. ; Theis, F.J. ; Hasenauer, J.
PLoS Comput. Biol. 13:e1005331 (2017)
Mechanistic mathematical modeling of biochemical reaction networks using ordinary differential equation (ODE) models has improved our understanding of small- and medium-scale biological processes. While the same should in principle hold for large- and genome-scale processes, the computational methods for the analysis of ODE models which describe hundreds or thousands of biochemical species and reactions are missing so far. While individual simulations are feasible, the inference of the model parameters from experimental data is computationally too intensive. In this manuscript, we evaluate adjoint sensitivity analysis for parameter estimation in large scale biochemical reaction networks. We present the approach for time discrete measurement and compare it to state-of-the-art methods used in systems and computational biology. Our comparison reveals a significantly improved computational efficiency and a superior scalability of adjoint sensitivity analysis. The computational complexity is effectively independent of the number of parameters, enabling the analysis of large- and genome scale models. Our study of a comprehensive kinetic model of ErbB signaling shows that parameter estimation using adjoint sensitivity analysis requires a fraction of the computation time of established methods. The proposed method will facilitate mechanistic modeling of genome-scale cellular processes, as required in the age of omics.
Wissenschaftlicher Artikel
Scientific Article
von Toerne, C. ; Laimighofer, M. ; Achenbach, P. ; Beyerlein, A. ; de Las Heras Gala, T. ; Krumsiek, J. ; Theis, F.J. ; Ziegler, A.-G. ; Hauck, S.M.
Diabetologia 60, 287-295 (2017)
Aims/hypothesis We sought to identify minimal sets of serum peptide signatures as markers for islet autoimmunity and predictors of progression rates to clinical type 1 diabetes in a case–control study. Methods A double cross-validation approach was applied to first prioritise peptides from a shotgun proteomic approach in 45 islet autoantibody-positive and -negative children from the BABYDIAB/BABYDIET birth cohorts. Targeted proteomics for 82 discriminating peptides were then applied to samples from another 140 children from these cohorts. Results A total of 41 peptides (26 proteins) enriched for the functional category lipid metabolism were significantly different between islet autoantibody-positive and autoantibody-negative children. Two peptides (from apolipoprotein M and apolipoprotein C-IV) were sufficient to discriminate autoantibody-positive from autoantibody-negative children. Hepatocyte growth factor activator, complement factor H, ceruloplasmin and age predicted progression time to type 1 diabetes with a significant improvement compared with age alone. Conclusion/interpretation Distinct peptide signatures indicate islet autoimmunity prior to the clinical manifestation of type 1 diabetes and enable refined staging of the presymptomatic disease period.  
Wissenschaftlicher Artikel
Scientific Article
Senís, E. ; Mockenhaupt, S. ; Rupp, D. ; Bauer, T. ; Paramasivam, N. ; Knapp, B. ; Gronych, J. ; Grosse, S.D. ; Windisch, M.P. ; Schmidt, F. ; Theis, F.J. ; Eils, R. ; Lichter, P. ; Schlesner, M. ; Bartenschlager, R. ; Grimm, D.
Nucleic Acids Res. 45:e3 (2017)
Successful RNAi applications depend on strategies allowing robust and persistent expression of minimal gene silencing triggers without perturbing endogenous gene expression. Here, we propose a novel avenue which is integration of a promoterless shmiRNA, i.e. a shRNA embedded in a micro-RNA (miRNA) scaffold, into an engineered genomic miRNA locus. For proof-of-concept, we used TALE or CRISPR/Cas9 nucleases to site-specifically integrate an anti-hepatitis C virus (HCV) shmiRNA into the liver-specific miR-122/hcr locus in hepatoma cells, with the aim to obtain cellular clones that are genetically protected against HCV infection. Using reporter assays, Northern blotting and qRT-PCR, we confirmed anti-HCV shmiRNA expression as well as miR-122 integrity and functionality in selected cellular progeny. Moreover, we employed a comprehensive battery of PCR, cDNA/miRNA profiling and whole genome sequencing analyses to validate targeted integration of a single shmiRNA molecule at the expected position, and to rule out deleterious effects on the genomes or transcriptomes of the engineered cells. Importantly, a subgenomic HCV replicon and a full-length reporter virus, but not a Dengue virus control, were significantly impaired in the modified cells. Our original combination of DNA engineering and RNAi expression technologies benefits numerous applications, from miRNA, genome and transgenesis research, to human gene therapy.
Wissenschaftlicher Artikel
Scientific Article
2016
Förster, K. ; Sass, S. ; Dietrich, O. ; Pomschar, A. ; Nährlich, L. ; Schulze, A. ; Flemmer, A.W. ; Ehrhardt, H. ; Hübener, C. ; Eickelberg, O. ; Theis, F.J. ; Ertl-Wagner, B. ; Hilgendorff, A.
Eur. Respir. J. 48, PP104 (2016)
Neonatal chronic lung disease, i.e. BPD determines long-term pulmonary and neurologic development. Early markers are urgently needed for timely diagnosis and personalized treatment. The prospective study determined structural and functional changes in the preterm lung at the time of diagnosis and identified early disease markers by proteome screening in plasma in the first week of life. 40 infants (27.7±2.09wks, 984±332g) were included for advanced MRI measurements (3-Tesla) and complemented by Infant Lung function testing (ILFT) in spontaneously breathing infants. Plasma samples were processed for proteomic screening by SOMAscan™. Key findings were confirmed in an independent study cohort (n=21 infants). Statistical analysis used penalized and Poisson regression analysis; for protein analysis confounder effects were subtracted by lasso regression. Statistical analysis confirmed a high correlation of MRI and lung function variables and identified a pattern characterizing changes in the lungs of preterm infants by T2- and T1-weighed image analysis and lung volume measurements as well as ILFT. Functional enrichment analysis showed overrepresentation of the GO categories 'immune function', 'extracellular matrix', 'cellular proliferation/migration', 'organ development' and 'angiogenesis' in infants with BPD. One protein was identified as a potential biomarker. We identified a structural pattern characterizing BPD by advanced MRI confirmed by ILFT. The identified protein indicated BPD development in the first week of life enabling personalized treatment strategies.
Meeting abstract
Meeting abstract
Much, D. ; Beyerlein, A. ; Kindt, A. ; Krumsiek, J. ; Rossbauerl, M. ; Hofelich, A. ; Hivner, S. ; Herbst, M. ; Römisch-Margl, W. ; Prehn, C. ; Adamski, J. ; Kastenmüller, G. ; Theis, F.J. ; Ziegler, A.-G. ; Hummel, S.
Diabetologia 59, S187-S187 (2016)
Meeting abstract
Meeting abstract
Stoecker, K. ; Sass, S. ; Theis, F.J. ; Hauner, H. ; Pfaffl, M.W.
Biomol. Detect. Quantif. 11, 31-44 (2016)
The process of adipogenesis is controlled in a highly orchestrated manner, including transcriptional and post-transcriptional events. In developing 3T3-L1 pre-adipocytes, this program can be interrupted by all-trans retinoic acid (ATRA). To examine this inhibiting impact by ATRA, we generated large-scale transcriptomic data on the microRNA and mRNA level. Non-coding RNAs such as microRNAs represent a field in RNA turnover, which is very important for understanding the regulation of mRNA gene expression. High throughput mRNA and microRNA expression profiling was performed using mRNA hybridisation microarray technology and multiplexed expression assay for microRNA quantification. After quantitative measurements we merged expression data sets, integrated the results and analysed the molecular regulation of . in vitro adipogenesis. For this purpose, we applied local enrichment analysis on the integrative microRNA-mRNA network determined by a linear regression approach. This approach includes the target predictions of TargetScan Mouse 5.2 and 23 pre-selected, significantly regulated microRNAs as well as Affymetrix microarray mRNA data. We found that the cellular lipid metabolism is negatively affected by ATRA. Furthermore, we were able to show that microRNA 27a and/or microRNA 96 are important regulators of gap junction signalling, the rearrangement of the actin cytoskeleton as well as the citric acid cycle, which represent the most affected pathways with regard to inhibitory effects of ATRA in 3T3-L1 preadipocytes. In conclusion, the experimental workflow and the integrative microRNA-mRNA data analysis shown in this study represent a possibility for illustrating interactions in highly orchestrated biological processes. Further the applied global microRNA-mRNA interaction network may also be used for the pre-selection of potential new biomarkers with regard to obesity or for the identification of new pharmaceutical targets.
Wissenschaftlicher Artikel
Scientific Article
Huypens, P. ; Sass, S. ; Wu, M. ; Dyckhoff, D. ; Tschöp, M.H. ; Theis, F.J. ; Marschall, S. ; Hrabě de Angelis, M. ; Beckers, J.
Obstet. Gynecol. Surv. 71, 719-720 (2016)
Letter to the Editor
Letter to the Editor
Luber, B. ; Keller, S. ; Zwingenberger, G. ; Ebert, K. ; Maier, D. ; Geier, B. ; Theis, F.J. ; Hasenauer, J. ; Huge, S. ; Meyer-Hermann, M. ; Dehghany, J. ; Walch, A.K. ; Aichler, M. ; Lordick, F. ; Haffner, I.
Eur. J. Cancer 68, S135-S135 (2016)
Meeting abstract
Meeting abstract
Fröhlich, F. ; Shadrin, A. ; Kessler, T. ; Wierling, C. ; Heinig, M. ; Theis, F.J. ; Lange, B.M. ; Lehrach, H. ; Hasenauer, J.
Eur. J. Cancer 68, S44-S44 (2016)
Meeting abstract
Meeting abstract
Hross, S. ; Fiedler, A. ; Theis, F.J. ; Hasenauer, J.
In: (Foundations of Systems Biology in Engineering - FOSBE 2016, 9—12 October 2016, Magdeburg). 2016. 264-269 (IFAC PapersOnline ; 49-26)
Gradient formation of Poml is a key regulator of cell cycle and cell growth in fission yeast (Schizosaccharomyces pombe). A variety of models to explain Poml gradient formation have been proposed, a quantitative analysis and comparison of these models is, however, still missing. In this work we present four models from the literature and perform a quantitative comparison using published single-cell images of the gradient formation process. For the comparison of these partial differential equation (PDE) models we use state-of-the-art techniques for parameter estimation together with model selection. The model selection supports the hypothesis that buffering of the gradient is achieved via clustering. The selected model does, however, not ensure mass conservation, which might be considered as problematic.
Kong, B. ; Bruns, P. ; Behler, N. ; Schlitter, A.M. ; Friess, H. ; Erkan, M. ; Theis, F.J. ; Esposito, I. ; Michalski, C.W. ; Kleeff, J.
Pancreatol. 16, S. 11 (2016)
Meeting abstract
Meeting abstract
González-Valllinas, M. ; Albrecht, M. ; Pitea, A. ; Rodriguez-Paredes, M. ; Stichel, D. ; Sass, S. ; Gutekunst, J. ; Schmitt, J. ; Muley, T. ; Meister, M. ; Warth, A. ; Schirmacher, P. ; Theis, F.J. ; Müller, N.S. ; Matthäus, F. ; Breuhahn, K.
Cancer Res. 76 (2016)
Background: Non-small cell lung cancer (NSCLC) is one of the most aggressive tumor entities and first data indicate that microRNAs (miRNAs) are central regulators of NSCLC dissemination. Since each miRNA is able to modulate the expression of several transcripts, they are promising targets for the development of drugs that cause efficient antitumor effects and low resistance. However, the relevant network of miRNA/mRNA driving NSCLC metastasis has not been identified yet. Methods: The differential expression of miRNAs was compared between NSCLC samples from patients with and without lymph node metastasis (N1, N2 and N3 vs. N0) in a cohort of The Cancer Genomic Atlas (TCGA) database (n = 449). The dysregulation of the miRNAs in tumors versus normal lung samples (n = 39) was also analyzed. For validation, fresh-frozen samples from an independent patient cohort (n = 108) were analyzed by qRT-PCR. The role of selected miRNAs in tumor dissemination was assessed by migration and invasion experiments after transfection of respective antagomirs and agomirs in NSCLC cells (time-lapse microscopy). The novel algorithm “miRlastic” was used to identify potential miRNA targets through the integration of miRNA-mRNA expression data by negative multiple linear regression analysis. Moreover, differential methylation of the miRNA genomic locations was studied as a possible mechanism of miRNA dysregulation by analyzing Illumina Infinium 450 k DNA methylation TCGA data (n = 29). Results: By using a stringent selection process, we identified 135 miRNAs differentially induced or reduced in NSCLCs with lymph node metastasis (p≤0.05). Interestingly, 22/135 (16.3%) of the selected miRNAs were located in the chromosomal cluster 14q32.31. Elevated expression of miR-323b, miR-487a and miR-539, which are located in 14q32.31, significantly correlated with poor patient survival. Time-resolved and quantitative analysis of lateral migration illustrated that these miRNAs increased tumor migration without affecting cell viability. Moreover, miRlastic identified several metastasis-related genes as potential downstream targets of these miRNAs. The connection between miRNAs encoded in 14q32.31 and candidate targets was confirmed in NSCLC cell lines (e.g. Pumilio RNA-Binding Family Member-2; PUM2). Lastly, hypomethylation of the 14q32.31 cluster in tumor tissues might explain increased expression of these miRNAs. Conclusions: Our results demonstrate that miRNAs located in the chromosomal cluster 14q32.31 are driving NSCLC dissemination. Therefore, we hypothesize that the coordinated overexpression of these miRNAs is part of a genetic network supporting cancer progression and that they represent promising cancer biomarkers and therapeutic targets. Citation Format: Margarita González-Vallinas, Marco Albrecht, Adriana Pitea, Manuel Rodríguez-Paredes, Damian Stichel, Steffen Sass, Julian Gutekunst, Jennifer Schmitt, Thomas Muley, Michael Meister, Arne Warth, Peter Schirmacher, Fabian J. Theis, Nikola S. Müller, Franziska Matthäus, Kai Breuhahn. Identification of a miRNA/mRNA network driving non-small cell lung cancer (NSCLC) dissemination. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 1945.
Meeting abstract
Meeting abstract
Sacher, A. ; Theis, F.J.
Laborpraxis, DOI: undefined (2016)
Moderne Wirkstoffforschung wird immer komplexer. Um Vorgänge zu verstehen, muss immer öfter der Mechanismus in einzelnen Zellen beobachtet werden. Lesen Sie, wie mathematische Methoden hier unterstützend eingreifen und damit zu neuen Einblicken verhelfen können.
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J.
https://www.bsse.ethz.ch/csd/software/ttt-and-qtfy.html (2016)
Gemeinsam mit Kollegen von der ETH Zürich haben Wissenschaftler am Helmholtz Zentrum München und der Technischen Universität München eine Software entwickelt, die erlaubt, einzelne Zellen über Wochen zu beobachten und gleichzeitig molekulare Eigenschaften zu messen. Sie ist frei verfügbar und wurde nun in ´Nature Biotechnology`vorgestellt.
Feigelman, J. ; Ganscha, S. ; Hastreiter, S. ; Schwarzfischer, M. ; Filipczyk, A. ; Schröder, T. ; Theis, F.J. ; Marr, C. ; Claassen, M.
Cell Syst. 3, 480-490 (2016)
Many cellular effectors of pluripotency are dynamically regulated. In principle, regulatory mechanisms can be inferred from single-cell observations of effector activity across time. However, rigorous inference techniques suitable for noisy, incomplete, and heterogeneous data are lacking. Here, we introduce stochastic inference on lineage trees (STILT), an algorithm capable of identifying stochastic models that accurately describe the quantitative behavior of cell fate markers observed using time-lapse microscopy data collected from proliferating cell populations. STILT performs exact Bayesian parameter inference and stochastic model selection using a particle-filter-based algorithm. We use STILT to investigate the autoregulation of Nanog, a heterogeneously expressed core pluripotency factor, in mouse embryonic stem cells. STILT rejects the possibility of positive Nanog autoregulation with high confidence; instead, model predictions indicate weak negative feedback. We use STILT for rational experimental design and validate model predictions using novel experimental data.
Wissenschaftlicher Artikel
Scientific Article
Bortsova, G. ; Sterr, M. ; Wang, L. ; Milletari, F. ; Navab, N. ; Böttcher, A. ; Lickert, H. ; Theis, F.J. ; Peng, T.
Lect. Notes Comput. Sc. 10019, 287-295 (2016)
Intestinal enteroendocrine cells secrete hormones that are vital for the regulation of glucose metabolism but their differentiation from intestinal stem cells is not fully understood. Asymmetric stem cell divisions have been linked to intestinal stem cell homeostasis and secretory fate commitment. We monitored cell divisions using 4D live cell imaging of cultured intestinal crypts to characterize division modes by means of measurable features such as orientation or shape. A statistical analysis of these measurements requires annotation of mitosis events, which is currently a tedious and time-consuming task that has to be performed manually. To assist data processing, we developed a learning based method to automatically detect mitosis events. The method contains a dual-phase framework for joint detection of dividing cells (mothers) and their progeny (daughters). In the first phase we detect mother and daughters independently using Hough Forest whilst in the second phase we associate mother and daughters by modelling their joint probability as Conditional Random Field (CRF). The method has been evaluated on 32 movies and has achieved an AUC of 72%, which can be used in conjunction with manual correction and dramatically speed up the processing pipeline.
Wissenschaftlicher Artikel
Scientific Article
Thomas, J. ; Garzorz-Stark, N. ; Kuepper, L. ; Krause, L. ; Müller, N.S. ; Biedermann, T. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Eyerich, K. ; Eyerich, S.
Exp. Dermatol. 25, 27-28 (2016)
Meeting abstract
Meeting abstract
Lauffer, F. ; Krause, L. ; Franz, R. ; Garzorz-Stark, N. ; Biedermann, T. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Eyerich, K.
Exp. Dermatol. 25, 30-31 (2016)
Meeting abstract
Meeting abstract
Kuepper, M.K. ; Thomas, J. ; Garzorz-Stark, N. ; Krause, L. ; Müller, N.S. ; Biedermann, T. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Eyerich, K. ; Eyerich, S.
J. Invest. Dermatol. 136, S217-S217 (2016)
Meeting abstract
Meeting abstract
Lauffer, F. ; Krause, L. ; Franz, R. ; Garzorz-Stark, N. ; Biedermann, T. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Eyerich, K.
J. Invest. Dermatol. 136, S219-S219 (2016)
Meeting abstract
Meeting abstract
Kong, B. ; Bruns, P. ; Behler, N.A. ; Chang, L. ; Schlitter, A.M. ; Cao, J. ; Gewies, A. ; Ruland, J. ; Fritzsche, S. ; Valkovskaya, N. ; Jian, Z. ; Regel, I. ; Raulefs, S. ; Irmler, M. ; Beckers, J. ; Friess, H. ; Erkan, M. ; Müller, N.S. ; Roth, S. ; Hackert, T. ; Esposito, I. ; Theis, F.J. ; Kleeff, J. ; Michalski, C.W.
Gut 67, 146-156 (2016)
OBJECTIVE: The initial steps of pancreatic regeneration versus carcinogenesis are insufficiently understood. Although a combination of oncogenic Kras and inflammation has been shown to induce malignancy, molecular networks of early carcinogenesis remain poorly defined. DESIGN: We compared early events during inflammation, regeneration and carcinogenesis on histological and transcriptional levels with a high temporal resolution using a well-established mouse model of pancreatitis and of inflammation-accelerated Kras(G12D)-driven pancreatic ductal adenocarcinoma. Quantitative expression data were analysed and extensively modelled in silico. RESULTS: We defined three distinctive phases-termed inflammation, regeneration and refinement-following induction of moderate acute pancreatitis in wild-type mice. These corresponded to different waves of proliferation of mesenchymal, progenitor-like and acinar cells. Pancreas regeneration required a coordinated transition of proliferation between progenitor-like and acinar cells. In mice harbouring an oncogenic Kras mutation and challenged with pancreatitis, there was an extended inflammatory phase and a parallel, continuous proliferation of mesenchymal, progenitor-like and acinar cells. Analysis of high-resolution transcriptional data from wild-type animals revealed that organ regeneration relied on a complex interaction of a gene network that normally governs acinar cell homeostasis, exocrine specification and intercellular signalling. In mice with oncogenic Kras, a specific carcinogenic signature was found, which was preserved in full-blown mouse pancreas cancer. CONCLUSIONS: These data define a transcriptional signature of early pancreatic carcinogenesis and a molecular network driving formation of preneoplastic lesions, which allows for more targeted biomarker development in order to detect cancer earlier in patients with pancreatitis.
Wissenschaftlicher Artikel
Scientific Article
Fiedler, A. ; Raeth, S. ; Theis, F.J. ; Hausser, A. ; Hasenauer, J.
BMC Syst. Biol. 10:80 (2016)
Background: Ordinary differential equation (ODE) models are widely used to describe (bio-)chemical and biological processes. To enhance the predictive power of these models, their unknown parameters are estimated from experimental data. These experimental data are mostly collected in perturbation experiments, in which the processes are pushed out of steady state by applying a stimulus. The information that the initial condition is a steady state of the unperturbed process provides valuable information, as it restricts the dynamics of the process and thereby the parameters. However, implementing steady-state constraints in the optimization often results in convergence problems. Results: In this manuscript, we propose two new methods for solving optimization problems with steady-state constraints. The first method exploits ideas from optimization algorithms on manifolds and introduces a retraction operator, essentially reducing the dimension of the optimization problem. The second method is based on the continuous analogue of the optimization problem. This continuous analogue is an ODE whose equilibrium points are the optima of the constrained optimization problem. This equivalence enables the use of adaptive numerical methods for solving optimization problems with steady-state constraints. Both methods are tailored to the problem structure and exploit the local geometry of the steady-state manifold and its stability properties. A parameterization of the steady-state manifold is not required. The efficiency and reliability of the proposed methods is evaluated using one toy example and two applications. The first application example uses published data while the second uses a novel dataset for Raf/MEK/ERK signaling. The proposed methods demonstrated better convergence properties than state-of-the-art methods employed in systems and computational biology. Furthermore, the average computation time per converged start is significantly lower. In addition to the theoretical results, the analysis of the dataset for Raf/MEK/ERK signaling provides novel biological insights regarding the existence of feedback regulation. Conclusion: Many optimization problems considered in systems and computational biology are subject to steady-state constraints. While most optimization methods have convergence problems if these steady-state constraints are highly nonlinear, the methods presented recover the convergence properties of optimizers which can exploit an analytical expression for the parameter-dependent steady state. This renders them an excellent alternative to methods which are currently employed in systems and computational biology.
Wissenschaftlicher Artikel
Scientific Article
Haghverdi, L. ; Büttner, M. ; Wolf, F.A. ; Buettner, F. ; Theis, F.J.
Nat. Methods 13, 845-848 (2016)
The temporal order of differentiating cells is intrinsically encoded in their single-cell expression profiles. We describe an efficient way to robustly estimate this order according to diffusion pseudotime (DPT), which measures transitions between cells using diffusion-like random walks. Our DPT software implementations make it possible to reconstruct the developmental progression of cells and identify transient or metastable states, branching decisions and differentiation endpoints.
Wissenschaftlicher Artikel
Scientific Article
Carpenter, A. ; Eddy, S.E. ; Flicek, P. ; Gymrek, M. ; Hammell, M. ; Jaqaman, K. ; Jenkins, J. ; Koller, D. ; Lappalainen, T. ; Oshlack, A. ; Shamir, R. ; Singh, M. ; Teichmann, S. ; Theis, F.J. ; Troyanskaya, O.
Cell Syst. 3, 7-11 (2016)
Sonstiges: Meinungsartikel
Other: Opinion
Theis, F.J.
MEDICA Magazin, DOI: undefined (2016)
Interview mit Prof. Dr. Dr. Fabian Theis, Direktor des Institute of Computational Biology (ICB) am Helmholtz Zentrum München sowie Inhaber des Lehrstuhls für Mathematische Modellierung biologischer Systeme der TU München.
Fröhlich, F. ; Thomas, P. ; Kazeroonian, A. ; Theis, F.J. ; Grima, R. ; Hasenauer, J.
PLoS Comput. Biol. 12:e1005030 (2016)
Quantitative mechanistic models are valuable tools for disentangling biochemical pathways and for achieving a comprehensive understanding of biological systems. However, to be quantitative the parameters of these models have to be estimated from experimental data. In the presence of significant stochastic fluctuations this is a challenging task as stochastic simulations are usually too time-consuming and a macroscopic description using reaction rate equations (RREs) is no longer accurate. In this manuscript, we therefore consider moment-closure approximation (MA) and the system size expansion (SSE), which approximate the statistical moments of stochastic processes and tend to be more precise than macroscopic descriptions. We introduce gradient-based parameter optimization methods and uncertainty analysis methods for MA and SSE. Efficiency and reliability of the methods are assessed using simulation examples as well as by an application to data for Epo-induced JAK/STAT signaling. The application revealed that even if merely population-average data are available, MA and SSE improve parameter identifiability in comparison to RRE. Furthermore, the simulation examples revealed that the resulting estimates are more reliable for an intermediate volume regime. In this regime the estimation error is reduced and we propose methods to determine the regime boundaries. These results illustrate that inference using MA and SSE is feasible and possesses a high sensitivity.
Wissenschaftlicher Artikel
Scientific Article
Sacher, A. ; Theis, F.J.
Laborjournal 7-8, 59-61 (2016)
Bioinformatiker bringen Computern bei, biologische Fragen zu lösen. Maschinelle Lernprogramme helfen den Rechnern dabei auf die Sprünge.
Much, D. ; Beyerlein, A. ; Kindt, A. ; Krumsiek, J. ; Stückler, F. ; Rossbauer, M. ; Hofelich, A. ; Wiesenäcker, D. ; Hivner, S. ; Herbst, M. ; Römisch-Margl, W. ; Prehn, C. ; Adamski, J. ; Kastenmüller, G. ; Theis, F.J. ; Ziegler, A.-G. ; Hummel, S.
Diabetologia 59, 2193-2202 (2016)
AIMS/HYPOTHESIS: Lactation for >3 months in women with gestational diabetes is associated with a reduced risk of type 2 diabetes that persists for up to 15 years postpartum. However, the underlying mechanisms are unknown. We examined whether in women with gestational diabetes lactation for >3 months is associated with altered metabolomic signatures postpartum. METHODS: We enrolled 197 women with gestational diabetes at a median of 3.6 years (interquartile range 0.7-6.5 years) after delivery. Targeted metabolomics profiles (including 156 metabolites) were obtained during a glucose challenge test. Comparisons of metabolite concentrations and ratios between women who lactated for >3 months and women who lactated for ≤3 months or not at all were performed using linear regression with adjustment for age and BMI at the postpartum visit, time since delivery, and maternal education level, and correction for multiple testing. Gaussian graphical modelling was used to generate metabolite networks. RESULTS: Lactation for >3 months was associated with a higher total lysophosphatidylcholine/total phosphatidylcholine ratio; in women with short-term follow-up, it was also associated with lower leucine concentrations and a lower total branched-chain amino acid concentration. Gaussian graphical modelling identified subgroups of closely linked metabolites within phosphatidylcholines and branched-chain amino acids that were affected by lactation for >3 months and have been linked to the pathophysiology of type 2 diabetes in previous studies. CONCLUSIONS/INTERPRETATION: Lactation for >3 months in women with gestational diabetes is associated with changes in the metabolomics profile that have been linked to the early pathogenesis of type 2 diabetes.
Wissenschaftlicher Artikel
Scientific Article
Hilsenbeck, O. ; Schwarzfischer, M. ; Skylaki, S. ; Schauberger, B. ; Hoppe, P.S. ; Loeffler, D. ; Kokkaliaris, K.D. ; Hastreiter, S. ; Skylaki, E. ; Filipczyk, A. ; Strasser, M. ; Buggenthin, F. ; Feigelman, J. ; Krumsiek, J. ; van den Berg, A.J. ; Endele, M. ; Etzrodt, M. ; Marr, C. ; Theis, F.J. ; Schroeder, T.
Nat. Biotechnol. 34, 703-706 (2016)
Wissenschaftlicher Artikel
Scientific Article
Hoppe, P.S. ; Schwarzfischer, M. ; Loeffler, D. ; Kokkaliaris, K.D. ; Hilsenbeck, O. ; Moritz, N. ; Endele, M. ; Filipczyk, A. ; Gambardella, A. ; Ahmed, N. ; Etzrodt, M. ; Coutu, D.L. ; Rieger, M.A. ; Marr, C. ; Strasser, M. ; Schauberger, B. ; Burtscher, I. ; Ermakova, O. ; Bürger, A. ; Lickert, H. ; Nerlov, C. ; Theis, F.J. ; Schroeder, T.
Nature 535, 299-302 (2016)
The mechanisms underlying haematopoietic lineage decisions remain disputed. Lineage-affiliated transcription factors with the capacity for lineage reprogramming, positive auto-regulation and mutual inhibition have been described as being expressed in uncommitted cell populations. This led to the assumption that lineage choice is cell-intrinsically initiated and determined by stochastic switches of randomly fluctuating cross-antagonistic transcription factors. However, this hypothesis was developed on the basis of RNA expression data from snapshot and/or population-averaged analyses. Alternative models of lineage choice therefore cannot be excluded. Here we use novel reporter mouse lines and live imaging for continuous single-cell long-term quantification of the transcription factors GATA1 and PU.1 (also known as SPI1). We analyse individual haematopoietic stem cells throughout differentiation into megakaryocytic-erythroid and granulocytic-monocytic lineages. The observed expression dynamics are incompatible with the assumption that stochastic switching between PU.1 and GATA1 precedes and initiates megakaryocytic-erythroid versus granulocytic-monocytic lineage decision-making. Rather, our findings suggest that these transcription factors are only executing and reinforcing lineage choice once made. These results challenge the current prevailing model of early myeloid lineage choice.
Wissenschaftlicher Artikel
Scientific Article
Scheel, C. ; Linnemann, J.R. ; Miura, H. ; Meixner, L.K. ; Irmier, M. ; Kloos, U.J. ; Hirschi, B. ; Bartsch, H.S. ; Sass, S. ; Beckers, J. ; Theis, F.J. ; Gabka, C. ; Sotlar, K.
Cancer Res. 76:P1-06-02 (2016)
Meeting abstract
Meeting abstract
Garzorz-Stark, N. ; Krause, L. ; Lauffer, F. ; Atenhan, A. ; Thomas, J. ; Stark, S.P. ; Franz, R. ; Weidinger, S. ; Balato, A. ; Müller, N.S. ; Theis, F.J. ; Ring, J. ; Schmidt-Weber, C.B. ; Biedermann, T. ; Eyerich, S. ; Eyerich, K.
Exp. Dermatol. 25, 767-774 (2016)
Novel specific therapies for psoriasis and eczema have been developed and they mark a new era in the treatment of these complex inflammatory skin diseases. However, within their broad clinical spectrum, psoriasis and eczema phenotypes overlap making an accurate diagnosis impossible in special cases, not to speak about predicting the clinical outcome of an individual patient. Here, we present a novel robust molecular classifier (MC) consisting of NOS2 and CCL27 gene that diagnosed psoriasis and eczema with a sensitivity and specificity of >95% in a cohort of 129 patients suffering from 1) classical forms, 2) subtypes and 3) clinically and histologically indistinct variants of psoriasis and eczema. NOS2 and CCL27 correlated with clinical and histological hallmarks of psoriasis and eczema in a mutually antagonistic way, thus highlighting their biological relevance. In line with this, the MC could be transferred to the level of immunofluorescence stainings for iNOS and CCL27 protein on paraffin-embedded sections, where patients were diagnosed with sensitivity and specificity >88%. Our MC proved superiority over current gold standard methods to distinguish psoriasis and eczema and may therefore build the basis for molecular diagnosis of chronic inflammatory skin diseases required to establish personalized medicine in the field.
Wissenschaftlicher Artikel
Scientific Article
Styczynski, M.P. ; Theis, F.J.
Curr. Opin. Biotechnol. 39, IV-VI (2016)
Editorial
Editorial
Krause, L. ; Mourantchanian, V. ; Brockow, K. ; Theis, F.J. ; Schmidt-Weber, C.B. ; Knapp, B. ; Müller, N.S. ; Eyerich, S.
J. Allergy Clin. Immunol. 138, 1207-1210.e2 (2016)
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J.
Laborjournal 23, 22-25 (2016)
Fabian Theis ist Leiter des Instituts für Computational Biology am Helmholtz Zentrum München. Im Laborjournal-Gespräch erklärt er, wie maschinelles Lernen und neuronale Netze dabei helfen, komplexe biologische Datensätze zu verstehen.
Krumsiek, J. ; Bartel, J. ; Theis, F.J.
Curr. Opin. Biotechnol. 39, 198-206 (2016)
Systems genetics is defined as the simultaneous assessment and analysis of multi-omics datasets. In the past few years, metabolomics has been established as a robust tool describing an important functional layer in this approach. The metabolome of a biological system represents an integrated state of genetic and environmental factors and has been referred to as a 'link between genotype and phenotype'. In this review, we summarize recent progresses in statistical analysis methods for metabolomics data in combination with other omics layers. We put a special focus on complex, multivariate statistical approaches as well as pathway-based and network-based analysis methods. Moreover, we outline current challenges and pitfalls of metabolomics-focused multi-omics analyses and discuss future steps for the field.
Wissenschaftlicher Artikel
Scientific Article
Geissen, E.M. ; Hasenauer, J. ; Heinrich, S. ; Hauf, S. ; Theis, F.J. ; Radde, N.
Bioinformatics 32, 2464-2472 (2016)
Motivation: The statistical analysis of single-cell data is a challenge in cell biological studies. Tailored statistical models and computational methods are required to resolve the subpopulation structure, i.e. to correctly identify and characterize subpopulations. These approaches also support the unraveling of sources of cell-to-cell variability. Finite mixture models have shown promise, but the available approaches are ill suited to the simultaneous consideration of data from multiple experimental conditions and to censored data. The prevalence and relevance of single-cell data and the lack of suitable computational analytics make automated methods, that are able to deal with the requirements posed by these data, necessary. Results: We present MEMO, a flexible mixture modeling framework that enables the simultaneous, automated analysis of censored and uncensored data acquired under multiple experimental conditions. MEMO is based on maximum-likelihood inference and allows for testing competing hypotheses. MEMO can be applied to a variety of different single-cell data types. We demonstrate the advantages of MEMO by analyzing right and interval censored single-cell microscopy data. Our results show that an examination of censoring and the simultaneous consideration of different experimental conditions are necessary to reveal biologically meaningful subpopulation structures. MEMO allows for a stringent analysis of single-cell data and enables researchers to avoid misinterpretation of censored data. Therefore, MEMO is a valuable asset for all fields that infer the characteristics of populations by looking at single individuals such as cell biology and medicine. Availability: MEMO is implemented in MATLAB and freely available via github (https://github.com/MEMO-toolbox/MEMO).
Wissenschaftlicher Artikel
Scientific Article
Kondofersky, I. ; Theis, F.J. ; Fuchs, C.
IET Syst. Biol. 10, 210-218 (2016)
© The Institution of Engineering and Technology.In systems biology, one is often interested in the communication patterns between several species, such as genes, enzymes or proteins. These patterns become more recognisable when temporal experiments are performed. This temporal communication can be structured by reaction networks such as gene regulatory networks or signalling pathways. Mathematical modelling of data arising from such networks can reveal important details, thus helping to understand the studied system. In many cases, however, corresponding models still deviate from the observed data. This may be due to unknown but present catalytic reactions. From a modelling perspective, the question of whether a certain reaction is catalysed leads to a large increase of model candidates. For large networks the calibration of all possible models becomes computationally infeasible. We propose a method which determines a substantially reduced set of appropriate model candidates and identifies the catalyst of each reaction at the same time. This is incorporated in a multiple-step procedure which first extends the network by additional latent variables and subsequently identifies catalyst candidates using similarity analysis methods. Results from synthetic data examples suggest a good performance even for non-informative data with few observations. Applied on CD95 apoptotic pathway our method provides new insights into apoptosis regulation.
Wissenschaftlicher Artikel
Scientific Article
Rolle-Kampczyk, U.E. ; Krumsiek, J. ; Otto, W. ; Röder, S.W. ; Kohajda, T. ; Borte, M. ; Theis, F.J. ; Lehmann, I. ; von Bergen, M.
Metabolomics 12:76 (2016)
Introduction: A general detrimental effect of smoking during pregnancy on the health of newborn children is well-documented, but the detailed mechanisms remain elusive. Objectives: Beside the specific influence of environmental tobacco smoke derived toxicants on developmental regulation the impact on the metabolism of newborn children is of particular interest, first as a general marker of foetal development and second due to its potential predictive value for the later occurrence of metabolic diseases. Methods: Tobacco smoke exposure information from a questionnaire was confirmed by measuring the smoking related metabolites S-Phenyl mercapturic acid, S-Benzyl mercapturic acid and cotinine in maternal urine by LC–MS/MS. The impact of smoking on maternal endogenous serum metabolome and children’s cord blood metabolome was assessed in a targeted analysis of 163 metabolites by an LC–MS/MS based assay. The anti-oxidative status of maternal serum samples was analysed by chemoluminiscence based method. Results: Here we present for the first time results of a metabolomic assessment of the cordblood of 40 children and their mothers. Several analytes from the group of phosphatidylcholines, namely PCaaC28:1, PCaaC32:3, PCaeC30:1, PCaeC32:2, PCaeC40:1, and sphingomyelin SM C26:0, differed significantly in mothers and children’s sera depending on smoking status. In serum of smoking mothers the antioxidative capacity of water soluble compounds was not significantly changed while there was a significant decrease in the lipid fraction. Conclusion: Our data give evidence that smoking during pregnancy alters both the maternal and children’s metabolome. Whether the different pattern found in adults compared to newborn children could be related to different disease outcomes should be in the focus of future studies.
Wissenschaftlicher Artikel
Scientific Article
Haffner, I. ; Luber, B. ; Maier, D. ; Geier, B. ; Theis, F.J. ; Meyer-Hermann, M. ; Walch, A.K. ; Kretzschmar, A.K. ; von Weikersthal, L.F. ; Knorrenschild, J.R. ; Schierle, K. ; Wittekind, C. ; Lordick, F.
Oncol. Res. Treat. 39, 16-17 (2016)
Meeting abstract
Meeting abstract
Preusse, M. ; Theis, F.J. ; Müller, N.S.
PLoS ONE 11:e0151771 (2016)
MicroRNAs are involved in almost all biological processes and have emerged as regulators of signaling pathways. We show that miRNA target genes and pathway genes are not uniformly expressed across human tissues. To capture tissue specific effects, we developed a novel methodology for tissue specific pathway analysis of miRNAs. We incorporated the most recent and highest quality miRNA targeting data (TargetScan and StarBase), RNA-seq based gene expression data (EBI Expression Atlas) and multiple new pathway data sources to increase the biological relevance of the predicted miRNA-pathway associations. We identified new potential roles of miR-199a-3p, miR-199b-3p and the miR-200 family in hepatocellular carcinoma, involving the regulation of metastasis through MAPK and Wnt signaling. Also, an association of miR-571 and Notch signaling in liver fibrosis was proposed. To facilitate data update and future extensions of our tool, we developed a flexible database backend using the graph database neo4j. The new backend as well as the novel methodology were included in the updated miTALOS v2, a tool that provides insights into tissue specific miRNA regulation of biological pathways. miTALOS v2 is available at http://mips.helmholtz-muenchen.de/mitalos.
Wissenschaftlicher Artikel
Scientific Article
Huypens, P. ; Sass, S. ; Wu, M. ; Dyckhoff, D. ; Tschöp, M.H. ; Theis, F.J. ; Marschall, S. ; Hrabě de Angelis, M. ; Beckers, J.
Nat. Genet. 48, 497-499 (2016)
There is considerable controversy regarding epigenetic inheritance in mammalian gametes. Using in vitro fertilization to ensure exclusive inheritance via the gametes, we show that a parental high-fat diet renders offspring more susceptible to developing obesity and diabetes in a sex- and parent of origin-specific mode. The epigenetic inheritance of acquired metabolic disorders may contribute to the current obesity and diabetes pandemic.
Wissenschaftlicher Artikel
Scientific Article
Laimighofer, M. ; Krumsiek, J. ; Buettner, F. ; Theis, F.J.
J. Comput. Biol. 23, 279-290 (2016)
With widespread availability of omics profiling techniques, the analysis and interpretation of high-dimensional omics data, for example, for biomarkers, is becoming an increasingly important part of clinical medicine because such datasets constitute a promising resource for predicting survival outcomes. However, early experience has shown that biomarkers often generalize poorly. Thus, it is crucial that models are not overfitted and give accurate results with new data. In addition, reliable detection of multivariate biomarkers with high predictive power (feature selection) is of particular interest in clinical settings. We present an approach that addresses both aspects in high-dimensional survival models. Within a nested cross-validation (CV), we fit a survival model, evaluate a dataset in an unbiased fashion, and select features with the best predictive power by applying a weighted combination of CV runs. We evaluate our approach using simulated toy data, as well as three breast cancer datasets, to predict the survival of breast cancer patients after treatment. In all datasets, we achieve more reliable estimation of predictive power for unseen cases and better predictive performance compared to the standard CoxLasso model. Taken together, we present a comprehensive and flexible framework for survival models, including performance estimation, final feature selection, and final model construction. The proposed algorithm is implemented in an open source R package (SurvRank) available on CRAN.
Wissenschaftlicher Artikel
Scientific Article
Kazeroonian, A. ; Fröhlich, F. ; Raue, A. ; Theis, F.J. ; Hasenauer, J.
PLoS ONE 11:e0146732 (2016)
Gene expression, signal transduction and many other cellular processes are subject to stochastic fluctuations. The analysis of these stochastic chemical kinetics is important for understanding cell-to-cell variability and its functional implications, but it is also challenging. A multitude of exact and approximate descriptions of stochastic chemical kinetics have been developed, however, tools to automatically generate the descriptions and compare their accuracy and computational efficiency are missing. In this manuscript we introduced CERENA, a toolbox for the analysis of stochastic chemical kinetics using Approximations of the Chemical Master Equation solution statistics. CERENA implements stochastic simulation algorithms and the finite state projection for microscopic descriptions of processes, the system size expansion and moment equations for meso- and macroscopic descriptions, as well as the novel conditional moment equations for a hybrid description. This unique collection of descriptions in a single toolbox facilitates the selection of appropriate modeling approaches. Unlike other software packages, the implementation of CERENA is completely general and allows, e.g., for time-dependent propensities and non-mass action kinetics. By providing SBML import, symbolic model generation and simulation using MEX-files, CERENA is user-friendly and computationally efficient. The availability of forward and adjoint sensitivity analyses allows for further studies such as parameter estimation and uncertainty analysis. The MATLAB code implementing CERENA is freely available from http://cerenadevelopers.github.io/CERENA/.
Wissenschaftlicher Artikel
Scientific Article
Jung, B. ; Padula, D. ; Burtscher, I. ; Landerer, C. ; Lutter, D. ; Theis, F.J. ; Messias, A.C. ; Geerlof, A. ; Sattler, M. ; Kremmer, E. ; Boldt, K. ; Ueffing, M. ; Lickert, H.
PLoS ONE 11:e0149477 (2016)
The seven-transmembrane receptor Smoothened (Smo) activates all Hedgehog (Hh) signaling by translocation into the primary cilia (PC), but how this is regulated is not well understood. Here we show that Pitchfork (Pifo) and the G protein-coupled receptor associated sorting protein 2 (Gprasp2) are essential components of an Hh induced ciliary targeting complex able to regulate Smo translocation to the PC. Depletion of Pifo or Gprasp2 leads to failure of Smo translocation to the PC and lack of Hh target gene activation. Together, our results identify a novel protein complex that is regulated by Hh signaling and required for Smo ciliary trafficking and Hh pathway activation.
Wissenschaftlicher Artikel
Scientific Article
Stojcheva, N. ; Schechtmann, G. ; Sass, S. ; Roth, P. ; Florea, A.M. ; Stefanski, A. ; Stühler, K. ; Wolter, M. ; Müller, N.S. ; Theis, F.J. ; Weller, M.G. ; Reifenberger, G. ; Happold, C.
Oncotarget 7, 12937-12950 (2016)
Glioblastoma is the most aggressive brain tumor in adults with a median survival below 12 months in population-based studies. The main reason for tumor recurrence and progression is constitutive or acquired resistance to the standard of care of surgical resection followed by radiotherapy with concomitant and adjuvant temozolomide (TMZ/RT→TMZ). Here, we investigated the role of microRNA (miRNA) alterations as mediators of alkylator resistance in glioblastoma cells. Using microarray-based miRNA expression profiling of parental and TMZ-resistant cultures of three human glioma cell lines, we identified a set of differentially expressed miRNA candidates. From these, we selected miR-138 for further functional analyses as this miRNA was not only upregulated in TMZ-resistant versus parental cells, but also showed increased expression in vivo in recurrent glioblastoma tissue samples after TMZ/RT→TMZ treatment. Transient transfection of miR-138 mimics in glioma cells with low basal miR-138 expression increased glioma cell proliferation. Moreover, miR-138 overexpression increased TMZ resistance in long-term glioblastoma cell lines and glioma initiating cell cultures. The apoptosis regulator BIM was identified as a direct target of miR-138, and its silencing mediated the induced TMZ resistance phenotype. Altered sensitivity to apoptosis played only a minor role in this resistance mechanism. Instead, we identified the induction of autophagy to be regulated downstream of the miR-138/BIM axis and to promote cell survival following TMZ exposure. Our data thus define miR-138 as a glioblastoma cell survival-promoting miRNA associated with resistance to TMZ therapy in vitro and with tumor progression in vivo.
Wissenschaftlicher Artikel
Scientific Article
Blasi, T. ; Feller, C. ; Feigelman, J. ; Hasenauer, J. ; Imhof, A. ; Theis, F.J. ; Becker, P.B. ; Marr, C.
Cell Syst. 2, 49-58 (2016)
Post-translational modifications (PTMs) are pivotal to cellular information processing, but how combinatorial PTM patterns (“motifs”) are set remains elusive. We develop a computational framework, which we provide as open source code, to investigate the design principles generating the combinatorial acetylation patterns on histone H4 in Drosophila melanogaster. We find that models assuming purely unspecific or lysine site-specific acetylation rates were insufficient to explain the experimentally determined motif abundances. Rather, these abundances were best described by an ensemble of models with acetylation rates that were specific to motifs. The model ensemble converged upon four acetylation pathways; we validated three of these using independent data from a systematic enzyme depletion study. Our findings suggest that histone acetylation patterns originate through specific pathways involving motif-specific acetylation activity.
Wissenschaftlicher Artikel
Scientific Article
Blasi, T. ; Hennig, H. ; Summers, H.D. ; Theis, F.J. ; Cerveira, J. ; Patterson, J.O. ; Davies, D. ; Filby, A. ; Carpenter, A.E. ; Rees, P.
Nat. Commun. 7:10256 (2016)
Imaging flow cytometry combines the high-throughput capabilities of conventional flow cytometry with single-cell imaging. Here we demonstrate label-free prediction of DNA content and quantification of the mitotic cell cycle phases by applying supervised machine learning to morphological features extracted from brightfield and the typically ignored darkfield images of cells from an imaging flow cytometer. This method facilitates non-destructive monitoring of cells avoiding potentially confounding effects of fluorescent stains while maximizing available fluorescence channels. The method is effective in cell cycle analysis for mammalian cells, both fixed and live, and accurately assesses the impact of a cell cycle mitotic phase blocking agent. As the same method is effective in predicting the DNA content of fission yeast, it is likely to have a broad application to other cell types.
Wissenschaftlicher Artikel
Scientific Article
Hug, S. ; Schmidl, D. ; Li, W.B. ; Greiter, M. ; Theis, F.J.
In: Geris, L.* ; Gomez-Cabrero, D.* [Eds.]: Uncertainty in Biology : A Computational Modeling Approach. Berlin; Heidelberg: Springer, 2016. 243-268 (Stud. Mechanobiol. Tiss. Engineering Biomater. ; 17)
In this chapter, we focus on Bayesian model selection for biological dynamical systems. We do not present an overview over existing methods, but showcase their comparison and the application to ordinary differential equation (ODE) systems, as well as the inference of the parameters in the ODE system. For this, our method of choice is the Bayes factor, computed by Thermodynamic Integration. We first present several model selection methods, both alternatives to the Bayes factor as well as several methods for calculating the Bayes factor, foremost among them said Thermodynamic Integration. As a simple example for the selection problem, we resort to a choice between normal distributions, which is analytically tractable. We apply our chosen method to a medium sized ODE model selection problem from radiation science and demonstrate how predictions can be drawn from the model selection results.
Angerer, P. ; Haghverdi, L. ; Büttner, M. ; Theis, F.J. ; Marr, C. ; Buettner, F.
Bioinformatics 32, 1241-1243 (2016)
Diffusion maps are a spectral method for non-linear dimension reduction and have recently been adapted for the visualization of single cell expression data. Here we present destiny, an efficient R implementation of the diffusion map algorithm. Our package includes a single-cell specific noise model allowing for missing and censored values. In contrast to previous implementations, we further present an efficient nearest-neighbour approximation that allows for the processing of hundreds of thousands of cells and a functionality for projecting new data on existing diffusion maps. We exemplarily apply destiny to a recent time-resolved mass cytometry dataset of cellular reprogramming. AVAILABILITY AND IMPLEMENTATION: destiny is an open-source R/Bioconductor package http://bioconductor.org/packages/destiny also available at https://www.helmholtz-muenchen.de/icb/destiny. A detailed vignette describing functions and workflows is provided with the package.
Wissenschaftlicher Artikel
Scientific Article
Willmann, S. ; Müller, N.S. ; Engert, S. ; Sterr, M. ; Burtscher, I. ; Raducanu, A. ; Irmler, M. ; Beckers, J. ; Sass, S. ; Theis, F.J. ; Lickert, H.
Mech. Dev. 139, 51-64 (2016)
Pancreas organogenesis is a highly dynamic process where neighboring tissue interactions lead to dynamic changes in gene regulatory networks that orchestrate endocrine, exocrine, and ductal lineage formation. To understand the spatio-temporal regulatory logic we have used the Forkhead transcription factor Foxa2-Venus fusion (FVF) knock-in reporter mouse to separate the FVF+ pancreatic epithelium from the FVF− surrounding tissue (mesenchyme, neurons, blood, and blood vessels) to perform a genome-wide mRNA expression profiling at embryonic days (E) 12.5–15.5. Annotating genes and molecular processes suggest that FVF marks endoderm-derived multipotent epithelial progenitors at several lineage restriction steps, when the bulk of endocrine, exocrine and ductal cells are formed during the secondary transition. In the pancreatic epithelial compartment, we identified most known endocrine and exocrine lineage determining factors and diabetes-associated genes, but also unknown genes with spatio-temporal regulated pancreatic expression. In the non-endoderm-derived compartment, we identified many well-described regulatory genes that are not yet functionally annotated in pancreas development, emphasizing that neighboring tissue interactions are still ill defined. Pancreatic expression of over 635 genes was analyzed with the mRNA in situ hybridization Genepaint public database. This validated the quality of the profiling data set and identified hundreds of genes with spatially restricted expression patterns in the pancreas. Some of these genes are also targeted by pancreatic transcription factors and show active chromatin marks in human islets of Langerhans. Thus, with the highest spatio-temporal resolution of a global gene expression profile during the secondary transition, our study enables to shed light on neighboring tissue interactions, developmental timing and diabetes gene regulation.
Wissenschaftlicher Artikel
Scientific Article
Zissler, U.M. ; Chaker, A. ; Effner, R. ; Ulrich, M. ; Guerth, F. ; Piontek, G. ; Dietz, K. ; Regn, M. ; Knapp, B. ; Theis, F.J. ; Heine, H. ; Suttner, K. ; Schmidt-Weber, C.B.
Mucosal Immunol. 9, 917-926 (2016)
Interferon-γ (IFN-γ) and interleukin-4 (IL-4) are key effector cytokines for the differentiation of T helper type 1 and 2 (Th1 and Th2) cells. Both cytokines induce fate-decisive transcription factors such as GATA3 and TBX21 that antagonize the polarized development of opposite phenotypes by direct regulation of each other's expression along with many other target genes. Although it is well established that mesenchymal cells directly respond to Th1 and Th2 cytokines, the nature of antagonistic differentiation programs in airway epithelial cells is only partially understood. In this study, primary normal human bronchial epithelial cells (NHBEs) were exposed to IL-4, IFN-γ, or both and genome-wide transcriptome analysis was performed. The study uncovers an antagonistic regulation pattern of IL-4 and IFN-γ in NHBEs, translating the Th1/Th2 antagonism directly in epithelial gene regulation. IL-4- and IFN-γ-induced transcription factor hubs form clusters, present in antagonistically and polarized gene regulation networks. Furthermore, the IL-4-dependent induction of IL-24 observed in rhinitis patients was downregulated by IFN-γ, and therefore IL-24 represents a potential biomarker of allergic inflammation and a Th2 polarized condition of the epithelium.
Wissenschaftlicher Artikel
Scientific Article
Conlon, T.M. ; Bartel, J. ; Ballweg, K. ; Günter, S. ; Prehn, C. ; Krumsiek, J. ; Meiners, S. ; Theis, F.J. ; Adamski, J. ; Eickelberg, O. ; Yildirim, A.Ö.
Clin. Sci. 130, 273-287 (2016)
Chronic obstructive pulmonary disease (COPD) is characterized by chronic bronchitis, small airway remodeling and emphysema. Emphysema is the destruction of alveolar structures, leading to enlarged airspaces and reduced surface area impairing the ability for gaseous exchange. To further understand the pathological mechanisms underlying progressive emphysema we used mass spectrometry-based approaches to quantitate the lung, bronchoalveolar-lavage fluid (BALF) and serum metabolome during emphysema progression in the established murine porcine pancreatic elastase (PPE) model on days 28, 56 and 161, compared to PBS controls. Partial Least Square analysis revealed greater changes in the metabolome of lung followed by BALF rather than serum during emphysema progression. Furthermore, we demonstrate for the first time that emphysema progression is associated with a reduction in lung specific L-carnitine, a metabolite critical for transporting long chain fatty acids into the mitochondria for their subsequent β-oxidation. In vitro , stimulation of the ATII-like LA4 cell line with L-carnitine diminished apoptosis induced by both PPE and H2O2. Moreover, PPE-treated mice demonstrated impaired lung function compared to PBS treated controls (lung compliance; 0.067±0.008ml/cmH20 vs 0.035±0.005ml/cmH20, p<0.0001), which improved following supplementation with L-carnitine (0.051±0.006, p<0.01) and was associated with a reduction in apoptosis. In summary, our results provide a new insight into the role of L-carnitine and, importantly, suggest therapeutic avenues for COPD.
Wissenschaftlicher Artikel
Scientific Article
Miettinen, J. ; Illner, K. ; Nordhausen, K. ; Oja, H. ; Taskinen, S. ; Theis, F.J.
J. Time Ser. Anal. 37, 337-354 (2016)
In blind source separation, one assumes that the observed p time series are linear combinations of p latent uncorrelated weakly stationary time series. To estimate the unmixing matrix, which transforms the observed time series back to uncorrelated latent time series, second-order blind identification (SOBI) uses joint diagonalization of the covariance matrix and autocovariance matrices with several lags. In this article, we find the limiting distribution of the well-known symmetric SOBI estimator under general conditions and compare its asymptotical efficiencies to those of the recently introduced deflation-based SOBI estimator. The theory is illustrated by some finite-sample simulation studies.
Wissenschaftlicher Artikel
Scientific Article
Kong, B. ; Wu, W. ; Cheng, T. ; Schlitter, A.M. ; Qian, C. ; Bruns, P. ; Jian, Z. ; Jager, C. ; Regel, I. ; Raulefs, S. ; Behler, N. ; Irmler, M. ; Beckers, J. ; Friess, H. ; Erkan, M. ; Siveke, J.T. ; Tannapfel, A. ; Hahn, S.A. ; Theis, F.J. ; Esposito, I. ; Kleeff, J. ; Michalski, C.W.
Gut 65, 647-657 (2016)
OBJECTIVE: Oncogenic Kras-activated robust Mek/Erk signals phosphorylate to the tuberous sclerosis complex (Tsc) and deactivates mammalian target of rapamycin (mTOR) suppression in pancreatic ductal adenocarcinoma (PDAC); however, Mek and mTOR inhibitors alone have demonstrated minimal clinical antitumor activity. DESIGN: We generated transgenic mouse models in which mTOR was hyperactivated either through the Kras/Mek/Erk cascade, by loss of Pten or through Tsc1 haploinsufficiency. Primary cancer cells were isolated from mouse tumours. Oncogenic signalling was assessed in vitro and in vivo, with and without single or multiple targeted molecule inhibition. Transcriptional profiling was used to identify biomarkers predictive of the underlying pathway alterations and of therapeutic response. Results from the preclinical models were confirmed on human material. RESULTS: Reduction of Tsc1 function facilitated activation of Kras/Mek/Erk-mediated mTOR signalling, which promoted the development of metastatic PDACs. Single inhibition of mTOR or Mek elicited strong feedback activation of Erk or Akt, respectively. Only dual inhibition of Mek and PI3K reduced mTOR activity and effectively induced cancer cell apoptosis. Analysis of downstream targets demonstrated that oncogenic activity of the Mek/Erk/Tsc/mTOR axis relied on Aldh1a3 function. Moreover, in clinical PDAC samples, ALDH1A3 specifically labelled an aggressive subtype. CONCLUSIONS: These results advance our understanding of Mek/Erk-driven mTOR activation and its downstream targets in PDAC, and provide a mechanistic rationale for effective therapeutic matching for Aldh1a3-positive PDACs.
Wissenschaftlicher Artikel
Scientific Article
2015
Bartel, S. ; Schulz, N. ; Schamberger, A.C. ; Alessandrini, F. ; Pagel, P. ; Theis, F.J. ; Milger, K. ; Nößner, E. ; Stick, S.M. ; Kicic, A. ; Eickelberg, O. ; Freishtat, R.J. ; Krauss-Etschmann, S.
In:. 2015. 69-A24 (Pneumologe)
Background: Asthma is the most common chronic disease in children leading to respiratory dysfunction in adults if not identified correctly. Therefore, there is an unmet clinical need to find novel treatment targets for asthma pathogenesis.   Objective: miRNA profiles were used to identify dysregulated signalling pathways in asthma as they are critical regulators of key molecules.   Methods: Whole lung microRNAs were identified in mice with ovalbumin (OVA)-induced allergic inflammation by qualitative and quantitative microarrays. Target mRNAs, identified by in silico analysis were validated using reporter assays. miRNAs and their targets were validated in house-dust-mite (HDM) induced asthma, primary normal human bronchial epithelial (NHBE) cells, and in nasal brushings from asthmatic children and controls.   Results: We identified the transcription factor cAMP-responsive element binding protein (CREB1) and its transcriptional co-activators (CRTC1 – 3) as targets of miR-17, miR-144, and miR-21 in vitro and ex vivo. A predicted CREB1 downstream target SEC14-like 3(SEC14l3) was down-regulated in two asthma models (OVA and HDM), was associated with ciliated cells in NHBE cultures and its expression was decreased by IL-13. In asthmatic children the three miRNAs were increased in nasal epithelial cells compared to healthy controls, while SEC14l3 expression was reduced.   Conclusion: By using altered miRNA profiles in experimental asthma we identified a so far non-characterized CREB1/CRTC-SEC14l3 axis that might be relevant for experimental and paediatric asthma.
Sass, S. ; Pitea, A. ; Unger, K. ; Hess-Rieger, J. ; Müller, N.S. ; Theis, F.J.
Int. J. Mol. Sci. 16, 30204-30222 (2015)
MicroRNAs represent ~22 nt long endogenous small RNA molecules that have been experimentally shown to regulate gene expression post-transcriptionally. One main interest in miRNA research is the investigation of their functional roles, which can typically be accomplished by identification of mi-/mRNA interactions and functional annotation of target gene sets. We here present a novel method "miRlastic", which infers miRNA-target interactions using transcriptomic data as well as prior knowledge and performs functional annotation of target genes by exploiting the local structure of the inferred network. For the network inference, we applied linear regression modeling with elastic net regularization on matched microRNA and messenger RNA expression profiling data to perform feature selection on prior knowledge from sequence-based target prediction resources. The novelty of miRlastic inference originates in predicting data-driven intra-transcriptome regulatory relationships through feature selection. With synthetic data, we showed that miRlastic outperformed commonly used methods and was suitable even for low sample sizes. To gain insight into the functional role of miRNAs and to determine joint functional properties of miRNA clusters, we introduced a local enrichment analysis procedure. The principle of this procedure lies in identifying regions of high functional similarity by evaluating the shortest paths between genes in the network. We can finally assign functional roles to the miRNAs by taking their regulatory relationships into account. We thoroughly evaluated miRlastic on a cohort of head and neck cancer (HNSCC) patients provided by The Cancer Genome Atlas. We inferred an mi-/mRNA regulatory network for human papilloma virus (HPV)-associated miRNAs in HNSCC. The resulting network best enriched for experimentally validated miRNA-target interaction, when compared to common methods. Finally, the local enrichment step identified two functional clusters of miRNAs that were predicted to mediate HPV-associated dysregulation in HNSCC. Our novel approach was able to characterize distinct pathway regulations from matched miRNA and mRNA data. An R package of miRlastic was made available through: http://icb.helmholtz-muenchen.de/mirlastic.
Wissenschaftlicher Artikel
Scientific Article
Wang, L. ; Belagiannis, V. ; Marr, C. ; Theis, F.J. ; Yang, G.Z. ; Navab, N.
In: Proceedings (12th IEEE International Symposium on Biomedical Imaging, ISBI 2015, 16-19 April 2015, Brooklyn, United States). 2015. 1304-1307
Anatomical landmarks in images play an important role in medical practice. This paper presents a graphical model that fully automatically detects such landmarks. The model includes a unary potential using a random forest classifier based on local appearance and binary and ternary potentials encoding geometrical context among different landmarks. The weightings of different potentials are learned in a maximum likelihood manner. The final detection result is formulated as the maximum-a-posteriori estimation jointly over the whole set of landmarks in one image. For validation, the model is applied to detect right-ventricle insert points in cardiac MR images. The result shows that the context modelling is able to substantially improve the overall accuracy.
Loos, C. ; Marr, C. ; Theis, F.J. ; Hasenauer, J.
Lect. Notes Comput. Sc. 9308, 52-63 (2015)
Stochastic dynamics of individual cells are mostly modeled with continuous time Markov chains (CTMCs). The parameters of CTMCs can be inferred using likelihood-based and likelihood-free methods. In this paper, we introduce a likelihood-free approximate Bayesian computation (ABC) approach for single-cell time-lapse data. This method uses multivariate statistics on the distribution of single-cell trajectories. We evaluated our method for samples of a bivariate normal distribution as well as for artificial equilibrium and non-equilibrium single-cell time-series of a one-stage model of gene expression. In addition, we assessed our method for parameter variability and for the case of tree-structured time-series data. A comparison with an existing method using univariate statistics revealed an improved parameter identifiability using multivariate test statistics.
Wissenschaftlicher Artikel
Scientific Article
Preusse, M. ; Marr, C. ; Saunders, S. ; Maticzka, D. ; Lickert, H. ; Backofen, R. ; Theis, F.J.
RNA Biol. 12, 998-1009 (2015)
microRNAs and microRNA-independent RNA-binding proteins are 2 classes of post-transcriptional regulators that have been shown to cooperate in gene-expression regulation. We compared the genome-wide target sets of microRNAs and RBPs identified by recent CLIP-Seq technologies, finding that RBPs have distinct target sets and favor gene interaction network hubs. To identify microRNAs and RBPs with a similar functional context, we developed simiRa, a tool that compares enriched functional categories such as pathways and GO terms. We applied simiRa to the known functional cooperation between Pumilio family proteins and miR-221/222 in the regulation of tumor supressor gene p27 and show that the cooperation is reflected by similar enriched categories but not by target genes. SimiRa also predicts possible cooperation of microRNAs and RBPs beyond direct interaction on the target mRNA for the nuclear RBP TAF15. To further facilitate research into cooperation of microRNAs and RBPs, we made simiRa available as a web tool that displays the functional neighborhood and similarity of microRNAs and RBPs: http://vsicb-simira.helmholtz-muenchen.de .
Wissenschaftlicher Artikel
Scientific Article
Filipczyk, A. ; Marr, C. ; Hastreiter, S. ; Feigelman, J. ; Schwarzfischer, M. ; Hoppe, P.S. ; Loeffler, D. ; Kokkaliaris, K.D. ; Endele, M. ; Schauberger, B. ; Hilsenbeck, O. ; Skylaki, S. ; Hasenauer, J. ; Anastassiadis, K. ; Theis, F.J. ; Schroeder, T.
Nat. Cell Biol. 17, 1235-1246 (2015)
Transcription factor (TF) networks are thought to regulate embryonic stem cell (ESC) pluripotency. However, TF expression dynamics and regulatory mechanisms are poorly understood. We use reporter mouse ESC lines allowing non-invasive quantification of Nanog or Oct4 protein levels and continuous long-term single-cell tracking and quantification over many generations to reveal diverse TF protein expression dynamics. For cells with low Nanog expression, we identified two distinct colony types: one re-expressed Nanog in a mosaic pattern, and the other did not re-express Nanog over many generations. Although both expressed pluripotency markers, they exhibited differences in their TF protein correlation networks and differentiation propensities. Sister cell analysis revealed that differences in Nanog levels are not necessarily accompanied by differences in the expression of other pluripotency factors. Thus, regulatory interactions of pluripotency TFs are less stringently implemented in individual self-renewing ESCs than assumed at present.
Wissenschaftlicher Artikel
Scientific Article
Strasser, M. ; Feigelman, J. ; Theis, F.J. ; Marr, C.
BMC Syst. Biol. 9:61 (2015)
Background Time-lapse microscopy allows to monitor cell state transitions in a spatiotemporal context. Combined with single cell tracking and appropriate cell state markers, transition events can be observed within the genealogical relationship of a proliferating population. However, to infer the correlations between the spatiotemporal context and cell state transitions, statistical analysis with an appropriately large number of samples is required. Results Here, we present a method to infer spatiotemporal features predictive of the state transition events observed in time-lapse microscopy data. We first formulate a generative model, simulate different scenarios, such as time-dependent or local cell density-dependent transitions, and illustrate how to estimate univariate transition rates. Second, we formulate the problem in a machine-learning language using regularized linear models. This allows for a multivariate analysis and to disentangle indirect dependencies via feature selection. We find that our method can accurately recover the relevant features and reconstruct the underlying interaction kernels if a critical number of samples is available. Finally, we explicitly use the tree structure of the data to validate if the estimated model is sufficient to explain correlated transition events of sister cells. Conclusions Using synthetic cellular genealogies, we prove that our method is able to correctly identify features predictive of state transitions and we moreover validate the chosen model. Our approach allows to estimate the number of cellular genealogies required for the proposed spatiotemporal statistical analysis, and we thus provide an important tool for the experimental design of challenging single cell time-lapse microscopy assays.  
Wissenschaftlicher Artikel
Scientific Article
Ilmonen, P. ; Nordhausen, K. ; Oja, H. ; Theis, F.J.
Lect. Notes Comput. Sc. 9237, 328-335 (2015)
The interest in robust methods for blind source separation has increased recently. In this paper we shortly review what has been suggested so far for robustifying ICA and second order blind source separation. Furthermore do we suggest a new algorithm, eSAM-SOBI, which is an affine equivariant improvement of (already robust) SAM-SOBI. In a simulation study we illustrate the benefits of using eSAM-SOBI when compared to SOBI and SAM-SOBI. For uncontaminated time series SOBI and eSAM-SOBI perform equally well. However, SOBI suffers a lot when the data is contaminated by outliers, whereas robust eSAM-SOBI does not. Due to the lack of affine equivariance of SAM-SOBI, eSAM-SOBI performs clearly better than it for both, contaminated and uncontaminated data.
Wissenschaftlicher Artikel
Scientific Article
Garzorz, N.V. ; Krause, L. ; Lauffer, F. ; Atenhan, A. ; Thomas, J. ; Theis, F.J. ; Biedermann, T. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Eyerich, K.
J. Invest. Dermatol. 135, S1 (2015)
Meeting abstract
Meeting abstract
Meyer, S.U. ; Sass, S. ; Müller, N.S. ; Krebs, S. ; Bauersachs, S. ; Kaiser, S. ; Blum, H. ; Thirion, C. ; Krause, S. ; Theis, F.J. ; Pfaffl, M.W.
PLoS ONE 10:e0135284 (2015)
INTRODUCTION: Skeletal muscle cell differentiation is impaired by elevated levels of the inflammatory cytokine tumor necrosis factor-α (TNF-α) with pathological significance in chronic diseases or inherited muscle disorders. Insulin like growth factor-1 (IGF1) positively regulates muscle cell differentiation. Both, TNF-α and IGF1 affect gene and microRNA (miRNA) expression in this process. However, computational prediction of miRNA-mRNA relations is challenged by false positives and targets which might be irrelevant in the respective cellular transcriptome context. Thus, this study is focused on functional information about miRNA affected target transcripts by integrating miRNA and mRNA expression profiling data. METHODOLOGY/PRINCIPAL FINDINGS: Murine skeletal myocytes PMI28 were differentiated for 24 hours with concomitant TNF-α or IGF1 treatment. Both, mRNA and miRNA expression profiling was performed. The data-driven integration of target prediction and paired mRNA/miRNA expression profiling data revealed that i) the quantity of predicted miRNA-mRNA relations was reduced, ii) miRNA targets with a function in cell cycle and axon guidance were enriched, iii) differential regulation of anti-differentiation miR-155-5p and miR-29b-3p as well as pro-differentiation miR-335-3p, miR-335-5p, miR-322-3p, and miR-322-5p seemed to be of primary importance during skeletal myoblast differentiation compared to the other miRNAs, iv) the abundance of targets and affected biological processes was miRNA specific, and v) subsets of miRNAs may collectively regulate gene expression. CONCLUSIONS: Joint analysis of mRNA and miRNA profiling data increased the process-specificity and quality of predicted relations by statistically selecting miRNA-target interactions. Moreover, this study revealed miRNA-specific predominant biological implications in skeletal muscle cell differentiation and in response to TNF-α or IGF1 treatment. Furthermore, myoblast differentiation-associated miRNAs are suggested to collectively regulate gene clusters and targets associated with enriched specific gene ontology terms or pathways. Predicted miRNA functions of this study provide novel insights into defective regulation at the transcriptomic level during myocyte proliferation and differentiation due to inflammatory stimuli.
Wissenschaftlicher Artikel
Scientific Article
Krumsiek, J. ; Mittelstraß, K. ; Do, K.T. ; Stückler, F. ; Ried, J.S. ; Adamski, J. ; Peters, A. ; Illig, T. ; Kronenberg, F. ; Friedrich, N. ; Nauck, M. ; Pietzner, M. ; Mook-Kanamori, D.O. ; Suhre, K. ; Gieger, C. ; Grallert, H. ; Theis, F.J. ; Kastenmüller, G.
Metabolomics 11, 1815-1833 (2015)
The susceptibility for various diseases as well as the response to treatments differ considerably between men and women. As a basis for a gender-specific personalized healthcare, an extensive characterization of the molecular differences between the two genders is required. In the present study, we conducted a large-scale metabolomics analysis of 507 metabolic markers measured in serum of 1756 participants from the German KORA F4 study (903 females and 853 males). One-third of the metabolites show significant differences between males and females. A pathway analysis revealed strong differences in steroid metabolism, fatty acids and further lipids, a large fraction of amino acids, oxidative phosphorylation, purine metabolism and gamma-glutamyl dipeptides. We then extended this analysis by a network-based clustering approach. Metabolite interactions were estimated using Gaussian graphical models to get an unbiased, fully data-driven metabolic network representation. This approach is not limited to possibly arbitrary pathway boundaries and can even include poorly or uncharacterized metabolites. The network analysis revealed several strongly gender-regulated submodules across different pathways. Finally, a gender-stratified genome-wide association study was performed to determine whether the observed gender differences are caused by dimorphisms in the effects of genetic polymorphisms on the metabolome. With only a single genome-wide significant hit, our results suggest that this scenario is not the case. In summary, we report an extensive characterization and interpretation of gender-specific differences of the human serum metabolome, providing a broad basis for future analyses.
Wissenschaftlicher Artikel
Scientific Article
Müller, S. ; Raulefs, S. ; Bruns, P. ; Afonso-Grunz, F. ; Plötner, A. ; Thermann, R. ; Jager, C. ; Schlitter, A.M. ; Kong, B. ; Regel, I. ; Roth, W.K. ; Rotter, B. ; Hoffmeier, K. ; Kahl, G.F. ; Koch, I. ; Theis, F.J. ; Kleeff, J. ; Winter, P. ; Michalski, C.W.
Mol. Cancer 14:144 (2015)
Raue, A. ; Steiert, B. ; Schelker, M. ; Kreutz, C. ; Maiwald, T. ; Hass, H. ; Vanlier, J. ; Tönsing, C. ; Adlung, L. ; Engesser, R. ; Mader, W. ; Heinemann, T. ; Hasenauer, J. ; Schilling, M. ; Höfer, T. ; Klipp, E. ; Theis, F.J. ; Klingmüller, U. ; Schöberl, B. ; Timmer, J.
Bioinformatics 31, 3558-3560 (2015)
Modeling of dynamical systems using ordinary differential equations is a popular approach in the field of Systems Biology. Two of the most critical steps in this approach are to construct dynamical models of biochemical reaction networks for large data sets and complex experimental conditions and to perform efficient and reliable parameter estimation for model fitting. We present a modeling environment for MATLAB that pioneers these challenges. The numerically expensive parts of the calculations such as the solving of the differential equations and of the associated sensitivity system are parallelized and automatically compiled into efficient C code. A variety of parameter estimation algorithms as well as frequentist and Bayesian methods for uncertainty analysis have been implemented and used on a range of applications that lead to publications. AVAILABILITY AND IMPLEMENTATION: The Data2Dynamics modeling environment is MATLAB based, open source and freely available at http://www.data2dynamics.org. CONTACT: andreas.raue@fdm.uni-freiburg.de SUPPLEMENTARY INFORMATION: is provided online and contains detailed description of methodology, a user guide and documentation.
Wissenschaftlicher Artikel
Scientific Article
Hasenauer, J. ; Jagiella, N. ; Hross, S. ; Theis, F.J.
J. Coupled Syst. Multiscale Dyn. 3, 101-121 (2015)
Wissenschaftlicher Artikel
Scientific Article
Masserdotti, G. ; Gillotin, S. ; Sutor, B. ; Drechsel, D. ; Irmler, M. ; Jørgensen, H.F. ; Sass, S. ; Theis, F.J. ; Beckers, J. ; Berninger, B. ; Guillemot, F. ; Götz, M.
Cell Stem Cell 17, 74-88 (2015)
Direct lineage reprogramming induces dramatic shifts in cellular identity, employing poorly understood mechanisms. Recently, we demonstrated that expression of Neurog2 or Ascl1 in postnatal mouse astrocytes generates glutamatergic or GABAergic neurons. Here, we take advantage of this model to study dynamics of neuronal cell fate acquisition at the transcriptional level. We found that Neurog2 and Ascl1 rapidly elicited distinct neurogenic programs with only a small subset of shared target genes. Within this subset, only NeuroD4 could by itself induce neuronal reprogramming in both mouse and human astrocytes, while co-expression with Insm1 was required for glutamatergic maturation. Cultured astrocytes gradually became refractory to reprogramming, in part by the repressor REST preventing Neurog2 from binding to the NeuroD4 promoter. Notably, in astrocytes refractory to Neurog2 activation, the underlying neurogenic program remained amenable to reprogramming by exogenous NeuroD4. Our findings support a model of temporal hierarchy for cell fate change during neuronal reprogramming.
Wissenschaftlicher Artikel
Scientific Article
Bartel, J. ; Krumsiek, J. ; Schramm, K. ; Adamski, J. ; Gieger, C. ; Herder, C. ; Carstensen, M. ; Peters, A. ; Rathmann, W. ; Roden, M. ; Strauch, K. ; Suhre, K. ; Kastenmüller, G. ; Prokisch, H. ; Theis, F.J.
PLoS Genet. 11:e1005274 (2015)
Biological systems consist of multiple organizational levels all densely interacting with each other to ensure function and flexibility of the system. Simultaneous analysis of cross-sectional multi-omics data from large population studies is a powerful tool to comprehensively characterize the underlying molecular mechanisms on a physiological scale. In this study, we systematically analyzed the relationship between fasting serum metabolomics and whole blood transcriptomics data from 712 individuals of the German KORA F4 cohort. Correlation-based analysis identified 1,109 significant associations between 522 transcripts and 114 metabolites summarized in an integrated network, the 'human blood metabolome-transcriptome interface' (BMTI). Bidirectional causality analysis using Mendelian randomization did not yield any statistically significant causal associations between transcripts and metabolites. A knowledge-based interpretation and integration with a genome-scale human metabolic reconstruction revealed systematic signatures of signaling, transport and metabolic processes, i.e. metabolic reactions mainly belonging to lipid, energy and amino acid metabolism. Moreover, the construction of a network based on functional categories illustrated the cross-talk between the biological layers at a pathway level. Using a transcription factor binding site enrichment analysis, this pathway cross-talk was further confirmed at a regulatory level. Finally, we demonstrated how the constructed networks can be used to gain novel insights into molecular mechanisms associated to intermediate clinical traits. Overall, our results demonstrate the utility of a multi-omics integrative approach to understand the molecular mechanisms underlying both normal physiology and disease.
Wissenschaftlicher Artikel
Scientific Article
Ocone, A. ; Haghverdi, L. ; Müller, N.S. ; Theis, F.J.
Bioinformatics 31, i89-i96 (2015)
MOTIVATION: High-dimensional single-cell snapshot data are becoming widespread in the systems biology community, as a mean to understand biological processes at the cellular level. However, as temporal information is lost with such data, mathematical models have been limited to capture only static features of the underlying cellular mechanisms. RESULTS: Here, we present a modular framework which allows to recover the temporal behaviour from single-cell snapshot data and reverse engineer the dynamics of gene expression. The framework combines a dimensionality reduction method with a cell time-ordering algorithm to generate pseudo time-series observations. These are in turn used to learn transcriptional ODE models and do model selection on structural network features. We apply it on synthetic data and then on real hematopoietic stem cells data, to reconstruct gene expression dynamics during differentiation pathways and infer the structure of a key gene regulatory network. AVAILABILITY AND IMPLEMENTATION: C++ and Matlab code available at https://www.helmholtz-muenchen.de/fileadmin/ICB/software/inferenceSnapshot.zip. CONTACT: fabian.theis@helmholtz-muenchen.deSupplementary information: Supplementary data are available at Bioinformatics online.
Wissenschaftlicher Artikel
Scientific Article
Linnemann, J. ; Miura, H. ; Meixner, L.K. ; Irmler, M. ; Kloos, U. ; Hirschi, B. ; Bartsch, H.S. ; Sass, S. ; Beckers, J. ; Theis, F.J. ; Gabka, C. ; Sotlar, K. ; Scheel, C.
Development 142, 3239-3251 (2015)
We present an organoid regeneration assay in which freshly isolated human mammary epithelial cells are cultured in adherent or floating collagen gels, corresponding to a rigid or compliant matrix environment. In both conditions, luminal progenitors form spheres, whereas basal cells generate branched ductal structures. In compliant but not rigid collagen gels, branching ducts form alveoli at their tips, express basal and luminal markers at correct positions, and display contractility, which is required for alveologenesis. Thereby, branched structures generated in compliant collagen gels resemble terminal ductal-lobular units (TDLUs), the functional units of the mammary gland. Using the membrane metallo-endopeptidase CD10 as a surface marker enriches for TDLU formation and reveals the presence of stromal cells within the CD49f(hi)/EpCAM(-) population. In summary, we describe a defined in vitro assay system to quantify cells with regenerative potential and systematically investigate their interaction with the physical environment at distinct steps of morphogenesis.
Wissenschaftlicher Artikel
Scientific Article
Sadic, D. ; Schmidt, K. ; Groh, S. ; Kondofersky, I. ; Ellwart, J.W. ; Fuchs, C. ; Theis, F.J. ; Schotta, G.
EMBO Rep. 16, 836-850 (2015)
More than 50% of mammalian genomes consist of retrotransposon sequences. Silencing of retrotransposons by heterochromatin is essential to ensure genomic stability and transcriptional integrity. Here, we identified a short sequence element in intracisternal A particle (IAP) retrotransposons that is sufficient to trigger heterochromatin formation. We used this sequence in a genome-wide shRNA screen and identified the chromatin remodeler Atrx as a novel regulator of IAP silencing. Atrx binds to IAP elements and is necessary for efficient heterochromatin formation. In addition, Atrx facilitates a robust and largely inaccessible heterochromatin structure as Atrx knockout cells display increased chromatin accessibility at retrotransposons and non-repetitive heterochromatic loci. In summary, we demonstrate a direct role of Atrx in the establishment and robust maintenance of heterochromatin.
Wissenschaftlicher Artikel
Scientific Article
Wilson, N.K. ; Kent, D.G. ; Buettner, F. ; Shehata, M. ; Macaulay, I.C. ; Calero-Nieto, F.J. ; Sánchez Castillo, M. ; Oedekoven, C.A. ; Diamanti, E. ; Schulte, R. ; Ponting, C.P. ; Voet, T. ; Caldas, C. ; Stingl, J. ; Green, A.R. ; Theis, F.J. ; Göttgens, B.
Cell Stem Cell 16, 712-724 (2015)
Heterogeneity within the self-renewal durability of adult hematopoietic stem cells (HSCs) challenges our understanding of the molecular framework underlying HSC function. Gene expression studies have been hampered by the presence of multiple HSC subtypes and contaminating non-HSCs in bulk HSC populations. To gain deeper insight into the gene expression program of murine HSCs, we combined single-cell functional assays with flow cytometric index sorting and single-cell gene expression assays. Through bioinformatic integration of these datasets, we designed an unbiased sorting strategy that separates non-HSCs away from HSCs, and single-cell transplantation experiments using the enriched population were combined with RNA-seq data to identify key molecules that associate with long-term durable self-renewal, producing a single-cell molecular dataset that is linked to functional stem cell activity. Finally, we demonstrated the broader applicability of this approach for linking key molecules with defined cellular functions in another stem cell system.
Wissenschaftlicher Artikel
Scientific Article
Haghverdi, L. ; Buettner, F. ; Theis, F.J.
Bioinformatics 31, 2989-2998 (2015)
MOTIVATION: Single-cell technologies have recently gained popularity in cellular differentiation studies regarding their ability to resolve potential heterogeneities in cell populations. Analysing such high-dimensional single-cell data has its own statistical and computational challenges. Popular multivariate approaches are based on data normalisation, followed by dimension reduction and clustering to identify subgroups. However, in the case of cellular differentiation, we would not expect clear clusters to be present but instead expect the cells to follow continuous branching lineages. RESULTS: Here we propose the use of diffusion maps to deal with the problem of defining differentiation trajectories. We adapt this method to single-cell data by adequate choice of kernel width and inclusion of uncertainties or missing measurement values, which enables the establishment of a pseudo-temporal ordering of single cells in a high-dimensional gene expression space. We expect this output to reflect cell differentiation trajectories, where the data originates from intrinsic diffusion-like dynamics. Starting from a pluripotent stage, cells move smoothly within the transcriptional landscape towards more differentiated states with some stochasticity along their path. We demonstrate the robustness of our method with respect to extrinsic noise (e.g. measurement noise) and sampling density heterogeneities on simulated toy data as well as two single-cell quantitative polymerase chain reaction (qPCR) data sets (i.e. mouse haematopoietic stem cells and mouse embryonic stem cells) and an RNA-Seq data of human pre-implantation embryos. We show that diffusion maps perform considerably better than Principal Component Analysis (PCA) and are advantageous over other techniques for non-linear dimension reduction such as t-distributed Stochastic Neighbour Embedding (t-SNE) for preserving the global structures and pseudotemporal ordering of cells. AVAILABILITY: The Matlab implementation of diffusion maps for single-cell data is available at https://www.helmholtz-muenchen.de/icb/single-cell-diffusion-map. CONTACT: fbuettner.phys@gmail.com, fabian.theis@helmholtz-muenchen.de.
Wissenschaftlicher Artikel
Scientific Article
Barbosa, J.S. ; Sanchez-Gonzalez, R. ; di Giaimo, R. ; Baumgart, E.V. ; Theis, F.J. ; Götz, M. ; Ninkovic, J.
Science 348, 789-793 (2015)
Adult neural stem cells are the source for restoring injured brain tissue. We used repetitive imaging to follow single stem cells in the intact and injured adult zebrafish telencephalon in vivo and found that neurons are generated by both direct conversions of stem cells into postmitotic neurons and via intermediate progenitors amplifying the neuronal output. We observed an imbalance of direct conversion consuming the stem cells and asymmetric and symmetric self-renewing divisions, leading to depletion of stem cells over time. After brain injury, neuronal progenitors are recruited to the injury site. These progenitors are generated by symmetric divisions that deplete the pool of stem cells, a mode of neurogenesis absent in the intact telencephalon. Our analysis revealed changes in the behavior of stem cells underlying generation of additional neurons during regeneration.
Wissenschaftlicher Artikel
Scientific Article
Kondofersky, I. ; Fuchs, C. ; Theis, F.J.
IET Syst. Biol. 9, 193-203 (2015)
In computational systems biology, the general aim is to derive regulatory models from multivariate readouts, thereby generating predictions for novel experiments. In the past, many such models have been formulated for different biological applications. The authors consider the scenario where a given model fails to predict a set of observations with acceptable accuracy and ask the question whether this is because of the model lacking important external regulations. Real-world examples for such entities range from microRNAs to metabolic fluxes. To improve the prediction, they propose an algorithm to systematically extend the network by an additional latent dynamic variable which has an exogenous effect on the considered network. This variable's time course and influence on the other species is estimated in a two-step procedure involving spline approximation, maximum-likelihood estimation and model selection. Simulation studies show that such a hidden influence can successfully be inferred. The method is also applied to a signalling pathway model where they analyse real data and obtain promising results. Furthermore, the technique can be employed to detect incomplete network structures.
Wissenschaftlicher Artikel
Scientific Article
Müller, S. ; Raulefs, S. ; Bruns, P. ; Afonso-Grunz, F. ; Plötner, A. ; Thermann, R. ; Jager, C. ; Schlitter, A.M. ; Kong, B. ; Regel, I. ; Roth, W.K. ; Rotter, B. ; Hoffmeier, K. ; Kahl, G.F. ; Koch, I. ; Theis, F.J. ; Kleeff, J. ; Winter, P. ; Michalski, C.W.
Mol. Cancer 14:94 (2015)
BACKGROUND: Previous studies identified microRNAs (miRNAs) and messenger RNAs with significantly different expression between normal pancreas and pancreatic cancer (PDAC) tissues. Due to technological limitations of microarrays and real-time PCR systems these studies focused on a fixed set of targets. Expression of other RNA classes such as long intergenic non-coding RNAs or sno-derived RNAs has rarely been examined in pancreatic cancer. Here, we analysed the coding and non-coding transcriptome of six PDAC and five control tissues using next-generation sequencing. RESULTS: Besides the confirmation of several deregulated mRNAs and miRNAs, miRNAs without previous implication in PDAC were detected: miR-802, miR-2114 or miR-561. SnoRNA-derived RNAs (e.g. sno-HBII-296B) and piR-017061, a piwi-interacting RNA, were found to be differentially expressed between PDAC and control tissues. In silico target analysis of miR-802 revealed potential binding sites in the 3' UTR of TCF4, encoding a transcription factor that controls Wnt-signalling genes. Overexpression of miR-802 in MiaPaCa pancreatic cancer cells reduced TCF4 protein levels. Using Massive Analysis of cDNA Ends (MACE) we identified differential expression of 43 lincRNAs, long intergenic non-coding RNAs, e.g. LINC00261 and LINC00152 as well as several natural antisense transcripts like HNF1A-AS1 and AFAP1-AS1. Differential expression was confirmed by qPCR on the mRNA/miRNA/lincRNA level and by immunhistochemistry on the protein level. CONCLUSIONS: Here, we report a novel lncRNA, sncRNA and mRNA signature of PDAC. In silico prediction of ncRNA targets allowed for assigning potential functions to differentially regulated RNAs.
Wissenschaftlicher Artikel
Scientific Article
Helmbrecht, M.S. ; Soellner, H. ; Castiblanco-Urbina, M.A. ; Winzeck, S. ; Sundermeier, J. ; Theis, F.J. ; Fouad, K. ; Huber, A.B.
PLoS ONE 10:e0123643 (2015)
The correct wiring of neuronal circuits is of crucial importance for precise neuromuscular functionality. Therefore, guidance cues provide tight spatiotemporal control of axon growth and guidance. Mice lacking the guidance cue Semaphorin 3F (Sema3F) display very specific axon wiring deficits of motor neurons in the medial aspect of the lateral motor column (LMCm). While these deficits have been investigated extensively during embryonic development, it remained unclear how Sema3F mutant mice cope with these errors postnatally. We therefore investigated whether these animals provide a suitable model for the exploration of adaptive plasticity in a system of miswired neuronal circuitry. We show that the embryonically developed wiring deficits in Sema3F mutants persist until adulthood. As a consequence, these mutants display impairments in motor coordination that improve during normal postnatal development, but never reach wildtype levels. These improvements in motor coordination were boosted to wildtype levels by housing the animals in an enriched environment starting at birth. In contrast, a delayed start of enriched environment housing, at 4 weeks after birth, did not similarly affect motor performance of Sema3F mutants. These results, which are corroborated by neuroanatomical analyses, suggest a critical period for adaptive plasticity in neuromuscular circuitry. Interestingly, the formation of perineuronal nets, which are known to close the critical period for plastic changes in other systems, was not altered between the different housing groups. However, we found significant changes in the number of excitatory synapses on limb innervating motor neurons. Thus, we propose that during the early postnatal phase, when perineuronal nets have not yet been formed around spinal motor neurons, housing in enriched environment conditions induces adaptive plasticity in the motor system by the formation of additional synaptic contacts, in order to compensate for coordination deficits.
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J. ; Ugelvig, L.V. ; Marr, C. ; Cremer, S.
Philos. Trans. R. Soc. B - Biol. Sci. 370, DOI: 10.1098/rstb.2014.0108 (2015)
To prevent epidemics, insect societies have evolved collective disease defences that are highly effective at curing exposed individuals and limiting disease transmission to healthy group members. Grooming is an important sanitary behaviour-either performed towards oneself (self-grooming) or towards others (allogrooming)-to remove infectious agents from the body surface of exposed individuals, but at the risk of disease contraction by the groomer. We use garden ants (Lasius neglectus) and the fungal pathogen Metarhizium as a model system to study how pathogen presence affects self-grooming and allogrooming between exposed and healthy individuals. We develop an epidemiological SIS model to explore how experimentally observed grooming patterns affect disease spread within the colony, thereby providing a direct link between the expression and direction of sanitary behaviours, and their effects on colony-level epidemiology. We find that fungus-exposed ants increase self-grooming, while simultaneously decreasing allogrooming. This behavioural modulation seems universally adaptive and is predicted to contain disease spread in a great variety of host-pathogen systems. In contrast, allogrooming directed towards pathogen-exposed individuals might both increase and decrease disease risk. Our model reveals that the effect of allogrooming depends on the balance between pathogen infectiousness and efficiency of social host defences, which are likely to vary across host-pathogen systems.
Wissenschaftlicher Artikel
Scientific Article
Wahl, S. ; Vogt, S. ; Stückler, F. ; Krumsiek, J. ; Bartel, J. ; Kacprowski, T. ; Schramm, K. ; Carstensen, M. ; Rathmann, W. ; Roden, M. ; Jourdan, C. ; Kangas, A.J. ; Soininen, P. ; Ala-Korpela, M. ; Nöthlings, U. ; Boeing, H. ; Theis, F.J. ; Meisinger, C. ; Waldenberger, M. ; Suhre, K. ; Homuth, G. ; Gieger, C. ; Kastenmüller, G. ; Illig, T. ; Linseisen, J. ; Peters, A. ; Prokisch, H. ; Herder, C. ; Thorand, B. ; Grallert, H.
BMC Med. 13:48 (2015)
Background Excess body weight is a major risk factor for cardiometabolic diseases. The complex molecular mechanisms of body weight change-induced metabolic perturbations are not fully understood. Specifically, in-depth molecular characterization of long-term body weight change in the general population is lacking. Here, we pursued a multi-omic approach to comprehensively study metabolic consequences of body weight change during a seven-year follow-up in a large prospective study. Methods We used data from the population-based Cooperative Health Research in the Region of Augsburg (KORA) S4/F4 cohort. At follow-up (F4), two-platform serum metabolomics and whole blood gene expression measurements were obtained for 1,631 and 689 participants, respectively. Using weighted correlation network analysis, omics data were clustered into modules of closely connected molecules, followed by the formation of a partial correlation network from the modules. Association of the omics modules with previous annual percentage weight change was then determined using linear models. In addition, we performed pathway enrichment analyses, stability analyses, and assessed the relation of the omics modules with clinical traits. Results Four metabolite and two gene expression modules were significantly and stably associated with body weight change (P-values ranging from 1.9 × 10−4 to 1.2 × 10−24). The four metabolite modules covered major branches of metabolism, with VLDL, LDL and large HDL subclasses, triglycerides, branched-chain amino acids and markers of energy metabolism among the main representative molecules. One gene expression module suggests a role of weight change in red blood cell development. The other gene expression module largely overlaps with the lipid-leukocyte (LL) module previously reported to interact with serum metabolites, for which we identify additional co-expressed genes. The omics modules were interrelated and showed cross-sectional associations with clinical traits. Moreover, weight gain and weight loss showed largely opposing associations with the omics modules. Conclusions Long-term weight change in the general population globally associates with serum metabolite concentrations. An integrated metabolomics and transcriptomics approach improved the understanding of molecular mechanisms underlying the association of weight gain with changes in lipid and amino acid metabolism, insulin sensitivity, mitochondrial function as well as blood cell development and function.
Wissenschaftlicher Artikel
Scientific Article
Hug, S. ; Schwarzfischer, M. ; Hasenauer, J. ; Marr, C. ; Theis, F.J.
Stat. Comp. 26, 663-677 (2015)
Bayesian model selection using Bayes factors requires the computation of marginal likelihoods. Nowadays, the marginal likelihoods are often computed using thermodynamic integration for power posteriors, which relies on numerical integration methods. The commonly used integration methods however neither control the integration accuracy nor exploit the available function evaluations efficiently. In this manuscript we introduce an adaptive method for calculating marginal likelihoods which relies on Simpson’s rule. The proposed method is evaluated on an analytically tractable academic example as well as two high-dimensional models possessing up to 800 parameters. The high-dimensional models describe the protein degradation in a large population of fibroblast cells. Our analysis reveals that the proposed adaptive method shows improved performance over existing approaches for simple problems and furthermore allows for the efficient study of high-dimensional problems.
Wissenschaftlicher Artikel
Scientific Article
Calzolari, F. ; Michel, J. ; Baumgart, E.V. ; Theis, F.J. ; Götz, M. ; Ninkovic, J.
Nat. Neurosci. 18, 490-492 (2015)
We analyzed the progeny of individual neural stem cells (NSCs) of the mouse adult subependymal zone (SEZ) in vivo and found a markedly fast lineage amplification, as well as limited NSC self-renewal and exhaustion in a few weeks. We further unraveled the mechanisms of neuronal subtype generation, finding that a higher proportion of NSCs were dedicated to generate deep granule cells in the olfactory bulb and that larger clones were produced by these NSCs.
Wissenschaftlicher Artikel
Scientific Article
Moignard, V. ; Woodhouse, S. ; Haghverdi, L. ; Lilly, A.J. ; Tanaka, Y. ; Wilkinson, A.C. ; Buettner, F. ; Macaulay, I.C. ; Jawaid, W. ; Diamanti, E. ; Nishikawa, S.I. ; Piterman, N. ; Kouskoff, V. ; Theis, F.J. ; Fisher, J. ; Gottgens, B.
Nat. Biotechnol. 33, 269-276 (2015)
Reconstruction of the molecular pathways controlling organ development has been hampered by a lack of methods to resolve embryonic progenitor cells. Here we describe a strategy to address this problem that combines gene expression profiling of large numbers of single cells with data analysis based on diffusion maps for dimensionality reduction and network synthesis from state transition graphs. Applying the approach to hematopoietic development in the mouse embryo, we map the progression of mesoderm toward blood using single-cell gene expression analysis of 3,934 cells with blood-forming potential captured at four time points between E7.0 and E8.5. Transitions between individual cellular states are then used as input to develop a single-cell network synthesis toolkit to generate a computationally executable transcriptional regulatory network model of blood development. Several model predictions concerning the roles of Sox and Hox factors are validated experimentally. Our results demonstrate that single-cell analysis of a developing organ coupled with computational approaches can reveal the transcriptional programs that underpin organogenesis.
Wissenschaftlicher Artikel
Scientific Article
Illner, K. ; Miettinen, J. ; Fuchs, C. ; Taskinen, S. ; Nordhausen, K. ; Oja, H. ; Theis, F.J.
Signal Process. 113, 95-103 (2015)
Signals, recorded over time, are often observed as mixtures of multiple source signals. To extract relevant information from such measurements one needs to determine the mixing coefficients. In case of weakly stationary time series with uncorrelated source signals, this separation can be achieved by jointly diagonalizing sample autocovariances at different lags, and several algorithms address this task. Often the mixing estimates contain close-to-zero entries and one wants to decide whether the corresponding source signals have a relevant impact on the observations or not. To address this question of model selection we consider the recently published second-order blind identification procedures SOBIdef and SOBIsym which provide limiting distributions of the mixing estimates. For the first time, such distributions enable informed decisions about the presence of second-order stationary source signals in the data. We consider a family of linear hypothesis tests and information criteria to perform model selection as second step after parameter estimation. In simulations we consider different time series models. We validate the model selection performance and demonstrate a good recovery of the true zero pattern of the mixing matrix.
Wissenschaftlicher Artikel
Scientific Article
Kong, B. ; Bruns, P. ; Raulefs, S. ; Rieder, S. ; Paul, L. ; da Costa, O.P. ; Buch, T. ; Theis, F.J. ; Michalski, C.W. ; Kleeff, J.
Int. J. Surg. 14, 67-74 (2015)
INTRODUCTION: Surgical site infections (SSI) represent a significant cause of morbidity in abdominal surgery. The objective of this study was to determine the gene expression signature in subcutaneous tissues in relation to SSI. METHODS: To determine differences in gene expression, microarray analysis were performed from bulk tissue mRNA of subcutaneous tissues prospectively collected in 92 patients during open abdominal surgery. 10 patients (11%) developed incisional (superficial and deep) SSI. RESULTS: Preoperative risk factors in patients with SSI were not significantly different from those in patients without wound infections. 1025 genes were differentially expressed between the groups, of which the AZGP1 and ALDH1A3 genes were the highest down- and upregulated ones. Hierarchical clustering demonstrated strong similarity within the respective groups (SSI vs. no-SSI) indicating inter-group distinctness. In a functional classification, genes controlling cell metabolism were mostly down-regulated in subcutaneous tissues of patients that subsequently developed SSI. CONCLUSION: Altered expression of metabolism genes in subcutaneous tissues might constitute a risk factor for postoperative abdominal SSI.
Wissenschaftlicher Artikel
Scientific Article
Buettner, F. ; Natarajan, K.N. ; Casale, F.P. ; Proserpio, V. ; Scialdone, A. ; Theis, F.J. ; Teichmann, S.A. ; Marioni, J.C. ; Stegle, O.
Nat. Biotechnol. 33, 155-160 (2015)
Recent technical developments have enabled the transcriptomes of hundreds of cells to be assayed in an unbiased manner, opening up the possibility that new subpopulations of cells can be found. However, the effects of potential confounding factors, such as the cell cycle, on the heterogeneity of gene expression and therefore on the ability to robustly identify subpopulations remain unclear. We present and validate a computational approach that uses latent variable models to account for such hidden factors. We show that our single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells. Our approach can be used not only to identify cellular subpopulations but also to tease apart different sources of gene expression heterogeneity in single-cell transcriptomes.
Wissenschaftlicher Artikel
Scientific Article
Schmidt, J. ; Panzilius, E. ; Bartsch, H.S. ; Irmler, M. ; Beckers, J. ; Kari, V. ; Linnemann, J. ; Dragoi, D. ; Hirschi, B. ; Kloos, U. ; Sass, S. ; Theis, F.J. ; Kahlert, S. ; Johnsen, S.A. ; Sotlar, K. ; Scheel, C.
Cell Rep. 10, 131-139 (2015)
Master regulators of the epithelial-mesenchymal transition such as Twist1 and Snail1 have been implicated in invasiveness and the generation of cancer stem cells, but their persistent activity inhibits stem-cell-like properties and the outgrowth of disseminated cancer cells into macroscopic metastases. Here, we show that Twist1 activation primes a subset of mammary epithelial cells for stem-cell-like properties, which only emerge and stably persist following Twist1 deactivation. Consequently, when cells undergo a mesenchymal-epithelial transition (MET), they do not return to their original epithelial cell state, evidenced by acquisition of invasive growth behavior and a distinct gene expression profile. These data provide an explanation for how transient Twist1 activation may promote all steps of the metastatic cascade; i.e., invasion, dissemination, and metastatic outgrowth at distant sites.
Wissenschaftlicher Artikel
Scientific Article
Do, K.T. ; Kastenmüller, G. ; Mook-Kanamori, D.O. ; Yousri, N.A. ; Theis, F.J. ; Suhre, K. ; Krumsiek, J.
J. Proteome Res. 14, 1183-1194 (2015)
Most studies investigating human metabolomics measurements are limited to a single biofluid, most often blood or urine. An organism's biochemical pool, however, comprises complex transboundary relationships, which can only be understood by investigating metabolic interactions and physiological processes spanning multiple parts of the human body. Therefore, we here propose a data-driven network-based approach to generate an integrated picture of metabolomics associations over multiple fluids. We performed an analysis of 2251 metabolites measured in plasma, urine, and saliva, from 374 participants of the Qatar Metabolomics Study on Diabetes (QMDiab). Gaussian graphical models (GGMs) were used to estimate metabolite-metabolite interactions on different subsets of the data set. First, we compared similarities and differences of the metabolome and the association networks between the three fluids. Second, we investigated the cross-talk between the fluids by analyzing correlations occurring between them. Third, we propose a framework for the analysis of medically relevant phenotypes by integrating type 2 diabetes, sex, age, and body mass index into our networks. In conclusion, we present a generic, data-driven network-based approach for structuring and visualizing metabolite correlations within and between multiple body fluids, enabling unbiased interpretation of metabolomics multifluid data.
Wissenschaftlicher Artikel
Scientific Article
Sass, S. ; Buettner, F. ; Müller, N.S. ; Theis, F.J.
Bioinformatics 31, 128-130 (2015)
SUMMARY: Decreasing costs of modern high-throughput experiments allow for the simultaneous analysis of altered gene activity on various molecular levels. However, these multi-omics approaches lead to a large amount of data which is hard to interpret for a non-bioinformatician. Here, we present the remotely accessible multilevel ontology analysis (RAMONA). It offers an easy-to-use interface for the simultaneous gene set analysis of combined omics datasets and is an extension of the previously introduced MONA approach. RAMONA is based on a Bayesian enrichment method for the inference of overrepresented biological processes among given gene sets. Overrepresentation is quantified by interpretable term probabilities. It is able to handle data from various molecular levels, while in parallel coping with redundancies arising from gene set overlaps and related multiple testing problems. The comprehensive output of RAMONA is easy to interpret and thus allows for functional insight into the affected biological processes. With RAMONA, we provide an efficient implementation of the Bayesian inference problem such that ontologies consisting of thousands of terms can be processed in the order of seconds. Availability and Implementation: RAMONA is implemented as ASP.NET web application and publicly available at http://icb.helmholtz-muenchen.de/ramona. CONTACT: steffen.sass@helmholtz-muenchen.de.
Wissenschaftlicher Artikel
Scientific Article
2014
Kazeroonian, A. ; Theis, F.J. ; Hasenauer, J.
IFAC PapersOnline 19, 1729-1735 (2014)
Biological processes exhibiting stochastic fluctuations are mainly modeled using the Chemical Master Equation (CME). As a direct simulation of the CME is often computationally intractable, we recently introduced the Method of Conditional Moments (MCM). The MCM is a hybrid approach to approximate the statistics of the CME solution. In this work, we provide a more comprehensive formulation of the MCM by using non-central conditional moments instead of central conditional moments. The modified formulation allows for additional insight into the model structure and for extensions to higher-order reactions and non-polynomial propensity functions. The properties of the non-central MCM are analyzed using a model for the regulation of pili formation on the surface of bacteria, which possesses rational propensity functions.
Wissenschaftlicher Artikel
Scientific Article
Reversat, A. ; Buggenthin, F. ; Merrin, J. ; Leithner, A. ; de Vries, I. ; Theis, F.J. ; Marr, C. ; Sixt, M.
Mol. Biol. Cell 25:P537 (2014)
Meeting abstract
Meeting abstract
Quaranta, M. ; Knapp, B. ; Garzoz, N. ; Mattii, M. ; Pullabhatla, V. ; Pennino, D. ; Andres, C. ; Traidl-Hoffmann, C. ; Cavani, A. ; Theis, F.J. ; Ring, J. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Eyerich, K.
Br. J. Dermatol. 171, E135 (2014)
Meeting abstract
Meeting abstract
Fröhlich, F. ; Theis, F.J. ; Hasenauer, J.
Lect. Notes Comput. Sc. 8859, 61-72 (2014)
Dynamical systems are widely used to describe the behaviour of biological systems. When estimating parameters of dynamical systems, noise and limited availability of measurements can lead to uncertainties. These uncertainties have to be studied to understand the limitations and the predictive power of a model. Several methods for uncertainty analysis are available. In this paper we analysed and compared bootstrapping, profile likelihood, Fisher information matrix, and multi-start based approaches for uncertainty analysis. The analysis was carried out on two models which contain structurally non-identifiable parameters. We showed that bootstrapping, multi-start optimisation, and Fisher information matrix based approaches yield misleading results for parameters which are structurally non-identifiable. We provide a simple and intuitive explanation for this, using geometric arguments.
Wissenschaftlicher Artikel
Scientific Article
Fröhlich, F. ; Hross, S. ; Theis, F.J. ; Hasenauer, J.
Lect. Notes Comput. Sc. 8859, 73-85 (2014)
Dynamical models are widely used in systems biology to describe biological processes ranging from single cell transcription of genes to the tissue scale formation of gradients for cell guidance. One of the key issues for this class of models is the estimation of kinetic parameters from given measurement data, the so called parameter estimation. Measurement noise and the limited amount of data, give rise to uncertainty in estimates which can be captured in a probability density over the parameter space. Unfortunately, studying this probability density, using e.g. Markov chain Monte-Carlo, is often computationally demanding as it requires the repeated simulation of the underlying model. In the case of highly complex models, such as PDE models, this can render the study intractable. In this paper, we will present novel methods for analysis of such probability densities using networks of radial basis functions. We employed lattice generation algorithms, adaptive interacting particle sampling schemes as well as classical sampling schemes for the generation of approximation nodes coupled to the respective weighting scheme and compared their efficiency on different application examples. Our analysis showed that the novel method can yield an expected L 2 approximation error in marginals that is several orders of magnitude lower compared to classical approximations. This allows for a drastic reduction of the number of model evaluations. This facilitates the analysis of uncertainty for problems with high computational complexity. Finally, we successfully applied our method to a complex partial differential equation model for guided cell migration of dendritic cells.
Wissenschaftlicher Artikel
Scientific Article
Winkler, C. ; Krumsiek, J. ; Buettner, F. ; Angermüller, C. ; Giannopoulou, E.Z. ; Theis, F.J. ; Ziegler, A.-G. ; Bonifacio, E.
Diabetologia 58, 206 (2014)
Waldera-Lupa, D.M. ; Kalfalah, F. ; Florea, A.M. ; Sass, S. ; Kruse, F.E. ; Rieder, V. ; Tigges, J. ; Fritsche, E. ; Krutmann, J. ; Busch, H. ; Boerries, M. ; Meyer, H.E. ; Boege, F. ; Theis, F.J. ; Reifenberger, G. ; Stühler, K.
Aging 6, 856-878 (2014)
We analyzed anex vivo model of in situ aged human dermal fibroblasts, obtained from 15 adult healthy donors from three different age groups using an unbiased quantitative proteome-wide approach applying label-free mass spectrometry. Thereby, we identified 2409 proteins, including 43 proteins with an age-associated abundance change. Most of the differentially abundant proteins have not been described in the context of fibroblasts' aging before, but the deduced biological processes confirmed known hallmarks of aging and led to a consistent picture of eight biological categories involved in fibroblast aging, namely proteostasis, cell cycle and proliferation, development and differentiation, cell death, cell organization and cytoskeleton, response to stress, cell communication and signal transduction, as well as RNA metabolism and translation. The exhaustive analysis of protein and mRNA data revealed that 77 % of the age-associated proteins were not linked to expression changes of the corresponding transcripts. This is in line with an associated miRNA study and led us to the conclusion that most of the age-associated alterations detected at the proteome level are likely caused post-transcriptionally rather than by differential gene expression. In summary, our findings led to the characterization of novel proteins potentially associated with fibroblast aging and revealed that primary cultures of in situ aged fibroblasts are characterized by moderate age-related proteomic changes comprising the multifactorial process of aging.
Wissenschaftlicher Artikel
Scientific Article
Khakhutskyy, V. ; Schwarzfischer, M. ; Hubig, N. ; Plant, C. ; Marr, C. ; Rieger, M.A. ; Schröder, T. ; Theis, F.J.
Lect. Notes Comput. Sc. 8649, 15-29 (2014)
Trees representing hierarchical knowledge are prevalent in biology and medicine. Some examples are phylogenetic trees, the hierarchical structure of biological tissues and cell lines. The increasing throughput of techniques generating such trees poses new challenges to the analysis of tree ensembles. Some typical tasks include the determination of common patterns of lineage decisions in cellular differentiation trees. Partitioning the dataset is crucial for further analysis of the cellular genealogies. In this work, we develop a method to cluster labeled binary tree structures. Furthermore, for every cluster our method selects a centroid tree that captures the characteristic mitosis patterns of the group. We evaluate this technique on synthetic data and apply it to experimental trees that embody the lineages of differentiating cells under specific conditions over time. The results of the cell lineage trees are thoroughly interpreted with expert domain knowledge.
Wissenschaftlicher Artikel
Scientific Article
Illner, K. ; Fuchs, C. ; Theis, F.J.
In: Gillli, M.* ; Gonzalez-Rodriguez, G.* ; Nieto-Reyes, A.* [Eds.]: Proccedings (21st International Conference on Computational Statistics (COMPSTAT2014), 19-22 August 2014, Geneva, Switzerland). 2014. 625-632
Illner, K. ; Fuchs, C. ; Theis, F.J.
Sep. Sci. Appl. 34, 24-26 (2014)
Wissenschaftlicher Artikel
Scientific Article
Illner, K. ; Fuchs, C. ; Theis, F.J.
GIT Fachz. Lab. 9, 28-30 (2014)
Wissenschaftlicher Artikel
Scientific Article
Illner, K. ; Fuchs, C. ; Theis, F.J.
J. Comput. Biol. 21, 855-865 (2014)
In biology, more and more information about the interactions in regulatory systems becomes accessible, and this often leads to prior knowledge for recent data interpretations. In this work we focus on multivariate signaling data, where the structure of the data is induced by a known regulatory network. To extract signals of interest we assume a blind source separation (BSS) model, and we capture the structure of the source signals in terms of a Bayesian network. To keep the parameter space small, we consider stationary signals, and we introduce the new algorithm emGrade, where model parameters and source signals are estimated using expectation maximization. For network data, we find an improved estimation performance compared to other BSS algorithms, and the flexible Bayesian modeling enables us to deal with repeated and missing observation values. The main advantage of our method is the statistically interpretable likelihood, and we can use model selection criteria to determine the (in general unknown) number of source signals or decide between different given networks. In simulations we demonstrate the recovery of the source signals dependent on the graph structure and the dimensionality of the data.
Wissenschaftlicher Artikel
Scientific Article
Röck, K. ; Tigges, J. ; Sass, S. ; Schütze, A. ; Florea, A.M. ; Fender, A.C. ; Theis, F.J. ; Krutmann, J. ; Boege, F. ; Fritsche, E. ; Reifenberger, G. ; Fischer, J.W.
J. Invest. Dermatol. 135, 369-377 (2014)
Even though aging and cellular senescence appear to be linked the biological mechanisms interconnecting these two processes remain to be unravelled. Therefore, miRNA profiles were analyzed ex vivo by gene array in fibroblasts isolated from young and old human donors. Expression of several miRNAs was positively correlated with donor age. Among those miR-23a-3p was shown to target hyaluronan-synthase 2. Hyaluronan (HA) is a polysaccharide of the extracellular matrix that critically regulates the phenotype of fibroblasts. Indeed, both aged and senescent fibroblasts showed increased miR-23a-3p expression and secreted significantly lower amounts of HA compared to young and non senescent fibroblasts. Ectopic overexpression of miR-23a-3p in non senescent fibroblasts led to decreased HAS2 mediated HA-synthesis, upregulation of senescence associated markers and decreased proliferation. In addition, siRNA mediated downregulation of HAS2 and pharmacologic inhibition of HA-synthesis by 4-methylumbelliferone mimicked the effects of miR-23a-3p. In vivo, miR-23a-3p was upregulated and HAS2 was downregulated in the skin of old mice versus young mice. Inhibition of HA-synthesis by 4-methylumbelliferone in mice reduced dermal hydration and viscoelasticity thereby mimicking an aged skin phenotype. Taken together, these findings appear to link miR-23a-3p and the HA-microenvironment as effector mechanisms in both dermal aging and senescence.
Wissenschaftlicher Artikel
Scientific Article
Bartel, S. ; Schulz, N. ; Alessandrini, F. ; Theis, F.J. ; Eickelberg, O. ; Kicic, A. ; Freishtat, R.J. ; Krauss-Etschmann, S.
Allergy 69, 40-41 (2014)
Meeting abstract
Meeting abstract
Winkler, C. ; Krumsiek, J. ; Buettner, F. ; Angermüller, C. ; Giannopoulou, E.Z. ; Theis, F.J. ; Ziegler, A.-G. ; Bonifacio, E.
Diabetologia 57, 2521-2529 (2014)
AIMS/HYPOTHESIS: More than 40 regions of the human genome confer susceptibility for type 1 diabetes and could be used to establish population screening strategies. The aim of our study was to identify weighted sets of SNP combinations for type 1 diabetes prediction. METHODS: We applied multivariable logistic regression and Bayesian feature selection to the Type 1 Diabetes Genetics Consortium (T1DGC) dataset with genotyping of HLA plus 40 SNPs within other type 1 diabetes-associated gene regions in 4,574 cases and 1,207 controls. We tested the weighted models in an independent validation set (765 cases, 423 controls), and assessed their performance in 1,772 prospectively followed children. RESULTS: The inclusion of 40 non-HLA gene SNPs significantly improved the prediction of type 1 diabetes over that provided by HLA alone (p = 3.1 × 10(-25)), with a receiver operating characteristic AUC of 0.87 in the T1DGC set, and 0.84 in the validation set. Feature selection identified HLA plus nine SNPs from the PTPN22, INS, IL2RA, ERBB3, ORMDL3, BACH2, IL27, GLIS3 and RNLS genes that could achieve similar prediction accuracy as the total SNP set. Application of this ten SNP model to prospectively followed children was able to improve risk stratification over that achieved by HLA genotype alone. CONCLUSIONS: We provided a weighted risk model with selected SNPs that could be considered for recruitment of infants into studies of early type 1 diabetes natural history or appropriately safe prevention.
Wissenschaftlicher Artikel
Scientific Article
Matthes, M. ; Preusse, M. ; Zhang, J. ; Schechter, J. ; Mayer, D. ; Lentes, B. ; Theis, F.J. ; Prakash, N. ; Wurst, W. ; Trümbach, D.
Database 2014:bau083 (2014)
The study of developmental processes in the mouse and other vertebrates includes the understanding of patterning along the anterior-posterior, dorsal-ventral and medial- lateral axis. Specifically, neural development is also of great clinical relevance because several human neuropsychiatric disorders such as schizophrenia, autism disorders or drug addiction and also brain malformations are thought to have neurodevelopmental origins, i.e. pathogenesis initiates during childhood and adolescence. Impacts during early neurodevelopment might also predispose to late-onset neurodegenerative disorders, such as Parkinson's disease. The neural tube develops from its precursor tissue, the neural plate, in a patterning process that is determined by compartmentalization into morphogenetic units, the action of local signaling centers and a well-defined and locally restricted expression of genes and their interactions. While public databases provide gene expression data with spatio-temporal resolution, they usually neglect the genetic interactions that govern neural development. Here, we introduce Mouse IDGenes, a reference database for genetic interactions in the developing mouse brain. The database is highly curated and offers detailed information about gene expressions and the genetic interactions at the developing mid-/hindbrain boundary. To showcase the predictive power of interaction data, we infer new Wnt/β-catenin target genes by machine learning and validate one of them experimentally. The database is updated regularly. Moreover, it can easily be extended by the research community. Mouse IDGenes will contribute as an important resource to the research on mouse brain development, not exclusively by offering data retrieval, but also by allowing data input. Database URL: http://mouseidgenes.helmholtz-muenchen.de.
Wissenschaftlicher Artikel
Scientific Article
Thalheimer, F.B. ; Wingert, S. ; de Giacomo, P. ; Haetscher, N. ; Brill, B. ; Theis, F.J. ; Hennighausen, L. ; Schroeder, T. ; Rieger, M.
Exp. Hematol. 42, S57 (2014)
Meeting abstract
Meeting abstract
Moignard, V. ; Woodhouse, S. ; Haghverdi, L. ; Lilly, J. ; Tanaka, Y. ; Wilkinson, A. ; Buettner, F. ; Nishikawa, S. ; Piterman, N. ; Kouskoff, V. ; Theis, F.J. ; Fisher, J. ; Göttgens, B.
Exp. Hematol. 42, S52 (2014)
Meeting abstract
Meeting abstract
Vinnikov, I.A. ; Hajdukiewicz, K. ; Reymann, J. ; Beneke, J. ; Czajkowski, R. ; Roth, L.C. ; Novak, M. ; Roller, A. ; Dörner, N. ; Starkuviene, V. ; Theis, F.J. ; Erfle, H. ; Schütz, G. ; Grinevich, V. ; Konopka, W.
J. Neurosci. 34, 10659-10674 (2014)
The role of neuronal noncoding RNAs in energy control of the body is not fully understood. The arcuate nucleus (ARC) of the hypothalamus comprises neurons regulating food intake and body weight. Here we show that Dicer-dependent loss of microRNAs in these neurons of adult (DicerCKO) mice causes chronic overactivation of the signaling pathways involving phosphatidylinositol-3-kinase (PI3K), Akt, and mammalian target of rapamycin (mTOR) and an imbalance in the levels of neuropeptides, resulting in severe hyperphagic obesity. Similarly, the activation of PI3K-Akt-mTOR pathway due to Pten deletion in the adult forebrain leads to comparable weight increase. Conversely, the mTORC1 inhibitor rapamycin normalizes obesity in mice with an inactivated Dicer1 or Pten gene. Importantly, the continuous delivery of oligonucleotides mimicking microRNAs, which are predicted to target PI3K-Akt-mTOR pathway components, to the hypothalamus attenuates adiposity in DicerCKO mice. Furthermore, loss of miR-103 causes strong upregulation of the PI3K-Akt-mTOR pathway in vitro and its application into the ARC of the Dicer-deficient mice both reverses upregulation of Pik3cg, the mRNA encoding the catalytic subunit p110γ of the PI3K complex, and attenuates the hyperphagic obesity. Our data demonstrate in vivo the crucial role of neuronal microRNAs in the control of energy homeostasis.
Wissenschaftlicher Artikel
Scientific Article
Garzorz-Stark, N. ; Knapp, B. ; Quaranta, M. ; Mattii, M. ; Theis, F.J. ; Ring, J. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Eyerich, K.
Br. J. Dermatol. 170, E19 (2014)
Meeting abstract
Meeting abstract
Quaranta, M. ; Eyerich, S. ; Knapp, B. ; Nasorri, F. ; Scarponi, C. ; Mattii, M. ; Garzorz-Stark, N. ; Harlfinger, A.T. ; Jaeger, T. ; Grosber, M. ; Pennino, D. ; Mempel, M. ; Schnopp, C. ; Theis, F.J. ; Albanesi, C. ; Cavani, A. ; Schmidt-Weber, C.B. ; Ring, J. ; Eyerich, K.
PLoS ONE 9:e101814 (2014)
Psoriasis is characterized by an apoptosis-resistant and metabolic active epidermis, while a hallmark for allergic contact dermatitis (ACD) is T cell-induced keratinocyte apoptosis. Here, we induced ACD reactions in psoriasis patients sensitized to nickel (n = 14) to investigate underlying mechanisms of psoriasis and ACD simultaneously. All patients developed a clinically and histologically typical dermatitis upon nickel challenge even in close proximity to pre-existing psoriasis plaques. However, the ACD reaction was delayed as compared to non-psoriatic patients, with a maximum intensity after 7 days. Whole genome expression analysis revealed alterations in numerous pathways related to metabolism and proliferation in non-involved skin of psoriasis patients as compared to non-psoriatic individuals, indicating that even in clinically non-involved skin of psoriasis patients molecular events opposing contact dermatitis may occur. Immunohistochemical comparison of ACD reactions as well as in vitro secretion analysis of lesional T cells showed a higher Th17 and neutrophilic migration as well as epidermal proliferation in psoriasis, while ACD reactions were dominated by cytotoxic CD8+ T cells and a Th2 signature. Based on these findings, we hypothesized an ACD reaction directly on top of a pre-existing psoriasis plaque might influence the clinical course of psoriasis. We observed a strong clinical inflammation with a mixed psoriasis and eczema phenotype in histology. Surprisingly, the initial psoriasis plaque was unaltered after self-limitation of the ACD reaction. We conclude that sensitized psoriasis patients develop a typical, but delayed ACD reaction which might be relevant for patch test evaluation in clinical practice. Psoriasis and ACD are driven by distinct and independent immune mechanisms.
Wissenschaftlicher Artikel
Scientific Article
Feigelman, J. ; Theis, F.J. ; Marr, C.
BMC Bioinformatics 15:240 (2014)
BACKGROUND: Biological data often originate from samples containing mixtures of subpopulations, corresponding e.g. to distinct cellular phenotypes. However, identification of distinct subpopulations may be difficult if biological measurements yield distributions that are not easily separable. RESULTS: We present Multiresolution Correlation Analysis (MCA), a method for visually identifying subpopulations based on the local pairwise correlation between covariates, without needing to define an a priori interaction scale. We demonstrate that MCA facilitates the identification of differentially regulated subpopulations in simulated data from a small gene regulatory network, followed by application to previously published single-cell qPCR data from mouse embryonic stem cells. We show that MCA recovers previously identified subpopulations, provides additional insight into the underlying correlation structure, reveals potentially spurious compartmentalizations, and provides insight into novel subpopulations. CONCLUSIONS: MCA is a useful method for the identification of subpopulations in low-dimensional expression data, as emerging from qPCR or FACS measurements. With MCA it is possible to investigate the robustness of covariate correlations with respect subpopulations, graphically identify outliers, and identify factors contributing to differential regulation between pairs of covariates. MCA thus provides a framework for investigation of expression correlations for genes of interests and biological hypothesis generation.
Wissenschaftlicher Artikel
Scientific Article
Quaranta, M. ; Knapp, B. ; Garzorz-Stark, N. ; Mattii, M. ; Pullabhatla, V. ; Pennino, D. ; Andres, C. ; Traidl-Hoffmann, C. ; Cavani, A. ; Theis, F.J. ; Ring, J. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Eyerich, K.
Sci. Transl. Med. 6:244ra90 (2014)
Previous attempts to gain insight into the pathogenesis of psoriasis and eczema by comparing their molecular signatures were hampered by the high interindividual variability of those complex diseases. In patients affected by both psoriasis and nonatopic or atopic eczema simultaneously (n = 24), an intraindividual comparison of the molecular signatures of psoriasis and eczema identified genes and signaling pathways regulated in common and exclusive for each disease across all patients. Psoriasis-specific genes were important regulators of glucose and lipid metabolism, epidermal differentiation, as well as immune mediators of T helper 17 (TH17) responses, interleukin-10 (IL-10) family cytokines, and IL-36. Genes in eczema related to epidermal barrier, reduced innate immunity, increased IL-6, and a TH2 signature. Within eczema subtypes, a mutually exclusive regulation of epidermal differentiation genes was observed. Furthermore, only contact eczema was driven by inflammasome activation, apoptosis, and cellular adhesion. On the basis of this comprehensive picture of the pathogenesis of psoriasis and eczema, a disease classifier consisting of NOS2 and CCL27 was created. In an independent cohort of eczema (n = 28) and psoriasis patients (n = 25), respectively, this classifier diagnosed all patients correctly and also identified initially misdiagnosed or clinically undifferentiated patients.
Wissenschaftlicher Artikel
Scientific Article
Garzorz, N.V. ; Quaranta, M. ; Knapp, B. ; Mattii, M. ; Andres, C. ; Theis, F.J. ; Ring, J. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Eyerich, K.
J. Invest. Dermatol. 134, S1 (2014)
Meeting abstract
Meeting abstract
Hasenauer, J. ; Hasenauer, C. ; Hucho, T. ; Theis, F.J.
PLoS Comput. Biol. 10:e1003686 (2014)
Functional cell-to-cell variability is ubiquitous in multicellular organisms as well as bacterial populations. Even genetically identical cells of the same cell type can respond differently to identical stimuli. Methods have been developed to analyse heterogeneous populations, e.g., mixture models and stochastic population models. The available methods are, however, either incapable of simultaneously analysing different experimental conditions or are computationally demanding and difficult to apply. Furthermore, they do not account for biological information available in the literature. To overcome disadvantages of existing methods, we combine mixture models and ordinary differential equation (ODE) models. The ODE models provide a mechanistic description of the underlying processes while mixture models provide an easy way to capture variability. In a simulation study, we show that the class of ODE constrained mixture models can unravel the subpopulation structure and determine the sources of cell-to-cell variability. In addition, the method provides reliable estimates for kinetic rates and subpopulation characteristics. We use ODE constrained mixture modelling to study NGF-induced Erk1/2 phosphorylation in primary sensory neurones, a process relevant in inflammatory and neuropathic pain. We propose a mechanistic pathway model for this process and reconstructed static and dynamical subpopulation characteristics across experimental conditions. We validate the model predictions experimentally, which verifies the capabilities of ODE constrained mixture models. These results illustrate that ODE constrained mixture models can reveal novel mechanistic insights and possess a high sensitivity.
Wissenschaftlicher Artikel
Scientific Article
Thalheimer, F.B. ; Wingert, S. ; de Giacomo, P. ; Haetscher, N. ; Rehage, M. ; Brill, B. ; Theis, F.J. ; Hennighausen, L. ; Schroeder, T. ; Rieger, M.A.
Stem Cell Rep. 3, 34-43 (2014)
The balance of self-renewal and differentiation in long-term repopulating hematopoietic stem cells (LT-HSC) must be strictly controlled to maintain blood homeostasis and to prevent leukemogenesis. Hematopoietic cytokines can induce differentiation in LT-HSCs; however, the molecular mechanism orchestrating this delicate balance requires further elucidation. We identified the tumor suppressor GADD45G as an instructor of LT-HSC differentiation under the control of differentiation-promoting cytokine receptor signaling. GADD45G immediately induces and accelerates differentiation in LT-HSCs and overrides the self-renewal program by specifically activating MAP3K4-mediated MAPK p38. Conversely, the absence of GADD45G enhances the self-renewal potential of LT-HSCs. Videomicroscopy-based tracking of single LT-HSCs revealed that, once GADD45G is expressed, the development of LT-HSCs into lineage-committed progeny occurred within 36 hr and uncovered a selective lineage choice with a severe reduction in megakaryocytic-erythroid cells. Here, we report an unrecognized role of GADD45G as a central molecular linker of extrinsic cytokine differentiation and lineage choice control in hematopoiesis.
Wissenschaftlicher Artikel
Scientific Article
Ried, J.S. ; Shin, S.Y. ; Krumsiek, J. ; Illig, T. ; Theis, F.J. ; Spector, T.D. ; Adamski, J. ; Wichmann, H.-E. ; Strauch, K. ; Soranzo, N. ; Suhre, K. ; Gieger, C.
Hum. Mol. Genet. 23, 5847-5857 (2014)
Availability of standardized metabolite panels and genome-wide single nucleotide polymorphism (SNP) data endorse the comprehensive analysis of gene-metabolite association. Currently, many studies use genome-wide association analysis to investigate the genetic effects on single metabolites (mGWAS) separately. Such studies have identified several loci that are associated not only with one but with multiple metabolites, facilitated by the fact that metabolite panels often include metabolites of the same or related pathways. Strategies that analyse several phenotypes in a combined way were shown to be able to detect additional genetic loci. One of those methods is the phenotype set enrichment analysis (PSEA) that tests sets of metabolites for enrichment at genes. Here we applied PSEA on two different panels of serum metabolites together with genome-wide data. All analyses were performed as a two-step identification-validation approach, using data from the population-based KORA cohort and the TwinsUK study. In addition to confirming genes that were already known from mGWAS, we were able to identify and validate twelve new genes. Knowledge about gene function was supported by the enriched metabolite sets. For loci with unknown gene functions, the results suggest a function that is interrelated with the metabolites, and hint at the underlying pathways.
Wissenschaftlicher Artikel
Scientific Article
Meyer, S.U. ; Stoecker, K. ; Sass, S. ; Theis, F.J. ; Pfaffl, M.W.
Methods Mol. Biol. 1160, 165-188 (2014)
Wissenschaftlicher Artikel
Scientific Article
Shin, S.Y. ; Fauman, E.B. ; Petersen, A.-K. ; Krumsiek, J. ; Santos, R. ; Huang, J. ; Arnold, M. ; Erte, I. ; Forgetta, V. ; Yang, T.P. ; Walter, K. ; Menni, C. ; Chen, L. ; Vasquez, L. ; Valdes, A.M. ; Hyde, C.L. ; Wang, V. ; Ziemek, D. ; Roberts, P. ; Xi, L. ; Grundberg, E. ; Waldenberger, M. ; Richards, J.B. ; Mohney, R.P. ; Milburn, M.V. ; John, S.L. ; Trimmer, J. ; Theis, F.J. ; Overington, J.P. ; Suhre, K. ; Brosnan, M.J. ; Gieger, C. ; Kastenmüller, G. ; Spector, T.D. ; Soranzo, N.
Nat. Genet. 46, 543-550 (2014)
Genome-wide association scans with high-throughput metabolic profiling provide unprecedented insights into how genetic variation influences metabolism and complex disease. Here we report the most comprehensive exploration of genetic loci influencing human metabolism thus far, comprising 7,824 adult individuals from 2 European population studies. We report genome-wide significant associations at 145 metabolic loci and their biochemical connectivity with more than 400 metabolites in human blood. We extensively characterize the resulting in vivo blueprint of metabolism in human blood by integrating it with information on gene expression, heritability and overlap with known loci for complex disorders, inborn errors of metabolism and pharmacological targets. We further developed a database and web-based resources for data mining and results visualization. Our findings provide new insights into the role of inherited variation in blood metabolic diversity and identify potential new opportunities for drug development and for understanding disease.
Wissenschaftlicher Artikel
Scientific Article
Quaranta, M. ; Knapp, B. ; Theis, F.J. ; Ring, J. ; Schmidt-Weber, C.B. ; Eyerich, S. ; Eyerich, K.
Exp. Dermatol. 23, E20 (2014)
Meeting abstract
Meeting abstract
Buettner, F. ; Moignard, V. ; Göttgens, B. ; Theis, F.J.
Bioinformatics 30, 1867-1875 (2014)
MOTIVATION: High-throughput single-cell qPCR is a promising technique allowing for new insights in complex cellular processes. However, the PCR reaction can only be detected up to a certain detection limit, while failed reactions could be due to low or absent expression and the true expression level is unknown. As this censoring can occur for high proportions of the data, it is one of the main challenges when dealing with single-cell qPCR data. PCA is an important tool for visualising the structure of high-dimensional data as well as for identifying sub-populations of cells. However, to date it is not clear how to perform a PCA of censored data. We present a probabilistic approach which accounts for the censoring and evaluate it for two typical data-sets containing single-cell qPCR data. RESULTS: We use the Gaussian Process Latent Variable Model (GPLVM) framework to account for censoring by introducing an appropriate noise model and allowing a different kernel for each dimension. We evaluate this new approach for two typical qPCR data-sets (of mouse embryonic stem cells and blood stem/progenitor cells respectively) by performing linear and non-linear probabilistic PCA. Taking the censoring into account results in a 2D representation of the data which better reflects its known structure: in both data-sets our new approach results in a better separation of known cell types and is able to reveal subpopulations in one data-set which could not be resolved using standard PCA. AVAILABILITY: The implementation was based on the existing GPLVM toolbox(1); extensions for noise models and kernels accounting for censoring are available from http://icb.helmholtz-muenchen.de/censgplvm.
Wissenschaftlicher Artikel
Scientific Article
Baumann, S. ; Rockstroh, M. ; Bartel, J. ; Krumsiek, J. ; Otto, W. ; Jungnickel, H. ; Potratz, S. ; Luch, A. ; Wilscher, E. ; Theis, F.J. ; von Bergen, M. ; Tomm, J.M.
J. Integr. OMICS, DOI: 10.5584/jiomics.v4i1.157 (2014)
Although the activation of immune cells is the first and thereby pivotal step in the immunological cascade, the current knowledge about the details of this process is quite limited. Recent studies have shown that aromatic compounds, such as B[a]P, influence the immune system even at low concentrations. We investigated the effect of a subtoxic B[a]P concentration (50 nM) on the proteome and the metabolome of non-activated and activated Jurkat T cells. The GeLC-MS/MS analysis yielded 2624 unambiguously identified proteins. In addition to typical regulatory pathways associated with T cell activation, pathway analysis by Ingenuity IPA revealed several metabolic processes, for instance purine and pyruvate metabolism. The activation process seems to be influenced by B[a]P suggesting an important role of the mTOR pathway in the cellular adaptation. B[a]P exposure of non-activated Jurkat cells induced signaling pathways such as protein ubiquitination and NRF2 mediated oxidative stress response as well as metabolic adaptation involving pyruvate, purine and fatty acid metabolism. Thus, we validated the proteome results by determining the concentrations of 183 metabolites with FIA-MS/MS and IC-MS/MS. Furthermore, we were able to show that Jurkat cells metabolize B[a]P to B[a]P-1,6-dione. The combined evaluation of proteome and metabolome data with an integrated, genome-scale metabolic model provided novel systems biological insights into the complex relation between metabolic and proteomic processes in Jurkat T cells during activation and subtoxic chemical exposure.
Wissenschaftlicher Artikel
Scientific Article
Yang, D. ; Lutter, D. ; Burtscher, I. ; Uetzmann, L. ; Theis, F.J. ; Lickert, H.
Development 141, 514-525 (2014)
Transcription factors (TFs) pattern developing tissues and determine cell fates; however, how spatio-temporal TF gradients are generated is ill defined. Here we show that miR-335 fine-tunes TF gradients in the endoderm and promotes mesendodermal lineage segregation. Initially, we identified miR-335 as a regulated intronic miRNA in differentiating embryonic stem cells (ESCs). miR-335 is encoded in the mesoderm-specific transcript (Mest) and targets the 3'-UTRs of the endoderm-determining TFs Foxa2 and Sox17. Mest and miR-335 are co-expressed and highly accumulate in the mesoderm, but are transiently expressed in endoderm progenitors. Overexpression of miR-335 does not affect initial mesendoderm induction, but blocks Foxa2- and Sox17-mediated endoderm differentiation in ESCs and ESC-derived embryos. Conversely, inhibition of miR-335 activity leads to increased Foxa2 and Sox17 protein accumulation and endoderm formation. Mathematical modeling predicts that transient miR-335 expression in endoderm progenitors shapes a TF gradient in the endoderm, which we confirm by functional studies in vivo. Taken together, our results suggest that miR-335 targets endoderm TFs for spatio-temporal gradient formation in the endoderm and to stabilize lineage decisions during mesendoderm formation.
Wissenschaftlicher Artikel
Scientific Article
Bajikar, S.S. ; Fuchs, C. ; Roller, A. ; Theis, F.J. ; Janes, K.A.
Proc. Natl. Acad. Sci. U.S.A. 111, E626-E635 (2014)
Cell-to-cell variations in gene regulation occur in a number of biological contexts, such as development and cancer. Discovering regulatory heterogeneities in an unbiased manner is difficult owing to the population averaging that is required for most global molecular methods. Here, we show that we can infer single-cell regulatory states by mathematically deconvolving global measurements taken as averages from small groups of cells. This averaging-and-deconvolution approach allows us to quantify single-cell regulatory heterogeneities while avoiding the measurement noise of global single-cell techniques. Our method is particularly relevant to solid tissues, where single-cell dissociation and molecular profiling is especially problematic.
Wissenschaftlicher Artikel
Scientific Article
Bonifacio, E. ; Krumsiek, J. ; Winkler, C. ; Theis, F.J. ; Ziegler, A.-G.
Acta Diabetol. 51, 403-411 (2014)
We recently developed a novel approach capable of identifying gene combinations to obtain maximal disease risk stratification. Type 1 diabetes has a preclinical phase including seroconversion to autoimmunity and subsequent progression to diabetes. Here, we applied our gene combination approach to identify combinations that contribute either to islet autoimmunity or to the progression from islet autoantibodies to diabetes onset. We examined 12 type 1 diabetes susceptibility genes (INS, ERBB3, PTPN2, IFIH1, PTPN22, KIAA0350, CD25, CTLA4, SH2B3, IL2, IL18RAP, IL10) in a cohort of children of parents with type 1 diabetes and prospectively followed from birth. The most predictive combination was subsequently applied to a smaller validation cohort. The combinations of genes only marginally contributed to the risk of developing islet autoimmunity, but could substantially modify risk of progression to diabetes in islet autoantibody-positive children. The greatest discrimination was provided by risk allele scores of five genes, INS, IFIH1, IL18RAP, CD25, and IL2 genes, which could identify 80 % of islet autoantibody-positive children who progressed to diabetes within 6 years of seroconversion and discriminate high risk (63 % within 6 years; 95 % CI 45-81 %) and low risk (11 % within 6 years; 95 % CI 0.1-22 %; p = 4 × 10(-5)) antibody-positive children. Risk stratification by these five genes was confirmed in a second cohort of islet autoantibody children. These findings highlight genes that may affect the rate of the beta-cell destruction process once autoimmunity has initiated and may help to identify islet autoantibody-positive subjects with rapid progression to diabetes.
Wissenschaftlicher Artikel
Scientific Article
Hasenauer, J. ; Wolf, V. ; Kazeroonian, A. ; Theis, F.J.
J. Math. Biol. 69, 687-735 (2014)
The time-evolution of continuous-time discrete-state biochemical processes is governed by the Chemical Master Equation (CME), which describes the probability of the molecular counts of each chemical species. As the corresponding number of discrete states is, for most processes, large, a direct numerical simulation of the CME is in general infeasible. In this paper we introduce the method of conditional moments (MCM), a novel approximation method for the solution of the CME. The MCM employs a discrete stochastic description for low-copy number species and a moment-based description for medium/high-copy number species. The moments of the medium/high-copy number species are conditioned on the state of the low abundance species, which allows us to capture complex correlation structures arising, e.g., for multi-attractor and oscillatory systems. We prove that the MCM provides a generalization of previous approximations of the CME based on hybrid modeling and moment-based methods. Furthermore, it improves upon these existing methods, as we illustrate using a model for the dynamics of stochastic single-gene expression. This application example shows that due to the more general structure, the MCM allows for the approximation of multi-modal distributions.
Wissenschaftlicher Artikel
Scientific Article
Albrecht, E. ; Waldenberger, M. ; Krumsiek, J. ; Evans, A.M. ; Jeratsch, U. ; Breier, M. ; Adamski, J. ; Koenig, W. ; Zeilinger, S. ; Fuchs, C. ; Klopp, N. ; Theis, F.J. ; Wichmann, H.-E. ; Suhre, K. ; Illig, T. ; Strauch, K. ; Peters, A. ; Gieger, C. ; Kastenmüller, G. ; Döring, A. ; Meisinger, C.
Metabolomics 10, 141-151 (2014)
Serum urate, the final breakdown product of purine metabolism, is causally involved in the pathogenesis of gout, and implicated in cardiovascular disease and type 2 diabetes. Serum urate levels highly differ between men and women; however the underlying biological processes in its regulation are still not completely understood and are assumed to result from a complex interplay between genetic, environmental and lifestyle factors. In order to describe the metabolic vicinity of serum urate, we analyzed 355 metabolites in 1,764 individuals of the population-based KORA F4 study and constructed a metabolite network around serum urate using Gaussian Graphical Modeling in a hypothesis-free approach. We subsequently investigated the effect of sex and urate lowering medication on all 38 metabolites assigned to the network. Within the resulting network three main clusters could be detected around urate, including the well-known pathway of purine metabolism, as well as several dipeptides, a group of essential amino acids, and a group of steroids. Of the 38 assigned metabolites, 25 showed strong differences between sexes. Association with uricostatic medication intake was not only confined to purine metabolism but seen for seven metabolites within the network. Our findings highlight pathways that are important in the regulation of serum urate and suggest that dipeptides, amino acids, and steroid hormones are playing a role in its regulation. The findings might have an impact on the development of specific targets in the treatment and prevention of hyperuricemia.
Wissenschaftlicher Artikel
Scientific Article
2013
Kong, B. ; Wu, W. ; Cheng, T. ; Schlitter, A.M. ; Bruns, P. ; Jaeger, C. ; Regel, I. ; Raulefs, S. ; Behler, N. ; Irmler, M. ; Beckers, J. ; Erkan, M. ; Theis, F.J. ; Esposito, I. ; Kleeff, J. ; Michalski, C.W.
Pancreas 42, 1360-1361 (2013)
Meeting abstract
Meeting abstract
Kong, B. ; Behler, N. ; Bruns, P. ; Schlitter, A.M. ; Valkovska, N. ; Fritzsche, S. ; Raulefs, S. ; Regel, I. ; Erkan, M. ; Theis, F.J. ; Esposito, I. ; Kleeff, J. ; Michalski, C.W.
Pancreas 42, 1360 (2013)
Meeting abstract
Meeting abstract
Kuruoglu, E.E. ; Theis, F.J.
EURASIP J. Adv. Signal Process. 2013:185 (2013)
Editorial
Editorial
Vehlow, C. ; Hasenauer, J. ; Krämer, A. ; Raue, A. ; Hug, S. ; Timmer, J. ; Radde, N. ; Theis, F.J. ; Weiskopf, D.
BMC Bioinformatics 14:S2 (2013)
Background: Mathematical models are nowadays widely used to describe biochemical reaction networks. One of the main reasons for this is that models facilitate the integration of a multitude of different data and data types using parameter estimation. Thereby, models allow for a holistic understanding of biological processes. However, due to measurement noise and the limited amount of data, uncertainties in the model parameters should be considered when conclusions are drawn from estimated model attributes, such as reaction fluxes or transient dynamics of biological species. Methods and results: We developed the visual analytics system iVUN that supports uncertainty-aware analysis of static and dynamic attributes of biochemical reaction networks modeled by ordinary differential equations. The multivariate graph of the network is visualized as a node-link diagram, and statistics of the attributes are mapped to the color of nodes and links of the graph. In addition, the graph view is linked with several views, such as line plots, scatter plots, and correlation matrices, to support locating uncertainties and the analysis of their time dependencies. As demonstration, we use iVUN to quantitatively analyze the dynamics of a model for Epo-induced JAK2/STAT5 signaling. Conclusion: Our case study showed that iVUN can be used to perform an in-depth study of biochemical reaction networks, including attribute uncertainties, correlations between these attributes and their uncertainties as well as the attribute dynamics. In particular, the linking of different visualization options turned out to be highly beneficial for the complex analysis tasks that come with the biological systems as presented here.
Wissenschaftlicher Artikel
Scientific Article
Bartel, S. ; Schulz, N. ; Theis, F.J. ; Alessandrini, F. ; Takenaka, S. ; Eickelberg, O. ; Krauss-Etschmann, S.
Allergy 68, 233-233 (2013)
Meeting abstract
Meeting abstract
Bartel, J. ; Krumsiek, J. ; Theis, F.J.
Comp. Struc. Biotech. J. 4:e201301009 (2013)
Metabolomics is a relatively new high-throughput technology that aims at measuring all endogenous metabolites within a biological sample in an unbiased fashion. The resulting metabolic profiles may be regarded as functional signatures of the physiological state, and have been shown to comprise effects of genetic regulation as well as environmental factors. This potential to connect genotypic to phenotypic information promises new insights and biomarkers for different research fields, including biomedical and pharmaceutical research. In the statistical analysis of metabolomics data, many techniques from other omics fields can be reused. However recently, a number of tools specific for metabolomics data have been developed as well. The focus of this mini review will be on recent advancements in the analysis of metabolomics data especially by utilizing Gaussian graphical models and independent component analysis.
Review
Review
Kazeroonian, A. ; Hasenauer, J. ; Theis, F.J.
In: Autio, R.* ; Shmulevich, I.* ; Strimmer, K.* ; Wiuf, C.* ; Sarbu, S.* ; Yli-Harja, O.* [Eds.]: Proceedings of the WCSB 2013 (10th International Workshop on Computational Systems Biology, 10 - 12 June 2013, Tampere, Finland). Tampere: Tampere International Center for Signal Processing, 2013. 66-73 (TICSP Series ; 63)
Sass, S. ; Buettner, F. ; Müller, N.S. ; Theis, F.J.
Nucleic Acids Res. 41, 9622-9633 (2013)
Modern high-throughput methods allow the investigation of biological functions across multiple 'omics' levels. Levels include mRNA and protein expression profiling as well as additional knowledge on, for example, DNA methylation and microRNA regulation. The reason for this interest in multi-omics is that actual cellular responses to different conditions are best explained mechanistically when taking all omics levels into account. To map gene products to their biological functions, public ontologies like Gene Ontology are commonly used. Many methods have been developed to identify terms in an ontology, overrepresented within a set of genes. However, these methods are not able to appropriately deal with any combination of several data types. Here, we propose a new method to analyse integrated data across multiple omics-levels to simultaneously assess their biological meaning. We developed a model-based Bayesian method for inferring interpretable term probabilities in a modular framework. Our Multi-level ONtology Analysis (MONA) algorithm performed significantly better than conventional analyses of individual levels and yields best results even for sophisticated models including mRNA fine-tuning by microRNAs. The MONA framework is flexible enough to allow for different underlying regulatory motifs or ontologies. It is ready-to-use for applied researchers and is available as a standalone application from http://icb.helmholtz-muenchen.de/mona.
Wissenschaftlicher Artikel
Scientific Article
Buggenthin, F. ; Marr, C. ; Schwarzfischer, M. ; Hoppe, P.S. ; Hilsenbeck, O. ; Schroeder, T. ; Theis, F.J.
BMC Bioinformatics 14:297 (2013)
Background;In recent years, high-throughput microscopy has emerged as a powerful tool to analyze cellular dynamicsin an unprecedentedly high resolved manner. The amount of data that is generated, for examplein long-term time-lapse microscopy experiments, requires automated methods for processing andanalysis. Available software frameworks are well suited for high-throughput processing of fluorescenceimages, but they often do not perform well on bright field image data that varies considerablybetween laboratories, setups, and even single experiments.Results;In this contribution, we present a fully automated image processing pipeline that is able to robustly segment and analyze cells with ellipsoid morphology from bright field microscopy in a highthroughput, yet time efficient manner. The pipeline comprises two steps: (i) Image acquisition is adjusted to obtain optimal bright field image quality for automatic processing. (ii) A concatenation of fast performing image processing algorithms robustly identifies single cells in each image. We applied the method to a time-lapse movie consisting of ~315,000 images of differentiating hematopoietic stem cells over 6 days. We evaluated the accuracy of our method by comparing the number of identified cells with manual counts. Our method is able to segment images with varying cell density and different cell types without parameter adjustment and clearly outperforms a standard approach. By computing population doubling times, we were able to identify three growth phases in the stem cell population throughout the whole movie, and validated our result with cell cycle times from single cell tracking.Conclusions;Our method allows fully automated processing and analysis of high-throughput bright field microscopydata. The robustness of cell detection and fast computation time will support the analysisof high-content screening experiments, on-line analysis of time-lapse experiments as well as developmentof methods to automatically track single-cell genealogies.
Wissenschaftlicher Artikel
Scientific Article
Raue, A. ; Schilling, M. ; Bachmann, J. ; Matteson, A. ; Schelker, M. ; Kaschek, D. ; Hug, S. ; Kreutz, C. ; Harms, B.D. ; Theis, F.J. ; Klingmüller, U. ; Timmer, J.
PLoS ONE 8:e74335 (2013)
Due to the high complexity of biological data it is difficult to disentangle cellular processes relying only on intuitive interpretation of measurements. A Systems Biology approach that combines quantitative experimental data with dynamic mathematical modeling promises to yield deeper insights into these processes. Nevertheless, with growing complexity and increasing amount of quantitative experimental data, building realistic and reliable mathematical models can become a challenging task: the quality of experimental data has to be assessed objectively, unknown model parameters need to be estimated from the experimental data, and numerical calculations need to be precise and efficient. Here, we discuss, compare and characterize the performance of computational methods throughout the process of quantitative dynamic modeling using two previously established examples, for which quantitative, dose- and time-resolved experimental data are available. In particular, we present an approach that allows to determine the quality of experimental data in an efficient, objective and automated manner. Using this approach data generated by different measurement techniques and even in single replicates can be reliably used for mathematical modeling. For the estimation of unknown model parameters, the performance of different optimization algorithms was compared systematically. Our results show that deterministic derivative-based optimization employing the sensitivity equations in combination with a multi-start strategy based on latin hypercube sampling outperforms the other methods by orders of magnitude in accuracy and speed. Finally, we investigated transformations that yield a more efficient parameterization of the model and therefore lead to a further enhancement in optimization performance. We provide a freely available open source software package that implements the algorithms and examples compared here.
Wissenschaftlicher Artikel
Scientific Article
Bartel, S. ; Schulz, N. ; Theis, F.J. ; Alessandrini, F. ; Takenaka, S. ; Eickelberg, O. ; Krauss-Etschmann, S.
Eur. Respir. J. 42 (2013)
Meeting abstract
Meeting abstract
Hock, S. ; Hasenauer, J. ; Theis, F.J.
BMC Bioinformatics 14:S7 (2013)
Background: Diffusion is a key component of many biological processes such as chemotaxis, developmental differentiation and tissue morphogenesis. Since recently, the spatial gradients caused by diffusion can be assessed in-vitro and in-vivo using microscopy based imaging techniques. The resulting time-series of two dimensional, high-resolutions images in combination with mechanistic models enable the quantitative analysis of the underlying mechanisms. However, such a model-based analysis is still challenging due to measurement noise and sparse observations, which result in uncertainties of the model parameters. Methods: We introduce a likelihood function for image-based measurements with log-normal distributed noise. Based upon this likelihood function we formulate the maximum likelihood estimation problem, which is solved using PDE-constrained optimization methods. To assess the uncertainty and practical identifiability of the parameters we introduce profile likelihoods for diffusion processes. Results and conclusion: As proof of concept, we model certain aspects of the guidance of dendritic cells towards lymphatic vessels, an example for haptotaxis. Using a realistic set of artificial measurement data, we estimate the five kinetic parameters of this model and compute profile likelihoods. Our novel approach for the estimation of model parameters from image data as well as the proposed identifiability analysis approach is widely applicable to diffusion processes. The profile likelihood based method provides more rigorous uncertainty bounds in contrast to local approximation methods.
Wissenschaftlicher Artikel
Scientific Article
Andres, C. ; Hasenauer, J. ; Ahn, H.S. ; Joseph, E.K. ; Isensee, J. ; Theis, F.J. ; Allgöwer, F. ; Levine, J.D. ; Dib-Hajj, S.D. ; Waxman, S.G. ; Hucho, T.
Pain 154, 2216-2226 (2013)
Growth factors such as nerve growth factor and glial cell line-derived neurotrophic factor are known to induce pain sensitization. However, a plethora of other growth factors is released during inflammation and tissue regeneration, and many of them are essential for wound healing. Which wound-healing factors also alter the sensitivity of nociceptive neurons is not well known. We studied the wound-healing factor, basic fibroblast growth factor (bFGF), for its role in pain sensitization. Reverse transcription polymerase chain reaction showed that the receptor of bFGF, FGFR1, is expressed in lumbar rat dorsal root ganglia (DRG). We demonstrated presence of FGFR1 protein in DRG neurons by a recently introduced quantitative automated immunofluorescent microscopic technique. FGFR1 was expressed in all lumbar DRG neurons as quantified by mixture modeling. Corroborating the mRNA and protein expression data, bFGF induced Erk1/2 phosphorylation in nociceptive neurons, which could be blocked by inhibition of FGF receptors. Furthermore, bFGF activated Erk1/2 in a dose- and time-dependent manner. Using single-cell electrophysiological recordings, we found that bFGF treatment of DRG neurons increased the current-density of NaV1.8 channels. Erk1/2 inhibitors abrogated this increase. Importantly, intradermal injection of bFGF in rats induced Erk1/2-dependent mechanical hyperalgesia. Perspective: Analyzing intracellular signaling dynamics in nociceptive neurons has proven to be a powerful approach to identify novel modulators of pain. In addition to describing a new sensitizing factor, our findings indicate the potential to investigate wound-healing factors for their role in nociception.
Wissenschaftlicher Artikel
Scientific Article
Rinck, A. ; Preusse, M. ; Laggerbauer, B. ; Lickert, H. ; Engelhardt, S. ; Theis, F.J.
RNA Biol. 10, 1125-1135 (2013)
Heat denaturation of native phages SD suspensions, phage "shadows", and isolated phage DNA solutions were studied by scanning microcalorimetry and viscosimetry. Energetic parameters of cooperative transitions of protein fraction and DNA were measured. DNA melting was shown to be preceded by the destruction of capsid and protein denaturation. The melting curve of isolated DNA and DNA in the presence of protein component is characterized by a fine structure which is completely restored at repeated denaturation only in the presence of the protein component. "Creeping" of DNA out of the capsid in heated suspensions at 50-52 degrees C was shown to proceed with "zero" enthalpy without significant endo- and exo-thermal effects. No change of specific heat capacity of the suspension was also observed. It is emphasized that the mechanism of DNA going out of the capsid can be understood by studying DNA hydration inside the phage and its change in the course of liberation of the phage genome from the protein capsid.MiRNAs are short, non-coding RNAs that regulate gene expression post-transcriptionally through specific binding to mRNA. Deregulation of miRNAs is associated with various diseases and interference with miRNA function has proven therapeutic potential. Most mRNAs are thought to be regulated by multiple miRNAs and there is some evidence that such joint activity is enhanced if a short distance between sites allows for cooperative binding. Until now, however, the concept of cooperativity among miRNAs has not been addressed in a transcriptome-wide approach. Here, we computationally screened human mRNAs for distances between miRNA binding sites that are expected to promote cooperativity. We find that sites with a maximal spacing of 26 nucleotides are enriched for naturally occurring miRNAs compared with control sequences. Furthermore, miRNAs with similar characteristics as indicated by either co-expression within a specific tissue or co-regulation in a disease context are predicted to target a higher number of mRNAs cooperatively than unrelated miRNAs. These bioinformatic data were compared with genome-wide sets of biochemically validated miRNA targets derived by Argonaute crosslinking and immunoprecipitation (HITS-CLIP and PAR-CLIP). To ease further research into combined and cooperative miRNA function, we developed miRco, a database connecting miRNAs and respective targets involved in distance-defined cooperative regulation (mips.helmholtz-muenchen.de/mirco). In conclusion, our findings suggest that cooperativity of miRNA-target interaction is a widespread phenomenon that may play an important role in miRNA-mediated gene regulation.
Wissenschaftlicher Artikel
Scientific Article
Hock, S. ; Ng, Y.K. ; Hasenauer, J. ; Wittmann, D.M. ; Lutter, D. ; Trümbach, D. ; Wurst, W. ; Prakash, N. ; Theis, F.J.
BMC Syst. Biol. 7:48 (2013)
BACKGROUND: The establishment of the mid-hindbrain region in vertebrates is mediated by the isthmic organizer, an embryonic secondary organizer characterized by a well-defined pattern of locally restricted gene expression domains with sharply delimited boundaries. While the function of the isthmic organizer at the mid-hindbrain boundary has been subject to extensive experimental studies, it remains unclear how this well-defined spatial gene expression pattern, which is essential for proper isthmic organizer function, is established during vertebrate development. Because the secreted Wnt1 protein plays a prominent role in isthmic organizer function, we focused in particular on the refinement of Wnt1 gene expression in this context. RESULTS: We analyzed the dynamics of the corresponding murine gene regulatory network and the related, diffusive signaling proteins using a macroscopic model for the biological two-scale signaling process. Despite the discontinuity arising from the sharp gene expression domain boundaries, we proved the existence of unique, positive solutions for the partial differential equation system. This enabled the numerically and analytically analysis of the formation and stability of the expression pattern. Notably, the calculated expression domain of Wnt1 has no sharp boundary in contrast to experimental evidence. We subsequently propose a post-transcriptional regulatory mechanism for Wnt1 miRNAs which yields the observed sharp expression domain boundaries. We established a list of candidate miRNAs and confirmed their expression pattern by radioactive in situ hybridization. The miRNA miR-709 was identified as a potential regulator of Wnt1 mRNA, which was validated by luciferase sensor assays. CONCLUSION: In summary, our theoretical analysis of the gene expression pattern induction at the mid-hindbrain boundary revealed the need to extend the model by an additional Wnt1 regulation. The developed macroscopic model of a two-scale process facilitate the stringent analysis of other morphogen-based patterning processes.
Wissenschaftlicher Artikel
Scientific Article
Hug, S. ; Raue, A. ; Hasenauer, J. ; Bachmann, J. ; Klingmüller, U. ; Timmer, J. ; Theis, F.J.
Math. Biosci. 246, 293-304 (2013)
In this work we present results of a detailed Bayesian parameter estimation for an analysis of ordinary differential equation models. These depend on many unknown parameters that have to be inferred from experimental data. The statistical inference in a high-dimensional parameter space is however conceptually and computationally challenging. To ensure rigorous assessment of model and prediction uncertainties we take advantage of both a profile posterior approach and Markov chain Monte Carlo sampling. We analyzed a dynamical model of the JAK2/STAT5 signal transduction pathway that contains more than one hundred parameters. Using the profile posterior we found that the corresponding posterior distribution is bimodal. To guarantee efficient mixing in the presence of multimodal posterior distributions we applied a multi-chain sampling approach. The Bayesian parameter estimation enables the assessment of prediction uncertainties and the design of additional experiments that enhance the explanatory power of the model. This study represents a proof of principle that detailed statistical analysis for quantitative dynamical modeling used in systems biology is feasible also in high-dimensional parameter spaces.
Wissenschaftlicher Artikel
Scientific Article
Schmidl, D. ; Czado, C. ; Hug, S. ; Theis, F.J.
Bayesian Anal. 8, 1-22 (2013)
Statistical inference in high dimensional dynamical systems is often hindered by the unknown dependency structure of model parameters. In particu- lar, the inference of parameterized differential equations (DEs) via Markov chain Monte Carlo (MCMC) samplers often suffers from high proposal rejection rates and is exacerbated by strong autocorrelation structures within the Markov chains leading to poor mixing properties. In this paper, we develop a novel vine-copula based adaptive MCMC approach for efficient parameter inference in dynamical systems with strong parameter interdependence. We exploit the concept of a vine-copula decomposition of distribution densities in order to generate problem- specific proposals for a hybrid independence/random walk Metropolis-Hastings (MH) sampler. The key advantage of this approach is that the corresponding MH proposals generate independent samples from the posterior distribution more effi- ciently than common competitors. All copula densities can be updated during the sampling procedure for fine-tuning. The performance of our method is assessed on two small-scale examples and finally evaluated on a delay DE model for the JAK2-STAT5 signaling pathway fitted to time-resolved western blot data. We compare our copula-based approach to an independence sampler, a second-order moment-based random walk MH algorithm, and an adaptive MH sampler.
Wissenschaftlicher Artikel
Scientific Article
Schmidl, D. ; Czado, C. ; Hug, S. ; Theis, F.J.
Bayesian Anal. 8, 33-42 (2013)
First of all, we would like to thank both D. Woodard and M. Girolami & A. Mira for their excellent and detailed comments on our paper. Their remarks and questions have given us quite a number of new ideas for improving our sampling procedure. As suggested by both, we have conducted new sampling runs for two additional examples, which illustrate the usefulness of CIMH and ACIMH and answer some of the ques- tions brought forward. Since Woodard’s comments focus on one particular aspect of CIMH/ACIMH, while Girolami & Mira point out several different considerations, we will first reply to Woodard’s comments.
Wissenschaftlicher Artikel
Scientific Article
Wittmann, D.M. ; Hock, S. ; Theis, F.J.
SIAM J. Appl. Dyn. Syst. 12, 315-351 (2013)
One distinguishes between qualitative and Boolean models of gene regulatory networks. Qualitative models are directed graphs with signed edges according to whether an interaction is activating or inhibiting. In Boolean models each node of the network is a Boolean variable whose value depends on the values of the node's regulators according to a specific update rule. Qualitative models can be systematically converted into Boolean models via generic logics, which allow combination of activating and inhibiting influences into an update rule. Here, we investigate random Boolean networks whose update rules are generated using generic logics. We begin by studying their truth-content, which is a mean field approximation of the fraction of ones (true nodes). We prove that the asymptotic behavior of this quantity is essentially independent of the initial conditions. In numeric analyses, the truth-content exhibits a rich dynamical behavior including period-doublings leading to chaos. We define truth-stable networks as networks whose truth-contents exhibit non-chaotic dynamics. Random Boolean networks are known to exist in two phases: a frozen phase with stable short periodic dynamics and a chaotic phase characterized by long-periodic attractors sensitive to perturbations. Our results about the truth-content of random Boolean networks with generic logics allow us to derive a criterion for phase transitions in these networks. The relation between phase transitions and the concept of truth-stability is studied. In numeric analyses we find multiple, intricately shaped critical boundaries. Simulations further strengthen the significance of our mean field results. Our results nicely fit into the theory of "living at the edge of chaos," which aims at elucidating the crucial properties of evolvable biological systems.
Wissenschaftlicher Artikel
Scientific Article
Rickert, D. ; Fricker, N. ; Lavrik, I.N. ; Theis, F.J.
In: Lavrik, I.N.* [Eds.]: Systems Biology of Apoptosis. New York: Springer, 2013. 57-84
A major problem when designing mathematical models of biochemical processes to analyze and explain experimental data is choosing the correct degree of model complexity. A common approach to solve this problem is top-down: Initially, complete models including all possible reactions are generated; they are then iteratively reduced to a more manageable size. The reactions to be simplified at each step are often chosen manually since exploration of the full search space seems unfeasible. While such a strategy is sufficient to identify a single, clearly structured reduction of the model, it discards additional information such as whether some model features are essential. In this chapter, we introduce alternate set-based strategies to model reduction that can be employed to exhaustively analyze the complete reduction space of a biochemical model instead of only identifying a single valid reduction.
Moignard, V. ; Macaulay, I.C. ; Swiers, G. ; Buettner, F. ; Schütte, J. ; Calero-Nieto, F.J. ; Kinston, S. ; Joshi, A. ; Hannah, R. ; Theis, F.J. ; Jacobsen, S.E. ; de Bruijn, M.F. ; Göttgens, B.
Nat. Cell Biol. 15, 363-372 (2013)
Cellular decision-making is mediated by a complex interplay of external stimuli with the intracellular environment, in particular transcription factor regulatory networks. Here we have determined the expression of a network of 18 key haematopoietic transcription factors in 597 single primary blood stem and progenitor cells isolated from mouse bone marrow. We demonstrate that different stem/progenitor populations are characterized by distinctive transcription factor expression states, and through comprehensive bioinformatic analysis reveal positively and negatively correlated transcription factor pairings, including previously unrecognized relationships between Gata2, Gfi1 and Gfi1b. Validation using transcriptional and transgenic assays confirmed direct regulatory interactions consistent with a regulatory triad in immature blood stem cells, where Gata2 may function to modulate cross-inhibition between Gfi1 and Gfi1b. Single-cell expression profiling therefore identifies network states and allows reconstruction of network hierarchies involved in controlling stem cell fate choices, and provides a blueprint for studying both normal development and human disease.
Wissenschaftlicher Artikel
Scientific Article
Bardehle, S. ; Krüger, M. ; Buggenthin, F. ; Schwausch, J. ; Ninkovic, J. ; Clevers, H. ; Snippert, H.J. ; Theis, F.J. ; Meyer-Luehmann, M. ; Bechmann, I. ; Dimou, L. ; Götz, M.
Nat. Neurosci. 16, 580-586 (2013)
Astrocytes are thought to have important roles after brain injury, but their behavior has largely been inferred from postmortem analysis. To examine the mechanisms that recruit astrocytes to sites of injury, we used in vivo two-photon laser-scanning microscopy to follow the response of GFP-labeled astrocytes in the adult mouse cerebral cortex over several weeks after acute injury. Live imaging revealed a marked heterogeneity in the reaction of individual astrocytes, with one subset retaining their initial morphology, another directing their processes toward the lesion, and a distinct subset located at juxtavascular sites proliferating. Although no astrocytes actively migrated toward the injury site, selective proliferation of juxtavascular astrocytes was observed after the introduction of a lesion and was still the case, even though the extent was reduced, after astrocyte-specific deletion of the RhoGTPase Cdc42. Thus, astrocyte recruitment after injury relies solely on proliferation in a specific niche.
Wissenschaftlicher Artikel
Scientific Article
Vehlow, C. ; Hasenauer, J. ; Theis, F.J. ; Weiskopf, D.
In: Proceedings of the 6th IEEE Pacific Visualization Symposium (IEEE PacificVis 2013 : 27.02.-01.03.2013, Sydney, Australia). Sydney, Australia: Univ. Sydney, 2013. 201-208
Graphs are used to model relations between sets of objects. Objects are represented by vertices and relations by edges of the graph. Besides vertex-vertex relations, in some application domains also relations between edges exist. Our new visualization approach supports the investigation of both relation types in one diagram. Edge-edge relations are visualized as curves that are directly integrated into the node-link diagram that represents the object-relation structure. In contrast, vertex-vertex relations are illustrated distinguishably from edge-edge relations using straight links as representations. While the shape of links is used to differentiate between the relation types, the weights of the edge-edge relations are mapped to the width and color of the curves. To facilitate an extensive analysis of interrelations, our approach incorporates several interaction techniques that can be used for filtering and highlighting. The usability of our visualization is demonstrated with two case studies in the application domains of bioinformatics and financial services.
Ried, J.S. ; Baurecht, H. ; Stückler, F. ; Krumsiek, J. ; Gieger, C. ; Heinrich, J. ; Kabesch, M. ; Prehn, C. ; Peters, A. ; Rodriguez, E. ; Schulz, H. ; Strauch, K. ; Suhre, K. ; Wang-Sattler, R. ; Wichmann, H.-E. ; Theis, F.J. ; Illig, T. ; Adamski, J. ; Weidinger, S.
Allergy 68, 629-636 (2013)
Genome-wide association studies (GWAS) have identified many risk loci for asthma, but effect sizes are small, and in most cases, the biological mechanisms are unclear. Targeted metabolite quantification that provides information about a whole range of pathways of intermediary metabolism can help to identify biomarkers and investigate disease mechanisms. Combining genetic and metabolic information can aid in characterizing genetic association signals with high resolution. This work aimed to investigate the interrelation of current asthma, candidate asthma risk alleles and a panel of metabolites. METHODS: We investigated 151 metabolites, quantified by targeted mass spectrometry, in fasting serum of asthmatic and nonasthmatic individuals from the population-based KORA F4 study (N = 2925). In addition, we analysed effects of single-nucleotide polymorphisms (SNPs) at 24 asthma risk loci on these metabolites. RESULTS: Increased levels of various phosphatidylcholines and decreased levels of various lyso-phosphatidylcholines were associated with asthma. Likewise, asthma risk alleles from the PDED3 and MED24 genes at the asthma susceptibility locus 17q21 were associated with increased concentrations of various phosphatidylcholines with consistent effect directions. CONCLUSIONS: Our study demonstrated the potential of metabolomics to infer asthma-related biomarkers by the identification of potentially deregulated phospholipids that associate with asthma and asthma risk alleles.
Wissenschaftlicher Artikel
Scientific Article
Xu, T. ; Holzapfel, C. ; Dong, X. ; Bader, E. ; Yu, Z. ; Prehn, C. ; Perstorfer, K. ; Jaremek, M. ; Römisch-Margl, W. ; Rathmann, W. ; Li, Y. ; Wichmann, H.-E. ; Wallaschofski, H. ; Ladwig, K.-H. ; Theis, F.J. ; Suhre, K. ; Adamski, J. ; Illig, T. ; Peters, A. ; Wang-Sattler, R.
BMC Med. 11:60 (2013)
BACKGROUND: Metabolomics helps to identify links between environmental exposures and intermediate biomarkers of disturbed pathways. We previously reported variations in phosphatidylcholines in male smokers compared with non-smokers in a cross-sectional pilot study with a small sample size, but knowledge of the reversibility of smoking effects on metabolite profiles is limited. Here, we extend our metabolomics study with a large prospective study including female smokers and quitters. METHODS: Using targeted metabolomics approach, we quantified 140 metabolite concentrations for 1,241 fasting serum samples in the population-based Cooperative Health Research in the Region of Augsburg (KORA) human cohort at two time points: baseline survey conducted between 1999 and 2001 and follow-up after seven years. Metabolite profiles were compared among groups of current smokers, former smokers and never smokers, and were further assessed for their reversibility after smoking cessation. Changes in metabolite concentrations from baseline to the follow-up were investigated in a longitudinal analysis comparing current smokers, never smokers and smoking quitters, who were current smokers at baseline but former smokers by the time of follow-up. In addition, we constructed protein-metabolite networks with smoking-related genes and metabolites. RESULTS: We identified 21 smoking-related metabolites in the baseline investigation (18 in men and six in women, with three overlaps) enriched in amino acid and lipid pathways, which were significantly different between current smokers and never smokers. Moreover, 19 out of the 21 metabolites were found to be reversible in former smokers. In the follow-up study, 13 reversible metabolites in men were measured, of which 10 were confirmed to be reversible in male quitters. Protein-metabolite networks are proposed to explain the consistent reversibility of smoking effects on metabolites. CONCLUSIONS: We showed that smoking-related changes in human serum metabolites are reversible after smoking cessation, consistent with the known cardiovascular risk reduction. The metabolites identified may serve as potential biomarkers to evaluate the status of smoking cessation and characterize smoking-related diseases.  
Wissenschaftlicher Artikel
Scientific Article
Köttgen, A. ; Albrecht, E. ; Teumer, A. ; Vitart, V. ; Krumsiek, J. ; Hundertmark, C. ; Pistis, G. ; Ruggiero, D. ; O'Seaghdha, C.M. ; Haller, T. ; Yang, Q. ; Tanaka, T. ; Johnson, A.D. ; Kutalik, Z. ; Smith, A.V. ; Shi, J. ; Struchalin, M. ; Middelberg, R.P. ; Brown, M.J. ; Gaffo, A.L. ; Pirastu, N. ; Li, G. ; Hayward, C. ; Zemunik, T. ; Huffman, J. ; Yengo, L. ; Zhao, J.H. ; Demirkan, A. ; Feitosa, M.F. ; Liu, X. ; Malerba, G. ; Lopez, L.M. ; van der Harst, P. ; Li, X. ; Kleber, M.E. ; Hicks, A.A. ; Nolte, I.M. ; Johansson, A. ; Murgia, F. ; Wild, S.H. ; Bakker, S.J. ; Peden, J.F. ; Dehghan, A. ; Steri, M. ; Tenesa, A. ; Lagou, V. ; Salo, P. ; Mangino, M. ; Rose, L.M. ; Lehtimäki, T. ; Woodward, O.M. ; Okada, Y. ; Tin, A. ; Müller, C. ; Oldmeadow, C. ; Putku, M. ; Czamara, D. ; Kraft, P. ; Frogheri, L. ; Thun, G.A. ; Grotevendt, A. ; Gislason, G.K. ; Harris, T.B. ; Launer, L.J. ; McArdle, P. ; Shuldiner, A.R. ; Boerwinkle, E. ; Coresh, J. ; Schmidt, H. ; Schallert, M. ; Martin, N.G. ; Montgomery, G.W. ; Kubo, M. ; Nakamura, Y. ; Munroe, P.B. ; Samani, N.J. ; Jacobs, D.R. J.r. ; Liu, K. ; D'Adamo, P. ; Ulivi, S. ; Rotter, J.I. ; Psaty, B.M. ; Vollenweider, P. ; Waeber, G. ; Campbell, S. ; Devuyst, O. ; Navarro, P. ; Kolcic, I. ; Hastie, N. ; Balkau, B. ; Froguel, P. ; Esko, T. ; Salumets, A. ; Khaw, K.T. ; Langenberg, C. ; Wareham, N.J. ; Isaacs, A. ; Kraja, A. ; Zhang, Q. ; Wild, P.S. ; Scott, R.J. ; Holliday, E.G. ; Org, E. ; Viigimaa, M. ; Bandinelli, S. ; Metter, J.E. ; Lupo, A. ; Trabetti, E. ; Sorice, R. ; Döring, A. ; Lattka, E. ; Strauch, K. ; Theis, F.J. ; Waldenberger, M. ; Wichmann, H.-E. ; Davies, G. ; Gow, A.J. ; Bruinenberg, M. ; LifeLines Cohort Study () ; Stolk, R.P. ; Kooner, J.S. ; Zhang, W. ; Winkelmann, B.R. ; Boehm, B.O. ; Lucae, S. ; Penninx, B.W. ; Smit, J.H. ; Curhan, G. ; Mudgal, P. ; Plenge, R.M. ; Portas, L. ; Persico, I. ; Kirin, M. ; Wilson, J.F. ; Leach, I.M. ; van Gilst, W.H. ; Goel, A. ; Ongen, H. ; Hofman, A. ; Rivadeneira, F. ; Uitterlinden, A.G. ; Imboden, M. ; von Eckardstein, A. ; Cucca, F. ; Nagaraja, R. ; Piras, M.G. ; Nauck, M. ; Schurmann, C. ; Budde, K. ; Ernst, F. ; Farrington, S.M. ; Theodoratou, E. ; Prokopenko, I. ; Stumvoll, M. ; Jula, A. ; Perola, M. ; Salomaa, V. ; Shin, S.Y. ; Spector, T.D. ; Sala, C. ; Ridker, P.M. ; Kähönen, M. ; Viikari, J. ; Hengstenberg, C. ; Nelson, C.P. ; Gieger, C. ; CARDIoGRAM Consortium (Wichmann, H.-E. ; Illig, T. ; Döring, A. ; Meisinger, C. ; Klopp, N. ; Peters, A. ; Meitinger, T.) ; DIAGRAM Consortium (Huth, C. ; Thorand, B. ; Meitinger, T. ; Gieger, C. ; Klopp, N. ; Grallert, H. ; Wichmann, H.-E. ; Illig, T. ; Petersen, A.-K.) ; ICBP Consortium () ; MAGIC Investigators (Wichmann, H.-E. ; Illig, T. ; Meisinger, C. ; Gieger, C. ; Thorand, B. ; Grallert, H.) ; Meschia, J.F. ; Nalls, M.A. ; Sharma, P. ; Singleton, A.B. ; Kamatani, N. ; Zeller, T. ; Burnier, M. ; Attia, J. ; Laan, M. ; Klopp, N. ; Hillege, H.L. ; Kloiber, S. ; Choi, H. ; Pirastu, M. ; Tore, S. ; Probst-Hensch, N.M. ; Völzke, H. ; Gudnason, V. ; Parsa, A. ; Schmidt, R. ; Whitfield, J.B. ; Fornage, M. ; Gasparini, P. ; Siscovick, D.S. ; Polasek, O. ; Campbell, H. ; Rudan, I. ; Bouatia-Naji, N. ; Metspalu, A. ; Loos, R.J. ; van Duijn, C.M. ; Borecki, I.B. ; Ferrucci, L. ; Gambaro, G. ; Deary, I.J. ; Wolffenbuttel, B.H. ; Chambers, J.C. ; Marz, W. ; Pramstaller, P.P. ; Snieder, H. ; Gyllensten, U. ; Wright, A.F. ; Navis, G. ; Watkins, H. ; Witteman, J.C. ; Sanna, S. ; Schipf, S. ; Dunlop, M.G. ; Tönjes, A. ; Ripatti, S. ; Soranzo, N. ; Toniolo, D. ; Chasman, D.I. ; Raitakari, O. ; Kao, W.H. ; Ciullo, M. ; Fox, C.S. ; Caulfield, M. ; Bochud, M.
Nat. Genet. 45, 145-154 (2013)
Elevated serum urate concentrations can cause gout, a prevalent and painful inflammatory arthritis. By combining data from >140,000 individuals of European ancestry within the Global Urate Genetics Consortium (GUGC), we identified and replicated 28 genome-wide significant loci in association with serum urate concentrations (18 new regions in or near TRIM46, INHBB, SFMBT1, TMEM171, VEGFA, BAZ1B, PRKAG2, STC1, HNF4G, A1CF, ATXN2, UBE2Q2, IGF1R, NFAT5, MAF, HLF, ACVR1B-ACVRL1 and B3GNT4). Associations for many of the loci were of similar magnitude in individuals of non-European ancestry. We further characterized these loci for associations with gout, transcript expression and the fractional excretion of urate. Network analyses implicate the inhibins-activins signaling pathways and glucose metabolism in systemic urate control. New candidate genes for serum urate concentration highlight the importance of metabolic control of urate production and excretion, which may have implications for the treatment and prevention of gout.
Wissenschaftlicher Artikel
Scientific Article
Altmaier, E. ; Emeny, R.T. ; Krumsiek, J. ; Lacruz, M.E. ; Lukaschek, K. ; Häfner, S. ; Kastenmüller, G. ; Römisch-Margl, W. ; Prehn, C. ; Mohney, R.P. ; Evans, A.M. ; Milburn, M.V. ; Illig, T. ; Adamski, J. ; Theis, F.J. ; Suhre, K. ; Ladwig, K.-H.
Psychoneuroendocrinology 38, 1299-1309 (2013)
Background Individuals with negative affectivity who are inhibited in social situations are characterized as distressed, or Type D, and have an increased risk of cardiovascular disease (CVD). The underlying biomechanisms that link this psychological affect to a pathological state are not well understood. This study applied a metabolomic approach to explore biochemical pathways that may contribute to the Type D personality. Methods Type D personality was determined by the Type D Scale-14. Small molecule biochemicals were measured using two complementary mass-spectrometry based metabolomics platforms. Metabolic profiles of Type D and non-Type D participants within a population-based study in Southern Germany were compared in cross-sectional regression analyses. The PHQ-9 and GAD-7 instruments were also used to assess symptoms of depression and anxiety, respectively, within this metabolomic study. Results 668 metabolites were identified in the serum of 1502 participants (age 32–77); 386 of these individuals were classified as Type D. While demographic and biomedical characteristics were equally distributed between the groups, a higher level of depression and anxiety was observed in Type D individuals. Significantly lower levels of the tryptophan metabolite kynurenine were associated with Type D (p-value corrected for multiple testing = 0.042), while no significant associations could be found for depression and anxiety. A Gaussian graphical model analysis enabled the identification of four potentially interesting metabolite networks that are enriched in metabolites (androsterone sulfate, tyrosine, indoxyl sulfate or caffeine) that associate nominally with Type D personality. Conclusions This study identified novel biochemical pathways associated with Type D personality and demonstrates that the application of metabolomic approaches in population studies can reveal mechanisms that may contribute to psychological health and disease.
Wissenschaftlicher Artikel
Scientific Article
Raue, A. ; Kreutz, C. ; Theis, F.J. ; Timmer, J.
Philos. Trans. R. Soc. A - Math. Phys. Eng. Sci. 371:20110544 (2013)
Increasingly complex applications involve large datasets in combination with nonlinear and high dimensional mathematical models. In this context, statistical inference is a challenging issue that calls for pragmatic approaches that take advantage of both Bayesian and frequentist methods. The elegance of Bayesian methodology is founded in the propagation of information content provided by experimental data and prior assumptions to the posterior probability distribution of model predictions. However, for complex applications experimental data and prior assumptions potentially constrain the posterior probability distribution insuciently. In these situations Bayesian Markov chain Monte Carlo sampling can be infeasible. From a frequentist point of view insucient experimental data and prior assumptions can be interpreted as non-identi ability. The pro le likelihood approach o ers to detect and to resolve non-identi ability by experimental design iteratively. Therefore, it allows one to better constrain the posterior probability distribution until Markov chain Monte Carlo sampling can be used securely. Using an application from cell biology we compare both methods and show that a successive application of both methods facilitates a realistic assessment of uncertainty in model predictions.
Wissenschaftlicher Artikel
Scientific Article
2012
Cichocki, A. ; Theis, F.J. ; Yeredor, A. ; Zibulevsky, M.
Lect. Notes Comput. Sc. 7191, V-VII (2012)
Editorial
Editorial
Gutch, H.W. ; Gruber, P. ; Yeredor, A. ; Theis, F.J.
Signal Process. 92, 1796-1808 (2012)
We transfer the ICA model to the case where the underlying field is not the set of teals but an arbitrary finite field. We give conditions for separability of the model, pointing out existing parallels to the real case. Three algorithms capable of solving the task are suggested and we demonstrate their viability through simulations and a possible application of the model.
Wissenschaftlicher Artikel
Scientific Article
Ruepp, A. ; Kowarsch, A. ; Theis, F.J.
Methods Mol. Biol. 822, 249-260 (2012)
The association of dysregulated microRNAs (miRNAs) and diseases has been shown in a variety of studies. Here, we review a resource denoted as PhenomiR, providing systematic and comprehensive access to such studies. It allows machine-readable access to miRNA and target relations from these studies to study the impact of miRNAs on multifactorial diseases across many samples and biological replicates. We summarize the PhenomiR data structure and its content and show how to access the database and use it in everyday miRNA profile analysis using the R language.
Wissenschaftlicher Artikel
Scientific Article
Nordhausen, K. ; Gutch, H.W. ; Oja, H. ; Theis, F.J.
In: Theis, F.J. ; Cichocki, A.* ; Yeredor, A* ; Zibulevsky, M.* [Eds.]: Proceedings (10th international conference on Latent Variable Analysis and Signal Separation). Heidelberg: Springer, 2012. 172-179 (Lecture Notes in Computer Science ; 7191)
Procedures such as FOBI that jointly diagonalize two matrices with the independence property have a long tradition in ICA. These procedures have well-known statistical properties, for example they are prone to failure if the sources have multiple identical values on the diagonal. In this paper we suggest to diagonalize jointly k ≥ 2 scatter matrices having the independence property. For the joint diagonalization we suggest a novel algorithm which finds the correct direction in an deflation based manner, one after another. The method is demonstrated in a small simulation study.
Chursov, A. ; Kopetzky, S.J. ; Leshchiner, I. ; Kondofersky, I. ; Theis, F.J. ; Frishman, D. ; Shneider, A.
RNA Biol. 9, 1266-1274 (2012)
For decades, cold-adapted, temperature-sensitive (ca/ts) strains of influenza A virus have been used as live attenuated vaccines. Due to their great public health importance it is crucial to understand the molecular mechanism(s) of cold adaptation and temperature sensitivity that are currently unknown. For instance, secondary RNA structures play important roles in influenza biology. Thus, we hypothesized that a relatively minor change in temperature (32-39 degrees C) can lead to perturbations in influenza RNA structures and, that these structural perturbations may be different for mRNAs of the wild type (wt) and ca/ts strains. To test this hypothesis, we developed a novel in silico method that enables assessing whether two related RNA molecules would undergo (dis)similar structural perturbations upon temperature change. The proposed method allows identifying those areas within an RNA chain where dissimilarities of RNA secondary structures at two different temperatures are particularly pronounced, without knowing particular RNA shapes at either temperature. We identified such areas in the NS2, PA, PB2 and NP mRNAs. However, these areas are not identical for the wt and ca/ts mutants. Differences in temperature-induced structural changes of wt and ca/ts mRNA structures may constitute a yet unappreciated molecular mechanism of the cold adaptation/temperature sensitivity phenomena.
Wissenschaftlicher Artikel
Scientific Article
Gutch, H.W. ; Theis, F.J.
In: Theis, F.J. ; Cichocki, A.* ; Yeredor, A* ; Zibulevsky, M.* [Eds.]: Proceedings (10th International Conference, LVA/ICA 2012, 12-15 March 2012, Tel Aviv, Israel). Heidelberg: Springer, 2012. 180-187 (Lect. Notes Comput. Sc. ; 7191)
The original Independent Component Analysis (ICA) problem of blindly separating a mixture of a finite number of real-valued statistically independent one-dimensional sources has been extended in a number of ways in recent years. These include dropping the assumption that all sources are one-dimensional and some extensions to the case where the sources are not real-valued. We introduce an extension in a further direction, no longer assuming only a finite number of sources, but instead allowing infinitely many. We define a notion of independent sources for this case and show separability of ICA in this framework.
Plant, C.C. ; Mai Thai, S. ; Shao, J. ; Theis, F.J. ; Meyer-Bäse, A. ; Böhm, C.
Adv. Artif. Neural Syst. 2012:962105 (2012)
Independent component analysis (ICA) is an essential building block for data analysis in many applications. Selecting the truly meaningful components from the result of an ICA algorithm, or comparing the results of different algorithms, however, is nontrivial problems. We introduce a very general technique for evaluating ICA results rooted in information-theoretic model selection. The basic idea is to exploit the natural link between non-Gaussianity and data compression: the better the data transformation represented by one or several ICs improves the effectiveness of data compression, the higher is the relevance of the ICs. We propose two different methods which allow an efficient data compression of non-Gaussian signals: Phi-transformed histograms and fuzzy histograms. In an extensive experimental evaluation, we demonstrate that our novel information-theoretic measures robustly select non-Gaussian components from data in a fully automatic way, that is, without requiring any restrictive assumptions or thresholds.
Wissenschaftlicher Artikel
Scientific Article
Krumsiek, J. ; Stückler, F. ; Kastenmüller, G. ; Theis, F.J.
In: Suhre, K.* [Eds.]: Genetics Meets Metabolomics: from Experiment to Systems Biology. New York: Springer, 2012. 281-313
In the preceding chapters many aspects of metabolite quantification and relation to trait and disease phenotypes have been described, in particular the linkage of intermediate metabolic traits to genetic heterogeneities. Although many analyses start on the genome-wide level, they end up picking out single polymorphisms or other variations and study these in detail. This reductionist approach is very common in molecular biology and has proven hugely successful over the past decades. In recent years however, a second paradigm has become increasingly popular, namely that of integrating multiple such analyses into larger ones commonly called models. This paradigm, nowadays, is known as systems biology and is expected to penetrate many classical molecular analyses.
Winkler, C. ; Krumsiek, J. ; Lempainen, J. ; Achenbach, P. ; Grallert, H. ; Giannopoulou, E.Z. ; Bunk, M. ; Theis, F.J. ; Bonifacio, E. ; Ziegler, A.-G.
Genes Immun. 13, 549-555 (2012)
Genome-wide association studies have identified gene regions associated with type 1 diabetes. The aim of this study was to determine how the combined allele frequency of multiple susceptibility genes can stratify islet autoimmunity and/or type 1 diabetes risk. Children of parents with type 1 diabetes and prospectively followed from birth for the development of islet autoantibodies and diabetes were genotyped for single-nucleotide polymorphisms at 12 type 1 diabetes susceptibility genes (ERBB3, PTPN2, IFIH1, PTPN22, KIAA0350, CD25, CTLA4, SH2B3, IL2, IL18RAP, IL10 and COBL). Non-human leukocyte antigen (HLA) risk score was defined by the total number of risk alleles at these genes. Receiver operator curve analysis showed that the non-HLA gene combinations were highly effective in discriminating diabetes and most effective in children with a high-risk HLA genotype. The greatest diabetes discrimination was obtained by the sum of risk alleles for eight genes (IFIH1, CTLA4, PTPN22, IL18RAP, SH2B3, KIAA0350, COBL and ERBB3) in the HLA-risk children. Non-HLA-risk allele scores stratified risk for developing islet autoantibodies and diabetes, and progression from islet autoimmunity to diabetes. Genotyping at multiple susceptibility loci in children from affected families can identify neonates with sufficient genetic risk of type 1 diabetes to be considered for early intervention.
Wissenschaftlicher Artikel
Scientific Article
Krumsiek, J. ; Suhre, K. ; Evans, A.M. ; Mitchell, M.W. ; Mohney, R.P. ; Milburn, M.V. ; Wägele, B. ; Römisch-Margl, W. ; Illig, T. ; Adamski, J. ; Gieger, C. ; Theis, F.J. ; Kastenmüller, G.
PLoS Genet. 8:e1003005 (2012)
Recent genome-wide association studies (GWAS) with metabolomics data linked genetic variation in the human genome to differences in individual metabolite levels. A strong relevance of this metabolic individuality for biomedical and pharmaceutical research has been reported. However, a considerable amount of the molecules currently quantified by modern metabolomics techniques are chemically unidentified. The identification of these "unknown metabolites" is still a demanding and intricate task, limiting their usability as functional markers of metabolic processes. As a consequence, previous GWAS largely ignored unknown metabolites as metabolic traits for the analysis. Here we present a systems-level approach that combines genome-wide association analysis and Gaussian graphical modeling with metabolomics to predict the identity of the unknown metabolites. We apply our method to original data of 517 metabolic traits, of which 225 are unknowns, and genotyping information on 655,658 genetic variants, measured in 1,768 human blood samples. We report previously undescribed genotype-metabotype associations for six distinct gene loci (SLC22A2, COMT, CYP3A5, CYP2C18, GBA3, UGT3A1) and one locus not related to any known gene (rs12413935). Overlaying the inferred genetic associations, metabolic networks, and knowledge-based pathway information, we derive testable hypotheses on the biochemical identities of 106 unknown metabolites. As a proof of principle, we experimentally confirm nine concrete predictions. We demonstrate the benefit of our method for the functional interpretation of previous metabolomics biomarker studies on liver detoxification, hypertension, and insulin resistance. Our approach is generic in nature and can be directly transferred to metabolomics data from different experimental platforms.
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J.
In: Theis, F.J. ; Cichocki, A.* ; Yeredor, A* ; Zibulevsky, M.* [Eds.]: Proceedings (10th international conference on Latent Variable Analysis and Signal Separation). Heidelberg: Springer, 2012. 528-535 (Lect. Notes Comput. Sc. ; 7191)
With the increasing availability of interaction data stemming form fields as diverse as systems biology, telecommunication or social sciences, the task of mining and understanding the underlying graph structures becomes more and more important. Here we focus on data with different types of nodes; we subsume this meta information in the color of a node. An important first step is the unsupervised clustering of nodes into communities, which are of the same color and highly connected within but sparsely connected to the rest of the graph. Recently we have proposed a fuzzy extension of this clustering concept, which allows a node to have membership in multiple clusters. The resulting gradient descent algorithm shared many similarities with the multiplicative update rules from nonnegative matrix factorization. Two issues left open were the determination of the number of clusters of each color, as well as the non-defined integration of additional prior information. In this contribution we resolve these issues by reinterpreting the factorization in a Bayesian framework, which allows the ready inclusion of priors. We integrate automatic relevance determination to automatically estimate group sizes. We derive a maximum-a-posteriori estimator, and illustrate the feasibility of the approach on a toy as well as a protein-complex hypergraph, where the resulting fuzzy clusters show significant enrichment of distinct gene ontology categories.
Hug, S. ; Theis, F.J.
In: Theis, F.J. ; Cichocki, A.* ; Yeredor, A* ; Zibulevsky, M.* [Eds.]: Proceedings (10th international conference on Latent Variable Analysis and Signal Separation). Heidelberg: Springer, 2012. 520-527 (Lect. Notes Comput. Sc. ; 7191)
In the study of gene regulatory networks, more and more quantitative data becomes available. However, few of the players in such networks are observed, others are latent. Focusing on the inference of multiple such latent causes, we arrive at a blind source separation problem. Under the assumptions of independent sources and Gaussian noise, this condenses to a Bayesian independent component analysis problem with a natural dynamic structure. We here present a method for the inference in networks with linear dynamics, with a straightforward extension to the nonlinear case. The proposed method uses a maximum a posteriori estimate of the latent causes, with additional prior information guaranteeing independence. We illustrate the feasibility of our method on a toy example and compare the results with standard approaches.
Illner, K. ; Fuchs, C. ; Theis, F.J.
In: Larjo, A.* ; Schober, S.* ; Farhan, M.* ; Bossert, M.* ; Yli-Harja, O.* [Eds.]: Proceedings (Ninth International Workshop on Computational Systems Biology, WCSB 2012, June 4-6, 2012, Ulm, Germany). Tampere, Finnland: Tampere International Center for Signal Processing, 2012. 43-46 (Proc. WCSB ; 61)
Dealing with data of a specific temporal or spatial structure is well established in blind source separation. However, in biology one often faces more complex network structures. The recently published GraDe-algorithm addresses such structures; it separates sources with respect to a given network in an analytical manner. We formulate corresponding assumptions and assign them to a very flexible Bayesian model. This allows us to include for instance missing observations and use prior parameter knowledge. Technically, we propose a Gaussian graphical model with latent variables to include all structural information from the data. The parameters and latent variables are estimated using expectation maximization, where we exploit the restrictions given by the separation assumptions. In a large scale application we consider gene expression data, where the dependence structure is given by a gene regulatory network. We demonstrate how the model indeed identifies relevant biological processes.
Gutch, H.W. ; Theis, F.J.
J. Multivar. Anal. 112, 48-62 (2012)
Given a random vector X, we address the question of linear separability of X. that is, the task of finding a linear operator W such that we have (S-1, ... , S-M) = (WX) with statistically independent random vectors Si. As this requirement alone is already fulfilled trivially by X being independent of the empty rest, we require that the components be not further decomposable. We show that if X has finite covariance, such a representation is unique up to trivial indeterminacies. We propose an algorithm based on this proof and demonstrate its applicability. Related algorithms, however with fixed dimensionality of the subspaces, have already been successfully employed in biomedical applications, such as separation of fMRI recorded data. Based on the presented uniqueness result, it is now clear that also subspace dimensions can be determined in a unique and therefore meaningful fashion, which shows the advantages of independent subspace analysis in contrast to methods like principal component analysis.
Wissenschaftlicher Artikel
Scientific Article
Marr, C. ; Strasser, M. ; Schwarzfischer, M. ; Schroeder, T. ; Theis, F.J.
FEBS J. 279, 3488-3500 (2012)
Hematopoiesis is often pictured as a hierarchy of branching decisions, giving rise to all mature blood cell types from stepwise differentiation of a single cell, the hematopoietic stem cell. Various aspects of this process have been modeled using various experimental and theoretical techniques on different scales. Here we integrate the more common population-based approach with a single-cell resolved molecular differentiation model to study the possibility of inferring mechanistic knowledge of the differentiation process. We focus on a sub-module of hematopoiesis: differentiation of granulocyte-monocyte progenitors GMPs) to granulocytes or monocytes. Within a branching process model, we infer the differentiation probability of GMPs from the experimentally quantified heterogeneity of colony assays under permissive conditions where both granulocytes and monocytes can emerge. We compare the predictions with the differentiation probability in genealogies determined from single-cell time-lapse microscopy. In contrast to the branching process model, we found that the differentiation probability as determined by differentiation marker onset increases with the generation of the cell within the genealogy. To study this feature from a molecular perspective, we established a stochastic toggle switch model, in which the intrinsic lineage decision is executed using two antagonistic transcription factors. We identified parameter regimes that allow for both time-dependent and time-independent differentiation probabilities. Finally, we infer parameters for which the model matches experimentally observed differentiation probabilities via approximate Bayesian computing. These parameters suggest different timescales in the dynamics of granulocyte and monocyte differentiation. Thus we provide a multi-scale picture of cell differentiation in murine GMPs, and illustrate the need for single-cell time-resolved observations of cellular decisions.
Wissenschaftlicher Artikel
Scientific Article
Petersen, A.-K. ; Krumsiek, J. ; Wägele, B. ; Theis, F.J. ; Wichmann, H.-E. ; Gieger, C. ; Suhre, K.
BMC Bioinformatics 13:120 (2012)
ABSTRACT: BACKGROUND: Genome-wide association studies (GWAS) with metabolic traits and metabolome-wide association studies (MWAS) with traits of biomedical relevance are powerful tools to identify the contribution of genetic, environmental and lifestyle factors to the etiology of complex diseases. Hypothesis-free testing of ratios between all possible metabolite pairs in GWAS and MWAS has proven to be an innovative approach in the discovery of new biologically meaningful associations. The p-gain statistic was introduced as an ad-hoc measure to determine whether a ratio between two metabolite concentrations carries more information than the two corresponding metabolite concentrations alone. So far, only a rule of thumb was applied to determine the significance of the p-gain. RESULTS: Here we explore the statistical properties of the p-gain through simulation of its density and by sampling of experimental data. We derive critical values of the p-gain for different levels of correlation between metabolite pairs and show that B/(2*alpha) is a conservative critical value for the p-gain, where alpha is the level of significance and B the number of tested metabolite pairs. CONCLUSIONS: We show that the p-gain is a well defined measure that can be used to identify statistically significant metabolite ratios in association studies and provide a conservative significance cut-off for the p-gain for use in future association studies with metabolic traits.
Wissenschaftlicher Artikel
Scientific Article
Schmidl, D. ; Hug, S. ; Li, W.B. ; Greiter, M. ; Theis, F.J.
BMC Syst. Biol. 6:95 (2012)
ABSTRACT: BACKGROUND: In radiation protection, biokinetic models for zirconium processing are of crucial importance in dose estimation and further risk analysis for humans exposed to this radioactive substance. They provide limiting values of detrimental effects and build the basis for applications in internal dosimetry, the prediction for radioactive zirconium retention in various organs as well as retrospective dosimetry. Multi-compartmental models are the tool of choice for simulating the processing of zirconium. Although easily interpretable, determining the exact compartment structure and interaction mechanisms is generally daunting. In the context of observing the dynamics of multiple compartments, Bayesian methods provide efficient tools for model inference and selection. RESULTS: We are the first to apply a Markov chain Monte Carlo approach to compute Bayes factors for the evaluation of two competing models for zirconium processing in the human body after ingestion. Based on in vivo measurements of human plasma and urine levels we were able to show that a recently published model is superior to the standard model of the International Commission on Radiological Protection. The Bayes factors were estimated by means of the numerically stable thermodynamic integration in combination with a recently developed copula based Metropolis-Hastings sampler. CONCLUSIONS: In contrast to the standard model the novel model predicts lower accretion of zirconium in bones. This results in lower levels of noxious doses for exposed individuals. Moreover, the Bayesian approach allows for retrospective dose assessment, including credible intervals for the initially ingested zirconium, in a significantly more reliable fashion than previously possible. All methods presented here are readily applicable to many modeling tasks in systems biology.
Wissenschaftlicher Artikel
Scientific Article
Buettner, F. ; Theis, F.J.
Bioinformatics 28, i626-i632 (2012)
Motivation: Single-cell experiments of cells from the early mouse embryo yield gene expression data for different developmental stages from zygote to blastocyst. To better understand cell fate decisions during differentiation, it is desirable to analyse the high-dimensional gene expression data and assess differences in gene expression patterns between different developmental stages as well as within developmental stages. Conventional methods include univariate analyses of distributions of genes at different stages or multivariate linear methods such as principal component analysis (PCA). However, these approaches often fail to resolve important differences as each lineage has a unique gene expression pattern which changes gradually over time yielding different gene expressions both between different developmental stages as well as heterogeneous distributions at a specific stage. Furthermore, to date, no approach taking the temporal structure of the data into account has been presented. Results: We present a novel framework based on Gaussian process latent variable models (GPLVMs) to analyse single-cell qPCR expression data of 48 genes from mouse zygote to blastocyst as presented by (Guo et al., 2010). We extend GPLVMs by introducing gene relevance maps and gradient plots to provide interpretability as in the linear case. Furthermore, we take the temporal group structure of the data into account and introduce a new factor in the GPLVM likelihood which ensures that small distances are preserved for cells from the same developmental stage. Using our novel framework, it is possible to resolve differences in gene expressions for all developmental stages. Furthermore, a new subpopulation of cells within the 16-cell stage is identified which is significantly more trophectoderm-like than the rest of the population. The trophectoderm-like subpopulation was characterized by considerable differences in the expression of Id2, Gata4 and, to a smaller extent, Klf4 and Hand1. The relevance of Id2 as early markers for TE cells is consistent with previously published results.
Wissenschaftlicher Artikel
Scientific Article
Sterz, K. ; Scherer, G. ; Krumsiek, J. ; Theis, F.J. ; Ecker, J.
Chem. Res. Toxicol. 25, 1565-1567 (2012)
1,3-Butadiene (BD) is a Class 1 carcinogen present at workplaces, in polluted air, in automobile exhaust, and in tobacco smoke. 2-Hydroxybutene-1-yl mercapturic acid (2-MHBMA) is a urinary metabolite often measured as a biomarker for exposure to BD. Here, we show for the first time that an additional MHBMA isomer is present at significant amounts in human urine, 1-hydroxybutene-2-yl mercapturic acid (1-MHBMA). For its quantification, a highly sensitive UPLC-HILIC-MS/MS method was developed and validated. Analyzing urinary samples of 183 volunteers, we demonstrate that 1-MHBMA is a novel and potentially more reliable biomarker for BD exposure than the commonly analyzed 2-MHBMA.
Wissenschaftlicher Artikel
Scientific Article
Peng, C. ; Li, N. ; Ng, Y.K. ; Zhang, J. ; Meier, F. ; Theis, F.J. ; Merkenschlager, M. ; Chen, W. ; Wurst, W. ; Prakash, N.
J. Neurosci. 32, 13292-13308 (2012)
MicroRNAs have emerged as key posttranscriptional regulators of gene expression during vertebrate development. We show that the miR-200 family plays a crucial role for the proper generation and survival of ventral neuronal populations in the murine midbrain/hindbrain region, including midbrain dopaminergic neurons, by directly targeting the pluripotency factor Sox2 and the cell-cycle regulator E2F3 in neural stem/progenitor cells. The lack of a negative regulation of Sox2 and E2F3 by miR-200 in conditional Dicer1 mutants (En1(+/Cre); Dicer1(flox/flox) mice) and after miR-200 knockdown in vitro leads to a strongly reduced cell-cycle exit and neuronal differentiation of ventral midbrain/hindbrain (vMH) neural progenitors, whereas the opposite effect is seen after miR-200 overexpression in primary vMH cells. Expression of miR-200 is in turn directly regulated by Sox2 and E2F3, thereby establishing a unilateral negative feedback loop required for the cell-cycle exit and neuronal differentiation of neural stem/progenitor cells. Our findings suggest that the posttranscriptional regulation of Sox2 and E2F3 by miR-200 family members might be a general mechanism to control the transition from a pluripotent/multipotent stem/progenitor cell to a postmitotic and more differentiated cell.
Wissenschaftlicher Artikel
Scientific Article
Krumsiek, J. ; Suhre, K. ; Illig, T. ; Adamski, J. ; Theis, F.J.
J. Proteome Res. 11, 4120-4131 (2012)
Interpreting the complex interplay of metabolites in heterogeneous biosamples still poses a challenging task. In this study, we propose independent component analysis (ICA) as a multivariate analysis tool for the interpretation of large-scale metabolomics data. In particular, we employ a Bayesian ICA method based on a mean-field approach, which allows us to statistically infer the number of independent components to be reconstructed. The advantage of ICA over correlation-based methods like principal component analysis (PCA) is the utilization of higher order statistical dependencies, which not only yield additional information but also allow a more meaningful representation of the data with fewer components. We performed the described ICA approach on a large-scale metabolomics data set of human serum samples, comprising a total of 1764 study probands with 218 measured metabolites. Inspecting the source matrix of statistically independent metabolite profiles using a weighted enrichment algorithm, we observe strong enrichment of specific metabolic pathways in all components. This includes signatures from amino acid metabolism, energy-related processes, carbohydrate metabolism, and lipid metabolism. Our results imply that the human blood metabolome is composed of a distinct set of overlaying, statistically independent signals. ICA furthermore produces a mixing matrix, describing the strength of each independent component for each of the study probands. Correlating these values with plasma high-density lipoprotein (HDL) levels, we establish a novel association between HDL plasma levels and the branched-chain amino acid pathway. We conclude that the Bayesian ICA methodology has the power and flexibility to replace many of the nowadays common PCA and clustering-based analyses common in the research field.
Wissenschaftlicher Artikel
Scientific Article
Jourdan, C. ; Petersen, A.-K. ; Gieger, C. ; Döring, A. ; Illig, T. ; Wang-Sattler, R. ; Meisinger, C. ; Peters, A. ; Adamski, J. ; Prehn, C. ; Suhre, K. ; Altmaier, E. ; Kastenmüller, G. ; Römisch-Margl, W. ; Theis, F.J. ; Krumsiek, J. ; Wichmann, H.-E. ; Linseisen, J.
PLoS ONE 7:e40009 (2012)
Objective: To characterise the influence of the fat free mass on the metabolite profile in serum samples from participants of the population-based KORA (Cooperative Health Research in the Region of Augsburg) S4 study. Subjects and Methods: Analyses were based on metabolite profile from 965 participants of the S4 and 890 weight-stable subjects of its seven-year follow-up study (KORA F4). 190 different serum metabolites were quantified in a targeted approach including amino acids, acylcarnitines, phosphatidylcholines (PCs), sphingomyelins and hexose. Associations between metabolite concentrations and the fat free mass index (FFMI) were analysed using adjusted linear regression models. To draw conclusions on enzymatic reactions, intra-metabolite class ratios were explored. Pairwise relationships among metabolites were investigated and illustrated by means of Gaussian graphical models (GGMs). Results: We found 339 significant associations between FFMI and various metabolites in KORA S4. Among the most prominent associations (p-values 4.75 x 10(-16) -8.95 x 10(-06)) with higher FFMI were increasing concentrations of the branched chained amino acids (BCAAs), ratios of BCAAs to glucogenic amino acids, and carnitine concentrations. For various PCs, a decrease in chain length or in saturation of the fatty acid moieties could be observed with increasing FFMI, as well as an overall shift from acyl-alkyl PCs to diacyl PCs. These findings were reproduced in KORA F4. The established GGMs supported the regression results and provided a comprehensive picture of the relationships between metabolites. In a sub-analysis, most of the discovered associations did not exist in obese subjects in contrast to non-obese subjects, possibly indicating derangements in skeletal muscle metabolism. Conclusion: A set of serum metabolites strongly associated with FFMI was identified and a network explaining the relationships among metabolites was established. These results offer a novel and more complete picture of the FFMI effects on serum metabolites in a data-driven network.
Wissenschaftlicher Artikel
Scientific Article
Konrad, M. ; Vyleta, M.L. ; Theis, F.J. ; Stock, M. ; Tragust, S. ; Klatt, M. ; Drescher, V. ; Marr, C. ; Ugelvig, L.V. ; Cremer, S.
PLoS Biol. 10:e1001300 (2012)
Due to the omnipresent risk of epidemics, insect societies have evolved sophisticated disease defences at the individual and colony level. An intriguing yet little understood phenomenon is that social contact to pathogen-exposed individuals reduces susceptibility of previously naive nestmates to this pathogen. We tested whether such social immunisation in Lasius ants against the entomopathogenic fungus Metarhizium anisopliae is based on active upregulation of the immune system of nestmates following contact to an infectious individual or passive protection via transfer of immune effectors among group members--that is, active versus passive immunisation. We found no evidence for involvement of passive immunisation via transfer of antimicrobials among colony members. Instead, intensive allogrooming behaviour between naive and pathogen-exposed ants before fungal conidia firmly attached to their cuticle suggested passage of the pathogen from the exposed individuals to their nestmates. By tracing fluorescence-labelled conidia we indeed detected frequent pathogen transfer to the nestmates, where they caused low-level infections as revealed by growth of small numbers of fungal colony forming units from their dissected body content. These infections rarely led to death, but instead promoted an enhanced ability to inhibit fungal growth and an active upregulation of immune genes involved in antifungal defences (defensin and prophenoloxidase, PPO). Contrarily, there was no upregulation of the gene cathepsin L, which is associated with antibacterial and antiviral defences, and we found no increased antibacterial activity of nestmates of fungus-exposed ants. This indicates that social immunisation after fungal exposure is specific, similar to recent findings for individual-level immune priming in invertebrates. Epidemiological modeling further suggests that active social immunisation is adaptive, as it leads to faster elimination of the disease and lower death rates than passive immunisation. Interestingly, humans have also utilised the protective effect of low-level infections to fight smallpox by intentional transfer of low pathogen doses ("variolation" or "inoculation").
Wissenschaftlicher Artikel
Scientific Article
Strasser, M. ; Theis, F.J. ; Marr, C.
Biophys. J. 102, 19-29 (2012)
A toggle switch consists of two genes that mutually repress each other. This regulatory motif is active during cell differentiation and is thought to act as a memory device, being able to choose and maintain cell fate decisions. Commonly, this switch has been modeled in a deterministic framework where transcription and translation are lumped together. In this description, bistability occurs for transcription factor cooperativity, whereas autoactivation leads to a tristable system with an additional undecided state. In this contribution, we study the stability and dynamics of a two-stage gene expression switch within a probabilistic framework inspired by the properties of the Pu/Gata toggle switch in myeloid progenitor cells. We focus on low mRNA numbers, high protein abundance, and monomeric transcription-factor binding. Contrary to the expectation from a deterministic description, this switch shows complex multiattractor dynamics without autoactivation and cooperativity. Most importantly, the four attractors of the system, which only emerge in a probabilistic two-stage description, can be identified with committed and primed states in cell differentiation. To begin, we study the dynamics of the system and infer the mechanisms that move the system between attractors using both the quasipotential and the probability flux of the system. Next, we show that the residence times of the system in one of the committed attractors are geometrically distributed. We derive an analytical expression for the parameter of the geometric distribution, therefore completely describing the statistics of the switching process and elucidate the influence of the system parameters on the residence time. Moreover, we find that the mean residence time increases linearly with the mean protein level. This scaling also holds for a one-stage scenario and for autoactivation. Finally, we study the implications of this distribution for the stability of a switch and discuss the influence of the stability on a specific cell differentiation mechanism. Our model explains lineage priming and proposes the need of either high protein numbers or long-term modifications such as chromatin remodeling to achieve stable cell fate decisions. Notably, we present a system with high protein abundance that nevertheless requires a probabilistic description to exhibit multistability, complex switching dynamics, and lineage priming.
Wissenschaftlicher Artikel
Scientific Article
Krug, S. ; Kastenmüller, G. ; Stückler, F. ; Rist, M.J. ; Skurk, T. ; Sailer, M. ; Raffler, J. ; Römisch-Margl, W. ; Adamski, J. ; Prehn, C. ; Frank, T. ; Engel, K.-H. ; Hofmann, T. ; Luy, B. ; Zimmermann, R. ; Moritz, F. ; Schmitt-Kopplin, P. ; Krumsiek, J. ; Kremer, W. ; Huber, F. ; Oeh, U. ; Theis, F.J. ; Szymczak, W. ; Hauner, H. ; Suhre, K. ; Daniel, H.
FASEB J. 26, 2607-2619 (2012)
Metabolic challenge protocols, such as the oral glucose tolerance test, can uncover early alterations in metabolism preceding chronic diseases. Nevertheless, most metabolomics data accessible today reflect the fasting state. To analyze the dynamics of the human metabolome in response to environmental stimuli, we submitted 15 young healthy male volunteers to a highly controlled 4 d challenge protocol, including 36 h fasting, oral glucose and lipid tests, liquid test meals, physical exercise, and cold stress. Blood, urine, exhaled air, and breath condensate samples were analyzed on up to 56 time points by MS-and NMR-based methods, yielding 275 metabolic traits with a focus on lipids and amino acids. Here, we show that physiological challenges increased interindividual variation even in phenotypically similar volunteers, revealing metabotypes not observable in baseline metabolite profiles; volunteer-specific metabolite concentrations were consistently reflected in various biofluids; and readouts from a systematic model of beta-oxidation (e. g., acetylcarnitine/palmitylcarnitine ratio) showed significant and stronger associations with physiological parameters (e. g., fat mass) than absolute metabolite concentrations, indicating that systematic models may aid in understanding individual challenge responses. Due to the multitude of analytical methods, challenges and sample types, our freely available metabolomics data set provides a unique reference for future metabolomics studies and for verification of systems biology models.-Krug, S., Kastenmuller, G., Stuckler, F., Rist, M. J., Skurk, T., Sailer, M., Raffler, J., Romisch-Margl, W., Adamski, J., Prehn, C., Frank, T., Engel, K.-H., Hofmann, T., Luy, B., Zimmermann, R., Moritz, F., Schmitt-Kopplin, P., Krumsiek, J., Kremer, W., Huber, F., Oeh, U., Theis, F. J., Szymczak, W., Hauner, H., Suhre, K., Daniel, H. The dynamic range of the human metabolome revealed by challenges. FASEB J. 26, 2607-2619 (2012). www.fasebj.org
Wissenschaftlicher Artikel
Scientific Article
Meyer-Bäse, A. ; Cappendijk, S. ; Theis, F.J.
In: Arabnia, H.R.* ; Tran, Q.-N.* [Eds.]: Proceedings (2011 International Conference on Bioinformatics & Computational Biology (BIOCOMP '11), 18-21 July 2011, Las Vegas, USA). Athens, GA, USA: CSREA Press, 2012. 751-759
The complexity of gene regulatory networks described by coupled nonlinear differen- tial equations is often an obstacle for analysis purposes. They are prone to internal parametrical fluctuations making thus robustness a crucial property of these net- works to attenuate the effects of internal fluctuation. Therefore, the development of effective model reduction techniques for uncertain biological systems is of paramount importance in the field of systems biology. In this paper, we apply a Gramian-based approach for model reduction for gene regulatory networks based only on finding generalized Gramians and standard matrix transformations. The method is based on finding a generalized controllability and observability Gramian of the uncertain system and then based on a state transformation matrix a reduced-order representation. Under the assumption that the structured uncertainties are norm-bounded, we can prove that the reduced-order balanced system is also stable.
Lutter, D. ; Bruns, P. ; Theis, F.J.
In: Goryanin, I.I.* ; Goryachev, A.B.* [Eds.]: Advances in Systems Biology. New York: Springer, 2012. 247-260 (Adv. Exp. Med. Biol. ; 736)
The process of differentiation of embryonic stem cells (ESCs) is currently becoming the focus of many systems biologists not only due to mechanistic interest but also since it is expected to play an increasingly important role in regenerative medicine, in particular with the advert to induced pluripotent stem cells. These ESCs give rise to the formation of the three germ layers and therefore to the formation of all tissues and organs. Here, we present a computational method for inferring regulatory interactions between the genes involved in ESC differentiation based on time resolved microarray profiles. Fully quantitative methods are commonly unavailable on such large-scale data; on the other hand, purely qualitative methods may fail to capture some of the more detailed regulations. Our method combines the beneficial aspects of qualitative and quantitative (ODE-based) modeling approaches searching for quantitative interaction coefficients in a discrete and qualitative state space. We further optimize on an ensemble of networks to detect essential properties and compare networks with respect to robustness. Applied to a toy model our method is able to reconstruct the original network and outperforms an entire discrete boolean approach. In particular, we show that including prior knowledge leads to more accurate results. Applied to data from differentiating mouse ESCs reveals new regulatory interactions, in particular we confirm the activation of Foxh1 through Oct4, mediating Nodal signaling.
Grady, D. ; Brune, R. ; Thiemann, C. ; Theis, F.J. ; Brockmann, D.
In: Thai, M.T.* ; Pardalos, P.M.* [Eds.]: Handbook of Optimization in Complex Networks. Theory and Applications. New York, NJ: Springer, 2012. 169-208 (Optimization and Its Applications ; 57)
Territorial subdivisions and geographic borders are essential for understanding phenomena in sociology, political science, history, and economics. They influence the interregional flow of information and cross-border trade and affect the diffusion of innovation and technology. However, most existing administrative borders were determined by a variety of historic and political circumstances along with some degree of arbitrariness. Societies have changed drastically, and it is doubtful that currently existing borders reflect the most logical divisions. Fortunately, at this point in history we are in a position to actually measure some aspects of the geographic structure of society through human mobility. Large-scale transportation systems such as trains and airlines provide data about the number of people traveling between geographic locations, and many promising human mobility proxies are being discovered, such as cell phones, bank notes, and various online social networks. In this chapter we apply two optimization techniques to a human mobility proxy (bank note circulation) to investigate the effective geographic borders that emerge from a direct analysis of human mobility.
Burtscher, I. ; Barkey, W. ; Schwarzfischer, M. ; Theis, F.J. ; Lickert, H.
Genesis 50, 496-505 (2012)
Sox17 is a HMG-box transcription factor that has been shown to play important roles in both cardio-vascular development and endoderm formation. To analyze these processes in greater detail, we have generated a Sox17-mCherry fusion (SCF) protein by gene targeting in ES cells. SCF reporter mice are homozygous viable and faithfully reflect the endogenous Sox17 protein localization. We report that SCF positive cells constitute a subpopulation in the visceral endoderm before gastrulation and time-lapse imaging reveals that SCF monitors the nascent definitive endoderm during epithelialisation. After gastrulation, SCF marks the mid- and hindgut endoderm and vascular endothelial network, which can be imaged during establishment in allantois explant cultures. The SCF reporter is downregulated in the endoderm epithelium and upregulated in endothelial cells of the intestine, lung and pancreas during organogenesis. In summary, the generation of the Sox17-mCherry reporter mouse line allows direct visualization of endoderm and vascular development in culture and the mouse embryo.
Wissenschaftlicher Artikel
Scientific Article
2011
Vincent, E. ; Araki, S. ; Theis, F.J. ; Nolte, G. ; Bofill, P. ; Sawada, H. ; Ozerov, A. ; Gowreesunker, V. ; Lutter, D. ; Duong, N.Q.K.
Signal Process. 92, 1928-1936 (2011)
We present the outcomes of three recent evaluation campaigns in the field of audio and biomedical source separation. These campaigns have witnessed a boom in the range of applications of source separation systems in the last few years, as shown by the increasing number of datasets from 1 to 9 and the increasing number of submissions from 15 to 34. We first discuss their impact on the definition of a reference evaluation methodology, together with shared datasets and software. We then present the key results obtained over almost all datasets. We conclude by proposing directions for future research and evaluation, based in particular on the ideas raised during the related panel discussion at the Ninth International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA 2010).
Wissenschaftlicher Artikel
Scientific Article
Meyer-Bäse, A. ; Plant, C. ; Krumsiek, J. ; Theis, F.J. ; Emmett, M. ; Conrad, C.A.
In: Arabnia, H.R.* ; Tran, Q.-N.* [Eds.]: Proceedings (The 2011 International Conference on Bioinformatics and Computational Biology). Elsevier, 2011. ( ; Proceedings of BIOCOMP'11)
Lang, E.W. ; Schachtner, R. ; Lutter, D. ; Herold, D. ; Kodewitz, A. ; Blöchl, F. ; Theis, F.J. ; Keck, I.R. ; Górriz Sáez, J.M. ; Gomez, P. ; Gómez Vilda, P. ; Tomé, A.M.
In: Górriz, J.M.* ; Lang, E.W.* ; Ramirez, J.* [Eds.]: Recent Advances in Biomedical Signal Processing. Sharjah: Bentham Science, 2011. 26-47
Exploratory matrix factorization (EMF) techniques applied to two-way or multi-way biomedical data arrays provide new and efficient analysis tools which are currently explored to analyze large scale data sets like gene expression profiles (GEP) measured on microarrays, lipidomic or metabolomic profiles acquired by mass spectrometry (MS) and/or high performance liquid chromatography (HPLC) as well as biomedical images acquired with functional imaging techniques like functional magnetic resonance imaging (fMRI) or positron emission tomography (PET). Exploratory feature extraction techniques like, for example, Principal Component Analysis (PCA), Independent Component Analysis (ICA) or sparse Nonnegative Matrix Factorization (NMF) yield uncorrelated, statistically independent or sparsely encoded and strictly non-negative features which in case of GEPs are called eigenarrays (PCA), expression modes (ICA) or metagenes (NMF). They represent features which characterize the data sets under study and are generally considered indicative of underlying regulatory processes or functional networks and also serve as discriminative features for classification purposes. In the latter case, EMF techniques, when combined with diagnostic a priori knowledge, can directly be applied to the classification of biomedical data sets by grouping samples into different categories for diagnostic purposes or group genes, lipids, metabolic species or activity patches into functional categories for further investigation of related metabolic pathways and regulatory or functional networks. Although these techniques can be applied to large scale data sets in general, the following discussion will primarily focus on applications to microarray data sets and PET images.
Theis, F.J. ; Kawanabe, M. ; Müller, K.-R.
IEEE Trans. Signal Process. 59, 4478 - 4482 (2011)
Dimension reduction is a key step in preprocessing large-scale data sets. A recently proposed method named non-Gaussian component analysis searches for a projection onto the non-Gaussian part of a given multivariate recording, which is a generalization of the deflationary projection pursuit model. In this contribution, we discuss the uniqueness of the subspaces of such a projection. We prove that a necessary and sufficient condition for uniqueness is that the non-Gaussian signal subspace is of minimal dimension. Furthermore, we propose a measure for estimating this minimal dimension and illustrate it by numerical simulations. Our result guarantees that projection algorithms uniquely recover the underlying lower dimensional data signals.
Wissenschaftlicher Artikel
Scientific Article
Breindl, C. ; Waldherr, S. ; Wittmann, D.M. ; Theis, F.J. ; Allgoewer, F.
Int. J. Robust Nonlin. 21, 1742-1758 (2011)
In this paper, we define a robustness measure for gene regulation networks, which allows to quantify how well a given model structure can reproduce a desired steady-state pattern in the absence of detailed knowledge about the kinetic mechanisms and parameters. To develop this measure, a modeling framework is introduced, which is able to represent the qualitative knowledge typically available for gene regulation networks. With this framework, the robustness measure as well as tools for its efficient computation are developed. The benefit of our method is twofold: On the one hand, it allows to compare the robustness properties of different model structures and thus may help modelers to decide which model is biologically more plausible. On the other hand, the most fragile interconnections within a network can be detected. To demonstrate its use, the new method is applied to various models of a gene regulation network, which is responsible for the maintenance of the mid-hindbrain boundary. We find that for this example system, weaker connected networks are more robust.
Wissenschaftlicher Artikel
Scientific Article
Müller, N.S. ; Krumsiek, J. ; Theis, F.J. ; Böhm, C. ; Meyer-Bäse, A.
Proc. SPIE 8058:805819 (2011)
Advances in high-throughput measurements of biological specimens necessitate the development of biologically driven computational techniques. To understand the molecular level of many human diseases, such as cancer, lipid quantifications have been shown to offer an excellent opportunity to reveal disease-specific regulations. The data analysis of the cell lipidome, however, remains a challenging task and cannot be accomplished solely based on intuitive reasoning. We have developed a method to identify a lipid correlation network which is entirely disease-specific. A powerful method to correlate experimentally measured lipid levels across the various samples is a Gaussian Graphical Model (GGM), which is based on partial correlation coefficients. In contrast to regular Pearson correlations, partial correlations aim to identify only direct correlations while eliminating indirect associations. Conventional GGM calculations on the entire dataset can, however, not provide information on whether a correlation is truly disease-specific with respect to the disease samples and not a correlation of control samples. Thus, we implemented a novel differential GGM approach unraveling only the disease-specific correlations, and applied it to the lipidome of immortal Glioblastoma tumor cells. A large set of lipid species were measured by mass spectrometry in order to evaluate lipid remodeling as a result to a combination of perturbation of cells inducing programmed cell death, while the other perturbations served solely as biological controls. With the differential GGM, we were able to reveal Glioblastoma-specific lipid correlations to advance biomedical research on novel gene therapies.
Wissenschaftlicher Artikel
Scientific Article
Schwarzfischer, M. ; Marr, C. ; Krumsiek, J. ; Hoppe, P.S. ; Schroeder, T. ; Theis, F.J.
In: Proceedings (Microscopic Image Analysis with Applications in Biology, 2nd September 2011, Heidelberg, Germany). Heidelberg: MIAAB, 2011. http://www.miaab.org/miaab-2011-
In the last few years, single-cell time-lapse fluorescence microscopy has emerged as a key technology in the toolbox of experimental life science. Imaging fluorescently tagged proteins allows to combine future information of cellular progeny with time resolved protein dynamics. Whenever quantitative data on the intensity of the fluorescent signal is required, a careful image processing pipeline has to be applied to account for uneven illumination, background signal, varying illumination strength or photobleaching. Previous approaches commonly used an additional calibration step to infer such image characteristics by imaging fluorescent dilutions like fluorescein. Here, we describe a method to infer a time-dependent background signal and the image gain without the use of additional fluorescent substances – instead, we use the information contained in the bleaching background of the fluorescence time-lapse movie itself. First, we tile the full image into small sub-images and determine background tiles by clustering the statistical moments of the individual intensity distributions. For each image, we interpolate the full background from the identified tiles and thus reconstitute the time-dependent background image. Second, we estimate the time-independent image gain from the background tiles of all pixels and all timepoints. We are thus able to correct for a bleaching background and an uneven illumination of the experimental setup. We show the applicability of our method by comparing the intensities of fluorescent beads derived from timelapse microscopy with intensities inferred from FACS analysis. In summary, our normalization method accurately corrects for fluorescence image issues and decreases the necessary experimental work.
Blöchl, F. ; Theis, F.J. ; Vega-Redondo, F. ; Fisher, E.
Phys. Rev. E 83:046127 (2011)
Input-output tables describe the flows of goods and services between the sectors of an economy. These tables can be interpreted as weighted directed networks. At the usual level of aggregation, they contain nodes with strong self-loops and are almost completely connected. We derive two measures of node centrality that are well suited for such networks. Both are based on random walks and have interpretations as the propagation of supply shocks through the economy. Random walk centrality reveals the vertices most immediately affected by a shock. Counting betweenness identifies the nodes where a shock lingers longest. The two measures differ in how they treat self-loops. We apply both to data from a wide set of countries and uncover salient characteristics of the structures of these national economies. We further validate our indices by clustering according to sectors’ centralities. This analysis reveals geographical proximity and similar developmental status.
Wissenschaftlicher Artikel
Scientific Article
Krumsiek, J. ; Wittmann, D.M. ; Theis, F.J.
Applications of MATLAB in Science and Engineering. Rijeka: InTech, 2011. 35-60
no abstract
Kowarsch, A. ; Preusse, M. ; Marr, C. ; Theis, F.J.
RNA 17, 809-819 (2011)
MicroRNAs (miRNAs) are an important class of post-transcriptional regulators of gene expression that are involved in various cellular and phenotypic processes. A number of studies have shown that miRNA expression is induced by signaling pathways. Moreover, miRNAs emerge as regulators of signaling pathways. Here, we present the miTALOS web resource, which provides insight into miRNA-mediated regulation of signaling pathways. As a novel feature, miTALOS considers the tissue-specific expression signatures of miRNAs and target transcripts to improve the analysis of miRNA regulation in biological pathways. MiTALOS identifies potential pathway regulation by (i) an enrichment analysis of miRNA targets genes and (ii) by using a proximity score to evaluate the functional role of miRNAs in biological pathways by their network proximity. Moreover, miTALOS integrates five different miRNA target prediction tools and two different signaling pathway resources (KEGG and NCI). A graphical visualization of miRNA targets in both KEGG and NCI PID signaling pathways is provided to illustrate their respective pathway context. We perform a functional analysis on prostate cancer-related miRNAs and are able to infer a model of miRNA-mediated regulation on tumor proliferation, mobility and anti-apoptotic behavior. miTALOS provides novel features that accomplish a substantial support to systematically infer regulation of signaling pathways mediated by miRNAs.
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J. ; Latif, N. ; Wong, P. ; Frishman, D.
Mol. Biol. Evol. 28, 2501-2512 (2011)
A quickly growing number of characteristics reflecting various aspects of gene function and evolution can be either measured experimentally or computed from DNA and protein sequences. The study of pairwise correlations between such quantitative genomic variables as well as collective analysis of their interrelations by multidimensional methods have delivered crucial insights into the processes of molecular evolution. Here, we present a principal component analysis (PCA) of 16 genomic variables from Saccharomyces cerevisiae, the largest data set analyzed so far. Because many missing values and potential outliers hinder the direct calculation of principal components, we introduce the application of Bayesian PCA. We confirm some of the previously established correlations, such as evolutionary rate versus protein expression, and reveal new correlations such as those between translational efficiency, phosphorylation density, and protein age. Although the first principal component primarily contrasts genomic change and protein expression, the second component separates variables related to gene existence and expressed protein functions. Enrichment analysis on genes affecting variable correlations unveils classes of influential genes. For example, although ribosomal and nuclear transport genes make important contributions to the correlation between protein isoelectric point and molecular weight, protein synthesis and amino acid metabolism genes help cause the lack of significant correlation between propensity for gene loss and protein age. We present the novel Quagmire database (Quantitative Genomics Resource) which allows exploring relationships between more genomic variables in three model organisms-Escherichia coli, S. cerevisiae, and Homo sapiens.
Wissenschaftlicher Artikel
Scientific Article
Wittmann, D.M. ; Theis, F.J.
New J. Phys. 13:013041 (2011)
Random multistate networks, generalizations of the Boolean Kauffman networks, are generic models for complex systems of interacting agents. Depending on their mean connectivity, these networks exhibit ordered as well as chaotic behavior with a critical boundary separating both regimes. Typically, the nodes of these networks are assigned single discrete states. Here, we describe nodes by fuzzy numbers, i.e. vectors of degree-of-membership (DOM) functions specifying the degree to which the nodes are in each of their discrete states. This allows our models to deal with imprecision and uncertainties. Compatible update rules are constructed by expressing the update rules of the multistate network in terms of Boolean operators and generalizing them to fuzzy logic (FL) operators. The standard choice for these generalizations is the Godel FL, where AND and OR are replaced by the minimum and maximum of two DOMs, respectively. In mean-field approximations we are able to analytically describe the percolation and asymptotic distribution of DOMs in random Godel FL networks. This allows us to characterize the different dynamic regimes of random multistate networks in terms of FL. In a low-dimensional example, we provide explicit computations and validate our mean-field results by showing that they agree well with network simulations.
Wissenschaftlicher Artikel
Scientific Article
Gutch, H.W. ; Krumsiek, J. ; Theis, F.J.
In: Eur. Signal Process. Conf (19th European Signal Processing Conference, 29 August - 2 September 2011, Barcelona). Barcelona: EUSIPCO, 2011. 1733-1737
Independent Subspace Analysis (ISA) denotes the task of linearly separating multivariate observations into statistically independent multi-dimensional sources, where dependencies only exist within these subspaces but not between them. So far ISA algorithms have mostly been described in the context of known group sizes. Here, we extend a previously proposed ISA algorithm based on joint block di-agonalization of 4-th order cumulant matrices to separate subspaces of unknown sizes. Further automated interpretation of the demixed sources then requires a means of recovering the subspace structure within them, and we propose two distinct methods for this. We then apply the method to a novel application field, namely clustering of metabolites, which seems to be well-fit to the ISA model. We are able to successfully identify dependencies between metabolites that could not be recovered using conventional methods.
Sass, S. ; Dietmann, S. ; Burk, U.C. ; Brabletz, S. ; Lutter, D. ; Kowarsch, A. ; Mayer, K.F.X. ; Brabletz, T. ; Ruepp, A. ; Theis, F.J. ; Wang, Y.
BMC Syst. Biol. 5:136 (2011)
BACKGROUND: In animals, microRNAs (miRNAs) regulate the protein synthesis of their target messenger RNAs (mRNAs) by either translational repression or deadenylation. miRNAs are frequently found to be co-expressed in different tissues and cell types, while some form polycistronic clusters on genomes. Interactions between targets of co-expressed miRNAs (including miRNA clusters) have not yet been systematically investigated.RESULTS: Here we integrated information from predicted and experimentally verified miRNA targets to characterize protein complex networks regulated by human miRNAs. We found striking evidence that individual miRNAs or co-expressed miRNAs frequently target several components of protein complexes. We experimentally verified that the miR-141-200c cluster targets different components of the CtBP/ZEB complex, suggesting a potential orchestrated regulation in epithelial to mesenchymal transition. CONCLUSIONS: Our findings indicate a coordinate posttranscriptional regulation of protein complexes by miRNAs. These provide a sound basis for designing experiments to study miRNA function at a systems level.
Wissenschaftlicher Artikel
Scientific Article
Hennig, H. ; Fleischmann, R. ; Fredebohm, A. ; Hagmayer, Y. ; Nagler, J. ; Witt, A. ; Theis, F.J. ; Geisel, T.
PLoS ONE 6:e26457 (2011)
Although human musical performances represent one of the most valuable achievements of mankind, the best musicians perform imperfectly. Musical rhythms are not entirely accurate and thus inevitably deviate from the ideal beat pattern. Nevertheless, computer generated perfect beat patterns are frequently devalued by listeners due to a perceived lack of human touch. Professional audio editing software therefore offers a humanizing feature which artificially generates rhythmic fluctuations. However, the built-in humanizing units are essentially random number generators producing only simple uncorrelated fluctuations. Here, for the first time, we establish long-range fluctuations as an inevitable natural companion of both simple and complex human rhythmic performances. Moreover, we demonstrate that listeners strongly prefer long-range correlated fluctuations in musical rhythms. Thus, the favorable fluctuation type for humanizing interbeat intervals coincides with the one generically inherent in human musical performances.
Wissenschaftlicher Artikel
Scientific Article
Krumsiek, J. ; Marr, C. ; Schroeder, T. ; Theis, F.J.
PLoS ONE 6:e22649 (2011)
Hematopoiesis is an ideal model system for stem cell biology with advanced experimental access. A systems view on the interactions of core transcription factors is important for understanding differentiation mechanisms and dynamics. In this manuscript, we construct a Boolean network to model myeloid differentiation, specifically from common myeloid progenitors to megakaryocytes, erythrocytes, granulocytes and monocytes. By interpreting the hematopoietic literature and translating experimental evidence into Boolean rules, we implement binary dynamics on the resulting 11-factor regulatory network. Our network contains interesting functional modules and a concatenation of mutual antagonistic pairs. The state space of our model is a hierarchical, acyclic graph, typifying the principles of myeloid differentiation. We observe excellent agreement between the steady states of our model and microarray expression profiles of two different studies. Moreover, perturbations of the network topology correctly reproduce reported knockout phenotypes in silico. We predict previously uncharacterized regulatory interactions and alterations of the differentiation process, and line out reprogramming strategies.
Wissenschaftlicher Artikel
Scientific Article
Mittelstraß, K. ; Ried, J.S. ; Yu, Z. ; Krumsiek, J. ; Gieger, C. ; Prehn, C. ; Römisch-Margl, W. ; Polonikov, A. ; Peters, A. ; Theis, F.J. ; Meitinger, T. ; Kronenberg, F. ; Weidinger, S. ; Wichmann, H.-E. ; Suhre, K. ; Wang-Sattler, R. ; Adamski, J. ; Illig, T.
PLoS Genet. 7:e1002215 (2011)
Metabolomic profiling and the integration of whole-genome genetic association data has proven to be a powerful tool to comprehensively explore gene regulatory networks and to investigate the effects of genetic variation at the molecular level. Serum metabolite concentrations allow a direct readout of biological processes, and association of specific metabolomic signatures with complex diseases such as Alzheimer's disease and cardiovascular and metabolic disorders has been shown. There are well-known correlations between sex and the incidence, prevalence, age of onset, symptoms, and severity of a disease, as well as the reaction to drugs. However, most of the studies published so far did not consider the role of sexual dimorphism and did not analyse their data stratified by gender. This study investigated sex-specific differences of serum metabolite concentrations and their underlying genetic determination. For discovery and replication we used more than 3,300 independent individuals from KORA F3 and F4 with metabolite measurements of 131 metabolites, including amino acids, phosphatidylcholines, sphingomyelins, acylcarnitines, and C6-sugars. A linear regression approach revealed significant concentration differences between males and females for 102 out of 131 metabolites (p-values<3.8×10(-4); Bonferroni-corrected threshold). Sex-specific genome-wide association studies (GWAS) showed genome-wide significant differences in beta-estimates for SNPs in the CPS1 locus (carbamoyl-phosphate synthase 1, significance level: p<3.8×10(-10); Bonferroni-corrected threshold) for glycine. We showed that the metabolite profiles of males and females are significantly different and, furthermore, that specific genetic variants in metabolism-related genes depict sexual dimorphism. Our study provides new important insights into sex-specific differences of cell regulatory processes and underscores that studies should consider sex-specific effects in design and interpretation.
Wissenschaftlicher Artikel
Scientific Article
Laaser, I. ; Theis, F.J. ; Hrabě de Angelis, M. ; Kolb, H.-J. ; Adamski, J.
OMICS 15, 141-154 (2011)
Over 90% of human genes produce more than one mRNA by alternative splicing (AS). Human UTY (ubiquitously transcribed tetratricopeptide repeat protein on the chromosome Y) has six mRNA-transcripts. UTY is subject to interdisciplinary approaches such as Y chromosomal genetics or development of leukemia immunotherapy based on UTY-specific peptides. Investigating UTY expression in a normal and leukemic setting we discovered an exceptional splicing phenomenon fostering huge transcript diversity. Transcript sequencing identified 90 novel AS-events being almost randomly combined in 284 new transcripts. We uncovered a novel system of transcript architecture and genomic organization in UTY. On a basis of a new UTY-splicing multigraph including a mathematical model we calculated the theoretical yield to exceed 1.3 billion distinct transcripts. To our knowledge, this is the greatest estimated transcript diversity by AS. On protein level we demonstrated interaction of AS-derived proteins with new interactors by yeast-two-hybrid assay. For translational research we predicted new UTY-peptide candidates for leukemia therapy development. Our study provides new insights into the complexity of human alternative splicing and its potential contribution to the transcript diversity of the transcriptome.
Wissenschaftlicher Artikel
Scientific Article
Krumsiek, J. ; Suhre, K. ; Illig, T. ; Adamski, J. ; Theis, F.J.
BMC Syst. Biol. 5:21 (2011)
BACKGROUND: With the advent of high-throughput targeted metabolic profiling techniques, the question of how to interpret and analyze the resulting vast amount of data becomes more and more important. In this work we address the reconstruction of metabolic reactions from cross-sectional metabolomics data, that is without the requirement for time-resolved measurements or specific system perturbations. Previous studies in this area mainly focused on Pearson correlation coefficients, which however are generally incapable of distinguishing between direct and indirect metabolic interactions. RESULTS: In our new approach we propose the application of a Gaussian graphical model (GGM), an undirected probabilistic graphical model estimating the conditional dependence between variables. GGMs are based on partial correlation coefficients, that is pairwise Pearson correlation coefficients conditioned against the correlation with all other metabolites. We first demonstrate the general validity of the method and its advantages over regular correlation networks with computer-simulated reaction systems. Then we estimate a GGM on data from a large human population cohort, covering 1020 fasting blood serum samples with 151 quantified metabolites. The GGM is much sparser than the correlation network, shows a modular structure with respect to metabolite classes, and is stable to the choice of samples in the data set. On the example of human fatty acid metabolism, we demonstrate for the first time that high partial correlation coefficients generally correspond to known metabolic reactions. This feature is evaluated both manually by investigating specific pairs of high-scoring metabolites, and then systematically on a literature-curated model of fatty acid synthesis and degradation. Our method detects many known reactions along with possibly novel pathway interactions, representing candidates for further experimental examination. CONCLUSIONS: In summary, we demonstrate strong signatures of intracellular pathways in blood serum data, and provide a valuable tool for the unbiased reconstruction of metabolic reactions from large-scale metabolomics data sets.
Wissenschaftlicher Artikel
Scientific Article
Lang, E.W. ; Schachtner, R. ; Lutter, D. ; Herold, D. ; Kodewitz, A. ; Blöchl, F. ; Theis, F.J. ; Keck, I.R. ; Górriz Sáez, J.M. ; Gómez Vilda, P. ; Tomé, A.M.
In: Górriz, J.M.* ; Lang, E.W.* ; Ramirez, J.* [Eds.]: Recent Advances in Biomedical Signal Processing. Oak Park, IL: Bentham Science Publishers, 2011. 26-47
Raia, V. ; Schilling, M. ; Böhm, M. ; Hahn, B. ; Kowarsch, A. ; Raue, A. ; Sticht, C. ; Bohl, S. ; Saile, M. ; Möller, P. ; Gretz, N. ; Timmer, J. ; Theis, F.J. ; Lehmann, W.D. ; Lichtner, P. ; Klingmüller, U.
Cancer Res. 71, 693-704 (2011)
Primary mediastinal B-cell lymphoma (PMBL) and classical Hodgkin lymphoma (cHL) share a frequent constitutive activation of JAK (Janus kinase)/STAT signaling pathway. Because of complex, nonlinear relations within the pathway, key dynamic properties remained to be identified to predict possible strategies for intervention. We report the development of dynamic pathway models based on quantitative data collected on signaling components of JAK/STAT pathway in two lymphoma-derived cell lines, MedB-1 and L1236, representative of PMBL and cHL, respectively. We show that the amounts of STAT5 and STAT6 are higher whereas those of SHP1 are lower in the two lymphoma cell lines than in normal B cells. Distinctively, L1236 cells harbor more JAK2 and less SHP1 molecules per cell than MedB-1 or control cells. In both lymphoma cell lines, we observe interleukin-13 (IL13)-induced activation of IL4 receptor α, JAK2, and STAT5, but not of STAT6. Genome-wide, 11 early and 16 sustained genes are upregulated by IL13 in both lymphoma cell lines. Specifically, the known STAT-inducible negative regulators CISH and SOCS3 are upregulated within 2 hours in MedB-1 but not in L1236 cells. On the basis of this detailed quantitative information, we established two mathematical models, MedB-1 and L1236 model, able to describe the respective experimental data. Most of the model parameters are identifiable and therefore the models are predictive. Sensitivity analysis of the model identifies six possible therapeutic targets able to reduce gene expression levels in L1236 cells and three in MedB-1. We experimentally confirm reduction in target gene expression in response to inhibition of STAT5 phosphorylation, thereby validating one of the predicted targets. Cancer Res; 71(3); 693-704. ©2010 AACR.
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J. ; Bohl, S. ; Klingmüller, U.
Bull. Math. Biol. 73, 978-1003 (2011)
Processing of information by signaling networks is characterized by properties of the induced kinetics of the activated pathway components. The maximal extent of pathway activation (maximum amplitude) and the time-to-peak-response (position) are key determinants of biological responses that have been linked to specific outcomes. We investigate how the maximum amplitude of pathway activation and its position depend on the input and wiring of a signaling network. For this purpose, we consider a simple reaction A-->B that is regulated by a transient input and extended this to include back-reaction and additional partners. In particular, we show that a unique maximum of B(t) exists. Moreover, we prove that the position of the maximum is independent of the applied input but regulated by degradation reactions of B. Indeed, the time-to-peak-response decreases with increasing degradation rate, which we prove for small models and show in simulations for more complex ones. The identified dependencies provide insights into design principles that facilitate the realization dynamical characteristics like constant position of maximal pathway activation and thereby guide the characterization of unknown kinetics within larger protein networks.
Wissenschaftlicher Artikel
Scientific Article
Baskaran, T. ; Blöchl, F. ; Brück, T. ; Theis, F.J.
Int. Rev. Econom. Finance 20, 135-145 (2011)
This paper estimates for 28 product groups a characteristic parameter that reflects the topological structure of its trading network. Using these estimates, it describes how the structure of international trade has evolved during the 1980–2000 period. Thereafter, it demonstrates the importance of networks in international trade by explicitly accounting for their scaling properties when testing the prediction of the “Heckscher–Ohlin” model that factor endowment differentials determine bilateral trade flows. The results suggest that factor endowment differentials increase bilateral trade in goods that are traded in “dispersed” networks. For goods traded in “concentrated” networks, factor endowment differentials are less important.
Wissenschaftlicher Artikel
Scientific Article
Blöchl, F. ; Wittmann, D.M. ; Theis, F.J.
Bull. Math. Biol. 73, 706-725 (2011)
Signaling networks are abundant in higher organisms. They play pivotal roles, e.g., during embryonic development or within the immune system. In this contribution, we study the combined effect of the various kinetic parameters on the dynamics of signal transduction. To this end, we consider hierarchical complex systems as prototypes of signaling networks. For given topology, the output of these networks is determined by an interplay of the single parameters. For different kinetics, we describe this by algebraic expressions, the so-called effective parameters.When modeling switch-like interactions by Heaviside step functions, we obtain these effective parameters recursively from the interaction graph. They can be visualized as directed trees, which allows us to easily determine the global effect of single kinetic parameters on the system's behavior. We provide evidence that these results generalize to sigmoidal Hill kinetics.In the case of linear activation functions, we again show that the algebraic expressions can be immediately inferred from the topology of the interaction network. This allows us to transform time-consuming analytic solutions of differential equations into a simple graph-theoretic problem. In this context, we also discuss the impact of our work on parameter estimation problems. An issue is that even the fitting of identifiable effective parameters often turns out to be numerically ill-conditioned. We demonstrate that this fitting problem can be reformulated as the problem of fitting exponential sums, for which robust algorithms exist.
Wissenschaftlicher Artikel
Scientific Article
Blöchl, F. ; Rascle, A. ; Kastner, J. ; Witzgall, R. ; Lang, E.W. ; Theis, F.J.
In: Górriz, J.M.* ; Lang, E.W.* ; Ramirez, J.* [Eds.]: Recent Advances in Biomedical Signal Processing. Hilversum: Bentham Science Publishers, 2011. 157-170
A general question in the analysis of biological experiments is how to maximize statistical information present in the data while at the same time keeping bias at a minimal level. This can be reformulated as the question whether to perform differential analysis or only explorative screens. In this contribution we discuss this old paradigm in the context of a differential microarray experiment. The transcription factor Lmx1b is knocked out in a mouse model in order to gain further insight into gene regulation taking place in Nail-patella syndrome, a disease caused by mutations of this gene. We review several statistical methods and contrast them with supervised learning on the two differential modes and unsupervised, explorative analysis. Moreover we propose a novel method for analyzing single clusters by projecting them back on specific experiments. Our reference is the identification of three well-known targets. We find that by integrating all results we are able to confirm these target genes. Furthermore, hypotheses on further potential target genes are formulated.
2010
Theis, F.J. ; Meyer-Bäse, A.
In: Biomedical Signal Analysis - Contemporary Methods and Applications. Cambridge, MA: MIT Press, 2010. 1-28
Computer processing and analysis of medical images, as well as experimental data analysis of physiological signals, have evolved since the late 1980s from a variety of directions, ranging from signal and imaging acquisition equipment to areas such as digital signal and image processing, computer vision, and pattern recognition. The most important physiological signals, such as electrocardiograms (ECG), electromyograms (EMG), electroencephalograms (EEG), and magnetoencephalograms (MEG), represent analog signals that are digitized for the purposes of storage and data analysis. The nature of medical images is very broad; it is as simple as an chest X-ray or as sophisticated as noninvasive brain imaging, such as functional magnetic resonance imaging (fMRI). While medical imaging is concerned with the interaction of all forms of radiation with tissue and the clinical extraction of relevant information, its analysis encompasses the measurement of anatomical and physiological parameters from images, image processing, and motion and change detection from image sequences. This chapter gives an overview of biological signal and image analysis, and describes the basic model for computer-aided systems as a commonbasis enabling the study of several problems of medical-imagingbased diagnostics.
Meyer-Bäse, A. ; Theis, F.J. ; Emmett, M.R.
Adv. Exp. Med. Biol. 680, 189-197 (2010)
The tryptophan system present in Escherichia coli represents an important regulatory unit described by multiple feedback loops. The role of these feedback loops is crucial for the analysis of the dynamical behavior of the tryptophan synthesis. We analyze the robust stability of this system which models the dynamics of both fast state, such as transcription and synthesis of free operator, and slow state, such as translation and tryptophan synthesis under consideration of nonlinear uncertainties. In addition, we analyze the role of these feedback loops as key design components of this regulatory unit responsible for its physiological performance. The range of allowed parameter perturbations and the conditions that ensure the existence of asymptotically stable equilibria of the perturbed system are determined. We also analyze two important alternate regulatory designs for the tryptophan synthesis pathway and derive the stability conditions.
Wissenschaftlicher Artikel
Scientific Article
Kowarsch, A. ; Blöchl, F. ; Bohl, S. ; Saile, M. ; Gretz, N. ; Klingmüller, U. ; Theis, F.J.
BMC Bioinformatics 11:585 (2010)
External stimulations of cells by hormones, cytokines or growth factors activate signal transduction pathways that subsequently induce a re-arrangement of cellular gene expression. The analysis of such changes is complicated, as they consist of multi-layered temporal responses. While classical analyses based on clustering or gene set enrichment only partly reveal this information, matrix factorization techniques are well suited for a detailed temporal analysis. In signal processing, factorization techniques incorporating data properties like spatial and temporal correlation structure have shown to be robust and computationally efficient. However, such correlation-based methods have so far not be applied in bioinformatics, because large scale biological data rarely imply a natural order that allows the definition of a delayed correlation function. We therefore develop the concept of graph-decorrelation. We encode prior knowledge like transcriptional regulation, protein interactions or metabolic pathways in a weighted directed graph. By linking features along this underlying graph, we introduce a partial ordering of the features (e.g. genes) and are thus able to define a graph-delayed correlation function. Using this framework as constraint to the matrix factorization task allows us to set up the fast and robust graph-decorrelation algorithm (GraDe). To analyze alterations in the gene response in IL-6 stimulated primary mouse hepatocytes, we performed a time-course microarray experiment and applied GraDe. In contrast to standard techniques, the extracted time-resolved gene expression profiles showed that IL-6 activates genes involved in cell cycle progression and cell division. Genes linked to metabolic and apoptotic processes are down-regulated indicating that IL-6 mediated priming renders hepatocytes more responsive towards cell proliferation and reduces expenditures for the energy metabolism. GraDe provides a novel framework for the decomposition of large-scale 'omics' data. We were able to show that including prior knowledge into the separation task leads to a much more structured and detailed separation of the time-dependent responses upon IL-6 stimulation compared to standard methods. A Matlab implementation of the GraDe algorithm is freely available at http://cmb.helmholtz-muenchen.de/grade.
Wissenschaftlicher Artikel
Scientific Article
Wittmann, D.M. ; Marr, C. ; Theis, F.J.
J. Theor. Biol. 266, 436-448 (2010)
We generalize random Boolean networks by softening the hard binary discretization into multiple discrete states. These multistate networks are generic models of gene regulatory networks, where each gene is known to assume a finite number of functionally different expression levels. We analytically determine the critical connectivity that separates the biologically unfavorable frozen and chaotic regimes. This connectivity is inversely proportional to a parameter which measures the heterogeneity of the update rules. Interestingly, the latter does not necessarily increase with the mean number of discrete states per node. Still, allowing for multiple states decreases the critical connectivity as compared to random Boolean networks, and thus leads to biologically unrealistic situations. Therefore, we study two approaches to increase the critical connectivity. First, we demonstrate that each network can be kept in its frozen regime by sufficiently biasing the update rules. Second, we restrict the randomly chosen update rules to a subclass of biologically more meaningful functions. These functions are characterized based on a thermodynamic model of gene regulation. We analytically show that their usage indeed increases the critical connectivity. From a general point of view, our thermodynamic considerations link discrete and continuous models of gene regulatory networks.
Wissenschaftlicher Artikel
Scientific Article
Krumsiek, J. ; Pölsterl, S. ; Wittmann, D.M. ; Theis, F.J.
BMC Bioinformatics 11, 1-10:233 (2010)
Phenomenological information about regulatory interactions is frequently available and can be readily converted to Boolean models. Fully quantitative models, on the other hand, provide detailed insights into the precise dynamics of the underlying system. In order to connect discrete and continuous modeling approaches, methods for the conversion of Boolean systems into systems of ordinary differential equations have been developed recently. As biological interaction networks have steadily grown in size and complexity, a fully automated framework for the conversion process is desirable. We present Odefy, a MATLAB- and Octave-compatible toolbox for the automated transformation of Boolean models into systems of ordinary differential equations. Models can be created from sets of Boolean equations or graph representations of Boolean networks. Alternatively, the user can import Boolean models from the CellNetAnalyzer toolbox, GINSim and the PBN toolbox. The Boolean models are transformed to systems of ordinary differential equations by multivariate polynomial interpolation and optional application of sigmoidal Hill functions. Our toolbox contains basic simulation and visualization functionalities for both, the Boolean as well as the continuous models. For further analyses, models can be exported to SQUAD, GNA, MATLAB script files, the SB toolbox, SBML and R script files. Odefy contains a user-friendly graphical user interface for convenient access to the simulation and exporting functionalities. We illustrate the validity of our transformation approach as well as the usage and benefit of the Odefy toolbox for two biological systems: a mutual inhibitory switch known from stem cell differentiation and a regulatory network giving rise to a specific spatial expression pattern at the mid-hindbrain boundary. Odefy provides an easy-to-use toolbox for the automatic conversion of Boolean models to systems of ordinary differential equations. It can be efficiently connected to a variety of input and output formats for further analysis and investigations. The toolbox is open-source and can be downloaded at http://cmb.helmholtz-muenchen.de/odefy.
Wissenschaftlicher Artikel
Scientific Article
Gutch, H.W. ; Maehara, T. ; Theis, F.J.
In: Vigneron, V.* ; Zarzoso, V.* ; Moreau, E.* [Eds.]: Proceedings (Latent variable analysis and signal separation : 9th international conference, 27-30 September 2010, St. Malo, France). Berlin: Springer, 2010. 370-377 ( ; 6365)
The recovery of the mixture of an N-dimensional signal generated by N independent processes is a well studied problem (see e.g. [1,10]) and robust algorithms that solve this problem by Joint Diagonalization exist. While there is a lot of empirical evidence suggesting that these algorithms are also capable of solving the case where the source signals have block structure (apart from a final permutation recovery step), this claim could not be shown yet - even more, it previously was not known if this model separable at all. We present a precise definition of the subspace model, introducing the notion of simple components, show that the decomposition into simple components is unique and present an algorithm handling the decomposition task.
Thiemann, C. ; Theis, F.J. ; Grady, D. ; Brune, R. ; Brockmann, D.
PLoS ONE 5:e15422 (2010)
Territorial subdivisions and geographic borders are essential for understanding phenomena in sociology, political science, history, and economics. They influence the interregional flow of information and cross-border trade and affect the diffusion of innovation and technology. However, it is unclear if existing administrative subdivisions that typically evolved decades ago still reflect the most plausible organizational structure of today. The complexity of modern human communication, the ease of long-distance movement, and increased interaction across political borders complicate the operational definition and assessment of geographic borders that optimally reflect the multi-scale nature of today's human connectivity patterns. What border structures emerge directly from the interplay of scales in human interactions is an open question. Based on a massive proxy dataset, we analyze a multi-scale human mobility network and compute effective geographic borders inherent to human mobility patterns in the United States. We propose two computational techniques for extracting these borders and for quantifying their strength. We find that effective borders only partially overlap with existing administrative borders, and show that some of the strongest mobility borders exist in unexpected regions. We show that the observed structures cannot be generated by gravity models for human traffic. Finally, we introduce the concept of link significance that clarifies the observed structure of effective borders. Our approach represents a novel type of quantitative, comparative analysis framework for spatially embedded multi-scale interaction networks in general and may yield important insight into a multitude of spatiotemporal phenomena generated by human activity.
Wissenschaftlicher Artikel
Scientific Article
Araki, S. ; Ozerov, A. ; Gowreesunker, V. ; Sawada, H. ; Theis, F.J. ; Nolte, G. ; Lutter, D. ; Duong, N.Q.K.
In: Vigneron, V.* ; Zarzoso, V.* ; Moreau, E.* [Eds.]: Latent variable analysis and signal separation (Latent variable analysis and signal separation 9th international conference, 27-30 September 2010, St. Malo, France). Berlin: Springer, 2010. 114-122 (Lect. Notes Comput. Sc. ; 6365)
This paper introduces the audio part of the 2010 community-based Signal Separation Evaluation Campaign (SiSEC2010). Seven speech and music datasets were contributed, which include datasets recorded in noisy or dynamic environments, in addition to the SiSEC2008 datasets. The source separation problems were split into five tasks, and the results for each task were evaluated using different objective performance criteria. We provide an overview of the audio datasets, tasks and criteria. We also report the results achieved with the submitted systems, and discuss organization strategies for future campaigns.
Araki, S. ; Theis, F.J. ; Nolte, G. ; Lutter, D. ; Ozerov, A. ; Gowreesunker, V. ; Sawada, H. ; Duong, N.Q.K.
In: Vigneron, V.* ; Zarzoso, V.* ; Moreau, E.* [Eds.]: Proceedings (Latent variable analysis and signal separation : 9th international conference, 27-30 September 2010, St. Malo, France). Berlin: Springer, 2010. 123-130 (Lect. Notes Comput. Sc. ; 6365)
We present an overview of the biomedical part of the 2010 community-based Signal Separation Evaluation Campaign (SiSEC2010), coordinated by the authors. In addition to the audio tasks which have been evaluated in the previous SiSEC, SiSEC2010 considered several biomedical tasks. Here, three biomedical datasets from molecular biology (gene expression profiles) and neuroscience (EEG) were contributed. This paper describes the biomedical datasets, tasks and evaluation criteria. This paper also reports the results of the biomedical part of SiSEC2010 achieved by participants.
Gutch, H.W. ; Gruber, P. ; Theis, F.J.
In: Vigneron, V.* ; Zarzoso, V.* ; Moreau, E.* [Eds.]: Proceedings (Latent variable analysis and signal separation : 9th international conference, 27-30 September 2010, St. Malo, France). Berlin: Springer, 2010. 645-652 ( ; 6365)
Independent Component Analysis is usually performed over the fields of reals or complex numbers and the only other field where some insight has been gained so far is GF(2), the finite field with two elements. We extend this to arbitrary finite fields, proving separability of the model if the sources are non-uniform and non-degenerate and present algorithms performing this task.
Blöchl, F. ; Kowarsch, A. ; Theis, F.J.
In: Vigneron, V.* ; Zarzoso, V.* ; Moreau, E.* [Eds.]: Proceedings (Latent variable analysis and signal separation : 9th international conference, 27-30 September 2010, St. Malo, France). Berlin: Springer, 2010. 434-441 (Lect. Notes Comput. Sc. ; 6365)
Matrix factorization techniques provide efficient tools for the detailed analysis of large-scale biological and biomedical data. While underlying algorithms usually work fully blindly, we propose to incorporate prior knowledge encoded in a graph model. This graph introduces a partial ordering in data without intrinsic (e.g. temporal or spatial) structure, which allows the definition of a graph-autocorrelation function. Using this framework as constraint to the matrix factorization task we develop a second-order source separation algorithm called graph-decorrelation algorithm (GraDe). We demonstrate its applicability and robustness by analyzing microarray data from a stem cell differentiation experiment.
Mewes, H.-W. ; Ruepp, A. ; Theis, F.J. ; Rattei, T. ; Walter, M.C. ; Frishman, D. ; Suhre, K. ; Spannagl, M. ; Mayer, K.F.X. ; Stuempflen, V. ; Antonov, A.
Nucleic Acids Res. 39, 1, D220-D224 (2010)
The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38,000,000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de).
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J. ; Müller, N.S. ; Plant, C. ; Böhm, C.
In: Vigneron, V.* [Eds.]: Proceedings (Latent variable analysis and signal separation : 9th international conference, 27-30 September 2010, St. Malo, France). Heidelberg: Springer, 2010. 466-473 (LNCS; 6365)
Theis, F.J.
IEEE Trans. Circuits Syst. I-Regul. Pap. 57, 1463-1474 (2010)
Identifying relevant signals within high-dimensional observations is an important preprocessing step for efficient data analysis. However, many classical dimension reduction techniques such as principal component analysis do not take the often rich statistics of real-world data into account, and thereby fail if for example the signal space is of low power but meaningful in terms of some other statistics. With "colored subspace analysis," we propose a method for linear dimension reduction that evaluates the time structure of the multivariate observations. We differentiate the signal subspace from noise by searching for a subspace of non-trivially autocorrelated data. We prove that the resulting signal subspace is uniquely determined by the data, given that all white components have been removed. Algorithmically we propose three efficient methods to perform this search, based on joint diagonalization, using a component clustering scheme, and via joint low-rank approximation. In contrast to temporal mixture approaches from blind signal processing we do not need a generative model, i.e., we do not require the existence of sources, so the model is applicable to any wide-sense stationary time series without restrictions. Moreover, since the method is based on second-order time structure, it can be efficiently implemented and applied even in large dimensions. Numerical examples together with an application to dimension reduction of functional MRI recordings demonstrate the usefulness of the proposed method. The implementation is publicly available as a Matlab package at http://cmb.helmholtz-muenchen.de/CSA.
Wissenschaftlicher Artikel
Scientific Article
Hartsperger, M.L. ; Blöchl, F. ; Stuempflen, V. ; Theis, F.J.
BMC Bioinformatics 11:522 (2010)
BACKGROUND: Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. RESULTS: Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. CONCLUSIONS: In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem.
Wissenschaftlicher Artikel
Scientific Article
Kowarsch, A. ; Marr, C. ; Schmidl, D. ; Ruepp, A. ; Theis, F.J.
PLoS ONE 5:e11154 (2010)
MicroRNAs are a large class of post-transcriptional regulators that bind to the 3' untranslated region of messenger RNAs. They play a critical role in many cellular processes and have been linked to the control of signal transduction pathways. Recent studies indicate that microRNAs can function as tumor suppressors or even as oncogenes when aberrantly expressed. For more general insights of disease-associated microRNAs, we analyzed their impact on human signaling pathways from two perspectives. On a global scale, we found a core set of signaling pathways with enriched tissue-specific microRNA targets across diseases. The function of these pathways reflects the affinity of microRNAs to regulate cellular processes associated with apoptosis, proliferation or development. Comparing cancer and non-cancer related microRNAs, we found no significant differences between both groups. To unveil the interaction and regulation of microRNAs on signaling pathways locally, we analyzed the cellular location and process type of disease-associated microRNA targets and proteins. While disease-associated proteins are highly enriched in extracellular components of the pathway, microRNA targets are preferentially located in the nucleus. Moreover, targets of disease-associated microRNAs preferentially exhibit an inhibitory effect within the pathways in contrast to disease proteins. Our analysis provides systematic insights into the interaction of disease-associated microRNAs and signaling pathways and uncovers differences in cellular locations and process types of microRNA targets and disease-associated proteins.
Wissenschaftlicher Artikel
Scientific Article
Blöchl, F. ; Hartsperger, M.L. ; Stuempflen, V. ; Theis, F.J.
In: Schomburg, D.* ; Grote, A.* [Eds.]: Proceedings (German Conference on Bioinformatics 2010). Bonn: Ges. f. Inform., 2010. 31-40 ( ; P-173)
With the increasing a availability of large-scale interaction networks derived eitther from experimental data or from text mining, we face the challenge of interpreting and analyzing these data sets in a comprehensive fashion. A particularity of these networks, which sets it apart from other examples in various scientific fields lies in their k-partiteness. Whereas graph partitioning has received considerable attention, only few researchers have focused on this generalized situation. Recently, Long et al. Have proposed a method for jointly clustering such a network and at the same time estimating a weighted graph connecting the clusters thereby allowing simple interpretation of the resulting clustering structure. In this contribution, we extend this work by allowing fuzzy clusters for each node type. We propose an extended cost function for partitioning that allows for overlapping clusters. Our main contribution lies in employed in algorithms for non-negativ matrix factorization. Results on clustering a manually annotated bipartite gene-complex graph show signigiantly higher homogeneity beween gene and corresponding complex clusters than expected by chance.
Ruepp, A. ; Kowarsch, A. ; Schmidl, D. ; Buggenthin, F. ; Brauner, B. ; Dunger, I. ; Fobo, G. ; Frishman, G. ; Montrone, C. ; Theis, F.J.
Genome Biol. 11:R6 (2010)
In recent years, microRNAs have been shown to play important roles in physiological as well as malignant processes. The PhenomiR database http://mips.helmholtz-muenchen.de/phenomir provides data from 542 studies that investigate deregulation of microRNA expression in diseases and biological processes as a systematic, manually curated resource. Using the PhenomiR dataset, we could demonstrate that, depending on disease type, independent information from cell culture studies contrasts with conclusions drawn from patient studies.
Wissenschaftlicher Artikel
Scientific Article
Konopka, W. ; Kiryk, A. ; Novak, M. ; Herwerth, M. ; Parkitna, J.R. ; Wawrzyniak, M. ; Kowarsch, A. ; Michaluk, P. ; Dzwonek, J. ; Arnsperger, T. ; Wilczynski, G. ; Merkenschlager, M. ; Theis, F.J. ; Köhr, G. ; Kaczmarek, L. ; Schütz, G.
J. Neurosci. 30, 14835-14842 (2010)
Dicer-dependent noncoding RNAs, including microRNAs (miRNAs), play an important role in a modulation of translation of mRNA transcripts necessary for differentiation in many cell types. In vivo experiments using cell type-specific Dicer1 gene inactivation in neurons showed its essential role for neuronal development and survival. However, little is known about the consequences of a loss of miRNAs in adult, fully differentiated neurons. To address this question, we used an inducible variant of the Cre recombinase (tamoxifen-inducible CreERT2) under control of Camk2a gene regulatory elements. After induction of Dicer1 gene deletion in adult mouse forebrain, we observed a progressive loss of a whole set of brain-specific miRNAs. Animals were tested in a battery of both aversively and appetitively motivated cognitive tasks, such as Morris water maze, IntelliCage system, or trace fear conditioning. Compatible with rather long half-life of miRNAs in hippocampal neurons, we observed an enhancement of memory strength of mutant mice 12 weeks after the Dicer1 gene mutation, before the onset of neurodegenerative process. In acute brain slices, immediately after high-frequency stimulation of the Schaffer collaterals, the efficacy at CA3-to-CA1 synapses was higher in mutant than in control mice, whereas long-term potentiation was comparable between genotypes. This phenotype was reflected at the subcellular and molecular level by the elongated filopodia-like shaped dendritic spines and an increased translation of synaptic plasticity-related proteins, such as BDNF and MMP-9 in mutant animals. The presented work shows miRNAs as key players in the learning and memory process of mammals.
Wissenschaftlicher Artikel
Scientific Article
Plant, C. ; Theis, F.J. ; Meyer-Bäse, A. ; Böhm, C.
In: Vigneron, V.* ; Zarzoso, V.* ; Moreau, E.* [Eds.]: Proceedings (Latent variable analysis and signal separation : 9th international conference, 27-30 September 2010, St. Malo, France). Berlin: Springer, 2010. 254-262 (Lect. Notes Comput. Sc. ; 6365)
Independent Component Analysis (ICA) is an essential building block for data analysis in many applications. Selecting the truly meaningful components from the result of an ICA algorithm, or comparing the results of different algorithms, however, are non-trivial problems. We introduce a very general technique for evaluating ICA results rooted in information-theoretic model selection. The basic idea is to exploit the natural link between non-Gaussianity and data compression: The better the data transformation represented by one or several ICs improves the effectiveness of data compression, the higher is the relevance of the ICs. In an extensive experimental evaluation we demonstrate that our novel information-theoretic measure robustly selects the most interesting components from data without requiring any assumptions or thresholds.
Blöchl, F. ; Theis, F.J. ; Vega-Redondo, F. ; Fisher, E.O'N.
In: Empirical and theoretical methods. München: ifo Institut für Wirtschaftsforschung e.V., 2010. 1-20 (CESifo working paper series ; 3175)
We analyze input-output matrices for a wide set of countries as weighted directed networks. These graphs contain only 47 nodes, but they are almost fully connected and many have nodes with strong self-loops. We apply two measures: random walk centrality and one based on count-betweenness. Our findings are intuitive. For example, in Luxembourg the most central sector is "Finance and Insurance" and the analog in Germany is "Wholesale and Retail Trade" or "Motor Vehicles", according to the measure. Rankings of sectoral centrality vary by country. Some sectors are often highly central, while others never are. Hierarchical clustering reveals geographical proximity and similar development status.
Meyer-Bäse, A. ; Plant, C. ; Cappendijk, S. ; Theis, F.J.
In: Proceedings (BIOCOMP'10 - The 2010 International Conference on Bioinformatics & Computational Biology). Las Vegas (Nevada): CSREA Press, 2010.
Genetic regulatory networks are prone to internal parametrical fluctuations as well as external noises and are modeled as multi-time scale systems with time-delay. Robustness represents a crucial property of these networks to attenuate the effects of internal fluctuations and external noise. In this study, we formulate biological networks as coupled nonlinear differential systems operating at different time-scales under consideration of time delay and vanishing perturbations. We determine conditions for the existence of a global uniform attractor of the perturbed biological system. By using a Lyapunov function for the coupled system, we derive a maximal upper bound for the fast time scale associated with the fast state.
Lutter, D. ; Marr, C. ; Krumsiek, J. ; Lang, E.W. ; Theis, F.J.
BMC Genomics 11:224 (2010)
MicroRNA-mediated control of gene expression via translational inhibition has substantial impact on cellular regulatory mechanisms. About 37% of mammalian microRNAs appear to be located within introns of protein coding genes, linking their expression to the promoter-driven regulation of the host gene. In our study we investigate this linkage towards a relationship beyond transcriptional co-regulation. Using measures based on both annotation and experimental data, we show that intronic microRNAs tend to support their host genes by regulation of target gene expression with significantly correlated expression patterns. We used expression data of three differentiating cell types and compared gene expression profiles of host and target genes. Many microRNA target genes show expression patterns significantly correlated with the expressions of the microRNA host genes. By calculating functional similarities between host and predicted microRNA target genes based on GO annotations, we confirm that many microRNAs link host and target gene activity in an either synergistic or antagonistic manner. These two regulatory effects may result from fine tuning of target gene expression functionally related to the host or knock-down of remaining opponent target gene expression. This finding allows to extend the common practice of mapping large scale gene expression data to protein associated genes with functionality of co-expressed intronic microRNAs.
Wissenschaftlicher Artikel
Scientific Article
Kreuzpointner, L. ; Simon, P. ; Theis, F.J.
Br. J. Math. Stat. Psychol. 63, 341-360 (2010)
The a(d) coefficient was developed to measure the within-group agreement of ratings. The underlying theory as well as the construction of the coefficient are explained. The a(d) coefficient ranges from 0 to 1, regardless of the number of scale points, raters, or items. With some limitations the measure of the within-group agreement of different groups and groups from different studies is directly comparable. For statistical significance testing, the binomial distribution is introduced as a model of the ratings' random distribution given the true score of a group construct. This method enables a decision about essential agreement and not only about a significant difference from 0 or a chosen critical value. The a(d) coefficient identifies a single true score within a group. It is not provided for multiple true score settings. The comparison of the a(d) coefficient with other agreement indices shows that the new coefficient is in line with their outcomes, but does not result in infinite or inappropriate values.
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J. ; Meyer- Bäse, A.
Cambridge, Mass.: MIT Press, 2010. 432 S.
Biomedical signal analysis has become one of the most important visualization and interpretation methods in biology and medicine. Many new and powerful instruments for detecting, storing, transmitting, analyzing, and displaying images have been developed in recent years, allowing scientists and physicians to obtain quantitative measurements to support scientific hypotheses and medical diagnoses. This book offers an overview of a range of proven and new methods, discussing both theoretical and practical aspects of biomedical signal analysis and interpretation.After an introduction to the topic and a survey of several processing and imaging techniques, the book describes a broad range of methods, including continuous and discrete Fourier transforms, independent component analysis (ICA), dependent component analysis, neural networks, and fuzzy logic methods. The book then discusses applications of these theoretical tools to practical problems in everyday biosignal processing, considering such subjects as exploratory data analysis and low-frequency connectivity analysis in fMRI, MRI signal processing including lesion detection in breast MRI, dynamic cerebral contrast-enhanced perfusion MRI, skin lesion classification, and microscopic slice image processing and automatic labeling. Biomedical Signal Analysis can be used as a text or professional reference. Part I, on methods, forms a self-contained text, with exercises and other learning aids, for upper-level undergraduate or graduate-level students. Researchers or graduate students in systems biology, genomic signal processing, and computer-assisted radiology will find both parts I and II (on applications) a valuable handbook.
Marr, C. ; Theis, F.J. ; Liebovitch, L.S. ; Hütt, M.-T.
PLoS Comput. Biol. 6, 31:e1000836 (2010)
The set of regulatory interactions between genes, mediated by transcription factors, forms a species' transcriptional regulatory network (TRN). By comparing this network with measured gene expression data, one can identify functional properties of the TRN and gain general insight into transcriptional control. We define the subnet of a node as the subgraph consisting of all nodes topologically downstream of the node, including itself. Using a large set of microarray expression data of the bacterium Escherichia coli, we find that the gene expression in different subnets exhibits a structured pattern in response to environmental changes and genotypic mutation. Subnets with fewer changes in their expression pattern have a higher fraction of feed-forward loop motifs and a lower fraction of small RNA targets within them. Our study implies that the TRN consists of several scales of regulatory organization: (1) subnets with more varying gene expression controlled by both transcription factors and post-transcriptional RNA regulation and (2) subnets with less varying gene expression having more feed-forward loops and less post-transcriptional RNA regulation.
Wissenschaftlicher Artikel
Scientific Article
Ansorg, M. ; Blöchl, F. ; zu Castell, W. ; Theis, F.J. ; Wittmann, D.M.
Int. J. Biomath. Biostat. 1, 9-21 (2010)
We consider the stationary limit of a mathematical model that desribes gene regulation at the so-called mit-hindbrain boundary during neural development in vertebrates. The study of an approproate modified system provides us with solutions that resemble the situation of the original system and can be used as references for a subsequent numerical analysis in terms of a pseudo-spectral collocation point method. We discuss the resulting numerical solutions in terms of their biological relecance.
Wissenschaftlicher Artikel
Scientific Article
Franke, R. ; Theis, F.J. ; Klamt, S.
J. Integr. Bioinform. 7:151 (2010)
Using the lac operon as a paradigmatic example for a gene regulatory system in prokaryotes, we demonstrate how qualitative knowledge can be initially captured using simple discrete (Boolean) models and then stepwise refined to multivalued logical models and finally to continuous (ODE) models. At all stages, signal transduction and transcriptional regulation is integrated in the model description. We first show the potential benefit of a discrete binary approach and discuss then problems and limitations due to indeterminacy arising in cyclic networks. These limitations can be partially circumvented by using multilevel logic as generalization of the Boolean framework enabling one to formulate a more realistic model of the lac operon. Ultimately a dynamic description is needed to fully appreciate the potential dynamic behavior that can be induced by regulatory feedback loops. As a very promising method we show how the use of multivariate polynomial interpolation allows transformation of the logical network into a system of ordinary differential equations (ODEs), which then enables the analysis of key features of the dynamic behavior.
Wissenschaftlicher Artikel
Scientific Article
2009
Blöchl, F. ; Theis, F.J.
In: Adali, T.* [Eds.]: Independent Component Analysis and Signal Separation. Berlin: Springer, 2009. 387-394 (Lect. Notes Comput. Sc. ; 5441)
We address the applicability of blind source separation (BSS) methods for the estimation of hidden influences in biological dynamic systems such as metabolic or gene regulatory networks. In simple processes obeying mass action kinetics, we find the emergence of linear mixture models. More complex situations as well as hidden influences in regulatory systems with sigmoidal input functions however lead to new classes of BSS problems.
Theis, F.J. ; Cason, T.P. ; Absil, P.A.
In: Adali, T.* [Eds.]: Independent Component Analysis and Signal Separation. Berlin: Springer, 2009. 354-361 (Lect. Notes Comput. Sc. ; 5441)
Joint diagonalization for ICA is often performed on the orthogonal group after a pre-whitening step. Here we assume that we only want to extract a few sources after pre-whitening, and hence work on the Stiefel manifold of $p$-frames in $R^n$. The resulting method does not only use second-order statistics to estimate the dimension reduction and is therefore denoted as soft dimension reduction. We employ a trust-region method for minimizing the cost function on the Stiefel manifold. Applications to a toy example and functional MRI data show a higher numerical efficiency, especially when $p$ is much smaller than $n$, and more robust performance in the presence of strong noise than methods based on pre-whitening.
Theis, F.J. ; Neher, R. ; Zeug, A.
In: Adali, T.* [Eds.]: Proceedings (Independent Component Analysis and Signal Separation). Berlin: Springer, 2009. 548-556 (Lecture Notes in Computer Science ; 5441)
Recently, we have proposed a blind source separation algorithm to separate dyes in multiply labeled fluorescence microscopy images. Applying the algorithm, we are able to successfully extract the dye distributions from the images. It thereby solves an often challenging problem since the recorded emission spectra of fluorescent dyes are environment and instrument specific. The separation algorithm is based on nonnegative matrix factorization in a Poisson noise model and works well on many samples. In some cases, however, additional cost function terms such as sparseness enhancement are necessary to arrive at a satisfactory decomposition. In this contribution we analyze the algorithm on two very well controlled real data sets. In the first case, known sources are artificially mixed in varying mixing conditions. In the second case, fluorescent beads are used to generate well behaved mixing situations. In both cases we can successfully extract the original sources. We discuss how the separation is influenced by the weight of the additional cost function terms, thereby illustrating that BSS can be be vastly improved by invoking qualitative knowledge about the nature of the sources.
Wittmann, D.M. ; Blöchl, F. ; Trümbach, D. ; Wurst, W. ; Prakash, N. ; Theis, F.J.
PLoS Comput. Biol. 5:e1000569 (2009)
An important aspect of the functional annotation of enzymes is not only the type of reaction catalysed by an enzyme, but also the substrate specificity, which can vary widely within the same family. In many cases, prediction of family membership and even substrate specificity is possible from enzyme sequence alone, using a nearest neighbour classification rule. However, the combination of structural information and sequence information can improve the interpretability and accuracy of predictive models. The method presented here, Active Site Classification (ASC), automatically extracts the residues lining the active site from one representative three-dimensional structure and the corresponding residues from sequences of other members of the family. From a set of representatives with known substrate specificity, a Support Vector Machine (SVM) can then learn a model of substrate specificity. Applied to a sequence of unknown specificity, the SVM can then predict the most likely substrate. The models can also be analysed to reveal the underlying structural reasons determining substrate specificities and thus yield valuable insights into mechanisms of enzyme specificity. We illustrate the high prediction accuracy achieved on two benchmark data sets and the structural insights gained from ASC by a detailed analysis of the family of decarboxylating dehydrogenases. The ASC web service is available at http://asc.informatik.uni-tuebingen.de/.
Wissenschaftlicher Artikel
Scientific Article
Wittmann, D.M. ; Schmidl, D. ; Blöchl, F. ; Theis, F.J.
Theor. Comput. Sci. 410, 3826-3838 (2009)
The analysis of complex networks is of major interest in various fields of science. In many applications we face the challenge that the exact topology of a network is unknown but we are instead given information about distances within this network. The theoretical approaches to this problem have so far been focusing on the reconstruction of graphs from shortest path distance matrices. Often, however, movements in networks do not follow shortest paths but occur in a random fashion. In these cases an appropriate distance measure can be defined as the mean length of a random walk between two nodes - a quantity known as the mean first hitting time. In this contribution we investigate whether a graph can be reconstructed from its mean first hitting time matrix and put forward an algorithm for solving this problem. A heuristic method to reduce the computational effort is described and analyzed. In the case of trees we can even give an algorithm for reconstructing graphs from incomplete random walk distance matrices.
Wissenschaftlicher Artikel
Scientific Article
Klamt, S. ; Haus, U.-U. ; Theis, F.J.
PLoS Comput. Biol. 5:e1000385 (2009)
Editorial
Editorial
Neher, R.A. ; Mitkovski, M. ; Kirchhoff, F. ; Neher, E. ; Theis, F.J. ; Zeug, A.
Biophys. J. 96, 3791-3800 (2009)
Methods of blind source separation are used in many contexts to separate composite data sets according to their sources. Multiply labeled fluorescence microscopy images represent such sets, in which the sources are the individual labels. Then distributions are the quantities of interest and have to be extracted from the images. This is often challenging, since the recorded emission spectra of fluorescent dyes are environment- and instrument-specific. We have developed a nonnegative matrix factorization (NMF) algorithm to detect and separate spectrally distinct components of multiply labeled fluorescence images. It operates on spectrally resolved images and delivers both the emission spectra of the identified components and images of their abundance. We tested the proposed method using biological samples labeled with up to four spectrally overlapping fluorescent labels. In most cases, NMF accurately decomposed the images into contributions of individual dyes. However, the Solutions are not unique when spectra overlap strongly or when images are diffuse in their structure. To arrive at satisfactory results in such cases, we extended NMF to incorporate preexisting qualitative knowledge about spectra and label distributions. We show how data acquired through excitations at two or three different wavelengths can be integrated and that multiple excitations greatly facilitate the decomposition. By allowing reliable decomposition in cases where the spectra of the individual labels are not known or are known only inaccurately, the proposed algorithms greatly extend the range of questions that can be addressed with quantitative microscopy.
Wissenschaftlicher Artikel
Scientific Article
Wittmann, D.M. ; Krumsiek, J. ; Saez-Rodriguez, J. ; Lauffenburg, D.A. ; Klam, S. ; Theis, F.J.
BMC Syst. Biol. 3, 98:98 (2009)
The understanding of regulatory and signaling networks has long been a core objective in Systems Biology. Knowledge about these networks is mainly of qualitative nature, which allows the construction of Boolean models, where the state of a component is either 'off' or 'on'. While often able to capture the essential behavior of a network, these models can never reproduce detailed time courses of concentration levels. Nowadays however, experiments yield more and more quantitative data. An obvious question therefore is how qualitative models can be used to explain and predict the outcome of these experiments. Results: In this contribution we present a canonical way of transforming Boolean into continuous models, where the use of multivariate polynomial interpolation allows transformation of logic operations into a system of ordinary differential equations (ODE). The method is standardized and can readily be applied to large networks. Other, more limited approaches to this task are briefly reviewed and compared. Moreover, we discuss and generalize existing theoretical results on the relation between Boolean and continuous models. As a test case a logical model is transformed into an extensive continuous ODE model describing the activation of T-cells. We discuss how parameters for this model can be determined such that quantitative experimental results are explained and predicted, including time-courses for multiple ligand concentrations and binding affinities of different ligands. This shows that from the continuous model we may obtain biological insights not evident from the discrete one. Conclusion: The presented approach will facilitate the interaction between modeling and experiments. Moreover, it provides a straightforward way to apply quantitative analysis methods to qualitatively described systems.
Wissenschaftlicher Artikel
Scientific Article
Gruber, P. ; Gutch, H.W. ; Theis, F.J.
In: Adali, T.* [Eds.]: Independent Component Analysis and Signal Separation. Berlin: Springer, 2009. 259-266 (Lect. Notes Comput. Sc. ; 5441)
Independent Subspace Analysis (ISA) is an extension of Independent Component Analysis (ICA) that aims to linearly transform a random vector such as to render groups of its components mutually independent. A recently proposed fixed-point algorithm is able to locally perform ISA if the sizes of the subspaces are known, however global convergence is a serious problem as the proposed cost function has additional local minima. We introduce an extension to this algorithm, based on the idea that the algorithm converges to a solution, in which subspaces that are members of the global minimum occur with a higher frequency. We show that this overcomes the algorithm’s limitations. Moreover, this idea allows a blind approach, where no a priori knowledge of subspace sizes is required.
Georgiev, P.G. ; Theis, F.J.
In: Romejin, H.E.* ; Pardalos, P.M.* [Eds.]: Handbook of Optimization in Medicine. New York: Springer, 2009. 1-38 (Springer Optimization and Its Applications ; 26)
Georgiev, P.G. ; Theis, F.J.
In: Pardalos, P.M.* ; Romejin, H.E.* [Eds.]: Handbook of Optimization in Medicine. New York: Springer Science + Business Media B.V., 2009. 253-290
Handbook of Optimization in Medicine is devoted to examining the dramatic increase in the application of effective optimization techniques to the delivery of health care. The articles, written by experts from the areas of optimization (operations research), computer science, and medicine, focus on models and algorithms that have led to more efficient and sophisticated treatments of patients.
Webb, K.J. ; Norton, W.H.J. ; Trümbach, D. ; Meijer, A.H. ; Ninkovic, J. ; Topp, S. ; Heck, D. ; Marr, C. ; Wurst, W. ; Theis, F.J. ; Spaink, H.P. ; Bally-Cuif, L.
Genome Biol. 10:R81 (2009)
Addiction is a pathological dysregulation of the brain's reward systems, determined by several complex genetic pathways. The conditioned place preference test provides an evaluation of the effects of drugs in animal models, allowing the investigation of substances at a biologically relevant level with respect to reward. Our lab has previously reported the development of a reliable conditioned place preference paradigm for zebrafish. Here, this test was used to isolate a dominant N-ethyl-N-nitrosourea (ENU)-induced mutant, no addiction (nad(dne3256)), which fails to respond to amphetamine, and which we used as an entry point towards identifying the behaviorally relevant transcriptional response to amphetamine. Results: Through the combination of microarray experiments comparing the adult brain transcriptome of mutant and wild-type siblings under normal conditions, as well as their response to amphetamine, we identified genes that correlate with the mutants' altered conditioned place preference behavior. In addition to pathways classically involved in reward, this gene set shows a striking enrichment in transcription factor-encoding genes classically involved in brain development, which later appear to be reused within the adult brain. We selected a subset of them for validation by quantitative PCR and in situ hybridization, revealing that specific brain areas responding to the drug through these transcription factors include domains of ongoing adult neurogenesis. Finally, network construction revealed functional connections between several of these genes. Conclusions: Together, our results identify a new network of coordinated gene regulation that influences or accompanies amphetamine-triggered conditioned place preference behavior and that may underlie the susceptibility to addiction.
Review
Review
Gruber, P. ; Meyer-Bäse, A. ; Foo, S. ; Theis, F.J.
Eng. Appl. Artif. Intel. 22, 497-504 (2009)
In the last decades, functional magnetic resonance imaging (fMRI) has been introduced into clinical practice. As a consequence of this advanced noninvasive medical imaging technique, the analysis and visualization of medical image time-series data poses a new challenge to both research and medical application. But often, the model data for a regression or generalized linear model-based analysis are not available. Hence exploratory data-driven techniques, i.e. blind source separation (BSS) methods are very popular in functional nuclear magnetic resonance imaging (fMRI) data analysis since they are neither based on explicit signal models nor on a priori knowledge of the underlying physiological process. The independent component analysis (ICA) represents a main BSS method which searches for stochastically independent signals from the multivariate observations. In this paper, we introduce a new kernel-based nonlinear ICA method and compare it to standard BSS techniques. This kernel nonlinear ICA (kICA) overcomes the restrictions of linearity of the mixing process usually encountered with ICA. Dimension reduction is an important preprocessing step for this nonlinear technique and is performed in a novel way: a genetic algorithm is designed which determines the optimal number of basis vectors for a reduced-order feature space representation as an optimization problem of the condition number of the resulting basis. For the fMRI data, a comparative quantitative evaluation is performed between kICA with different kernels, nonnegative matrix factorization (NMF) and other BSS algorithms. The comparative results are evaluated by task-related activation maps, associated time courses and ROC study. The comparison is performed on fMRI data from experiments with 10 subjects. The external stimulus was a visual pattern presentation in a block design. The most important obtained results in this paper represent that kICA and sparse NMF (sNMF) are able to identify signal components with high correlation to the fMRI stimulus, and kICA with a Gaussian kernel is comparable to standard ICA algorithms and even more, it yields spatially focused results.
Wissenschaftlicher Artikel
Scientific Article
2008
Brockmann, D. ; Theis, F.J.
IEEE Pervasive Comput. 7, 28-35 (2008)
no Abstract
Wissenschaftlicher Artikel
Scientific Article
Wong, P. ; Althammer, S. ; Hildebrand, A. ; Kirschner, A. ; Pagel, P. ; Geissler, B. ; Smialowski, P. ; Blöchl, F. ; Oesterheld, M. ; Schmidt, T. ; Strack, N. ; Theis, F.J. ; Ruepp, A. ; Frishman, D.
BMC Genomics 9:629 (2008)
Background: We have recently released a comprehensive, manually curated database of mammalian protein complexes called CORUM. Combining CORUM with other resources, we assembled a dataset of over 2700 mammalian complexes. The availability of a rich information resource allows us to search for organizational properties concerning these complexes. Results: As the complexity of a protein complex in terms of the number of unique subunits increases, we observed that the number of such complexes and the mean non-synonymous to synonymous substitution ratio of associated genes tend to decrease. Similarly, as the number of different complexes a given protein participates in increases, the number of such proteins and the substitution ratio of the associated gene also tend to decrease. These observations provide evidence relating natural selection and the organization of mammalian complexes. We also observed greater homogeneity in terms of predicted protein isoelectric points, secondary structure and substitution ratio in annotated versus randomly generated complexes. A large proportion of the protein content and interactions in the complexes could be predicted from known binary protein-protein and domain-domain interactions. In particular, we found that large proteins interact preferentially with much smaller proteins. Conclusions: We observed similar trends in yeast and other data. Our results support the existence of conserved relations associated with the mammalian protein complexes.
Wissenschaftlicher Artikel
Scientific Article
Begemann, M. ; Sargin, D. ; Rossner, M.J. ; Bartels, C. ; Theis, F.J. ; Wichert, S.P. ; Stender, N. ; Fischer, B. ; Sperling, S. ; Stawicki, S. ; Wiedl, A. ; Falkai, P. ; Nave, K.A. ; Ehrenreich, H.
Mol. Med. 14, 546-552 (2008)
Molecular mechanisms underlying bipolar affective disorders are unknown. Difficulties arise from genetic and phenotypic heterogeneity of patients and the lack of animal models. Thus, we focused on only one patient (n = 1) with an extreme form of rapid cycling. Ribonucleic acid (RNA) from peripheral blood mononuclear cells (PBMC) was analyzed in a three-tiered approach under widely standardized conditions. Firstly, RNA was extracted from PBMC of eight blood samples, obtained on two consecutive days within one particular episode, including two different consecutive depressive and two different consecutive manic episodes, and submitted to (1) screening by microarray hybridizations, followed by (2) detailed bioinformatic analysis, and (3) confirmation of episode-specific regulation of genes by quantitative real-time polymerase chain reaction (qRT-PCR).Secondly, results were validated in additional blood samples obtained one to two years later. Among gene transcripts elevated in depressed episodes were prostaglandin D synthetase (PTGDS) and prostaglandin D2 11-ketoreductase (AKR1C3), both involved in hibernation. We hypothesized them to account for some of the rapid cycling symptoms. A subsequent treatment approach over 5 months applying the cyclooxygenase inhibitor celecoxib (2 x 200 mg daily) resulted in reduced severity rating of both depressed and manic episodes. This case suggests that rapid cycling is a systemic disease, resembling hibernation, with prostaglandins playing a mediator role.
Wissenschaftlicher Artikel
Scientific Article
Adamcio, B. ; Sargin, D. ; Stradomska, A. ; Medrihan, L. ; Gertler, C. ; Theis, F.J. ; Zhang, M. ; Müller, M. ; Hassouna, I. ; Hannke, K. ; Sperling, S. ; Radyushkin, K. ; El-Kordi, A. ; Schulze, L. ; Ronnenberg, A. ; Wolf, F. ; Brose, N. ; Rhee, J.S. ; Zhang, W. ; Ehrenreich, H.
BMC Biol. 6:37 (2008)
BACKGROUND: Erythropoietin (EPO) improves cognition of human subjects in the clinical setting by as yet unknown mechanisms. We developed a mouse model of robust cognitive improvement by EPO to obtain the first clues of how EPO influences cognition, and how it may act on hippocampal neurons to modulate plasticity. RESULTS: We show here that a 3-week treatment of young mice with EPO enhances long-term potentiation (LTP), a cellular correlate of learning processes in the CA1 region of the hippocampus. This treatment concomitantly alters short-term synaptic plasticity and synaptic transmission, shifting the balance of excitatory and inhibitory activity. These effects are accompanied by an improvement of hippocampus dependent memory, persisting for 3 weeks after termination of EPO injections, and are independent of changes in hematocrit. Networks of EPO-treated primary hippocampal neurons develop lower overall spiking activity but enhanced bursting in discrete neuronal assemblies. At the level of developing single neurons, EPO treatment reduces the typical increase in excitatory synaptic transmission without changing the number of synaptic boutons, consistent with prolonged functional silencing of synapses. CONCLUSION: We conclude that EPO improves hippocampus dependent memory by modulating plasticity, synaptic connectivity and activity of memory-related neuronal networks. These mechanisms of action of EPO have to be further exploited for treating neuropsychiatric diseases.
Wissenschaftlicher Artikel
Scientific Article
Lutter, D. ; Ugocsai, P. ; Grandl, M. ; Orso, E. ; Theis, F.J. ; Lang, E.W. ; Schmitz, G.
BMC Bioinformatics 9:100 (2008)
BACKGROUND: The analysis of high-throughput gene expression data sets derived from microarray experiments still is a field of extensive investigation. Although new approaches and algorithms are published continuously, mostly conventional methods like hierarchical clustering algorithms or variance analysis tools are used. Here we take a closer look at independent component analysis (ICA) which is already discussed widely as a new analysis approach. However, deep exploration of its applicability and relevance to concrete biological problems is still missing. In this study, we investigate the relevance of ICA in gaining new insights into well characterized regulatory mechanisms of M-CSF dependent macrophage differentiation. RESULTS: Statistically independent gene expression modes (GEM) were extracted from observed gene expression signatures (GES) through ICA of different microarray experiments. From each GEM we deduced a group of genes, henceforth called sub-mode. These sub-modes were further analyzed with different database query and literature mining tools and then combined to form so called meta-modes. With them we performed a knowledge-based pathway analysis and reconstructed a well known signal cascade. CONCLUSION: We show that ICA is an appropriate tool to uncover underlying biological mechanisms from microarray data. Most of the well known pathways of M-CSF dependent monocyte to macrophage differentiation can be identified by this unsupervised microarray data analysis. Moreover, recent research results like the involvement of proliferation associated cellular mechanisms during macrophage differentiation can be corroborated.
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J. ; Gruber, P. ; Keck, I.R. ; Lang, E.W.
Neurocomp. 71, 2209-2216 (2008)
Real-world data sets such as recordings from functional magnetic resonance imaging (fMRI) often possess both spatial and temporal structures. Here, we propose an algorithm including such spatiotemporal information into the analysis, and reduce the problem to the joint approximate diagonalization of a set of autocorrelation matrices. We demonstrate the feasibility of the algorithm by applying it to fMRI analysis, where previous approaches are outperformed considerably.
Wissenschaftlicher Artikel
Scientific Article
Schachtner, R. ; Lutter, D. ; Knollmüller, P. ; Tomé, A.M. ; Theis, F.J. ; Schmitz, G. ; Stetter, M. ; Vilda, P.G. ; Lang, E.W.
Bioinformatics 24, 1688-1697 (2008)
Modern machine learning methods based on matrix decomposition techniques, like independent component analysis (ICA) or non-negative matrix factorization (NMF), provide new and efficient analysis tools which are currently explored to analyze gene expression profiles. These exploratory feature extraction techniques yield expression modes (ICA) or metagenes (NMF). These extracted features are considered indicative of underlying regulatory processes. They can as well be applied to the classification of gene expression datasets by grouping samples into different categories for diagnostic purposes or group genes into functional categories for further investigation of related metabolic pathways and regulatory networks. RESULTS: In this study we focus on unsupervised matrix factorization techniques and apply ICA and sparse NMF to microarray datasets. The latter monitor the gene expression levels of human peripheral blood cells during differentiation from monocytes to macrophages. We show that these tools are able to identify relevant signatures in the deduced component matrices and extract informative sets of marker genes from these gene expression profiles. The methods rely on the joint discriminative power of a set of marker genes rather than on single marker genes. With these sets of marker genes, corroborated by leave-one-out or random forest cross-validation, the datasets could easily be classified into related diagnostic categories. The latter correspond to either monocytes versus macrophages or healthy vs Niemann Pick C disease patients.
Wissenschaftlicher Artikel
Scientific Article
2007
Gutch, H.W. ; Theis, F.J.
Lect. Notes Comput. Sc. 4666, 49-56 (2007)
Independent Subspace Analysis (ISA) is a generalization of ICA. It tries to find a basis in which a given random vector can be decomposed into groups of mutually independent random vectors. Since the first introduction of ISA, various algorithms to solve this problem have been introduced, however a general proof of the uniqueness of ISA decompositions remained an open question. In this contribution we address this question and sketch a proof for the separability of ISA. The key condition for separability is to require the subspaces to be not further decomposable (irreducible). Based on a decomposition into irreducible components, we formulate a general model for ISA without restrictions on the group sizes. The validity of the uniqueness result is illustrated on a toy example. Moreover, an extension of ISA to subspace extraction is introduced and its indeterminacies are discussed.
Wissenschaftlicher Artikel
Scientific Article
Kawanabe, M. ; Theis, F.J.
Signal Process. 87, 1890-1903 (2007)
In this article, we consider high-dimensional data which contains a low-dimensional non-Gaussian structure contaminated with Gaussian noise. Motivated by the joint diagonalization algorithms, we propose a linear dimension reduction procedure called joint low-dimensional approximation (JLA) to identify the non-Gaussian subspace. The method uses matrices whose non-zero eigen spaces coincide with the non-Gaussian subspace. We also prove its global consistency, that is the true mapping to the non-Gaussian subspace is achieved by maximizing the contrast function defined by such matrices. As examples, we will present two implementations of JLA, one with the fourth-order cumulant tensors and the other with Hessian of the characteristic functions. A numerical study demonstrates validity of our method. In particular, the second algorithm works more robustly and efficiently in most cases.
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J. ; Kawanabe, M.
Lect. Notes Comput. Sc. 4666, 121-128 (2007)
With the advent of high-throughput data recording methods in biology and medicine, the efficient identification of meaningful subspaces within these data sets becomes an increasingly important challenge. Classical dimension reduction techniques such as principal component analysis often do not take the large statistics of the data set into account, and thereby fail if the signal space is for example of low power but meaningful in terms of some other statistics. With ‘colored subspace analysis’, we propose a method for linear dimension reduction that evaluates the time structure of the multivariate observations. We differentiate the signal subspace from noise by searching for a subspace of non-trivially autocorrelated data; algorithmically we perform this search by joint low-rank approximation. In contrast to blind source separation approaches we however do not require the existence of sources, so the model is applicable to any wide-sense stationary time series without restrictions. Moreover, since the method is based on second-order time structure, it can be efficiently implemented even for large dimensions. We conclude with an application to dimension reduction of functional MRI recordings.
Wissenschaftlicher Artikel
Scientific Article
Theis, F.J. ; Georgiev, P. ; Cichocki, A.
EURASIP J. Adv. Signal Process. 2007:052105 (2007)
An algorithm called Hough SCA is presented for recovering the matrix in , where is a multivariate observed signal, possibly is of lower dimension than the unknown sources . They are assumed to be sparse in the sense that at every time instant , has fewer nonzero elements than the dimension of . The presented algorithm performs a global search for hyperplane clusters within the mixture space by gathering possible hyperplane parameters within a Hough accumulator tensor. This renders the algorithm immune to the many local minima typically exhibited by the corresponding cost function. In contrast to previous approaches, Hough SCA is linear in the sample number and independent of the source dimension as well as robust against noise and outliers. Experiments demonstrate the flexibility of the proposed algorithm.
Wissenschaftlicher Artikel
Scientific Article

* external authors, # joint first authors, ° joint corresponding authors