Carsten Marr Project - DETAILS -
SUPERVISORS
Dr. Carsten Marr, Nasrine Bekhedda, Daniele Scarcella
REQUIREMENTS
- Working knowledge of Python (mandatory)
- Familiarity with R (recommended)
- Experience with single-cell data analysis (recommended)
OBJECTIVES
The internship aims to provide hands-on experience with state-of-the-art single-cell and multi-modal data analysis pipelines. We will focus on the following objectives:
Single-cell RNA-seq (scRNA-seq) analysis pipeline:
- Quality Control, Normalisation, Feature Selection, Dimensionality Reduction
- Clustering, Data Integration
- Differential Gene Expression Analysis (DEA)
- Gene Set Enrichment Analysis (GSEA)
- Cell to Cell Communication (CCC) analysis
Single-cell ATAC-seq (scATAC-seq) analysis pipeline:
- Quality Control, Normalisation, Feature selection, Dimensionality reduction
- Differential Gene Scores analysis
- Motif Accessibility Analysis
- Gene Regulatory Network inference
TIMELINE
Week 1: Familiarization with the working environment, including the Helmholtz Campus, your workspace, and lab members.
Week 2-3: Review of literature and best practices in single-cell analysis.
Week 4-6: Establishment of pipelines and preliminary acquisition of results.
Week 7-8: Structuring results, writing the final report, and preparing an oral presentation.
PROPOSAL
Recent advancements in single-cell technologies and computational approaches have enhanced the ability to dissect the molecular mechanisms underlying leukemia1,2 and pre-leukemic conditions3, at unprecedented resolution. Our project focuses on applying state-of-the-art computational pipelines to analyze multi-modal single-cell data of human blood from both diseased and healthy donors. As our summer intern, you will perform scRNA-seq and scATAC-seq data analysis, covering data preprocessing, cell cluster annotation, differential expression analysis, gene set enrichment analysis, and cell-to-cell communication analysis. You will also explore chromatin accessibility, perform motif accessibility analysis, and infer gene regulatory networks.
You will learn and implement the best practices for single-cell analysis as described by Heumos et al.4, ensuring reproducible workflows. Tools such as Scanpy5, scVI tools6, ArchR7, and CellChat8 will be utilized to process and integrate datasets.
REFERENCES
- Liu, J., Jiang, P., Lu, Z. et al. Decoding leukemia at the single-cell level: clonal architecture, classification, microenvironment, and drug resistance. Exp Hematol Oncol 13, 12 (2024). https://doi.org/10.1186/s40164-024-00479-6
- Hu, T., Cheng, B., Matsunaga, A. et al. Single-cell analysis defines highly specific leukemia-induced neutrophils and links MMP8 expression to recruitment of tumor associated neutrophils during FGFR1 driven leukemogenesis. Exp Hematol Oncol 13, 49 (2024). https://doi.org/10.1186/s40164-024-00514-6
- Jakobsen, N. A. et al. Selective advantage of mutant stem cells in human clonal hematopoiesis is associated with attenuated response to inflammation and aging. Cell Stem Cell, 31(11), 1127–1144.e17. https://doi.org/10.1016/j.stem.2024.05.010
- Heumos, L., Schaar, A.C., Lance, C. et al. Best practices for single-cell analysis across modalities. Nat Rev Genet 24, 550–572 (2023). https://doi.org/10.1038/s41576-023-00586-w
- Wolf, F., Angerer, P. & Theis, F. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol 19, 15 (2018). https://doi.org/10.1186/s13059-017-1382-0
- Gayoso, A., Lopez, R., Xing, G. et al. A Python library for probabilistic analysis of single-cell omics data. Nat Biotechnol 40, 163–166 (2022). https://doi.org/10.1038/s41587-021-01206-w
- Granja, J.M., Corces, M.R., Pierce, S.E. et al. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat Genet 53, 403–411 (2021). https://doi.org/10.1038/s41588-021-00790-6
- Jin, S., Plikus, M.V. & Nie, Q. CellChat for systematic analysis of cell–cell communication from single-cell transcriptomics. Nat Protoc (2024). https://doi.org/10.1038/s41596-024-01045-4