Principal component analysis and clustering algorithms
Multivariate Statistics 1
Course description
This course introduces essential unsupervised learning methods for exploring and uncovering hidden structures in complex, high-dimensional datasets. You will learn the theory and application of dimension reduction with Principal Component Analysis (PCA) and advanced tools like t-SNE and UMAP. Additionally pattern discovery with a range of clustering algorithms is introduced. The focus is on practical implementation in R, equipping you to translate complex data into meaningful insights.
Target Audience
Researchers who want to learn foundational methods for dimension reduction and pattern discovery.
Topics
The course covers two main areas of unsupervised analysis:
- Dimension Reduction:
- Core principles of PCA for simplifying complex data.
- How to select and interpret principal components.
- Advanced tools for dimension reduction like t-SNE and UMAP and their usage.
- Cluster Analysis for Pattern Discovery:
- Understanding and choosing appropriate dissimilarity measures.
- Applying core methods like k-means and hierarchical clustering.
Advanced clustering techniques including the Louvain method.
Methods
Each module introduces a statistical concept, followed immediately by practical exercises with best-practice solutions. We use R for the practical exercises.
Learning Goals
At the end of this course, you will be able to:
- Apply PCA to reduce data dimensionality and interpret the results.
- Select and implement appropriate clustering algorithms (k-means, hierarchical).
- Identify and characterize distinct groups within your data.
- Use advanced methods like Louvain for more complex data.
Understand and interpret the outputs of the different approaches.
Prerequisites
Programming skills in R (e.g., from the course Introduction to R) and basic knowledge of statistics (e.g., from the course Introduction to Statistics).
Format
- Duration: either 2 full days or 4 half days
- Language: English
- This course will be offered either on campus (in person), or online.
- For online courses we use the software Zoom
Dates and Application
- Courses provided for Helmholtz Munich:
- You can check the current dates and whether the courses are already fully booked here*.
- Please read the corresponding FAQ* before applying via the forms of the HR Development department*.
- Courses provided for HIDA:
* Links marked with * are only available for Helmholtz Munich staff.