Skip to main content
kasto - stock.adobe.com

Multivariate Statistics 2

Advanced dimensionality reduction techniques

Course description

Requirements:

Programming skills with R, e.g. course Introduction to R, basic knowledge of statistics, e.g. course Introduction to Statistics and knowledge on basic dimension reduction techniques (PCA), e.g. course Multivariate Statistics 1.  Some practice in ggplot2 is also welcome, which can be achieved in the course Graphics with R (not mandatory).

Course overview:

The participants will learn when and how to apply unsupervised and supervised dimension reduction techniques as MDS, MFA, t-SNE, UMAP, PCR or PLSR. A short introduction on PCA will be given at the beginning of the lecture (more details on PCA can be learned in the course Multivariate Statistics 1). The content of the course will help to understand the basis of the theory when doing a multivariate analysis. All topics are accompanied with hands-on exercises using the statistical software R. The participants are invited to ask as many questions as they want about the analyses on their own dataset.

Topics:

This course on multivariate statistics covers two different topics:

  • Unsupervised dimension reduction methods. This first chapter starts with a short repetition on the basic principles of principal component analysis (PCA). After this introduction and a short overview on other unsupervised multivariate methods (e.g. for categorical variables), more advanced dimension reduction techniques are explained, namely multidimensional scaling (MDS) and multiple factor analysis (MFA) for data structured into groups. This chapter focuses as well on techniques developed for high-dimensional data set (e.g. omics data), namely t-SNE and UMAP.
  • Supervised dimension reduction methods. This second chapter covers two supervised learning methods: principal component regression (PCR) and partial least squares regression (PLSR).

Methods:

Each day consists of blocks covering first the theory behind the methods and their applications in R. Theoretical lessons will be followed by hands-on examples with best-practice solutions.

Format:

  • Duration: 2 Days
  • Language: English
  • This course will be offered either on campus (in person), or online
  • For online courses we use the software Zoom.

Materials:

  • Material for the course can be found here*.
  • Please install the necessary R-packages prior to the course. The packages are listed in "Materials_Multivariate_Statistics_2.html" which is part of the linked ZIP-folder.
  • Please be aware that the materials will be updated shortly before the next course.

Dates and Application:

  • Courses provided for Helmholtz Munich:
    • You can check the current dates and whether the courses are already fully booked here*. The course registration will usually open 8 weeks prior to the course.
    • Please read the corresponding FAQ* before applying via the forms of the HR Development department*.
  • Courses provided for HIDA:
    • You can check the current dates and whether the courses are already fully booked here.
    • Registrations for these courses are exclusively possible via the provided homepage.

 * Links marked with * are only available for Helmholtz Munich staff.