Speaker Giving a Talk at Business Meeting.

Basic Methods in Machine Learning

Course description

This course provides a practical introduction to the basic concepts and techniques of machine learning using Python. Designed for researchers, the focus is on building predictive models to classify data and predict outcomes, distinguishing it from traditional statistical inference. You will learn the complete workflow: from preparing data to building, evaluating, and interpreting models. The course is highly interactive, featuring live coding in Jupyter Notebooks to ensure you can apply these methods to your own work.

This course is equivalent to the former course “Introduction to Machine Learning”.

 

Target Audience

Researchers interested in building a practical foundation in predictive modeling using Python, with no prior machine learning experience required.
Helmholtz Munich doctoral researchers cannot replace the mandatory course Introduction to Statistics with this course.

 

Topics

  • The Machine Learning Workflow: 
    • Key paradigms: supervised and unsupervised learning
    • Distinguishing machine learning from traditional statistical analysis
    • Standard machine learning workflow and the importance of train-test splits
  • Predictive Modeling with Regression: 
    • Linear regression for predictive tasks.
    • Shrinkage methods (Lasso, Ridge) for automatic feature selection and regularization to handle high-dimensional data and prevent overfitting.
  • Foundational Classification Algorithms: 
    • Logistic regression, K-Nearest Neighbors, and decision trees for classification problems.
    • Interpreting outputs and decision boundaries of these models.
  • Robust Model Evaluation and Selection: 
    • Cross-validation for estimating model performance.
    • Selection and justification of appropriate performance metrics (e.g., confusion matrix, ROC/AUC) to select the best model for a specific research question.

Differences from "Introduction to Statistics" course:

  • Emphasis on predictive modeling rather than statistical inference.
  • Focus on understanding key ML terminology and practical applications.

 

Methods

The course integrates foundational theory with intensive, hands-on coding sessions in Python. Each theoretical concept is immediately followed by practical examples and exercises with best-practice solutions.

 

Learning Goals

At the end of this course, you will be able to:

  1. Understand the ML framework
  2. Build and Interpret regression models
  3. Apply core classification algorithms
  4. Rigorously evaluate model performance

 

Prerequisites

Programming skills in Python (e.g., Introduction to Python course). Basic understanding of data analysis and fundamental statistical approaches is recommended.

 

Format

  • Duration:  either 2 full days or 4 half days
  • Language: English
  • This course will be offered either on campus (in person), or online
  • For online courses we use the software Zoom.

 

Dates and Application

  • Courses provided for Helmholtz Munich:
    • You can check the current dates and whether the courses are already fully booked here*.
    • Please read the corresponding FAQ* before applying via the forms of the HR Development department*.
  • Courses provided for HIDA:
    • You can check the current dates and whether the courses are already fully booked here.
    • Registrations for these courses are exclusively possible via the provided homepage.

 * Links marked with * are only available for Helmholtz Munich staff.