# Introduction to Statistics using Python

## Course description

#### Requirements:

Basic skills in programming with Python. In detail we require basic knowledge using the packages pandas, matplotlib, seaborn and numpy. We will use the Anaconda (Spyder) IDE and only provide support for Anaconda (Spyder).

#### Course overview:

“Introduction to Statistics using Python” is a foundational course in statistical analysis for scientists and practitioners. This introductory course combines an overview of basic statistical methods with their application in Python. This course covers descriptive statistics, classical statistical tests, and linear regression and its extensions. All methods are explained in an applied setting. By the end of the course, you will be able to identify appropriate statistical methods, apply them, and interpret your results. This course does not require any previous knowledge of statistics.

#### Topics:

The course covers basic statistical methods

• Descriptive statistics
• Levels of variables
• Measures of tendency and variability (mean, median, variance, …)
• Classical statistical graphics and when to apply them (histogram, boxplots, violin plots, …)
• Random variables
• Distribution of random variables
• Characteristics of distributions
• Confidence intervals
• Hypothesis testing
• How to apply tests
• Classical statistical test (t-test, ANOVA, ...)
• When to apply which test
• Multiple testing and corrections
• Linear regression
• Idea of linear regression
• How to apply and interpret linear models
• Limits of linear regression
• The focus in all chapters is to understand when to apply which method, how to run them in Python and how to interpret the output. Also, limitations and extensions of the methods are discussed.
• The content is the same as in our Introduction to Statistics course which is taught using R.

This is not an introductory programming course. Basic programming skills in Python are a prerequisite of this course and can be achieved with the course Introduction to Python.

#### Methods:

• Each day consists of blocks covering first the statistical theory behind the methods and their application in Python, and then hands-on examples with best-practice solutions.
• The trainers are happy to answer questions during the talks and exercises.

#### Format

• Duration: 4 Days
• Language: English
• This course will be offered either on campus (in person), or online.
• For online courses we use the software Zoom.

#### Materials:

• Please be aware that the materials will be updated shortly before the next course.

#### Dates and Application:

• This course is currently not part of the HR Development program at Helmholtz Munich. Employees of Helmholtz Munich can either participate in the R version of this course (Introduction to Statistics), or in the courses we provide via HIDA (see below).
• Courses will be provided for HIDA in 2023:
• You can check the current dates and whether the courses are already fully booked here.
• Registrations for these courses are exclusively possible via the provided homepage