# Python Basics for Statistics (March 19-20, 2020)

## The course is fully booked. Filling out the registration puts you on the waiting list.

Introduction to the Python environment, basic commands and data structures, data management and statistical programming (numpy, scipy, pandas), descriptive statistics, inference statistics (tests and linear regression), graphics, and applied exercises.

Instructor | Felix Skarke/ Flavio Morelli |
---|---|

Number of Places | about 25 participants (fu:stat reserves the right to rescedule or cancel the course if the number of participants lies below 15) |

There are vacancies | nein |

Registration Mode | General information about our course offers (questions about the group of participants, registration or cancelation, about payment, organsation, certificates of participation etc.) you can find here. |

Participation Fee | 100 € for students, 200 € for employees, for members of Potsdam Graduate School (40 € PhD-students, 60 € Postdocs). Financial support of Dahlem Research School: Unfortunatly, the cost free participation for doctoral candidates of the Berlin University Alliance is temporarily suspend, more information can be found on the main page for courses. |

Room | FB Wirtschaftswissenschaft, Garystr. 21, 14195 Berlin PC-Pool 2 (basement) |

Time | Thursday, March 19, 2020 and Friday, March 20, 2020, 9:00 a.m. to 5:00 p.m. |

## Student Profile

Students, PhD candidates and academic staff from **all** universities.

## Requirements

Basic knowledge of descriptive and inferential statistics at the level of our courses “Statistik-Kompakt” or “’Statistik-Grundlagen”. No prior knowledge of Python required.

## Description:

Python is one of the world’s most widely used programming languages. It is accessible to beginners due to its simple syntax, and it is used in a large variety of applications - from microcontrollers to web programming. Python is open-source and freely available, unlike other common statistical solutions such as SAS, SPSS, MATLAB and STATA. An advantage of Python is the comprehensive standard library, which includes many common functions, as well as the availability of many high-quality libraries for different use cases.

In the last couple of years, Python has been adopted increasingly for scientific programming in fields like economics, mathematics, physics, statistics, psychology, and data science. The scientific programming ecosystem comprises packages like numpy and scipy for numerical computing, pandas for data transformation, scikit-learn for machine learning, TensorFlow for deep learning, and OpenCV for computer vision.

This course provides the basics of the scientific programming environment in Python. It introduces programming concepts, before explaining how to do work with scientific packages like numpy, scipy and pandas. We learn the basic procedures of an empirical analysis like descriptive statistics, statistical tests, linear regression, and data visualization. After the course, participants should be able to use the Python documentation independently and apply the tools to answer research questions.

Topics:

- Using Python with Conda and Jupyter
- Python programming basics (data structures, functions, importing libraries)
- Numerical computing with numpy, scipy
- Data editing with pandas
- Descriptive statistics
- Data visualization in Python
- Statistical tests
- Linear regression