Introduction to the Python environment, basic commands and data structures, data management and statistical programming (numpy, scipy, pandas), descriptive statistics, inference statistics (tests and linear regression), graphics, and applied exercises.

Instructor | Felix Skarke/ Flavio Morelli |
---|---|

Number of Places | about 25 participants (fu:stat reserves the right to rescedule or cancel the course if the number of participants lies below 15) |

Registration | → Register online |

Registration Mode | General information about our course offers (questions about the group of participants, registration or cancelation, about payment, organsation, certificates of participation etc.) you can find here. |

Participation Fee | 100 € for students, 200 € for employees, for members of Potsdam Graduate School (40 € PhD-students, 60 € Postdocs), PhD-students of a member institution of the Berlin University Alliance can participate free of charge. |

Room | FB Wirtschaftswissenschaft, Garystr. 21, 14195 Berlin PC-Pool 2 (basement) |

Time | Thursday, March 19, 2020 and Friday, March 20, 2020, 9:00 a.m. to 5:00 p.m. |

Students, PhD candidates and academic staff from **all** universities.

Basic knowledge of descriptive and inferential statistics at the level of our courses “Statistik-Kompakt” or “’Statistik-Grundlagen”. No prior knowledge of Python required.

Python is one of the world’s most widely used programming languages. It is accessible to beginners due to its simple syntax, and it is used in a large variety of applications - from microcontrollers to web programming. Python is open-source and freely available, unlike other common statistical solutions such as SAS, SPSS, MATLAB and STATA. An advantage of Python is the comprehensive standard library, which includes many common functions, as well as the availability of many high-quality libraries for different use cases.

In the last couple of years, Python has been adopted increasingly for scientific programming in fields like economics, mathematics, physics, statistics, psychology, and data science. The scientific programming ecosystem comprises packages like numpy and scipy for numerical computing, pandas for data transformation, scikit-learn for machine learning, TensorFlow for deep learning, and OpenCV for computer vision.

This course provides the basics of the scientific programming environment in Python. It introduces programming concepts, before explaining how to do work with scientific packages like numpy, scipy and pandas. We learn the basic procedures of an empirical analysis like descriptive statistics, statistical tests, linear regression, and data visualization. After the course, participants should be able to use the Python documentation independently and apply the tools to answer research questions.

Topics:

- Using Python with Conda and Jupyter
- Python programming basics (data structures, functions, importing libraries)
- Numerical computing with numpy, scipy
- Data editing with pandas
- Descriptive statistics
- Data visualization in Python
- Statistical tests
- Linear regression