E401: Machine Learning for Economic Data

This course is an introduction to popular tools from the field of statistical and machine learning. After reviewing basic concepts, such as the bias-variance trade-off, linear regression, and cross-validation, we will cover a broad-range of machine learning methods, for example, shrinkage estimators (ridge regression and LASSO), splines, and random forests. Throughout the course, we will use the software package R and economic data to illustrate the discussed concepts and methods. Empirical projects with real-world data and student presentations will be an integral part of the class. We will use the projects to also discuss the full workflow of data science from getting data, importing and cleaning data, visualizing data, to communicating the results of empirical analyses.

Until Fall 2021 this class has been offered as E392: Topics in Big Data. Since Fall 2022, this class is also offered as a M.Sc.-level class cross-listed as M518: Big Data in Economics.

All lecture material is available on Canvas.