MA 363: Probability in High Dimensions (Spring 2021)

Instructor: Anirban Basak
Office hours: Online. By email appointment.
Class time and location: TuTh 10.00-11.30 AM. Lectures will be held over Zoom. Interested students are requested to email to get the meeting link. First class is on March 02.
Prerequisite: This is a graduate level topics course in probability theory. Graduate level measure theoretic probability will be useful, but not a requirement. Students are expected to be familiar with basic probability theory and linear algebra. The course will be accessible to advanced undergraduates who have had sufficient exposure to probability and linear algebra.
Course outline: This course will be aimed at understanding the behavior of random geometric objects in high dimensional spaces such as random vectors, random graphs, random matrices, and random subspaces, as well. Topics will include the concentration of measure phenomenon, non-asymptotic random matrix theory, empirical processes, and some related topics from geometric functional analysis and convex geometry. Towards the latter half of the course, depending on students' interests, a few applications of the topics covered in the first half will be considered such as community detection, covariance estimation, randomized dimension reduction, and sparse recovery problems.

Suggested books and references:
1. Roman Vershynin, High-dimensional probability: An introduction with Applications in Data Science, Cambridge Series in Statistical and Probabilistic Mathematics (Series Number 47), 2018.
2. Roman Vershynin, Introduction to the non-asymptotic analysis of random matrices, Compressed sensing, 210-268, Cambridge University Press, 2012.
3. Ramon van Handel, Lecture notes on Probability in High Dimension.
4. Stéphane Boucheron, Gábor Lugosi, and Pascal Massart, Concentration Inequalities: A nonasymptotic theory of independence, Oxford University Press, 2013.
5. Michel Ledoux and Michel Talagrand, Probability in Banach spaces, Springer Science & Business Media, 2013.
6. Avrim Blum, John Hopcroft, and Ravindran Kannan, Foundations of Data Science, Cambridge University Press, 2020.
7. Joel Tropp, An Introduction to Matrix Concentration Inequalities, Foundations and Trends in Machine Learning, Vol. 8, No. 1-2, pp 1-230, 2015.

Schedule of topics covered and lecture notes:
Mar 02: Introduction and a brief overview of the topics to be considered.
Mar 04 and 09: Sub-Gaussian and sub-exponential random variables, their properties, and some concentration bounds.
Mar 11 and 16: Examples and properties of random vectors in high dimension.
Mar 18, 23, and 25: MAX-CUT, Cut norm, Community detection, semidefinite relaxations, and Grothendieck's inequality.
Mar 30 and Apr 01: Hanson-Wright inequality and its applications.
Apr 06, 08, and 20: Matrix concentration inequalities.
Apr 13 and 15: No lectures.
Apr 22 and 27: Applications of matrix concentration inequalities in community detection and covariance estimation.
Apr 29, May 03, and 05: Symmetrization trick, bounds on operator norms of random matrices, and an application to the matrix completion problem.

Problem Set (to be updated regularly): See here

Grading: Students taking this course for credit are required to do either of the following: 1) Solve at least 60% of the problems assigned during the lectures, or 2) Do a (reading) project, submit a report, and give a presentation on the same at the end of the semester. The final grade will be based primarily on the basis of efforts.