STAT 200C: High-dimensional Statistics

Spring 2022


The course surveys modern techniques in analyzing high-dimensional and nonparametric estimation problems. The emphasis is on non-asymptotic bounds via concentration inequalities. A theme of the course is understanding the effective complexity and dimensions of the models and a theoretical understanding of the role played by regularizers in managing the complexity. The students are expected to develop an in-depth knowledge of these tools by working through challenging problems sets and surveying techniques used in current literature.

Logistics

  • Lectures: TR 2pm-3:15pm. ROYCE 160 (also online via Zoom)
  • Instructor: Arash A. Amini,
    • Office Hours: Thursdays 11am-1pm (starting 1/18/24).
  • TA: Samuel Onyambu.
  • Announcements will be posted on Campuswire. Please sign-up using code 0123.
  • Slides, homework, etc. will be posted on the Box folder.
  • Homework should be returned on Gradescope. Please sign-up using code 86VXPV.
  • Project sign-up sheet.

Syllabus

  • Concentration inequalities:
    • Basic concentration: sub-Gaussian and sub-Exponential.
    • Hoeffding and Bernstein inequalities
    • Azuma–Hoeffding and bounded difference
    • Gaussian concentration
    • Concentration of random matrices.
  • Metric entropy
  • High-dimensional sparse regression.
  • High-dimensional covariance matrix estimation.
  • Nonparametric regression or high-dimensional PCA.
  • Information-theoretic lower bounds.

Current lectures

Old Lectures

Notes

Textbook

The following is a list of other closely related sources:

Prerequisites

  • STATS 200A/B or graduate level probability theory and theoretical statistics.
  • Real analysis.
  • Linear algebra.

Grading

  • Problems sets will be assigned bi-weekly and constitute 60% of the grade. Students are encouraged to discuss the problems together, but must independently produce and submit solutions. Use of LaTeX in preparing the solutions is strongly encouraged.

  • Writing a short paper and presentation of the results constitute 40% of the grade. The paper should be on a research-level topic related to the course.

    • A common approach is to read some papers on a particular problem and write an expository paper summarizing and clarifying the results. The focus should be on understanding and illustrating the theoretical arguments used in the literature. If possible, students are encouraged to attempt to provide extensions of published theory to similar problems.
    • An alternative is to analyze a promising method, reported in the literature to have empirical success but without theoretical guarantees, and try to provide such guarantees using the tools discussed in the course.

    Students are encouraged to work in groups of two and co-author a paper. Contribution of each party should be noted in this case. Students will present their work in the final week of the class. Students are enouraged to practice their communication skills and will be graded on the effectiveness of their presentation. 10% out of the 40% will be on the quality of the presentation.