STAT 200C: High-dimensional Statistics

Spring 2021

The course surveys modern techniques in analyzing high-dimensional and nonparametric estimation problems. The emphasis is on non-asymptotic bounds via concentration inequalities. A theme of the course is understanding the effective complexity and dimensions of the models and a theoretical understanding of the role played by regularizers in managing the complexity. Techniques from empirical process theory will be discussed in some depth, providing a unifying framework for analyzing many of the optimization-based estimators used in statistical machine learning. As examples, the theory of sparse linear models, sparse covariance matrix estimation and smooth function estimation will be covered. The students are expected to develop an in-depth knowledge of these tools by working through challenging problems sets and surveying techniques used in current literature.

General info

TA session and OH

  • Section: Tuesday, 4-5pm via Zoom.
  • OH: Monday, 12:30-2pm, sign up on CCLE, same Zoom link.

Campusewire and Gradescope

  • Announcements will be posted on Campuswire (Code: 0856).
  • Please sign-up on Campuswire as soon as possible by following the link.
  • Use Gradescope for homework submission. (Entry Code: P5KVG7)
  • Use this Google sheets to sign-up for the project.

Slides, Notes, Homework

Lectures

Textbook

The following is a list of other closely related sources:

Syllabus

  • Concentration inequalities: Gaussian, sub-Gaussian and sub-Exponential concentrations, Hoeffding and Bernstein inequalities, bounded difference, concentration of random matrices, etc.
  • Analysis of empirical risk minimization procedures: Excess risk bounds, oracle inequalities, etc.
  • Empirical process theory: Uniform laws of large numbers, symmetrization, contraction inequality, chaining, VC classes, metric entropy, Gaussian and Rademacher complexities, etc.
  • High-dimensional sparse regression.
  • Covariance matrix estimation and PCA under sparsity.
  • Reproducing kernel Hilbert spaces for function estimation.
  • Nonparametric regression and kernel density estimation.
  • Information-theoretic lower bounds and minimax rates.

Time-permitting a sample of the following topics will be covered:

  • General theory of decomposable regularizes.
  • Low-rank matrix estimation.
  • Community detection and spectral clustering.
  • Sparse graphical models.

Prerequisites

STATS 200A and 200B (or graduate level probability theory and theoretical statistics), real analysis and linear algebra.

Grading

  • Problems sets will be assigned bi-weekly and constitute 60% of the grade. Students are encouraged to discuss the problems together, but must independently produce and submit solutions. Use of LaTeX in preparing the solutions is strongly encouraged.

  • Writing a short paper and presentation of the results constitute 40% of the grade. The paper should be on a research-level topic related to the course. A common approach is to read some papers on a particular problem and write an expository paper summarizing and clarifying the results. The focus should be on understanding and illustrating the theoretical arguments used in the literature. If possible, students are encouraged to attempt to provide extensions of published theory to similar problems. An alternative is to analyze a promising method, reported in the literature to have empirical success but without theoretical guarantees, and try to provide such guarantees using the tools discussed in the course. Students are encouraged to work in groups of two and co-author a paper. Contribution of each party should be noted in this case. Students will present their work in the final week of the class. Students are enouraged to practice their communication skills and will be graded on the effectiveness of their presentation. 10% out of the 40% will be on the quality of the presentation.