Stat 202C: Monte Carlo Methods for Optimization 

 MW 4-5:15 pm, Spring 2016, Boelter Hall 5273

  [syllabus.pdf] 

Course Description

This graduate level course introduces Monte Carlo methods for optimization, estimation and learning, including: Importance sampling; Sequential importance sampling; Markov chain Monte Carlo (MCMC) sampling techniques including Gibbs samplers, Metropolis/Hastings and various improvements; Simulated annealing; Exact sampling techniques; Convergence analysis; Data augmentation; Cluster sampling, such as Swendsen-Wang and SW-cuts; Equi-energy and multi-domain sampler; and Mapping the energy landscapes.

Prerequisites
Textbooks
   The lectures will be based on the following book draft.
Instructors
Grading Plan: 4 units, letter grades
    The grade will be based on four parts
        2 homework                                       20%
        3 small projects                                  45%
             Project 1:  Importance sampling for counting the number of SAWs in a lattice (15%) 
             Project 2:  Exact sampling of Potts model with Gibbs sampler    (15%)
             Project 3:  Cluster sampling for Potts model using Swerndsen-Wang method (15%)
        Final exam                                         35%

Tentative List of Topics

   Chapter 1,   Introduction to Monte Carlo Methods                                                                                                     
   1, Monte Carlo methods in science and enginnering 
      -- Simulation, estimation, sampling, optimization and learning.
   2, Topics and issues in Monte Carlo methods

  Chapter 2,   Sequential Monte Carlo                                     
   1. Importance sampling and weighted samples 
   2. Advanced importance sampling techniques 
   3. Framework for sequential Monte Carlo 
         (selection, pruning, resampling, ...)
   4, Application on learning log-linear/Gibbs models            
   5. Application: particle filtering in object tracking         

  Chapter 3,  Backgrounds on Markov Chains                             
   1. The transition matrix 
   2. Topology of transition matrix: communication and period 
   3. Positive recurrence and invariant measures 
   4. Ergodicity theorem                 
                                                         
   Chapter 4, Metropolis methods and its variants                           
   1. Metropolis algorithm and the Hastings's generalization
   2. Special case: Metropolized independence sampler    
   3. Reversible jumps and trans-dimensional MCMC           
 
   Chapter 5 Gibbs sampler and its variants                               
   1. Gibbs sampler                           
   2. generalizations: 
       Hit-and-run, Multi-grid, generalized Gibbs, Metropolized Gibbs,
   3. Data association and data augmentation
   4. Slice sampling 

   Chapter 6  Clustering sampling                                      
   1. Ising/Potts models
   2. Swendsen-Wang and clustering sampling  
   3, Original papers by Swendson-Wang  and Edwards-Sokal 
      in Physics Review Letters. 
   4, Three interpretations of the SW method 

  Chapter 7 Monte Carlo for Bayesian statistics 
   1. Bayesian hierarchical modeling
   2. Missing data imputation

  Chapter 8 Convergence analysis                                       
   1. Monitoring and diagnosing convergence 
   2*. Contraction coefficient 
   3. Puskin's order 
   4*. Eigen-structures of the transition matrix 
         (Perron-Frobenius theorem, spectral theorem)
   5. Geometric bounds 
   6*. Exact analysis on independence Metropolised Sampler (IMS) 
   7*. First hitting time analysis and bounds for IMS (paper) 
   8. Path coupling techniques.
        Bounds for Gibbs sampler and Swendson-Wang algorithm (paper).
   * discussed in previous Chapters.
          
   Chapter 9  Exact sampling                                        
   1. Coupling from the past CFTP  
   2. Bounding chains

  Chapter 10 Advanced topics                                           
   1. Equi-energy and mult-domain sampler                 
   2. Wang-Landau algorithm
   3. Stochastic gradient
   4. Mapping the energy landscape and case studies
   5. Comparing the clustering algorithms
   6. Landscapes for curriculum learning