Project II: Detecting Faces in Images by Boosting Technique

 
 1. Objectives.
                     Boosting is a general method for improving the accuracy of any given learning algorithm. One can use it to combine simple 
       or weak classifiers, each performing only slightly better than random guess,  to form an arbitrarily good hypothesis. Viola and Jones employed 
       Adaboost  (an adaptive boosting method) for object detection and got good performance when applying to human face detection [1].
        
Example 1
Example 2

                     In this project, you are required to implement the Adaboost and RealBoost algorithms for  frontal human face detection,
        but cascade is not required. 

2. The project includes the following steps.
     
  (2.1) Construction of weak learners
       Compute the value of each rectangle feature for each sample. Each feature corresponds to a weak learner (also called tree-stump). Determine the threshold between 
        face samples and non-face samples for all the weak learners. Calculate the classification error for each weak learner and draw the best 10 features 
        (before boosting) as Figure 3 in [1].

  (2.2) AdaBoosting
       Implement Adaboost algorithm following Table 1 in [1] to boost the weak classifiers you got in (2.1). Ranking the number of weak learners the algorithm  
       selects (T in Table 1) from 2 to 200. Display the best ten features (after boosting)  as Figure 3 in [1] and compare them with the features in (2.1).
  (2.3) Cross validation
       The above two steps deal with the whole dataset. Now randomly divide the dataset into 5 equal size non-overlapped subsets, use one set as testing        
       set and the other four sets as training set, select T = 50, 100, 200 respectively, run  the algorithm.  Plot the ROC curves (3 curves in one figure) 
       for the T=50, 100, 200 respectively.

  (2.4) Testing
       Applying the algorithm (choose T= 200) on the  images that I took at class. Present the results by imposing the detected windows on images.

  (2.5) RealBoosting
          Implement RealBoosting algorithm using the top T=50, 100, 200 features you chose at Adaboosting step 2.2. Run the cross validation in step 2.3, 
          plot the 3 ROC curves for RealBoosting and compare it with the ROC curves you got by Adaboost.
3. Datasets

    The dataset includes a total of 11,800 frontal faces in two sizes: 16x16 pixels and 24x24 pixels.
    and 45,000 non-faces which are also in two sizes. These non-faces are collected through a "negative mining"
    procedure: running the detection code and adding false alarms (hard examples) to the non-face set.

           Faces:  face16x16.zip,     face24x24.zip    
    Non-faces:  nonface16x16.zip,    nonface24x24.zip

   Note: the training of Boosting code takes a long time, so  use a small number of examples when you
test your codes, and then run the full dataset (you can use only the 16x16 or 24 x24, or both) after you verify
the correctness of your code.

4. Test images
     
   These are images taken at the class [download],  you should use them for testing, not for training.
   Note that these images are tooo big, you need to downscale them to a sequence of images (pyramid)
   so that the largest and smallest faces appear as 24x24 pixels or 16x16 pixels in the pyramid at least once.

5. References

       [1] P.Viola, M.Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features", CVPR 2001.[pdf]
  
      [2] C. Huang, H. Ai, Y. Li, and S. Lao, "High-performance Rotation Invariant Multi-View Face Detection", 
                       IEEE Trans. on PAMI, 29(4), 2007. [pdf]
               (This paper uses other type of features, uses RealBoost and deals with multi-views)