Project III: Object classification by SVM

 

1. Objective

This project is an exercise for detecting pedestrians and vehicles in common video surveillance scenes. See examples in the figure below. Like Adaboost for face detection, suppose we slide a window over the image, each window extracts a patch to be classified. The positive window is then displayed in different colors. These windows overalp with each other and have various sizes, aspect ratios to account for pedestrians,cars at front and side views etc.

In this project, you will do the following steps.

1, Use the training examples (positive only) to train 4 SVM classifiers: 1 for pedestrian, 3 for cars at three viewing angles (front, 45 degree and 90 degree side views) respectively. The feature extraction code and the linear SVM code are downloadable from below. You will design your negative images by yourself, for example, select images that do not have pedestrian or cars by the google image search and extract windows from these images. So each image can produce a large amount of negative examples. You decide how many images you need.

2, On a set of 13 real testing images, you will slide the four classifier windows over the images and classify these windows by the SVMs that you trained. Print out the results and count the number of false-alarms and missing detection.

3, Do "negative mining" to improve the performance. Download a large set of background images below which contain no pedestrians or vehicles. apply each of your SVM classifier to these background images. So any positive window will be a false alarm. These are considered "harder" examples because they confused your classifier. Add these false alarms into your negative training sets respectively. Re-train the classifiers.

4, Test the improved classifiers on the 13 test images again. Print out the results and count the number of false-alarms and missing detection.

In practice, people do many iterations of the negative mining process so as to reduce false alarms.

2, Data

The data include the following images: Download here Pedestrians: 2416 images [Pedestrian.zip]
Cars (frontal) : 265 images [Car00.zip]
Cars (45 degree) : 418 images [Car45.zip]
Cars (90 degree): 387 images [Car90.zip]
Background: large image with no pedestrian or vehicles, where you can do the negative mining. [negative_mining]

3, Tools and codes

For the project, you will use the HoG (Histogram of Oriented Gradients) feature to extract the information X=(x1, ..., xn) from each image patch, and then train the SVM for classification.

LibSVM for Matlab and Octave:
Download LibSVM here

The matlab and Octave code for HoG:
Download HoG code here

Binaries for both Matlab and Octave are included for 32bit Windows, just run svmtrain and svmpredict from within the directory you expand the files. If you are using a different architecture/platform, then you need to rebuild the binaries. This is done in matlab by first running mex -setup followed by make. In Octave, simply run make_octave.

This package trains a model from a matrix of feature vectors, and a vector of class labels. 
A simple example is provided:
matlab> load heart_scale.mat
matlab> model = svmtrain(heart_scale_label, heart_scale_inst, '-c 1 -g 0.07'); % train the model    
matlab> [predict_label, accuracy, dec_values] = svmpredict(heart_scale_label, heart_scale_inst, model); % test the training data

The third parameter of svmtrain contains a string with the following options:

options:
-s svm_type : set type of SVM (default 0)
	0 -- C-SVC
	1 -- nu-SVC
	2 -- one-class SVM
	3 -- epsilon-SVR
	4 -- nu-SVR
-t kernel_type : set type of kernel function (default 2)
	0 -- linear: u'*v
	1 -- polynomial: (gamma*u'*v + coef0)^degree
	2 -- radial basis function: exp(-gamma*|u-v|^2)
	3 -- sigmoid: tanh(gamma*u'*v + coef0)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight: set the parameter C of class i to weight*C, for C-SVC (default 1)

The k in the -g option means the number of attributes in the input data.

4, Testing images

We have 13 test images download images. Some are from UCLA campus and some are from a surveillance video.

5, Reference for HoG feature

D. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection", IEEE Conf. on Computer Vision and Pattern Recognition, 2005. [pdf]