Project I: Principal Component Analysis for Huamn Faces

Due on Oct. 19. Wednesday

1. Objectives.

Human face is a very important pattern and has been extensively studied in the past 30 years in vision, graphics, and human computer interaction, for tasks like face recognition, human identification, expression, animation etc. An important step for these tasks is to extract effective representations from the face images. The principal component analysis is found to be a good representation.

We provide a dataset of 178 faces in this project and each has 256 x 256 pixels. The face images are pre-processed so that the background and hair are removed and the faces have similar lighting conditions. Each face has a number of landmarks which are identified manually for your convenience and they correspond across the dataset. In the second dataset, there are 87 points per image. As we know that these landmarks must be first aligned before we apply the PCA on the intensity.

   
face with 87 landmarks

Part 1: ASM and AAM model for face reconstruction.

The experiment include the following steps.

we divide the 178 faces into two parts: the first 150 faces are put in a training set, and the remianing 28 faces are put in a testing set.

(1). Compute the mean and first k eigen-faces for the training images with no landmark alignment. Display the first K=20 eigen-faces and use them to reconstruct the remaining 28 testing faces. Plot the total reconstruction error (squared intensity difference between the reconstructed images and their original ones) per pixel (i.e. normalize the error by the pixel number, and average over the testing images) over the number of eigen-faces k.

(2). Compute the mean and first k-eigen-warpping of the landmarks for the training faces. Here warping means displacement of points on the face images. Display the first 5 eigen warppings (you need to add the mean to make it meaningful), and use them to reconstruct the landmarks for the testing faces. Plot the reconstruction error (in terms of distance) over the number of eigen-warppings k (again, the error is averaged over all the testing images).

(3). Combine the two steps above. Our objective is to reconstruct images based on top 10 eigen-vectors for the warping and then top k (say 10) eigen-vectors for the appearance, in the context of compressing the face images and communicate through a network with small number of bits. For the training images, we first align the images by warping their landmarks into the mean position (interpolation between landmarks is needed), and then compute the eigen-faces from these aligned images. For each testing face: (i) project its landmarks to the top 10 eigen-warpings, you get the reconstructed landmarks. (here you lose a bit of geometric precision of reconstruction); (ii) warp the face image to the mean position and then project to the top k (say k=10) eigen-faces, you get the reconstructed image at mean position (here you further lose a bit of appearance accuracy). (iii) Warp the reconstructed faces in step (ii) to the positions reconstructed in step (i). Then compare the reconstructed faces against the original testing images (here you have loss in both geometry and appearance). (iv) Plot the reconstruction errors per pixel against the number of eigen-faces k.

(4). Synthesize random faces by a random sampling of the landmarks (based on the top 10 eigen-values and eigen-vectors in the wrapping analysis) and a random sampling of the intensity (based on the top 10 eigen-values and eigen-vectors in the intensity analysis). Display 20 synthesized face images. (As we discussed in class, each axis has its ownunit, that is, the square-root of eigen-value).

Part 2: Fisher faces for gender discrimination.

We have divided the 178 faces into male (89) and female faces (85), plus 4 unknown (which is hard even for human eyes to tell based on the faces alone). Then choose 10 male and 10 female as the testing set. The remaining faces (except the 4 unknown) will be used as training sets.

(5) Find the fisher face that distingushes male from female using the training sets, and then test it on the 20 testing faces and see how much percentage is right. This Fisher face mixes both geometry and appearance difference between male and female.

(6) Compute the Fisher face the key point (geometric shape) and Fisher face for the appearance respectively, and thus each face is projected to a 2D-feature space, and visualize how separable these points are.

[The within-class scatter matrix is again very high dimensional, you can either use the trick that we used in (1) for computing its eigen-values and eigen-vectors, or you may compute the Fisher faces over the reduced dimensions in steps (2) and (3): i.e. each face is now reduced to 10-Dimensional geometric vector + 10 dimensional appearance vector. After the Fisher linear Discriminant analysis, we represent each face by as few as 2 dimensions for discriminative purpose and yet, it can tell apart male from female!

Read and pdf file compute the Fisher face.pdf for exactly how to do this in matlab].

What to submit: Your submission will bn an experimental report, you need to visualize the results in good figures (your grade will be based on the quality of results and analysis) , such as mean, eigen-vectors, and plots. Print out the result for submission. Don't print the code. Zip all your code and send to the reader in email.

2. Datasets

The dataset is packed in a zip file. You should download asap. Here is the readme file.

Here is a piece of the matlab code for warping the images: WarpImage.m

A student in the class has made a video animation for some of the eigen-vectors which represent interesting axes of changes. Click here to see.

3. Further References

L.Sirovich and M. Kirby, "Low-dimensional procedure for the characterization of human faces", Journal of Optical Society of America, 4:519-524, 1987.

M. Turk and A. Pentland, "Eigen-faces for recognition", Journal of Cognitive Neurosciences, vol.3, no.1 77-86, 1991.

T.F. Cootes, C.J. Taylor, D. Cooper, and J. Graham, "Active shape models--their training and applications", Computer Vision and Image Understanding, 61(1):38-59, 1995.