E27/CS27 Lab 4

E27/CS27 Lab 4: Face Recognition

Dan Crosta
Ben Mitchell
Zach Pezzementi

Abstract

We compared two methods of recognizing faces in images: one apperance-based, and one image-based. The appearance-based method used singular value decomposition to identify principal variations in the image set, and an artificial neural net to label images based on these variations. The image-based method simply calculated a pixel-wise sum squared difference to compare a novel image to each image in the training set. Additionally, we attempted to eliminate non-facial characteristics with a fuzzy mask. The appearance-based method achieved better accuracy and precision than the sum-squared-difference method, and the unmasked versions performed better in both cases.

Image Processing

Before building the matrix of face-image vectors, we had to normalize the images based on total length of the face vector, which basically accounts for variation in brightness between the images. The average face image is constructed from the normalized faces, and is then used to construct the column vectors for the matrix. These vectors are the component-wise difference of normalized face images minus the normalized average face image.

Next, we took the large matrix created above and ran it through the SVD function, which returns the singular value decomposition of the matrix. Since the singular values returned by SVD are square-roots of the eigenvalues of the matrix, we could use this information to determine the amount of variation that each of the corresponding vectors accounted for in our face space. In this case, we used enough vectors to account for 70% of the variation, which turned out to be 11 for un-masked images, and 14 for masked images (see section on Image Masking). The vectors corresponding to the singular values are scaled versions of the eigenvectors of the same matrix, which we used as the basis vectors for the eigenspace we projected our database images and test images into. Projection into the eigenspace is accomplished with a simple dot product of the difference image vector by each of the eigenvectors. This gives us a vector with the same dimesionality as our eigenspace, which we can then use to try to recognize a new face image.

Face Recognition

We tackled the recognition problem in two ways: primarily and most successfully, we used a artificial neural net to learn a classification function. The net had as many input nodes as the eigenspace had dimensions, a unique output node for each recognizable face, and 12 hidden nodes (this number was empirically determined). The net was a fully connected, feed-forward connectionist network. Weights were initiallized to random numbers between -1 and 1. The activation function was a standard sigmoid squashing function, and the network was trained using the standard backpropogation gradient descent algorithm. Each epoch the net was trained on each example in the first, third and fourth image sets, and every twentieth epoch, was tested on the second image set. The training set contained one input-output pair for each image in the database, and the testing set was similarly constructed from images not in the database (ie from image set two). The network was trained for 1500 epochs, at which time error seemed to have plateaued. (See graphs). If none of the net's output nodes produced a strong enough signal, or if multiple strong signals were deteced, the test image was assumed to be not in the database. After running the test image through the net, we reprojected the image from its eigenspace coordinates, and ran a sum-of-squared-differences check between the original and reprojected image. If the result was very high, we classified the image as not a face, but if it is low, it is likely to be a face, though not one in the database.

The other recognition method we used was a brute-force, pixel-wise sum-of-squared-differences approach. After computing the SSD between the test image and every image in the database, the face with the smallest SSD was considered to be a matching image, if the SSD was below the empirically determined threshold. If the minimum SSD was above the threshold, the image was rejected as not being a face.

Image Masking

In order to improve the performance of the recognition system, we created a gaussian image mask, and applied it with a multiplication operation to the face images before normalizing them and adding them to the database. We expected this to both reduce the number of eigenvectors required to represent the face space (because a lot of the variation around the outside of the images, such as hair or shirt collars), and to improve the accuracy of the system. However, given the images we had access to, we found that the system both required more eigenvectors, and performed significantly worse (see Results). After considering the problem, we concluded that the neural network was using a lot of the variation around the edges of the images to identify people, since everyone was wearing the same clothing and, with the exception of jbulnes.02.pgm, everyone had roughly the same hairstyle in each of their images. Without this information, the system had a harder time classifying the images.

The original image.

The mask image.

The masked image.

Image Reprojection

One feature of using the eigenspace to represent the images is that we can use the coordinates in the eigenspace to project the model into a differance image vector. This can then be used to reconstruct an image by adding the average image to the reconstructed vector, and then scaling the result from 0 to 255, and saving as a PGM. Reprojection is accomplished by multiplying each coordinate in the eigenspace by its corresponding eigenvector, and summing the resultant vectors component-wise. Since we only use a portion of the eigenvectors returned from SVD, the reconstructed image is often similar, but not identical to, the original image. Using simple image comparision methods, such as SSD, we can determine whether the image in question seems to be a face in the database, or not.

The original image	Unmasked reprojection.	Masked reprojection.
The original image	Unmasked reprojection.	Masked reprojection.

Results

The confusion matrices below show performance on the test set (set 2). All methods returned perfect performance on all training data (sets 1, 3, and 4).

Unmasked SSD

Name

andrew

ben

brandon

charlie

cshetland

dave

eli

eric

evan

jane

jbulnes

jesse

jordan

kuan

laura

luis

matt

maxwell

nik

oshu

paul

rberger

stephanie

suor

tgillette

xzhuo

andrew

ben

brandon

charlie

cshetland

csmith

dave

eli

eric

evan

jane

jbulnes

jesse

jordan

kuan

laura

luis

matt

maxwell

nik

oshu

paul

rberger

stephanie

suor

tgillette

xzhuo

Masked SSD

Name

andrew

ben

brandon

charlie

cshetland

dave

eli

eric

evan

jane

jbulnes

jesse

jordan

kuan

laura

luis

matt

maxwell

nik

oshu

paul

rberger

stephanie

suor

tgillette

xzhuo

andrew

ben

brandon

charlie

cshetland

csmith

dave

eli

eric

evan

jane

jbulnes

jesse

jordan

kuan

laura

luis

matt

maxwell

nik

oshu

paul

rberger

stephanie

suor

tgillette

xzhuo

Unmasked Appearance-Based

Name

andrew

ben

brandon

charlie

cshetland

dave

eli

eric

evan

jane

jbulnes

jesse

jordan

kuan

laura

luis

matt

maxwell

nik

oshu

paul

rberger

stephanie

suor

tgillette

xzhuo

andrew

ben

brandon

charlie

cshetland

csmith

dave

eli

eric

evan

jane

jbulnes

jesse

jordan

kuan

laura

luis

matt

maxwell

nik

oshu

paul

rberger

stephanie

suor

tgillette

xzhuo

Masked Appearance-Based

Name

andrew

ben

brandon

charlie

cshetland

dave

eli

eric

evan

jane

jbulnes

jesse

jordan

kuan

laura

luis

matt

maxwell

nik

oshu

paul

rberger

stephanie

suor

tgillette

xzhuo

andrew

ben

brandon

charlie

cshetland

csmith

dave

eli

eric

evan

jane

jbulnes

jesse

jordan

kuan

laura

luis

matt

maxwell

nik

oshu

paul

rberger

stephanie

suor

tgillette

xzhuo

Only one method actually misidentifies any image, and only a single image (evan identified as jordan). All other errors are false negatives, identifying the image as being not in the database when it actually is. One image is consistently misidentified by all methods: the second image of Matt Fowles (jbulnes.02.pgm) shows him with his hair pulled back, while every other image of him shows him with his hair down.

Graphs

SNNS Training Graph, Unmasked Images

Red: testing set error (image set 2)
Black: training set error (image sets 1, 3, 4)

SNNS Training Graph, Masked Images

Red: testing set error (image set 2)
Black: training set error (image sets 1, 3, 4)

Conclusions

All methods except masked appearance-based modeling had perfect precision. The neural net based on the masked eigen-space mislabeled one image out of 27 (see confusion matrix). The accuracies for each method were 21/27 for Unmasked SSD, 18/27 for Masked SSD, 26/27 for Unmasked Appearance-based, and 22/27 for Masked Appearance-based. Additionally, after the one-time cost of building the eigenspace and training the network, the appearance-based method performed comparisons almost 5 times as fast as the SSD method (2.9s for the script neteval.pl and 13.7s for ssdeval.pl). The difference between the accuracy of masked and unmasked appearance-based methods suggests that the svd analysis gets a significant amount of information from the non-face regions of the images (shirt collar, hair) for unmasked images. If a larger database of images were available, particularly of the same people wearing different things and different hair-styles, we suspect that the masked method would provide more meaningful results with fewer basis vectors. With the current database, however, recognizing shirts and hair is an effective method of identifying faces, since each face is associated with only a single shirt/hair.

Extentions

We used a mask to isolate faces in the images.
We compared the eigensystem to brute-force SSD.
We reconstruct images to differentiate non-face images, and novel faces.
We used aritificial neural networks to classify images (with the eigensystem).