Facial Feature Recognition using Neural Networks

In the Fall of 1992, for a class project in Artificial Intelligence, I designed a neural network to locate facial features in images. The one hundred images I used came from the underclassmen section of the 1987 University High School yearbook. They were scanned in at 96 by 128 resolution. I set four of the images aside to comprise the testing set, and for the remaining ninety-six I manually specified the coordinates of the left eye, right eye, nose, and mouth.

My first thought was to train the neural network by showing it an 8 × 8 pixel window around the left eye of some training image, and then influence it (via back-propagation) that it should recognize this small image patch as a left eye. Then I would show it a right eye, and then a nose, and then a mouth, and keep this up through the whole testing set until the weights in the network converged to stable values. I implemented this approach with 64 input units (for the 8 × 8 patch of grey values), 9 units in the hidden layer, and four output units -- one for each feature.

I soon realized that the neural network would probably have a hard time telling the difference between a left eye and a right eye if it only got to see an 8 × 8 pixel region of the image. Thus, instead of sending in an 8 × 8 window around the feature, I sent in an 8 × 8 subsampled version of the log-polar map of the image centered on the feature. This had the effect of including detailed information from the local area about the feature, as well as coarser information about the rest of the image around the feature. In this way, the neural network had a chance of telling the difference between a left eye and right eye by noticing their location relative to the rest of the face.

As it turned out, I was pleasantly surprised with the network's ability to detect the facial features in the images in the testing set. I believe that the log-polar mapping was a large part of its success.

One of the original face images in the training set, shown with log-polar maps centered about the eye, nose, and mouth. At the bottom are the 8x8 subsampled versions that were used to form the 64 inputs to the neural network.

The results of the neural network's attempt to locate the facial features in an image not in the training set. Each point of the 96x128 image was log-polar mapped, subsampled, and then fed in to the neural network. The highlighted areas of the output images indicate the pixels in the image for which there was a large positive response from the neural network.

Richard Fateman, who taught CS283 that semester, was impressed with the success of the system. Unfortunately , my research advisor at the time discouraged me from either publishing the results or continuing the work. I finally created the first version of this web page for the project in 1994.

Project Report:

  • Paul Debevec. A Neural Network for Facial Feature Location. UC Berkeley CS283 Project Report, December 1992. http://www.debevec.org/FaceRecognition/
    Available as HTML or PDF

Other face recognition pages:

The Face Recognition Home Page

Facial Analysis Home Page


Paul E. Debevec / paul@debevec.org