May 21st 2016 16:00                  2975

Eye Tracking System: appearance-based model and manifold learning

This article summarizes my Ph.D work of implementing image processing algorithms in real time eye tracking system. Here are my thesis paper PDF , and a presentation of my thesis in 2015 (presented in French) HTML5 .

The eye tracking system works with one remote camera and under a natural illumination condition. With the simple and flexible hardware condition, it focuses on implementation of efficient algorithms to locate eye regions, to extract features from eye images and to establish the correlation between eye images and gaze direction, or specific eye movements.

1. Location of eye regions

I propose a hybrid method to localize two eye regions in real time in the frames captured by the webcam. Once the eye regions are detected in the first frame, they will be tracked through the rest frames by the stochastic method, Particle Filtering (PF), in order to improve the speed of processing.
The particle filtering method is a stochastic approximation technique based on Importance Sampling techniques for solving state space models, especially for highly non-linear and non-Gaussian problems. Here the PF method estimates the possible location of the object (eye) in the frame t from previous location in the frame t-1, where the problem can be viewed as an approximation to the probability distribution P(x) in a Bayesian context. A set of particles (samples) associated with weight (ω) is used to approximate P(x) and each of them has the knowledge about what the object is like (observation model) and how it moves (transition model). The set of particles will be evolving in three phases of processing: correction, propagation and prediction.
The result of experimentation shows that the PF method in tracking eye regions improves the processing speed by comparing other deterministic methods and it can adapt well to different conditions, such as the changes of illumination.

2. Extraction of eye features

Location of eye regions gives eye images of 60x40 pixels. The features are obtained by the analysis of the whole appearance of image. The Center-Symmetric Local Binary Pattern (CS-LBP) is employed in the divided blocs of the eye image (as shown in the following image). sample-image It generates a combined CS-LBP descriptor to represent the features of the eye image. The CS-LBP descriptor (Heikkila et al. 2009) combines the strengths of the SIFT descriptor and the LBP texture operator and it is computationally simpler than the SIFT descriptor. The descriptor is used as an eye feature in the observation model of the PF method (mentioned in section 1) to track the eye regions. The experimental results show that the descriptor is tolerant to the changes in illumination condition.

3. Manifold learning on eye movements

As mentioned before, the CS-LBP descriptor is used as an eye feature while tracking eye regions on video. But we also need to study different eye movements from the descriptors, such as blink, gaze towards different directions.
Manifold learning, also called non-linear demensionality reduction, aims at finding out the "intrinsic variables" on an embedded manifold for high-dimensional data. If the manifold is of low enough dimension (2 or 3 dimensions for example), the high-dimensional data can be visualized in the low-dimensional space. Laplacian Eigenmaps, which use spectral techniques to perform dimensionality reduction, are employed on a set of eye images collected in real time in the eye tracking system. Here is an example of 3-D distribution of the eye images. sample-image In the experimentation of eye control system, I create a laplacian eigenmaps engine to estimate specific eye movements such as the four gaze directions and the blink. The laplacian eigenmaps engine collects 32 eye images during the calibration phase. The new eye image will be processed by the engine for comparison with the 32 images in order to classify the new image in one specific movement(as shown below). sample-image Another type of gaze estimation model applies a semi-supervised Gaussian Process Regression to predict the coordinates (x,y) of eye direction in the screen. A 5-points calibration collects eye images in real time to establish a function, so as to correlate the eye images and the coordinates in the screen. Here is the demo video of the eye tracking system.

Your message was sent, thank you!
Email Me At

Call Me At

Mobile: (+33) 06 06 62 37 69