Facial Pose and Attribute Estimation in Real-World Videos

Researchers: M. Demirkus, T. Arbel. Collaborators: D. Precup, J. J. Clark

Facial attribute classification has been receiving a lot of attention in the computer vision community due to its wide range of possible application areas. Most approaches in the literature have focused on trait classification in controlled environments, due to the challenges presented by real-world environments, i.e. arbitrary facial expressions, arbitrary partial occlusions, arbitrary and non-uniform illumination conditions and arbitrary background clutter. In recent years, trait classification has started to be applied to real-world environments, with some success. However, the focus has been on estimation from single images or video frames, without leveraging the temporal information available in the entire video sequence. In addition, a fixed set of features are usually used for trait classification without any consideration of possible changes in the facial features due to head pose changes. In this project, a temporal, probabilistic framework is first employed to robustly estimate continuous head pose angles from real-world videos, and then use this pose estimate to decide on the appropriate set of frames and features to use in a temporal fusion scheme for soft biometric trait classification.

M. Demirkus, D. Precup, J. Clark, T. Arbel, “Soft Biometric Trait Classification from Real-world Face Videos Conditioned on Head Pose Estimation", IEEE Computer Society Workshop on Biometrics in association with the IEEE CVPR, pp.130-137, June 2012. (oral presentation).

.

Facial Pose Labelling in Real-World Videos

Researchers: M. Demirkus, T. Arbel. Collaborators: D. Precup, J. J. Clark

Automatic head pose estimation from real-world video sequences is of great interest to the computer vision community since pose provides prior knowledge for tasks, such as face detection and classification. However, developing pose estimation algorithms requires large, labelled real-world video databases on which computer vision systems can be trained and tested. Manual labeling of each frame is tedious, time consuming, and often difficult due to the high uncertainty in head pose angle estimate, particularly in unconstrained environments that include arbitrary facial expression, occlusion, illumination etc. To overcome these difficulties, a semi-automatic framework is proposed for labelling temporal head pose in real-world video sequences. The proposed multi-stage labelling framework first detects a subset of frames with distinct head poses over a video sequence, which is then manually labelled by the expert to obtain the ground truth for those frames. The proposed framework provides a continuous head pose label and corresponding confidence value over the pose angles. Next, the interpolation scheme over a video sequence estimates i) labels for the frames without manual labels and ii) corresponding confidence values for interpolated labels. This confidence value permits an automatic head pose estimation framework to determine the subset of frames to be used for further processing, depending on the labelling accuracy required.

M. Demirkus, J. J. Clark and T. Arbel, "Robust Semi-automatic Head Pose Labelling for Real-World Face Video Sequences", Multimedia Tools and Applications, January 2013.

.

Small, Enhanced Pathology Segmentation

Researchers: Zahra Karimaghaloo, Tal Arbel. Collaborators: D. Louis Collins, Dr. D.L. Arnold

In Multiple Sclerosis (MS) patients' Magnetic Resonance Imaging (MRI) brain images, finding the smallest gadolinium enhancing brain lesions can be the most difficult, especially when other anatomy such as blood vessels are similarly enhanced. This project employs a Temporal Hierarchical Adaptive Texture Conditional Random Field classifier to segment gadolinium enhancing lesions. In addition to voxel-wise features, the framework exploits multiple higher order textures as well as temporal information to discriminate the true lesional enhancements from other enhanced anatomy.

Karimaghaloo, Zahra, et al. "Adaptive Voxel, Texture and Temporal Conditional Random Fields for Detection of Gad-Enhancing Multiple Sclerosis Lesions in Brain MRI." Medical Image Computing and Computer-Assisted Intervention–MICCAI 2013. Springer Berlin Heidelberg, 2013. 543-550.


Fast and Robust Multi-Modal Registration

Researchers: Dante De Nigris and Tal Arbel. Collaborators: D. Louis Collins

This project employs robust similarity metrics for multi-modal image registration where features in one image can be missing in the other, with particular interest in significantly improving computational efficiency in the context of time-sensitive clinical interventions. Near real-time performance is achieved through the use of smart pixel selection strategies and a GPU-Accelerated implementation.


Registration Validation

Researchers: Dante De Nigris and Tal Arbel

Medical image registration is often performed too coarsely, and some implementations can yield non-optimal solutions to image registration. This project seeks to characterize the uncertainty of a solution to an image registration problem and provides an algorithm for the detection of incorrect registrations.


MRI-ultrasound registration for the correction of intra-operative brain shift

Researchers: Tal Arbel, Xavier Morandi and D. Louis Collins

In this project, we explore the use of acquiring intra-operative ultrasound (US) images for the quantification of and the correction for non-linear brain deformations. We develop a multi-modal image registration strategy that involves (i) building "predicted-ultrasound" prior to surgery based on segmented MRIand (ii) automatically matching the predicted US images to real intra-operative US images. By providing the surgeon with a set of updated MRI images for surgical guidance, surgical procedures can be performed with higher precision thus improving surgical outcomes and overall patient care.

For more information, see the project website


Bayesian MS lesion classification modeling regional and local spatial information

Researchers: Rola Harmounche, Tal Arbel. Collaborators: D. Louis Collins, Dr. D.L. Arnold, Simon Francis.

A fully automatic Bayesian method for multiple sclerosis (MS) lesion classification is presented. The posterior probability distribution is used to determine voxel labels for regular tissue as well as T1-hypointense lesions and T2-hyperintense lesions, and to provide experts with a confidence level in the classification. Spatial variability in intensity distributions over the brain is explicitly modeled by segmenting the brain into distinct anatomical regions and building the likelihood distributions of each tissue class in each region based on multimodal magnetic resonance image (MRI) intensities. Local smoothness is ensured by incorporating neighboring voxel information in the prior probability via Markov random fields. Validation is done for both lesion types on real data from ten patients with MS. Lesion classification results are compared to five expert raters and two other automatic classification techniques, using volume count and overlap. The classification results obtained with the presented method are comparable to manual classifications in both the cerebral hemispheres and posterior fossa.

For more information, see the project website.


Entropy based gaze planning

Researchers: Tal Arbel and Frank P. Ferrie

This work introduces a probabilistic active object recognition system capable of identifying an object from a known database based on the information gathered sequentially from different points of view, by moving the camera in curvilinear arcs around a viewsphere centered about the object. The notion of entropy map is introduced as a means of encoding prior knowledge about the discriminability of objects as a function of viewing position. Empirical results show how entropy maps can be used to guide a sensor towards informative viewpoints, leading to confident assertions about the identity of the unknown object in a short number of steps.

For more information, see the project website.


Interactive visual dialog

Researchers: Tal Arbel and Frank P. Ferrie

In this work, an interactive engine was built that recognizes objects waved in front of a television camera by a human. The context is that of a supermarket checkout scenario in which objects presented by the operator to a camera are automatically tallied. In order to improve the recognition, the system provides feedback to the operator about which objects it currently believes he might be holding with an indication as to how each of these objects should next be presented to the system to minimize ambiguity. Experiments show that this human-machine dialog mechanism leads to accurate recognition results in a small number of iterations.

For more information, see the project website.


Efficient viewpoint selection for active object recognition and pose estimation

Researchers: Catherine Laporte, Rupert Brooks and Tal Arbel

In this work, a new criterion for viewpoint selection in the context of active Bayesian object recognition and pose estimation was developped. Recognition and pose estimation are jointly performed by probabilistically fusing successive observations, taking into account the dependencies between the observed scene, the data and the observation parameters, thereby acquiring knowledge about the structure of the observed objects. Based on the system's current belief state, the new observation selection criterion associates high utility with observations whose outcome predictably facilitates distinction between pairs of competing hypotheses. The algorithm has relatively low complexity and lends itself to various simplifications. Experiments show that this approach achieves comparable recognition performance to the widely used mutual information maximization approach at a much lower computational cost.

For more information, see the project website.


Entropy-of-likelihood feature selection for image correspondence

Researchers: Matthew Toews and Tal Arbel

We have developed a means of evaluating which image points can be matched with the least ambiguity, given a particular image domain and matching process. Based on this development, we envision improving the reliability, generality and speed of matching, as well as increasing our understanding of the correspondence task.


MAP local histogram estimation for image registration

Researchers: Matthew Toews and Tal Arbel

We have developed a means of evaluating which image points can be matched with the least ambiguity, given a particular image domain and matching process. Based on this development, we envision improving the reliability, generality and speed of matching, as well as increasing our understanding of the correspondence task.