Home

Publications                

Results
3D Morphable Model
Object Class Recognition
3D Object Detection

Résumé

Results

3D Morphable Model Project (Sep 2002 - March 2005)

Used OpenGL to synthesize 3D model from a single 2D face image based on matching rendered image with the 2D image. See the technicalreport for details.

The following gifs show how one face is morphed into another and the different views of the latter face.

   

 

Object Class Recognition (April 2005 - April 2006)

Employed Locally Linear Embedding algorithm to construct an object model that represents multiple object classes and background in a low dimensional space. This model provides for view-invariant multiple object class recognition. Note that in this model global features have been used to represent each image. For more details, please see BMVC 2006 paper inpublications.

 

3D Object Detection (April 2006 - Feb 2009)

Constructed an object model that is similar to the one above but that captures the variation in the appearance of parts of an object class:

  1. The object is divided into cells. Each cell represents an object part and the spatial layout of the cells encodes the relationship between various object parts.
  2. Each object part is represented by dense overlapping SIFT features. In the figure below, the dot represents the location around which a fixed-size window is represented by the 128-dimensional SIFT descriptor. Also note that the dots (the features) are color encoded according to their cell location.
  3. The distribution of these features across the training images is modeled in a lower dimensional space using supervised Locally Linear Embedding, where supervised information is given in form of cell location (or, so to say, in form of unique color encoding).
  4. This produces distinct spatial clusters in the embedding space, each representing the appearance of the corresponding object part. The embedding space also contains a cluster representing the background class (the samples for which are obtained by densly sampling images that do not contain object of interest).
  5. For multi-view detection, the view-sphere is divided into multiple view-segments and spatial clusters are built for each object view. These can be represented either in a single embedding space or, alternatively, spatial clusters in each view can be represented in a distinct embedding space.

 

  

2 x2 cell structure and densely sampled features labeled according to their cell location.

 

 

Spatial clusters and Background cluster in a three-dimensional embedding space

 

The resulting object model is applied to detect instances of generic object classes:

  1. Densely sampled features in a test image are projected to the embedding space and labeled as background or as belonging to a particular object part.
  2. A group of neighboring features with the same label forms a hypothesis for the existence of an object part.
  3. Groups that are spatially consistent are found based on a spatial consistency test.
  4. These vote for the center of the object and provide an estimate of the location of the object.

   

Labeled features in a test image. Note that the features labeled as background are not shown.

Groups of neighboring features represent an object part

 

Work done under this category has been published in DAGM 2009 and CAIP 2009 (see publications). Some detection results are shown below for different object classes.

Object classes containing instances in a single view

Faces (MIT CMU Rowley Dataset)    Cars (UIUC Single-scale and Multi-scale Dataset)   Airplanes (Caltech Dataset)

Object classes containing instances across many different views(uniformly sampled from the view-sphere).

All classes are from the3D Object Category dataset.

Cars   Bicycle   Cell Phone    Iron    Shoe   Stapler   Toaster