Combining Bottom-Up and Top-Down Vision Processes


Theoretical arguments and biological evidence indicate top-down information should be used to guide low- and mid- level vision tasks. A mathematical formulation of this idea is the Bayesian, analysis-by-synthesis approach, that views scene analysis as interpreting the observations in terms of object models. My work in this direction has been on learning models that facilitate this approach and on applying it to problems containing both low- and high- level aspects: Image segmentation and Point-of-Interest based object detection.

Unsupervised Learning of Object Deformation Models [slides]

The learning of top-down object models has as a basic prerequisite the modelling of deformations. This is necessary in order to disentangle shape variation from appearance and structural variation, and model them separately.

For this we work with two types of representations for deformations: Active Appearance Models and Part-Based deformation models. We formulate the task of learning as an Expectation Maximization procedure and then provide the tools that are needed to render the EM approach feasible. We rely on the primal sketch representation of the image, which helps develop intuitive solutions to problems like finding the parts of objects, initializing the eigenvectors of the AAM model etc.

Joint Image Segmentation and Object Recognition [slides]

For image segmentation the  key idea has been to formulate the task of segmenting an image as assigning observations to hypotheses, and treat object models as providing one of the hypotheses considered. In this light the EM algorithm can be applied, where the assignment of observations is interpreted as the E-step and model fitting as the M-step.

This results in an iterative model fitting/image segmentation algorithm, that can segment images in a top-down manner. Apart from top-down segmentation, this approach allows the validation of bottom-up object detection results, pruning false positives.

Primal Sketch Features and Bottom-Up / Top-Down Object Detection [slides]

My work on object detection addressed the problem of incorporating information related to the boundaries and symmetry axes of objects in a detection task. During model construction these features are extracted based on T. Lindeberg's scale-invariant primal sketch and broken into simple tokens using line segmentation.

During object detection two stages are considered: the bottom-up stage, where such features are extracted from a new image and matched to the model learned during training. The misses of the feature extraction stage are recovered in a top-down stage, where a simple parametric model is fit to the image locally. The main contribution here is an efficient algorithm for estimating the likelihood of the observations given the model, that allows for the use of graphical model algorithms, such as particle filtering for computer vision tasks.

Related Publications

 

 I. Kokkinos and P. Maragos,

Synergy Between Image Segmentation and Object Recognition Using the Expectation Maximization Algorithm .

IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI), accepted for publication

I. Kokkinos and A. Yuille,

Unsupervised Learning of Object Deformation Models., supplement,

Proc. IEEE Int'l. Conf. on Computer Vision (ICCV), 2007.

 

I. Kokkinos, P. Maragos and A. Yuille,

Bottom-Up and Top-Down Object Detection Using Primal  Sketch Features and Graphical Models,

Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2006.

 

I. Kokkinos and P. Maragos,

An Expectation Maximization Approach to the Synergy Between Image Segmentation and Object Categorization.,

Proc. IEEE Int'l. Conf. on Computer Vision (ICCV), 2005.

 

Back to main page