Active Basis Model, Shared Sketch Algorithm, and Sum-Max Maps

Ying Nian Wu, Zhangzhang Si, Haifeng Gong, and Song-Chun Zhu

New| Notes | Paper and talks | Source codes and experiments | Abstract | Model | Algorithm | Architecture | Recursive | References | Past work




<-- Active basis model & Shared sketch algorithm -->


The model is generative: it is a linear composition of active wavelet elements localized in both spatial and frequency domains.

New Additions

(1) C++ codes were released on June 11, 2008.
(2) Bug fixed in gabor.m on July 3, 2008, about symbolic representation of Gabor wavelets for display purpose. The bug has no effect on displaying existing results. It manifests itself only if we want to display much bigger Gabors. revised gaborfilter.m
(3) Experiment 10 was added on July 11, 2008.
(4) Experiment 8 was added on July 27, 2008.
(5) Experiment 7.2 was added on August 2, 2008. To be compared with Experiment 8.2.


Notes


Paper and Presentation

Journal paper pdf Version March 28, 2008
latex and eps files Version March 28, 2008
ppt presentation May 2008, UC Irvin
ppt in Chinese March 2008, Kaohsiung

Source Codes for Experiments

Experiments 1 and 2 Supervised learning and detection
Experiment 3 Supervised learning and classification
Experiment 4 Clustering by EM and K-mean
Experiment 5 Learning from non-aligned images
Experiment 6 Learning moving template
Experiment 7 Composing multiple templates
Experiment 8 Geometric transformation of template
Experiment 10 Synthesis by multi-scale Gabors and DoGs
C++ implementation by Haifeng Gong and Zhangzhang Si, Version June 11, 2008

TITLE: Active Basis for Modeling, Learning and Recognizing Deformable Templates

ABSTRACT: This article proposes an active basis model, a shared sketch algorithm, and a computational architecture of sum-max maps for representing, learning, and recognizing deformable templates. In our generative model, a deformable template is in the form of an active basis, which consists of a small number of Gabor wavelet elements at selected locations and orientations. These elements are allowed to slightly perturb their locations and orientations before they are linearly combined to generate the observed image. The active basis model, in particular, the locations and the orientations of the basis elements, can be learned from training images by the shared sketch algorithm. The algorithm selects the elements of the active basis sequentially from a dictionary of Gabor wavelets at a dense collection of locations and orientations. When an element is selected at each step, the element is shared by all the training images, and the element is perturbed to encode or sketch a nearby edge segment in each training image. The recognition of the deformable template from an image can be accomplished by a computational architecture that alternates the sum maps and the max maps. The computation of the max maps deforms the active basis to match the image data, and the computation of the sum maps scores the template matching by the log-likelihood of the deformed active basis.

Active Basis Model


Active basis. Each basis element is illustrated by a thin ellipsoid at a certain location and orientation. The upper half shows the perturbation of one basis element. By shifting its location or orientation or both within a limited range, the basis element (illustrated by a black ellipsoid) can change to other Gabor wavelet elements (illustrated by the blue ellipsoids). Because of the perturbations of the basis elements, the active basis represents a deformable template. eps

Active basis formed by 60 Gabor wavelet elements. The first block displays the 60 elements, where each element is represented by a bar. For each of the other 7 blocks, the left plot is the observed image, and the right plot displays the 60 Gabor wavelet elements resulting from locally shifting the 60 elements in the first block to fit the corresponding observed image. eps

Animation of the perturbations of the elements of active basis.

Shared Sketch Algorithm



Shared sketch algorithm. A selected element (colored ellipsoid) is shared by all the training images. For each image, a perturbed version of the element sketches a local edge segment near the element. The elements of the active basis are selected sequentially according to the Kullback-Leibler divergence of the pooled distribution (colored solid curve) of filter responses from the background distribution (black dotted curve). eps

Animation of the shared sketch algorithm.

Sum-Max Maps


Sum-max maps. The SUM1 maps are obtained by convolving the input image with Gabor filters at all the locations and orientations. The ellipsoids in the SUM1 maps illustrate the local filtering or summation operation. The MAX1 maps are obtained by applying a local maximization operator to the SUM1 maps. The arrows in the MAX1 maps illustrate the perturbations over which the local maximization is taken. The SUM2 maps are computed by applying a local summation operator to the MAX1 maps, where the summation is over the elements of the active basis. This operation computes the log-likelihood of the deformed active basis, and can be interpreted as a shape filter. eps

Recursive Active Basis and Sum-Max Maps


Sum-max maps. A SUM2 map is computed for each sub-template. For each SUM2 map, a MAX2 map is computed by applying a local maximization operator to the SUM2 map. Then a SUM3 map is computed by summing over the two MAX2 maps. The SUM3 map scores the template matching, where the template consists of two sub-templates that are allowed to locally shift their locations. eps

Key References

Form of representation: Scheme of learning: Architecture of computation:

Past Work

Our work is a continuation of our long term search for generative models and model-based algorithms, as well as our attempt to understand these models within a common information-theoretical framework. The active basis model can be considered a revision of our previous model on textons. It can also be viewed as an inhomogeneous version of the Markov random field model that we previously developed for textures. More important, the active basis model is a simplest instance of the and-or graph in the compositional framework that we have been studying. The and-or grammar naturally suggests to further compose multiple active bases to represent more articulate shapes. The architecture of the sum-max maps is a natural computational tool for parsing the observed image according to the and-or grammar.
  1. Wu, Y. N., Guo, C., and Zhu, S. C. (2008) From information scaling to regimes of statistical models. Quarterly of Applied Mathematics, 66, 81-122. pdf latex ppt
  2. Wu, Y. N., Si, Z., Fleming, C., and Zhu, S. C. (2007) Deformable template as active basis. Proceedings of International Conference of Computer Vision. pdf latex ppt
  3. Wu, Y. N., Li, J., Liu, Z., and Zhu, S. C. (2007) Statistical principles in image modeling. Technometrics, 49, 249-261. pdf latex
  4. Guo, C., Zhu, S. C. and Wu, Y. N. (2007) Primal sketch: integrating structure and texture. Computer Vision and Image Understanding, 106, 5-19. pdf latex ppt
  5. Zhu, S. C. and Mumford, D. (2006) A stochastic grammar of images. Foundations and Trends in Computer Graphics and Vision, 2, 259-362. pdf
  6. Zhu S. C., Guo C., Wang Y, and Xu Z. (2005) What are Textons? International Journal of Computer Vision, 62, 21-143. pdf
  7. Zhu S. C. (2003) Statistical modeling and conceptualization of visual patterns. IEEE Pattern Analysis and Machine Intelligence, 25. 691-712. pdf
  8. Guo, C., Zhu, S. C., and Wu, Y. N. (2003) Towards a mathematical theory of primal sketch and sketchability. Proceedings of International Conference of Computer Vision. 1228-1235. pdf latex ppt
  9. Guo, C., Zhu, S. C., and Wu, Y. N. (2003) Visual learning by integrating descriptive and generative models. International Journal of Computer Vision. 53, 5-29. pdf
  10. Zhu, S. C., Guo, C., Wu, Y. N. and Wang, Y. (2002) What are textons? Proceedings of European Conference of Computer Vision, 793-807. pdf
  11. Wu, Y. N., Zhu, S. C., and Guo, C. (2002) Statistical modeling of texture sketch. Proceedings of European Conference of Computer Vision, 240-254. pdf latex ppt
  12. Wu, Y. N., Zhu, S. C., and Liu, X. (2000) Equivalence of Julesz ensembles and FRAME models. International Journal of Computer Vision, 38, 245-261. pdf
  13. Zhu, S. C., Liu, X., and Wu, Y. N. (2000) Exploring texture ensembles by efficient Markov chain Monte Carlo - towards a `trichromacy' theory of texture. IEEE Pattern Analysis and Machine Intelligence, 22, 554-569. pdf
  14. Zhu, S. C., Wu, Y. N., and Mumford, D. B. (1998) Minimax entropy principle and its application to texture modeling. Neural Computation, 9, 1627-1660. pdf
  15. Zhu, S. C. and Mumford, D. (1997) Prior Learning and Gibbs Reaction-Diffusion. IEEE Pattern Analysis and Machine Intelligence, 19, 1236-1250. pdf
  16. Zhu, S. C., Wu, Y. N., and Mumford, D. B. (1997) Filter, Random field, And Maximum Entropy (FRAME): towards a unified theory for texture modeling. International Journal of Computer Vision, 27, 107-126. pdf
  17. Zhu, S. C. and Yuille, A. L. (1996) A Flexible object recognition and modeling system. International Journal of Computer Vision, 20, 187-212. pdf

New| Notes | Paper and talks | Source codes and experiments | Abstract | Model | Algorithm | Architecture | Recursive | References | Past work
Back to active basis homepage