Active Basis Model, Shared Sketch Algorithm, and Sum-Max Maps
Ying Nian Wu, Zhangzhang Si, Haifeng Gong, and Song-Chun Zhu
New|
Notes |
Paper and talks |
Source codes and experiments |
Abstract |
Model |
Algorithm |
Architecture |
Recursive |
References |
Past work


<-- Active basis model & Shared sketch algorithm -->

The model is generative: it is a linear composition of active wavelet elements
localized in both spatial and frequency domains.
(1) C++ codes were released on June 11, 2008.
(2) Bug fixed in gabor.m on July 3, 2008, about symbolic representation of
Gabor wavelets for display purpose. The bug has no effect on displaying
existing results. It manifests itself only if we want to display much
bigger Gabors.
revised gaborfilter.m
(3) Experiment 10 was added on July 11, 2008.
(4) Experiment 8 was added on July 27, 2008.
(5) Experiment 7.2 was added on August 2, 2008. To be compared with Experiment 8.2.
- This is a reproducibility page for the results presented in the journal
version of the paper.
- Three purposes of this page:
(1) Reproduce the results in the paper.
(2) Share the codes with the community.
(3) Version control for the codes and the paper.
- The journal version has more experiment results and better theoretical
understanding than the ICCV paper. This page contains more efficient computer
codes than
ICCV07 reproducibility page.
- Matlab, C and C++ codes are copyrighted by the authors.
- Last updated on August 2, 2008. We shall continue to post reports, results
and codes on this page. We have stopped updating ICCV07 reproducibility page.
- Contact ywu@stat.ucla.edu
Journal paper pdf
Version March 28, 2008
latex and eps files
Version March 28, 2008
ppt presentation
May 2008, UC Irvin
ppt in Chinese
March 2008, Kaohsiung
Experiments 1 and 2
Supervised learning and detection
Experiment 3
Supervised learning and classification
Experiment 4
Clustering by EM and K-mean
Experiment 5
Learning from non-aligned images
Experiment 6
Learning moving template
Experiment 7
Composing multiple templates
Experiment 8
Geometric transformation of template
Experiment 10
Synthesis by multi-scale Gabors and DoGs
C++ implementation by Haifeng Gong and Zhangzhang Si, Version June 11, 2008
ABSTRACT: This article proposes an active basis model, a shared sketch algorithm,
and a computational architecture of sum-max maps for representing, learning, and
recognizing deformable templates. In our generative model, a deformable template
is in the form of an active basis, which consists of a small number of Gabor wavelet
elements at selected locations and orientations. These elements are allowed to
slightly perturb their locations and orientations before they are linearly combined
to generate the observed image. The active basis model, in particular, the locations
and the orientations of the basis elements, can be learned from training images
by the shared sketch algorithm. The algorithm selects the elements of the active
basis sequentially from a dictionary of Gabor wavelets at a dense collection of
locations and orientations. When an element is selected at each step, the element
is shared by all the training images, and the element is perturbed to encode or
sketch a nearby edge segment in each training image. The recognition of the
deformable template from an image can be accomplished by a computational
architecture that alternates the sum maps and the max maps. The computation of
the max maps deforms the active basis to match the image data, and the computation
of the sum maps scores the template matching by the log-likelihood of the deformed
active basis.

Active basis. Each basis element is illustrated by a thin ellipsoid at a certain
location and orientation. The upper half shows the perturbation of one basis element.
By shifting its location or orientation or both within a limited range, the basis
element (illustrated by a black ellipsoid) can change to other Gabor wavelet
elements (illustrated by the blue ellipsoids). Because of the perturbations of the
basis elements, the active basis represents a deformable template.
eps

Active basis formed by 60 Gabor wavelet elements. The first block displays the 60
elements, where each element is represented by a bar. For each of the other 7
blocks, the left plot is the observed image, and the right plot displays the
60 Gabor wavelet elements resulting from locally shifting the 60 elements
in the first block to fit the corresponding observed image.
eps

Animation of the perturbations of the elements of active basis.
Shared sketch algorithm. A selected element (colored ellipsoid) is shared by
all the training images. For each image, a perturbed version of the element sketches
a local edge segment near the element. The elements of the active basis are selected
sequentially according to the Kullback-Leibler divergence of the pooled distribution
(colored solid curve) of filter responses from the background distribution (black
dotted curve).
eps

Animation of the shared sketch algorithm.

Sum-max maps. The SUM1 maps are obtained by convolving the input image with Gabor
filters at all the locations and orientations. The ellipsoids in the SUM1 maps
illustrate the local filtering or summation operation. The MAX1 maps are obtained
by applying a local maximization operator to the SUM1 maps. The arrows in the
MAX1 maps illustrate the perturbations over which the local maximization is taken.
The SUM2 maps are computed by applying a local summation operator to the MAX1 maps,
where the summation is over the elements of the active basis. This operation
computes the log-likelihood of the deformed active basis, and can be interpreted
as a shape filter.
eps

Sum-max maps. A SUM2 map is computed for each sub-template. For each SUM2 map,
a MAX2 map is computed by applying a local maximization operator to the SUM2 map.
Then a SUM3 map is computed by summing over the two MAX2 maps. The SUM3 map
scores the template matching, where the template consists of two sub-templates
that are allowed to locally shift their locations.
eps
Form of representation:
- B. A. Olshausen and D. J. Field, Emergence of simple-cell receptive field properties
by learning a sparse code for natural images, Nature, 381, 607-609, 1996.
- S. C. Zhu, C. E. Guo, Y. Z. Wang, and Z. J. Xu, What are
textons, International Journal of Computer Vision, 62,
121-143, 2005.
- S. C. Zhu and D. B. Mumford, A stochastic grammar of images, Foundations and
Trends in Computer Graphics and Vision, 2, 259-362, 2006.
Scheme of learning:
- S. Mallat and Z. Zhang, Matching pursuit in a time-frequency dictionary, IEEE
Transactions on Signal Processing, 41, 3397-415, 1993.
- J. H. Friedman, Exploratory projection pursuit, Journal of the American Statistical
Association, 82, 249-266, 1987.
- P. A. Viola and M. J. Jones, Robust real-time face detection, International
Journal of Computer Vision, 57, 137-154, 2004.
- S. C. Zhu, Y. N. Wu, and D. B. Mumford, Minimax entropy principle and its applications
in texture modeling, Neural Computation, 9, 1627-1660, 1997.
- S. Della Pietra, V. Della Pietra, and J. Lafferty, Inducing features of random fields,
IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 380-393, 1997.
Architecture of computation:
- M. Riesenhuber and T. Poggio, Hierarchical models of object recognition in cortex,
Nature Neuroscience, 2, 1019-1025,
1999.
Our work is a continuation of our long term search for generative models and
model-based algorithms, as well as our attempt to understand these models within
a common information-theoretical framework. The active basis model can be considered
a revision of our previous model on textons. It can also be viewed as an inhomogeneous
version of the Markov random field model that we previously developed for textures.
More important, the active basis model is a simplest instance of the and-or graph
in the compositional framework that we have been studying. The and-or grammar naturally
suggests to further compose multiple active bases to represent more articulate shapes.
The architecture of the sum-max maps is a natural computational tool for parsing the
observed image according to the and-or grammar.
- Wu, Y. N., Guo, C., and Zhu, S. C. (2008) From information scaling to regimes of
statistical models. Quarterly of Applied Mathematics, 66, 81-122.
pdf
latex
ppt
- Wu, Y. N., Si, Z., Fleming, C., and Zhu, S. C. (2007) Deformable template as
active basis. Proceedings of International Conference of Computer Vision.
pdf
latex
ppt
- Wu, Y. N., Li, J., Liu, Z., and Zhu, S. C. (2007) Statistical principles in image
modeling. Technometrics, 49, 249-261.
pdf
latex
- Guo, C., Zhu, S. C. and Wu, Y. N. (2007) Primal sketch: integrating structure
and texture. Computer Vision and Image Understanding, 106, 5-19.
pdf
latex
ppt
- Zhu, S. C. and Mumford, D. (2006) A stochastic grammar of images. Foundations
and Trends in Computer Graphics and Vision, 2, 259-362.
pdf
- Zhu S. C., Guo C., Wang Y, and Xu Z. (2005) What are Textons? International
Journal of Computer Vision, 62, 21-143.
pdf
- Zhu S. C. (2003) Statistical modeling and conceptualization of visual patterns.
IEEE Pattern Analysis and Machine Intelligence, 25. 691-712.
pdf
- Guo, C., Zhu, S. C., and Wu, Y. N. (2003) Towards a mathematical theory of primal
sketch and sketchability. Proceedings of International Conference of Computer Vision.
1228-1235.
pdf
latex
ppt
- Guo, C., Zhu, S. C., and Wu, Y. N. (2003) Visual learning by integrating descriptive
and generative models. International Journal of Computer Vision. 53, 5-29.
pdf
- Zhu, S. C., Guo, C., Wu, Y. N. and Wang, Y. (2002) What are textons? Proceedings of
European Conference of Computer Vision, 793-807.
pdf
- Wu, Y. N., Zhu, S. C., and Guo, C. (2002) Statistical modeling of texture sketch.
Proceedings of European Conference of Computer Vision, 240-254.
pdf
latex
ppt
- Wu, Y. N., Zhu, S. C., and Liu, X. (2000) Equivalence of Julesz ensembles and
FRAME models. International Journal of Computer Vision, 38, 245-261.
pdf
- Zhu, S. C., Liu, X., and Wu, Y. N. (2000) Exploring texture ensembles by efficient
Markov chain Monte Carlo - towards a `trichromacy' theory of texture. IEEE Pattern
Analysis and Machine Intelligence, 22, 554-569.
pdf
- Zhu, S. C., Wu, Y. N., and Mumford, D. B. (1998) Minimax entropy principle and its
application to texture modeling. Neural Computation, 9, 1627-1660.
pdf
- Zhu, S. C. and Mumford, D. (1997) Prior Learning and Gibbs Reaction-Diffusion.
IEEE Pattern Analysis and Machine Intelligence, 19, 1236-1250.
pdf
- Zhu, S. C., Wu, Y. N., and Mumford, D. B. (1997) Filter, Random field, And Maximum
Entropy (FRAME): towards a unified theory for texture modeling. International Journal of
Computer Vision, 27, 107-126.
pdf
- Zhu, S. C. and Yuille, A. L. (1996) A Flexible object recognition and modeling system.
International Journal of Computer Vision, 20, 187-212.
pdf
New|
Notes |
Paper and talks |
Source codes and experiments |
Abstract |
Model |
Algorithm |
Architecture |
Recursive |
References |
Past work
Back to active basis homepage