Learning

Code and data with REAMDE: (ZIP) for learning mixed templates.
Change Log
open/close README

Running the Demo

The demo (ZIP) is written in matlab/mex-C. We included both mex32 and mex64 files compiled from mex-C source code. To see the demo, please unzip it to a local folder, where your matlab working folder is directed to. Then type in the matlab command line:

>> run_demo

Windows will pop up one by one, each time showing four example images from an object category, and a mixed template is learned and illustrated. Press any key to carry on the process. Below are several screen captures in the running process. Black bold strokes represent sketch features, while red dots represent orientation histograms. Each red dot is generated by weighted superposing of oriented bars, where weights are coordinates in the orientation histogram.

bonsai hedgehog pig head

To save time, we have included the model files so that the mixed templates are loaded directly. Sketch features in the mixed templates are stored in rawmodel_Basis*.mat, while texture features are stored in rawmodel_tex*.mat. To re-learn these templates, the user may simply remove these "rawmodel*" files, and then start "run_demo" again.


Files in the Package

run_demo.m

The main function that display example images, perform mixed template learning and illustrate the symbolic templates.

get_examples.m

Load example images from a specified directory.

learn_mixed.m

Learning a mixed image template from example images of a certain category and displaying it in the pop up window. For display purpose, for different categories we select from several different template sizes (100*100 ~ 150*150).

learnBasis.m, ABlearn2.m

Select sketch features into the mixed template. To estimate the model parameter (lambda) and normalizing constant, we store a table for E[r] and a table for Z, both indexed by candidate values of lambda. The tables are precomputed from ~100 natural images, and stored in nlf.mat. Please refer to the submitted paper for more details.

learnTex.m, learnOriHist_minVar.m, learnOriHist_table.m

Select texture (orientation histogram) features into the mixed template. As stated in the paper we have two options to compute information gain for texture features: (1) assuming one dimensional Gaussian which leads to minimum variance features (learnOriHist_minVar.m); (2) table look-up similarly to the sketch features (learnOriHist_table.m). We used learnOriHist_minVar.m for all reported results in the paper, and also for this demo. To use table look-up instead, just replace "learnOriHist_minVar" with "learnOriHist_table" in learnTex.m.

config1.m

Basic configuration: specify global parameters and generate Gabor filters.

Learning Result: (Explanation about the learning algorithm)

Following pictures are symbolized templates automatically learned from example images. In each row, the left most picture is the learned template, followed by the information gains of selected features, and then followed by the first few training examples. Black strokes represent structural edges (sketches), and red blobs represents orientation histograms (note that some are direction-less). It is interesting to see how the two types of feature compliment each other in explaining different regions of the image lattice, though they do overlap at part of the object boundary.



learned mixed template feature competition plot first few training examples
hedgehog
rabbit head
bear head
bonsai
palm tree
zebra
wolf head
cow head
pig head
cat head
fish
elephant head
windsor chair
hand fan
eagle head
lion head
sheep head
teapot
dog head
motorcycle
soccer ball
pizza
panda head
clock
pigeon head
tiger head
monkey head
rabbit head


* For presentation purpose we only display sketches and textures of one scale.

The learning method follows directly from the active basis model. It is formulated as a step-wise variable selection and parameter estimation, which we may call feature pursuit.

We learn an image model which best reconstructs positive example images. The learning method resembles matching pursuit, but it replaces the L2 error term by a negative log likelihood which is more meaningful for discriminative tasks. Formulated as an minimization of KL-divergence between the model and the true image distribution, the algorithm extracts informative image component so that the residual image is as natural-image as possible, instead of as small-i.i.d.-Gaussian-noise as possible. Though currently tested in categorization tasks, given the simplicity of the learning and matching algorithm, this tool can be put to use in many computer vision applications.

Both sketch and texture features are defined on a dictionary of Gabor elements at 16 orientations at different scales.