Unsupervised learning without categorical labeling is most suitable for discovering visual structures and clusters in the training data. In the following example, we show the learned part dictionary and AND-OR grammar for generating the observed images.
|Code and data: (ZIP).|
A simple learned AND/OR graph from 320 animal face images. To display properly, we use only four out of nine subsampled window positions inside the whole object, and OR branches with probability smaller than 0.05 are not shown. (a) The animal categories bear, cat, cow and wolf share similar parts such as a sharp ear and a round nose. There are also large variations of these parts, even within the same animal category. (b) Firstly shape templates are learned using active basis with EM clustering. For each position, we select top 5 clusters and their templates, ranked by their prior probabilities. The numbers correspond to terminal nodes in the bottom figure, where we visualize the learned AND/OR graph. A bold edge coming out from an OR node means its conditional probability is larger than 0.4, and a dotted edge means its probability is smaller than 0.1. The bottom shows AND/OR graph learned from the labeled sequences using the shape dictionary in (b). The AND/OR graph has 30 prefix nodes, in comparison to 71 prefix nodes in the recorded trie structure. Around 50% of free parameters are eliminated.