Research Interests

Learning Representational Visual Knowledge

Objective: Learning hierarchical and compositional models with a unified deep And-Or directed and acyclic graph (AOG) structure and appearance features directly from weakly-labeled visual big data in a principled way.

Overview: Our method unfolds the space of latent scene/object structures by exploiting the compositionality and reconfigurability in a grammar-like manner, as illustrated by the famous Chinese Tangram puzzle. We define a dictionary of part types by tiling the image lattice. For each part type, we enumerate all valid instances by placing it in the image lattice. Then, we have an overcomplete set of part instances which are organized into a quantization AOG. We then learn both the AOG structure of the scene/object model (i.e., the optimal sub-AOG of the quantization AOG) and appearance features directly from raw data. Tangram Theoretical Justification: In the long run, we are seeking the statistical theories of performance guaranteed learning of AOGs under the context of life-long learning.

Learning Computational Visual Knowledge

Objective: Learning to compute faster by scheduling bottom-up/top-down computing processes in hierarchical models for (near-)optimal inference, i.e., maximizing the "gain" (accuracy) and minimizing the "pain" (computational costs).

Overview: Our strategy of scheduling bottom-up/top-down computing processes is similar in spirit to the one we usually adopted in playing the minesweeper game, that is, goal-guided cost-sensitive computing. We factorize the computation of inference in a principled way such that they can be scheduled to obtain the optimal or near-optimal interpretation of a given input with a minimal computational cost (e.g., using early rejection/acceptance), and balance the loss of misclassification (i.e., the penalties for a false negative and a false positive respectively) and the cost of computation (controlled by a parameter λ) in the scheduling. ThreeProcesses Theoretical Justification: In the long run, we are seeking multi-armed bandit framework of optimally scheduled inferential computing.