Scene

Scene Understanding: Parsing, Learning, Modeling and Categorization

1. Case study I: Image parsing byData Driven Markov Chain Monte Carlo (DDMCMC)
2. Case study II: A Hierarchical and Contextual Model for Aerial Image Parsing
3. Case study III: Top-down/Bottom-up Parsing of Rectangular World by Image Grammar
4. Case study IV: Scene Parsing and Labeling by Cluster Sampling
5. Case study V: Unsupervised Scene Categorization

To construct the parse graph (i.e. the full generative representation) of an input image on-th-fly by a data-driven Markov Chain Monte Carlo paradigm

Top-down/Bottom-up Parsing of Rectangle Scene by Image Grammar [Project Page]

We present a simple attribute graph grammar as a generative representation for man- made scenes, such as buildings, hallways, kitchens, and living rooms, and studies an e®ective top-down/bottom-up inference algorithm for parsing images in the process of maximizing a Bayesian posterior probability or equivalently minimizing a description length (MDL). This simple grammar has one class of primitives as its terminal nodes { the projection of planar rectangles in 3-space into the image plane, and six production rules for the spatial layout of the rectangular surfaces.

A Hierarchical and Contextual Model for Aerial Image Parsing [Project page]

Aerial image understanding is an important field of research for tackling the problems of automated navigation, large scale 3D scene construction, and object tracking for use in event detection. Most of the tasks using aerial images need or would benefit from a full explanation of the scene, consisting of the locations and scales of detected objects and their relationships to one another. Being able to identify objects of many different types and understand their relationships to one another gives a deeper understanding of the data and allows subsequent algorithms to make smarter decisions faster.

Scene Parsing and Labeling by Cluster Sampling [Project page]

Our objective is to parse scene images into objects (such as cows, cars and human body) and generic regions (e.g., sky, water and grass).

We adopt generative models for both objects and generic regions which are learned under the same information projection principle and then comparable. The inference algorithm follows the data-driven Markov Chain Monte Carlo (DDMCMC) paradigm where the object and generic region models cooperate and compete for an optimal interpretation of the scene in a Bayesian framework.

Unsupervised Learning for Discovering Scene Categories [Project page]

We pose unsupervised scene categorization as a graph partition problem under a Bayesian framework by treating each image as a vertex.

Automatic feature selection (i.e. what is what) is addressed by the information projection principle, and both informative and discrimative features are discovered and used to learn the likelihood models for each category.

Automatic cluster number selection (i.e. what goes with what) is addressed by the cluster sampling strategy, and the cluster number is selected which maximizes the Bayesian posterior probability.