Image Parsing via Stochastic Scene Grammar

Yibiao Zhao and Song-Chun Zhu




Abstract

This paper proposes a parsing algorithm for scene understanding which includes four aspects: computing 3D scene layout, detecting 3D objects (e.g. furniture), detecting 2D faces (windows, doors etc.), and segmenting background. In contrast to previous scene labeling work that applied discriminative classifiers to pixels (or super-pixels), we use a generative Stochastic Scene Grammar (SSG). This grammar represents the compositional structures of visual entities from scene categories, 3D foreground/background, 2D faces, to 1D lines.

The grammar includes three types of production rules and two types of contextual relations. Production rules: (i) AND rules represent the decomposition of an entity into sub-parts; (ii) OR rules represent the switching among sub-types of an entity; (iii) SET rules represent an ensemble of visual entities. Contextual relations: (i) Cooperative "+" relations represent positive links between binding entities, such as hinged faces of a object or aligned boxes; (ii) Competitive "-" relations represents negative links between competing entities, such as mutually exclusive boxes.

We design an efficient MCMC inference algorithm, namely Hierarchical cluster sampling, to search in the large solution space of scene configurations. The algorithm has two stages: (i) Clustering: It forms all possible higher-level structures (clusters) from lower-level entities by production rules and contextual relations. (ii) Sampling: It jumps between alternative structures (clusters) in each layer of the hierarchy to find the most probable configuration (represented by a parse tree).

Demo



Results



Publication

Yibiao Zhao, Song-Chun Zhu, Image Parsing with Stochastic Scene Grammar, the Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS 2011), Granada, Spain.

Bibtex

	@inproceedings{ZhaoZhu2011,
   		author="Yibiao Zhao and Song-Chun Zhu", 
   		title="Image Parsing via Stochastic Scene Grammar", 
   		booktitle="Advances in Neural Information Processing Systems", 
   		year="2011"
	}

Downloads

Coming soon...



The work is supported by grants from NSF IIS-1018751, NSF CNS-1028381 and ONR MURI N00014-10-1-093