Sparse coding in practice

Chakra Chennubhotla and Allan Jepson

Department of Computer Science, Univ. of Toronto



The goal in sparse coding is to seek a linear basis representation where each image is represented by a small number of active coefficients. The learning algorithm involves adapting a basis vector set while imposing a {\em low-entropy}, or sparse, prior on the output coefficients. Sparse coding applied on natural images has been shown to extract wavelet-like structure \cite{OlsFie,Harpur}. However, our experience in using sparse coding for extracting multi-scale structure in object-specific ensembles, such as face images or images of a gesturing hand, has been negative. In this paper we highlight three points about the reliability of sparse coding for extracting the desired structure: $(1)$ using an {\em overcomplete} representation $(2)$ projecting data into a low-dimensional subspace before attempting to resolve the sparse structure and $(3)$ applying sparsity constraint on the basis elements, as opposed to the output coefficients.

ps.gz gzipped postscript file