University of California, Los Angeles (UCLA), USA
This paper proposes an alternating back-propagation algorithm for learning the generator network model. The model is a non-linear generalization of factor analysis. In this model, the mapping from the latent factors to the observed vector is parametrized by a convolutional neural network. The alternating back-propagation algorithm iterates between the following two steps: (1) Inferential back-propagation, which infers the latent factors by Langevin dynamics or gradient descent. (2) Learning back-propagation, which updates the parameters given the inferred latent factors by gradient descent. The gradient computations in both steps are powered by back-propagation, and they share most of their code in common. We show that the alternating back-propagation algorithm can learn realistic generator models of natural images, video sequences, and sounds. Moreover, it can also be used to learn from incomplete or indirect training data.
The paper can be downloaded here.
The code for the qualitative experiment can be downloaded here.
The code for learning from incomplete data experiment can be downloaded here.
The code for learning from compressing data experiment can be downloaded here.
The code for comparison with PCA experiment can be downloaded here.
For more details, please refer to Readme.txt in each zip file.
Figure 1. Generating texture patterns. For each category, the first image displays one of the training images, and the second displays one of the generated images. Please click on the category names or images for more details.
Figure 2. Generating sound patterns. For each category, the first sound track is the training data, and the second one is the generated sound. Please click on the category names for more details.
Figure 3. Generating object patterns. For each category, the first image is the training image, and the second image is generated images. Please click on the category names or images for more details.
Figure 4. Generating dynamic texture patterns. For each category, the first animation is the training sequence, and the middle animation is generated by linear dynamic system. The right animation is generated by our method.
Figure 5. Learning from incomplete data. The 10 columns belong to experiments salt and pepper occlusion 50%, 70%, 90%, 90%, 90%, 90%, 90%, single region occlusion 20x20, 30x30, 30x30 respectively. Row 1: original images, not observed in learning. Row 2: training images. Row 3: recovered images during learning. Please click on the image for more details.
Figure 6. Learning from indirect data. The first row displays the original 64 × 64 × 3 images, which are projected onto 1,000 white noise images. The second row displays the re-covered images during learning. Please click on the image for more details.
Figure 7. Comparison between our method and PCA. Row 1: original testing images. Row 2: reconstructions by PCA eigenvectors learned from training images. Row 3: Recon- struction by the generator learned from training images. d = 20 for both methods. Please click on the image for more details.
The code in our work is based on the Matlab code of MatConvNet of Vedaldi & Lenc (2015). We thank the authors for sharing their code with the community.
We thank Yifei (Jerry) Xu for his help with the experiments. We thank Jianwen Xie for helpful discussions.
The work is supported by NSF DMS 1310391, DARPA SIMPLEX N66001-15-C-4035, ONR MURI N00014-16-1-2007, DARPA ARO W911NF-16-1-0579.