1 University of California, Los Angeles, USA
2 Hikvision Research Institute, Santa Clara, USA
This paper studies the cooperative training of two generative models for image modeling and synthesis. Both models are parametrized by convolutional neural networks (ConvNets). The first model is a deep energy-based model, whose energy function is defined by a bottom-up ConvNet, which maps the observed image to the energy. We call it the descriptor network. The second model is a generator network, which is a non-linear version of factor analysis. It is defined by a top-down ConvNet, which maps the latent factors to the observed image. The maximum likelihood learning algorithms of both models involve MCMC sampling such as Langevin dynamics. We observe that the two learning algorithms can be seamlessly interwoven into a cooperative learning algorithm that can train both models simultaneously. Specifically, within each iteration of the cooperative learning algorithm, the generator model generates initial synthesized examples to initialize a finite-step MCMC that samples and trains the energy-based descriptor model. After that, the generator model learns from how the MCMC changes its synthesized examples. That is, the descriptor model teaches the generator model by MCMC, so that the generator model accumulates the MCMC transitions and reproduces them by direct ancestral sampling. We call this scheme MCMC teaching. We show that the cooperative algorithm can learn highly realistic generative models.
The code for Spatial-temporal CoopNets can be downloaded from: Python with Tensorflow.
If you wish to use our code, please cite the following papers:
The tex file can be downloaded here.
ContentsExp 1 : Experiment on generating texture patterns
Exp 2 : Experiment on generating object patterns
Exp 3 : Experiment on generating scene patterns
Exp 4 : Experiment on generating handwritten digits
Exp 5 : Experiment on large-scale benchmark datasets
Exp 6 : Experiment on pattern completion
Exp 7 : Experiment on generating dynamic textures
|Generator in CoopNets (ours)||64.30||16.98||35.25|
|Descriptor in CoopNets (ours)||35.42||16.65||33.61|
We thank Hansheng Jiang, Zilong Zheng, Erik Nijkamp, Tengyu Liu, Yaxuan Zhu, Zhaozhuo Xu and Xiaolin Fang for their assistance with coding and experiments. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research. The work is supported by Hikvision gift fund, NSF DMS 1310391, DARPA SIMPLEX N66001-15-C-4035, ONR MURI N00014-16-1-2007, and DARPA ARO W911NF-16-1-0579.