2016

Synthesizing Dynamic Textures and Sounds by Spatial-Temporal Generative ConvNet

Jianwen Xie    Song-Chun Zhu    Ying Nian Wu

University of California, Los Angeles (UCLA), USA

Paper

Abstract

Dynamic textures are spatial-temporal processes that exhibit statistical stationarity or stochastic repetitiveness in the temporal dimension. In this paper, we study the problem of modeling and synthesizing dynamic textures using a generative version of the convolution neural network (ConvNet or CNN) that consists of multiple layers of spatial-temporal filters to capture the spatial-temporal patterns in the dynamic textures. We show that such spatial-temporal generative ConvNet can synthesize realistic dynamic textures. We also apply the temporal generative ConvNet to the one-dimensional sound data, and show that the model can synthesize realistic natural and man-made sounds.



Synthesizing Dynamic Textures

(In each example, the first one (or two) is the observed sound, the other is the synthesized sound.)


Synthesizing Sounds

(In each example, the first is the observed sound, the other two are the synthesized sounds.)


Canary

Chicken Coop

cougar_snarl_growl

hard_sheep

Forest

jazz

Running stream

magkas

unpoco