Synthesizing Dynamic Textures and Sounds by Spatial-Temporal Generative ConvNet
University of California, Los Angeles (UCLA), USA
Dynamic textures are spatial-temporal processes that exhibit statistical stationarity or stochastic repetitiveness in the temporal dimension. In this paper, we study the problem of modeling and synthesizing dynamic textures using a generative version of the convolution neural network (ConvNet or CNN) that consists of multiple layers of spatial-temporal filters to capture the spatial-temporal patterns in the dynamic textures. We show that such spatial-temporal generative ConvNet can synthesize realistic dynamic textures. We also apply the temporal generative ConvNet to the one-dimensional sound data, and show that the model can synthesize realistic natural and man-made sounds.
Synthesizing Dynamic Textures
(In each example, the first one (or two) is the observed sound, the other is the synthesized sound.)
Synthesizing Sounds(In each example, the first is the observed sound, the other two are the synthesized sounds.)