Tian Han * , Yang Lu * , Song-Chun Zhu , and Ying Nian Wu
* Equal contributions
University of California, Los Angeles (UCLA), USA
The training image is 224x224 and the synthesized images are 448x448.