Learning Descriptor Networks for 3D Shape Synthesis and Analysis

Jianwen Xie 1*, Zilong Zheng 2*, Ruiqi Gao 2, Wenguan Wang 2,3, Song-Chun Zhu 2, and Ying Nian Wu 2

(* Equal contributions)
1 Hikvision Research Institute, Santa Clara, USA
2 University of California, Los Angeles (UCLA), USA
3 Beijing Institute of Technology, China


This paper proposes a 3D shape descriptor network, which is a deep convolutional energy-based model, for modeling volumetric shape patterns. The maximum likelihood training of the model follows an “analysis by synthesis” scheme and can be interpreted as a mode seeking and mode shifting process. The model can synthesize 3D shape patterns by sampling from the probability distribution via MCMC such as Langevin dynamics. The model can be used to train a 3D generator network via MCMC teaching. The conditional version of the 3D shape descriptor net can be used for 3D object recovery and 3D object super-resolution. Experiments demonstrate that the proposed model can generate realistic 3D shape patterns and can be useful for 3D shape analysis.


The paper can be downloaded here.

The tex file can be downloaded here.

The poster can be downloaded here.


The oral presentation can be downloaded here.

Code and Data

The Python code using tensorflow is coming soon

If you wish to use our code, please cite the following paper: 

Learning Descriptor Networks for 3D Shape Synthesis and Analysis
Jianwen Xie*, Zilong Zheng*, Ruiqi Gao, Wenguan Wang, Song-Chun Zhu, Ying Nian Wu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018 (Oral)



Experiment 1: Generating 3D Objects

We conduct experiments on synthesizing 3D objects of categories from ModelNet dataset.

Figure1. Generating 3D objects. For each category, the first three 3D objects are observed examples, columns 4, 5, 6, 7, 8, and 9 are 6 of the synthesized 3D objects sampled from the learned model by Langevin dynamics. For the last four synthesized objects (shown in columns 6, 7, 8, and 9), their nearest neighbors retrieved from the training set are shown in columns 10, 11, 12, and 13.

Experiment 2: 3D Object Recovery

We test the conditional 3D DescriptorNet on the 3D object recovery task.

(a) chair
(b) nightstand
(c) toilet
(d) sofa
Figure2. 3D object recovery by sampling from the conditional 3D DescriptorNet models. In each category, the first row displays the original 3D objects, the second row shows the corrupted 3D objects, and the third row displays the recovered 3D objects by running Langevin dynamics starting from the corrupted objects. (a) chair (b) night stand (c) toilet (d) sofa.

Experiment 3: 3D Object Super resolution

We test the conditional 3D DescriptorNet on the 3D object super-resolution task

Figure3. 3D object super-resolution by conditional 3D DescriptorNet. The first row displays some original 3D objects (64×64×64 voxels). The second row shows the corresponding low resolution 3D objects (16 × 16 × 16 voxels). The last row displays the corresponding super-resolution results which are obtained by sampling from the conditional 3D DescriptorNet by running 10 steps of Langevin dynamics initialized with the objects shown in the second row

Experiment 4: Cooperative training of 3D generator

We evaluate a 3D generator trained by a 3D DescriptorNet via MCMC teaching. We show results of interpolating between two latent vectors and shape arithmetic in the latent space.


Figure4. Interpolation between latent vectors of the 3D objects on the two ends.

3D Object Arithmetic

Figure5. 3D shape arithmetic in the latent space.

Experiment 5: 3D object classification

We evaluate the feature maps learned by our 3D DescriptorNet. We perform a classification experiment on ModelNet10 dataset. We first train a single model on all categories of the training set in an unsupervised manner. We train a multinomial logistic regression classifier from labeled data based on the extracted feature vectors for classification.

Table1. 3D object classification on ModelNet10 dataset
Geometry Image 88.4%
ECC 90.0%
3D ShapeNets83.5%
DeepPano 85.5%
3D DescriptorNet (ours)92.4%

Related Works

This project builds upon the following ideas, which we encourage you to check out.

[1] Jianwen Xie*, Yang Lu*, Song-Chun Zhu, Ying Nian Wu. "A theory of generative convnet." ICML. 2016.

[2] Tian Han*, Yang Lu*, Song-Chun Zhu, Ying Nian Wu. "Alternating Back-Propagation for Generator Network." AAAI 2017.

[3] Jianwen Xie, Song-Chun Zhu, and Ying Nian Wu. "Synthesizing dynamic patterns by spatial-temporal generative ConvNet." CVPR . 2017.

[4] Jianwen Xie, Yang Lu, Ruiqi Gao, Ying Nian Wu. "Cooperative Learning of Energy-Based Model and Latent Variable Model via MCMC Teaching." AAAI. 2018.