Learning Descriptor Networks for 3D Shape Synthesis and Analysis

Jianwen Xie ^1*, Zilong Zheng ^2*, Ruiqi Gao ², Wenguan Wang ^2,3, Song-Chun Zhu ², and Ying Nian Wu ²

(* Equal contributions)
¹ Hikvision Research Institute, Santa Clara, USA
² University of California, Los Angeles (UCLA), USA
³ Beijing Institute of Technology

Abstract

This paper proposes a 3D shape descriptor network, which is a deep convolutional energy-based model, for modeling volumetric shape patterns. The maximum likelihood training of the model follows an “analysis by synthesis” scheme and can be interpreted as a mode seeking and mode shifting process. The model can synthesize 3D shape patterns by sampling from the probability distribution via MCMC such as Langevin dynamics. The model can be used to train a 3D generator network via MCMC teaching. The conditional version of the 3D shape descriptor net can be used for 3D object recovery and 3D object super-resolution. Experiments demonstrate that the proposed model can generate realistic 3D shape patterns and can be useful for 3D shape analysis.

Paper

The paper can be downloaded here.

The tex file can be downloaded here.

The poster can be downloaded here.

Slides

The CVPR 2018 Oral presentation can be downloaded here.

Code and Data

The Python code using tensorflow can be downloaded here

If you wish to use our code, please cite the following paper:

Learning Descriptor Networks for 3D Shape Synthesis and Analysis
Jianwen Xie*, Zilong Zheng*, Ruiqi Gao, Wenguan Wang, Song-Chun Zhu, Ying Nian Wu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018

Experiments

Contents

Exp 1 : Experiment on 3D object synthesis
Exp 2 : Experiment on 3D object recovery
Exp 3 : Experiment on 3D object super resolution
Exp 4 : Experiment on Cooperative training of 3D generator
Exp 5 : Experiment on 3D object classification

Experiment 1: Generating 3D Objects

Experiment 2: 3D Object Recovery

Experiment 3: 3D Object Super Resolution

Experiment 4: Cooperative Training of 3D Generator

We evaluate a 3D generator trained by a 3D DescriptorNet in cooperative training scheme on experiments of latent space interpolation and 3D object arithmetic.

Exp 4.1: Interpolation

Exp 4.2: 3D Object Arithmetic

The following shows 3D object arithmetic by the 3D generator net. It encodes semantic knowledge of 3D shapes in the latent space.

Experiment 5: 3D Object Classification

We first train a single model on all categories of the training set of ModelNet10 dataset in an unsupervised manner. Then we use the model as a feature extractor. We train a multinomial logistic regression classifier from labeled data based on the extracted feature vectors for classification. The following shows 3D object classification results on ModelNet10 dataset. We evaluate the classification accuracy on the testing data using the one-versus-all rule.

Acknowledgment

We thank Erik Nijkamp for his help on coding. We thank Siyuan Huang for helpful discussions.

Method	Classification
Geometry Image	88.4%
PANORAMA-NN	91.1%
ECC	90.0%
3D ShapeNets	83.5%
DeepPana	85.5%
SPH	79.8%
VConv-DAE	80.5%
3D-GAN	91.0%
3D DescriptorNet (ours)	92.4%