
Statistical learning:
We develop statistical learning methods and theory in the context of big data. In particular, we are interested in structure learning of Bayesian networks from largescale and highdimensional data. Bayesian networks are a class of popular graphical models widely used for modeling conditional independence structures in a joint distribution and causal relations among a set of variables. The structure of a Bayesian network is represented by a directed acyclic graph (DAG). We have developed penalized likelihood methods and software packages for structure learning of DAGs from both experimental and observational data. We have also introduced datadriven concave regularization into unsupervised learning.

Highdimensional inference:
We are interested in uncertainty quantification for regularized sparse estimators. We have developed the technique of estimator augmentation to characterize the sampling distribution of a lassotype estimator, which allows us to integrate Markov chain Monte Carlo and importance sampling into statistical inference for highdimensional models.

Monte Carlo methods:
We develop Monte Carlo methods to estimate statistical and topological structures
of a probability distribution, with applications in Bayesian inference and statistical physics.
We are interested in exploring and reconstructing energy landscapes
by estimating the density of states, the tree of sublevel sets, and the domain of attraction.

Bioinformatics:
We develop statistical methodologies
for efficient analysis of largescale highthroughput genomic data.
We employ modelbased and sparse regularization methods to
make statistical inference on these data. Our goal is to understand
gene regulation and decode regulatory
circuits by integrating gene expression data, protein binding data,
chromatin interaction data, and DNA sequence data.
We have constructed gene regulatory networks and identified combinatorial binding patterns in mouse embryonic stem cells. In addition, we also have
biological applications in alternative splicing and complex diseases via collaborations with experimental groups.