
Causal inference and graphical models:
We develop methods and theory for statistical learning of graphical models, in particular, causal Bayesian networks, from largescale and highdimensional data. Bayesian networks are a class of popular graphical models widely used for modeling conditional independence structures in a joint distribution and causal relations among a set of variables. The structure of a Bayesian network is represented by a directed acyclic graph (DAG). We have developed penalized likelihood methods, divideandconquer strategies, and associated software packages for structure learning of causal DAGs and causal inference on both experimental and observational data.

Highdimensional statistics:
We are interested in uncertainty quantification for regularized sparse estimators with applications to statistical inference for highdimensional models. We have developed the technique of estimator augmentation to characterize the sampling distribution of a lassotype estimator, which allows us to integrate Markov chain Monte Carlo and importance sampling into statistical inference for highdimensional models. Recently, we developed a projection and shrinkage method for constructing confidence sets in highdimensional regression that combines inferential advantages of sparse regularization and Stein estimation.

Monte Carlo methods:
We develop Monte Carlo methods to estimate statistical and topological structures
of a probability distribution, with applications in Bayesian inference and statistical physics.
We are interested in exploring and reconstructing energy landscapes
by estimating the density of states, the tree of sublevel sets, and the domain of attraction.

Bioinformatics:
We develop statistical methodologies
for efficient analysis of largescale highthroughput genomic data.
We employ modelbased and sparse regularization methods to
make statistical inference on these data. Our goal is to understand
gene regulation and decode regulatory
circuits by integrating gene expression data, protein binding data,
chromatin interaction data, and DNA sequence data.
We have constructed gene regulatory networks and identified combinatorial binding patterns in mouse embryonic stem cells. In addition, we also have
biological applications in alternative splicing and complex diseases via collaborations with experimental groups.