High-Dimensional Statistical Inference in the Brain

Talk and Poster Abstracts

Jonathan Pillow: Local and low-rank priors for efficient high-dimensional receptive field estimation
An important problem in systems neuroscience is to characterize how a
neuron integrates a sensory stimulus across space and time.
Mathematically, this formalized as the problem of estimating a neuronâ€™s
linear receptive field (RF) under a probabilistic encoding model. In
typical experiments, RFs are high-dimensional due to the large number of
spatiotemporal stimulus elements within the neuron's integration window.
RF estimation therefore poses a variety of difficult computational and
statistical challenges.

In this talk, I will discuss several recent advances on the problem of
estimating high-dimensional neural RFs. First, I will describe a
hierarchical prior that favors "local" structure in space-time and
frequency, yielding RF estimates that are smooth and local (i.e., have
non-zero coefficients only within some restricted region in space-time
and Fourier space). Second, I will discuss reduced-rank RF inference,
which drastically reduces dimensionality by parametrizing RFs with a
small number of temporal and spatial filters. Our method relies on a
novel prior defined by the restriction of a matrix normal distribution
to the space of low-rank matrices. Finally, I will describe exact and
approximate Bayesian inference methods under a hierarchical model that
combines local and low-rank priors. In applications to neural data, the
resulting method achieves error rates several times lower than standard
estimators while maintaining low computational cost, which allows for
tractable scaling to high-dimensional settings.

Wolfgang Maass: Distributions of high-dimensional network states as knowledge base for networks of spiking neurons in the brain
Even biologically quite realistic complex nonlinear models for
cortical networks of neurons can be proven to have (under some mild
conditions) a unique stationary distribution of network states, to which
their stochastic dynamics converges exponentially fast from any initial
state. This holds for any stationary external input x, and also if x is
generated by an environment that can be approximated by a Markov model
with a stationary distribution. In the case of periodic external input,
such stationary distributions exists for every phase of this underlying
oscillation. It will also been shown that information can be extracted
from this probabilistic knowledge base of networks of spiking neurons
with noise in real-time through probabilistic inference via MCMC
sampling (called "neural sampling" in this context). These results
provide a theoretical foundation for the analysis of distributions of
high-dimensional network states in experimental data from neuroscience,
and also a model for probabilistic inference and stochastic computation
in such networks. Many of the results that are discussed in this talk
have very recently been published in
S. Habenschuss, Z. Jonke, W. Maass: Stochastic computations in cortical
microcircuit models. PLoS Computational Biology, Nov.14, 2013.

Ryan Adams: Implementing Graphical Models with Chemical Reaction Networks
Recent work on molecular programming has explored new possibilities
for computational abstractions with biomolecules, including logic
gates, neural networks, and linear systems. In the future such
abstractions might enable nanoscale devices that can sense and control
the world at a molecular scale. Just as in macroscale robotics, it is
critical that such devices can learn about their environment and
reason under uncertainty. At this small scale, systems are typically
modeled as chemical reaction networks. I will describe a procedure by
which arbitrary probabilistic graphical models, represented as factor
graphs over discrete random variables, can be compiled into chemical
reaction networks that implement inference. I will show how
marginalization based on sum-product message passing can be
implemented in terms of reactions between chemical species whose
concentrations represent probabilities. Tthe steady state
concentration of these species correspond to the marginal
distributions of the random variables in the graph. As with standard
sum-product inference, this procedure yields exact results for
tree-structured graphs, and approximate solutions for loopy graphs.

This is joint work with Nils Napp.

Mitya Chklovskii: How the brain handles big data:
Online algorithms in neurons
A neuron is a basic physiological and computational unit of the brain.
While much is known about the physiological properties of a neuron, its
computational role is poorly understood. Here we propose to view a
neuron as a signal processing device that represents the incoming
streaming data matrix as a sparse vector of synaptic weights scaled by
an outgoing sparse activity vector. Formally, a neuron minimizes a cost
function comprising a cumulative squared representation error and
regularization terms. We derive an online algorithm that minimizes such
cost function by alternating between the minimization with respect to
activity and with respect to synaptic weights. The steps of this
algorithm reproduce well-known physiological properties of a neuron,
such as weighted summation and leaky integration of synaptic inputs, as
well as an Oja-like, but parameter-free, synaptic learning rule. Our
theoretical framework makes several predictions, some of which can be
verified by the existing data, others require further experiments. Such
framework should allow modeling the function of neuronal circuits
without necessarily measuring all the microscopic biophysical
parameters, as well as facilitate the design of neuromorphic electronics.

T. Furlanello, M. Cristoforetti, C. Furlanello and G. Jurman:
Sparse predictive structure of deconvolved functional brain networks
The functional and structural representation of the brain as a complex network is marked by the fact that the comparison of noisy and intrinsically correlated high-dimensional structures between experimental conditions or groups shuns typical mass univariate methods. Furthermore most network estimation methods cannot distinguish between real and spurious correlation arising from the convolution due to nodes' interaction, which thus introduces additional noise in the data. We propose a machine learning pipeline aimed at identifying multivariate differences between brain networks associated to different experimental conditions. The pipeline (1) leverages the deconvolved individual contribution of each edge and (2) maps the task into a sparse classification problem in order to construct the associated "sparse deconvolved predictive network", i.e., a graph with the same nodes of those compared but whose edge weights are defined by their relevance for out of sample predictions in classification. We present an application of the proposed method by decoding the covert attention direction (left or right) based on the single-trial functional connectivity matrix extracted from high-frequency magnetoencephalography (MEG) data. Our results demonstrate how network deconvolution matched with sparse classification methods outperforms typical approaches for MEG decoding.

S. Zayd Enam, Michael R. DeWeese:
Spectro-Temporal Models of Inferior Colliculus Neuron
Receptive Fields
Sparse codes for speech spectrograms qualitatively match properties of receptive fields
of Inferior Colliculus (ICC) neurons.
We find sparse codes of speech-spectrograms are well described by one of four
models and we find that these models also fit ICC spectro-temporal receptive fields (STRF) well. Further,
our models are able to express time-frequency inseparable receptive fields (e.g. frequency sweeps) that
previous models were unable to satisfactorily describe. Our models allow the accurate characterization
of high-dimensional STRFs with more natural parameterizations of the neuron's behavior.

Mengchen Zhu, Ian Stevenson, Urs Köster, Charles M. Gray,
Bruno A. Olshausen, and Christopher Rozell:
Modeling single-trial V1 population response to
dynamic natural scenes
Single-trial correlations between neurons in the primary visual cortex
(V1) may arise from two sources: similar tuning properties (correlation
between receptive fields) and shared inputs from internal processes
(correlated ``noise''). It is still unclear how each component
contributes individually to single trial population response encoding
dynamic natural scenes. To investigate, we analyzed population response
to natural movies and found that: (1) On a short time scale, the
receptive field overlap does not predict well the distribution of
response correlation on its own, but could partially explain the
distribution when coupled with an active decorrelation mechanism (sparse
coding); (2) Ongoing activities contributes to the response correlation
on a longer time scale. We demonstrate that latent variable models can
learn the shared variability between neurons and can predict
single-trial spikes better than PSTH (the ideal receptive field model)
alone. Taken together, our results suggest that the cortical population
response likely reflects an efficient encoding strategy coupled with
global modulations from internal dynamics.

High-Dimensional Statistical Inference in the Brain## Talk and Poster Abstracts

Jonathan Pillow:Local and low-rank priors for efficient high-dimensional receptive field estimationAn important problem in systems neuroscience is to characterize how a neuron integrates a sensory stimulus across space and time. Mathematically, this formalized as the problem of estimating a neuronâ€™s linear receptive field (RF) under a probabilistic encoding model. In typical experiments, RFs are high-dimensional due to the large number of spatiotemporal stimulus elements within the neuron's integration window. RF estimation therefore poses a variety of difficult computational and statistical challenges.

In this talk, I will discuss several recent advances on the problem of estimating high-dimensional neural RFs. First, I will describe a hierarchical prior that favors "local" structure in space-time and frequency, yielding RF estimates that are smooth and local (i.e., have non-zero coefficients only within some restricted region in space-time and Fourier space). Second, I will discuss reduced-rank RF inference, which drastically reduces dimensionality by parametrizing RFs with a small number of temporal and spatial filters. Our method relies on a novel prior defined by the restriction of a matrix normal distribution to the space of low-rank matrices. Finally, I will describe exact and approximate Bayesian inference methods under a hierarchical model that combines local and low-rank priors. In applications to neural data, the resulting method achieves error rates several times lower than standard estimators while maintaining low computational cost, which allows for tractable scaling to high-dimensional settings.

Wolfgang Maass:Distributions of high-dimensional network states as knowledge base for networks of spiking neurons in the brainEven biologically quite realistic complex nonlinear models for cortical networks of neurons can be proven to have (under some mild conditions) a unique stationary distribution of network states, to which their stochastic dynamics converges exponentially fast from any initial state. This holds for any stationary external input x, and also if x is generated by an environment that can be approximated by a Markov model with a stationary distribution. In the case of periodic external input, such stationary distributions exists for every phase of this underlying oscillation. It will also been shown that information can be extracted from this probabilistic knowledge base of networks of spiking neurons with noise in real-time through probabilistic inference via MCMC sampling (called "neural sampling" in this context). These results provide a theoretical foundation for the analysis of distributions of high-dimensional network states in experimental data from neuroscience, and also a model for probabilistic inference and stochastic computation in such networks. Many of the results that are discussed in this talk have very recently been published in S. Habenschuss, Z. Jonke, W. Maass: Stochastic computations in cortical microcircuit models. PLoS Computational Biology, Nov.14, 2013.

Ryan Adams:Implementing Graphical Models with Chemical Reaction NetworksRecent work on molecular programming has explored new possibilities for computational abstractions with biomolecules, including logic gates, neural networks, and linear systems. In the future such abstractions might enable nanoscale devices that can sense and control the world at a molecular scale. Just as in macroscale robotics, it is critical that such devices can learn about their environment and reason under uncertainty. At this small scale, systems are typically modeled as chemical reaction networks. I will describe a procedure by which arbitrary probabilistic graphical models, represented as factor graphs over discrete random variables, can be compiled into chemical reaction networks that implement inference. I will show how marginalization based on sum-product message passing can be implemented in terms of reactions between chemical species whose concentrations represent probabilities. Tthe steady state concentration of these species correspond to the marginal distributions of the random variables in the graph. As with standard sum-product inference, this procedure yields exact results for tree-structured graphs, and approximate solutions for loopy graphs.

This is joint work with Nils Napp.

Mitya Chklovskii:How the brain handles big data: Online algorithms in neuronsA neuron is a basic physiological and computational unit of the brain. While much is known about the physiological properties of a neuron, its computational role is poorly understood. Here we propose to view a neuron as a signal processing device that represents the incoming streaming data matrix as a sparse vector of synaptic weights scaled by an outgoing sparse activity vector. Formally, a neuron minimizes a cost function comprising a cumulative squared representation error and regularization terms. We derive an online algorithm that minimizes such cost function by alternating between the minimization with respect to activity and with respect to synaptic weights. The steps of this algorithm reproduce well-known physiological properties of a neuron, such as weighted summation and leaky integration of synaptic inputs, as well as an Oja-like, but parameter-free, synaptic learning rule. Our theoretical framework makes several predictions, some of which can be verified by the existing data, others require further experiments. Such framework should allow modeling the function of neuronal circuits without necessarily measuring all the microscopic biophysical parameters, as well as facilitate the design of neuromorphic electronics.

T. Furlanello, M. Cristoforetti, C. Furlanello and G. Jurman:Sparse predictive structure of deconvolved functional brain networksThe functional and structural representation of the brain as a complex network is marked by the fact that the comparison of noisy and intrinsically correlated high-dimensional structures between experimental conditions or groups shuns typical mass univariate methods. Furthermore most network estimation methods cannot distinguish between real and spurious correlation arising from the convolution due to nodes' interaction, which thus introduces additional noise in the data. We propose a machine learning pipeline aimed at identifying multivariate differences between brain networks associated to different experimental conditions. The pipeline (1) leverages the deconvolved individual contribution of each edge and (2) maps the task into a sparse classification problem in order to construct the associated "sparse deconvolved predictive network", i.e., a graph with the same nodes of those compared but whose edge weights are defined by their relevance for out of sample predictions in classification. We present an application of the proposed method by decoding the covert attention direction (left or right) based on the single-trial functional connectivity matrix extracted from high-frequency magnetoencephalography (MEG) data. Our results demonstrate how network deconvolution matched with sparse classification methods outperforms typical approaches for MEG decoding.

S. Zayd Enam, Michael R. DeWeese:Spectro-Temporal Models of Inferior Colliculus Neuron Receptive FieldsSparse codes for speech spectrograms qualitatively match properties of receptive fields of Inferior Colliculus (ICC) neurons. We find sparse codes of speech-spectrograms are well described by one of four models and we find that these models also fit ICC spectro-temporal receptive fields (STRF) well. Further, our models are able to express time-frequency inseparable receptive fields (e.g. frequency sweeps) that previous models were unable to satisfactorily describe. Our models allow the accurate characterization of high-dimensional STRFs with more natural parameterizations of the neuron's behavior.

Mengchen Zhu, Ian Stevenson, Urs Köster, Charles M. Gray, Bruno A. Olshausen, and Christopher Rozell:Modeling single-trial V1 population response to dynamic natural scenesSingle-trial correlations between neurons in the primary visual cortex (V1) may arise from two sources: similar tuning properties (correlation between receptive fields) and shared inputs from internal processes (correlated ``noise''). It is still unclear how each component contributes individually to single trial population response encoding dynamic natural scenes. To investigate, we analyzed population response to natural movies and found that: (1) On a short time scale, the receptive field overlap does not predict well the distribution of response correlation on its own, but could partially explain the distribution when coupled with an active decorrelation mechanism (sparse coding); (2) Ongoing activities contributes to the response correlation on a longer time scale. We demonstrate that latent variable models can learn the shared variability between neurons and can predict single-trial spikes better than PSTH (the ideal receptive field model) alone. Taken together, our results suggest that the cortical population response likely reflects an efficient encoding strategy coupled with global modulations from internal dynamics.