Gleb Basalyga - Academia.edu (original) (raw)
Papers by Gleb Basalyga
Neural populations across cortical layers perform different computational tasks. However, it is n... more Neural populations across cortical layers perform different computational tasks. However, it is not known whether information in different layers is encoded using a common neural code or whether it depends on the specific layer. Here we studied the laminar distribution of information in a large-scale computational model of cat primary visual cortex. We analyzed the amount of information about the input stimulus conveyed by the different representations of the cortical responses. In particular, we compared the information encoded in four possible neural codes: (1) the information carried by the firing rate of individual neurons; (2) the information carried by spike patterns within a time window; (3) the rate-and-phase information carried by the firing rate labelled by the phase of the Local Field Potentials (LFP); (4) the pattern-and-phase information carried by the spike patterns tagged with the LFP phase. We found that there is substantially more information in the rate-and-phase code compared with the firing rate alone for low LFP frequency bands (less than 30 Hz). When comparing how information is encoded across layers, we found that the extra information contained in a rate-and-phase code may reach 90 % in Layer 4, while in other layers it reaches only 60 %, compared to the information carried by the firing rate alone. These results suggest that information processing in primary sensory cortices could rely on different coding strategies across different layers.
We applied recently developed information theory methods [1,2] to the analysis of cortical respon... more We applied recently developed information theory methods [1,2] to the analysis of cortical responses in a large-scale computational model of cat primary visual cortex [3]. These methods quantify the information conveyed by spikes and by local field potentials (LFPs) in a very general way, without ad hoc assumptions about precisely which stimulus features (orientation, direction, etc.) drive the neuronal response. The phase-of-firing information is the extra information obtained by labeling spikes with the value of the LFP phase [2]. In order to gain insight into the information-processing properties of laminar cortical microcircuits, we calculated the spike count information conveyed by firing rates and the phase-of-firing information conveyed by LFPs for each layer of primary visual cortex.
We found that there is substantially more information in the phase code compared with the spike rate alone for low LFP frequencies (< 30 Hz). Figure 1 shows that the information gain for the phase code may reach 80 % in Layer 2/3, while in Layer 4 it reaches only 40 %, compared to the spike count code. These data support the hypothesis that the thalamo-cortical layers, which receive direct sensory input, may rely more on spikes to convey the information, while the cortico-cortical layers with strong recurrent connectivity may use the phase code and LFP signals for information coding.
In this work, we use a complex network approach to investigate how a neural network structure cha... more In this work, we use a complex network approach to investigate how a neural network structure changes under synaptic plasticity. In particular, we consider a network of conductance-based, single-compartment integrate-and-fire excitatory and inhibitory neurons. Initially the neurons are connected randomly with uniformly distributed synaptic weights. The weights of excitatory connections can be strengthened or weakened during spiking activity by the mechanism known as spike-timing-dependent plasticity (STDP). We extract a binary directed connection matrix by thresholding the weights of the excitatory connections at every simulation step and calculate its major topological characteristics such as the network clustering coefficient, characteristic path length and small-world index. We numerically demonstrate that, under certain conditions, a nontrivial small-world structure can emerge from a random initial network subject to STDP learning.
BMC Neuroscience
A large-scale computational model of the thalamocortical system containing specific thalamic stru... more A large-scale computational model of the thalamocortical system containing specific thalamic structures and four layers of cat primary visual cortex (V1) has been developed and simulated in the parallel NEURON environment. The model represents a simplified version of the X visual pathway for one eye to the primary visual cortex of an adult cat and consists of the three connected subsystems (see Figure 1): Retina, Thalamus and V1 model. Cells are modeled as conductance-based single compartment Hodgkin-Huxley neurons [1]. Inputs to the LGN in response to visual stimuli are generated by a reimplementation of the retina model described in [2]. The modeled LGN consists of thalamocortical relay cells and local inhibitory neurons. Additional inhibition to the LGN is provided by the reticular thalamic nucleus (RTN) neurons, which together with the thalamic relay cells, receive strong feedback from Layer 6. The model is scaled to span approximately 1.9 × 1.9 mm2 of striate cortical surface. It is composed of ~10 000 neurons with ~1000000 connections. Connections from LGN to V1 produce an orientation selectivity map in Layer 4. The obtained orientation map is further used to infer the lateral connectivity inside Layer 2/3 in accordance with known principles governing horizontal local and long-range connections in V1 [3]. The model is constructed from currently available anatomical and physiological data [4,5]. The main focus of this work is to reproduce the response properties of cortical cells such as orientation preference and direction selectivity [6]. Using large-scale numerical simulations, we also study the spacial distribution of neuronal responses across the cortical layers and examine the mechanisms and principles that underlie the efficiency of the primary visual cortex. This work [7] was supported by an EPSRC research grant (Ref. EP/C010841/1).
The learning dynamics of on-line independent component analysis is analysed in the limit of large... more The learning dynamics of on-line independent component analysis is analysed in the limit of large data dimension. We study a simple Hebbian learning algorithm that can be used to separate out a small number of non-Gaussian components from a high-dimensional data set. The de-mixing matrix parameters are confined to a Stiefel manifold of tall, orthogonal matrices and we introduce a natural gradient variant of the algorithm which is appropriate to learning on this manifold. For large input dimension the parameter trajectory of both algorithms passes through a sequence of unstable fixed points, each described by a diffusion process in a polynomial potential. Choosing the learning rate too large increases the escape time from each of these fixed points, effectively trapping the learning in a sub-optimal state. In order to avoid these trapping states a very low learning rate must be chosen during the learning transient, resulting in learning time-scales of O(N^2) or O(N^3) iterations where N is the data dimension. Escape from each sub-optimal state results in a sequence of symmetry breaking events as the algorithm learns each source in turn. This is in marked contrast to the learning dynamics displayed by related on-line learning algorithms for multilayer neural networks and principal component analysis. Although the natural gradient variant of the algorithm has nice asymptotic convergence properties, it has an equivalent transient dynamics to the standard Hebbian algorithm.
Cortical sensory neurons are known to be highly variable, in the sense that responses evoked by i... more Cortical sensory neurons are known to be highly variable, in the sense that responses evoked by identical stimuli often change dramatically from trial to trial. The origin of this variability is uncertain, but it is usually interpreted as detrimental noise that reduces the computational accuracy of neural circuits. Here we investigate the possibility that such response variability might, in fact, be beneficial, because it may partially compensate for a decrease in accuracy due to stochastic changes in the synaptic strengths of a network. We study the interplay between two kinds of noise, response (or neuronal) noise and synaptic noise, by analyzing their joint influence on the accuracy of neural networks trained to perform various tasks. We find an interesting, generic interaction: when fluctuations in the synaptic connections are proportional to their strengths (multiplicative noise), a certain amount of response noise in the input neurons can significantly improve network performance, compared to the same network without response noise. Performance is enhanced because response noise and multiplicative synaptic noise are in some ways equivalent. These results are demonstrated analytically for the most basic network consisting of two input neurons and one output neuron performing a simple classification task, but computer simulations show that the phenomenon persists in a wide range of architectures, including recurrent (attractor) networks and sensory-motor networks that perform coordinate transformations. The results suggest that response variability could play an important dynamic role in networks that continuously learn.
This thesis provides a theoretical description of on-line unsupervised learning from high-dimensi... more This thesis provides a theoretical description of on-line unsupervised learning from high-dimensional data. In particular, the learning dynamics of the on-line Hebbian algorithm is studied for the following two popular statistical models—principal component analysis (PCA) and independent component analysis (ICA). The methods of statistical mechanics are used to elucidate the critical transient stages of the learning process.
In the case of on-line PCA, the statistical mechanics approach allows the derivation of a set of deterministic differential equations for a small number of macroscopic self-averaging order parameters which enables an exact calculation of the evolution of the error function in the limit of large data dimension. In the case of ICA learning, it is found that the interesting macroscopic order parameters are not self-averaging in general and the transient dynamics is relatively slow with large fluctuations. The parameter trajectory of the on-line Hebbian ICA algorithm studied here passes through a sequence of metastable states, each of which can be described by a diffusion process in a polynomial potential. The proper treatment of the fluctuations of the order parameters for this case is given by the Fokker-Planck equation.
For both models a natural gradient version of the studied algorithms is constructed using the fact that the parameter space of our models is constrained to a Stiefel manifold of orthogonal rectangular matrices. In the case of on-line ICA, the natural gradient variant exhibits the same stochastic trapping in sub-optimal metastable states during transient learning as in the case of the standard Hebbian ICA algorithm and there is only an advantage in using the natural gradient variant asymptotically.
The problem of the sensitivity of on-line learning to the choice of learning algorithms is considered. Recommendations for proper adjustment of learning parameters in order to improve the performance of studied algorithms are given. Numerical simulations for finite-sized systems are in good agreement with the theoretical results.
Artificial Neural NetworksICANN 2002, Jan 1, 2002
The learning dynamics close to the initial conditions of an on-line Hebbian ICA algorithm has bee... more The learning dynamics close to the initial conditions of an on-line Hebbian ICA algorithm has been studied. For large input dimension the dynamics can be described by a diffusion equation.A surprisingly large number of examples and unusually low initial learning rate are required to avoid a stochastic trapping state near the initial conditions. Escape from this state results in symmetry breaking and the algorithm therefore avoids trapping in plateau-like fixed points which have been observed in other learning algorithms.
Advances in Neural Information Processing …, Jan 1, 2002
We study the dynamics of a Hebbian ICA algorithm extracting a single non-Gaussian component from ... more We study the dynamics of a Hebbian ICA algorithm extracting a single non-Gaussian component from a high-dimensional Gaussian background. For both on-line and batch learning we find that a surprisingly large number of examples are required to avoid trapping in a sub-optimal state close to the initial conditions. To extract a skewed signal at least O(N^2) examples are required for N-dimensional data and O(N^3) examples are required to extract a symmetrical signal with non-zero kurtosis.
Abstract In this paper, we use the complex network approach to investigate how a neural network s... more Abstract In this paper, we use the complex network approach to investigate how a neural network structure changes under spike-timing-dependent plasticity (STDP). We numerically demonstrate that, under certain conditions, a nontrivial small-wold like structure can emerge from a random initial network subject to STDP learning.
Neural populations across cortical layers perform different computational tasks. However, it is n... more Neural populations across cortical layers perform different computational tasks. However, it is not known whether information in different layers is encoded using a common neural code or whether it depends on the specific layer. Here we studied the laminar distribution of information in a large-scale computational model of cat primary visual cortex. We analyzed the amount of information about the input stimulus conveyed by the different representations of the cortical responses. In particular, we compared the information encoded in four possible neural codes: (1) the information carried by the firing rate of individual neurons; (2) the information carried by spike patterns within a time window; (3) the rate-and-phase information carried by the firing rate labelled by the phase of the Local Field Potentials (LFP); (4) the pattern-and-phase information carried by the spike patterns tagged with the LFP phase. We found that there is substantially more information in the rate-and-phase code compared with the firing rate alone for low LFP frequency bands (less than 30 Hz). When comparing how information is encoded across layers, we found that the extra information contained in a rate-and-phase code may reach 90 % in Layer 4, while in other layers it reaches only 60 %, compared to the information carried by the firing rate alone. These results suggest that information processing in primary sensory cortices could rely on different coding strategies across different layers.
We applied recently developed information theory methods [1,2] to the analysis of cortical respon... more We applied recently developed information theory methods [1,2] to the analysis of cortical responses in a large-scale computational model of cat primary visual cortex [3]. These methods quantify the information conveyed by spikes and by local field potentials (LFPs) in a very general way, without ad hoc assumptions about precisely which stimulus features (orientation, direction, etc.) drive the neuronal response. The phase-of-firing information is the extra information obtained by labeling spikes with the value of the LFP phase [2]. In order to gain insight into the information-processing properties of laminar cortical microcircuits, we calculated the spike count information conveyed by firing rates and the phase-of-firing information conveyed by LFPs for each layer of primary visual cortex.
We found that there is substantially more information in the phase code compared with the spike rate alone for low LFP frequencies (< 30 Hz). Figure 1 shows that the information gain for the phase code may reach 80 % in Layer 2/3, while in Layer 4 it reaches only 40 %, compared to the spike count code. These data support the hypothesis that the thalamo-cortical layers, which receive direct sensory input, may rely more on spikes to convey the information, while the cortico-cortical layers with strong recurrent connectivity may use the phase code and LFP signals for information coding.
In this work, we use a complex network approach to investigate how a neural network structure cha... more In this work, we use a complex network approach to investigate how a neural network structure changes under synaptic plasticity. In particular, we consider a network of conductance-based, single-compartment integrate-and-fire excitatory and inhibitory neurons. Initially the neurons are connected randomly with uniformly distributed synaptic weights. The weights of excitatory connections can be strengthened or weakened during spiking activity by the mechanism known as spike-timing-dependent plasticity (STDP). We extract a binary directed connection matrix by thresholding the weights of the excitatory connections at every simulation step and calculate its major topological characteristics such as the network clustering coefficient, characteristic path length and small-world index. We numerically demonstrate that, under certain conditions, a nontrivial small-world structure can emerge from a random initial network subject to STDP learning.
BMC Neuroscience
A large-scale computational model of the thalamocortical system containing specific thalamic stru... more A large-scale computational model of the thalamocortical system containing specific thalamic structures and four layers of cat primary visual cortex (V1) has been developed and simulated in the parallel NEURON environment. The model represents a simplified version of the X visual pathway for one eye to the primary visual cortex of an adult cat and consists of the three connected subsystems (see Figure 1): Retina, Thalamus and V1 model. Cells are modeled as conductance-based single compartment Hodgkin-Huxley neurons [1]. Inputs to the LGN in response to visual stimuli are generated by a reimplementation of the retina model described in [2]. The modeled LGN consists of thalamocortical relay cells and local inhibitory neurons. Additional inhibition to the LGN is provided by the reticular thalamic nucleus (RTN) neurons, which together with the thalamic relay cells, receive strong feedback from Layer 6. The model is scaled to span approximately 1.9 × 1.9 mm2 of striate cortical surface. It is composed of ~10 000 neurons with ~1000000 connections. Connections from LGN to V1 produce an orientation selectivity map in Layer 4. The obtained orientation map is further used to infer the lateral connectivity inside Layer 2/3 in accordance with known principles governing horizontal local and long-range connections in V1 [3]. The model is constructed from currently available anatomical and physiological data [4,5]. The main focus of this work is to reproduce the response properties of cortical cells such as orientation preference and direction selectivity [6]. Using large-scale numerical simulations, we also study the spacial distribution of neuronal responses across the cortical layers and examine the mechanisms and principles that underlie the efficiency of the primary visual cortex. This work [7] was supported by an EPSRC research grant (Ref. EP/C010841/1).
The learning dynamics of on-line independent component analysis is analysed in the limit of large... more The learning dynamics of on-line independent component analysis is analysed in the limit of large data dimension. We study a simple Hebbian learning algorithm that can be used to separate out a small number of non-Gaussian components from a high-dimensional data set. The de-mixing matrix parameters are confined to a Stiefel manifold of tall, orthogonal matrices and we introduce a natural gradient variant of the algorithm which is appropriate to learning on this manifold. For large input dimension the parameter trajectory of both algorithms passes through a sequence of unstable fixed points, each described by a diffusion process in a polynomial potential. Choosing the learning rate too large increases the escape time from each of these fixed points, effectively trapping the learning in a sub-optimal state. In order to avoid these trapping states a very low learning rate must be chosen during the learning transient, resulting in learning time-scales of O(N^2) or O(N^3) iterations where N is the data dimension. Escape from each sub-optimal state results in a sequence of symmetry breaking events as the algorithm learns each source in turn. This is in marked contrast to the learning dynamics displayed by related on-line learning algorithms for multilayer neural networks and principal component analysis. Although the natural gradient variant of the algorithm has nice asymptotic convergence properties, it has an equivalent transient dynamics to the standard Hebbian algorithm.
Cortical sensory neurons are known to be highly variable, in the sense that responses evoked by i... more Cortical sensory neurons are known to be highly variable, in the sense that responses evoked by identical stimuli often change dramatically from trial to trial. The origin of this variability is uncertain, but it is usually interpreted as detrimental noise that reduces the computational accuracy of neural circuits. Here we investigate the possibility that such response variability might, in fact, be beneficial, because it may partially compensate for a decrease in accuracy due to stochastic changes in the synaptic strengths of a network. We study the interplay between two kinds of noise, response (or neuronal) noise and synaptic noise, by analyzing their joint influence on the accuracy of neural networks trained to perform various tasks. We find an interesting, generic interaction: when fluctuations in the synaptic connections are proportional to their strengths (multiplicative noise), a certain amount of response noise in the input neurons can significantly improve network performance, compared to the same network without response noise. Performance is enhanced because response noise and multiplicative synaptic noise are in some ways equivalent. These results are demonstrated analytically for the most basic network consisting of two input neurons and one output neuron performing a simple classification task, but computer simulations show that the phenomenon persists in a wide range of architectures, including recurrent (attractor) networks and sensory-motor networks that perform coordinate transformations. The results suggest that response variability could play an important dynamic role in networks that continuously learn.
This thesis provides a theoretical description of on-line unsupervised learning from high-dimensi... more This thesis provides a theoretical description of on-line unsupervised learning from high-dimensional data. In particular, the learning dynamics of the on-line Hebbian algorithm is studied for the following two popular statistical models—principal component analysis (PCA) and independent component analysis (ICA). The methods of statistical mechanics are used to elucidate the critical transient stages of the learning process.
In the case of on-line PCA, the statistical mechanics approach allows the derivation of a set of deterministic differential equations for a small number of macroscopic self-averaging order parameters which enables an exact calculation of the evolution of the error function in the limit of large data dimension. In the case of ICA learning, it is found that the interesting macroscopic order parameters are not self-averaging in general and the transient dynamics is relatively slow with large fluctuations. The parameter trajectory of the on-line Hebbian ICA algorithm studied here passes through a sequence of metastable states, each of which can be described by a diffusion process in a polynomial potential. The proper treatment of the fluctuations of the order parameters for this case is given by the Fokker-Planck equation.
For both models a natural gradient version of the studied algorithms is constructed using the fact that the parameter space of our models is constrained to a Stiefel manifold of orthogonal rectangular matrices. In the case of on-line ICA, the natural gradient variant exhibits the same stochastic trapping in sub-optimal metastable states during transient learning as in the case of the standard Hebbian ICA algorithm and there is only an advantage in using the natural gradient variant asymptotically.
The problem of the sensitivity of on-line learning to the choice of learning algorithms is considered. Recommendations for proper adjustment of learning parameters in order to improve the performance of studied algorithms are given. Numerical simulations for finite-sized systems are in good agreement with the theoretical results.
Artificial Neural NetworksICANN 2002, Jan 1, 2002
The learning dynamics close to the initial conditions of an on-line Hebbian ICA algorithm has bee... more The learning dynamics close to the initial conditions of an on-line Hebbian ICA algorithm has been studied. For large input dimension the dynamics can be described by a diffusion equation.A surprisingly large number of examples and unusually low initial learning rate are required to avoid a stochastic trapping state near the initial conditions. Escape from this state results in symmetry breaking and the algorithm therefore avoids trapping in plateau-like fixed points which have been observed in other learning algorithms.
Advances in Neural Information Processing …, Jan 1, 2002
We study the dynamics of a Hebbian ICA algorithm extracting a single non-Gaussian component from ... more We study the dynamics of a Hebbian ICA algorithm extracting a single non-Gaussian component from a high-dimensional Gaussian background. For both on-line and batch learning we find that a surprisingly large number of examples are required to avoid trapping in a sub-optimal state close to the initial conditions. To extract a skewed signal at least O(N^2) examples are required for N-dimensional data and O(N^3) examples are required to extract a symmetrical signal with non-zero kurtosis.
Abstract In this paper, we use the complex network approach to investigate how a neural network s... more Abstract In this paper, we use the complex network approach to investigate how a neural network structure changes under spike-timing-dependent plasticity (STDP). We numerically demonstrate that, under certain conditions, a nontrivial small-wold like structure can emerge from a random initial network subject to STDP learning.