Gert Cauwenberghs - Profile on Academia.edu (original) (raw)

Papers by Gert Cauwenberghs

Frontiers in Neuroscience, 2014

Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been demonstrated to perform e... more Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been demonstrated to perform efficiently in a variety of applications, such as dimensionality reduction, feature learning, and classification. Their implementation on neuromorphic hardware platforms emulating large-scale networks of spiking neurons can have significant advantages from the perspectives of scalability, power dissipation and real-time interfacing with the environment. However, the traditional RBM architecture and the commonly used training algorithm known as Contrastive Divergence (CD) are based on discrete updates and exact arithmetics which do not directly map onto a dynamical neural substrate. Here, we present an event-driven variation of CD to train a RBM constructed with Integrate & Fire (I&F) neurons, that is constrained by the limitations of existing and near future neuromorphic hardware platforms. Our strategy is based on neural sampling, which allows us to synthesize a spiking neural network that samples from a target Boltzmann distribution. The recurrent activity of the network replaces the discrete steps of the CD algorithm, while Spike Time Dependent Plasticity (STDP) carries out the weight updates in an online, asynchronous fashion. We demonstrate our approach by training an RBM composed of leaky I&F neurons with STDP synapses to learn a generative model of the MNIST hand-written digit dataset, and by testing it in recognition, generation and cue integration tasks. Our results contribute to a machine learning-driven approach for synthesizing networks of spiking neurons capable of carrying out practical, high-level functionality.

Frontiers in Neuroscience, Mar 31, 2015

Spike-timing-dependent plasticity (STDP) incurs both causal and acausal synaptic weight updates, ... more Spike-timing-dependent plasticity (STDP) incurs both causal and acausal synaptic weight updates, for negative and positive time differences between pre-synaptic and postsynaptic spike events. For realizing such updates in neuromorphic hardware, current implementations either require forward and reverse lookup access to the synaptic connectivity table, or rely on memory-intensive architectures such as crossbar arrays. We present a novel method for realizing both causal and acausal weight updates using only forward lookup access of the synaptic connectivity table, permitting memory-efficient implementation. A simplified implementation in FPGA, using a single timer variable for each neuron, closely approximates exact STDP cumulative weight updates for neuron refractory periods greater than 10 ms, and reduces to exact STDP for refractory periods greater than the STDP time window. Compared to conventional crossbar implementation, the forward table-based implementation leads to substantial memory savings for sparsely connected networks supporting scalable neuromorphic systems with fully reconfigurable synaptic connectivity and plasticity.

Restricted Boltzmann Machines and Deep Belief Networks have been successfully used in a wide vari... more Restricted Boltzmann Machines and Deep Belief Networks have been successfully used in a wide variety of applications including image classification and speech recognition. Inference and learning in these algorithms uses a Markov Chain Monte Carlo procedure called Gibbs sampling. A sigmoidal function forms the kernel of this sampler which can be realized from the firing statistics of noisy integrate-and-fire neurons on a neuromorphic VLSI substrate. This paper demonstrates such an implementation on an array of digital spiking neurons with stochastic leak and threshold properties for inference tasks and presents some key performance metrics for such a hardwarebased sampler in both the generative and discriminative contexts.

An ultra-low power integrate-and-fire neuron array transceiver with a multi-modal neuron architec... more An ultra-low power integrate-and-fire neuron array transceiver with a multi-modal neuron architecture is presented. The design features an array of 16×16 charge-mode mixed-signal neurons that can be configured to implement a variety of activation functions, including step, sigmoid and Rectified Linear Unit (ReLU), through reconfiguration of clocking waveforms through partial reset in charge accumulation and additive stochastic noise by Linear Feedback Shift Register (LFSR) coupling. The neuron outputs spike-based sparse synchronous events, which are either binary (event/no event) or ternary (positive/negative/no events). The reconfigurable energyefficient design makes this architecture suitable for deep learning and neuromorphic applications like Restricted Boltzmann Machines, Convolutional Neural Networks and general event-driven computing. The 1.796 mm 2 chip fabricated in 130nm CMOS technology consumes 140.6 µW from a 1.8V supply at 92.5 MSpikes/s achieving an energy efficiency Figure-of-Merit (FoM) of 1.52 pJ/Spike. A CNN architecture implemented on the chip using sigmoid and ReLU activation achieves MNIST prediction accuracy of 94.8% and 96.9%.

arXiv (Cornell University), Sep 24, 2015

Stochastic neural networks such as Restricted Boltzmann Machines (RBMs) have been successfully us... more Stochastic neural networks such as Restricted Boltzmann Machines (RBMs) have been successfully used in applications ranging from speech recognition to image classification. Inference and learning in these algorithms use a Markov Chain Monte Carlo procedure called Gibbs sampling, where a logistic function forms the kernel of this sampler. On the other side of the spectrum, neuromorphic systems have shown great promise for low-power and parallelized cognitive computing, but lack wellsuited applications and automation procedures. In this work, we propose a systematic method for bridging the RBM algorithm and digital neuromorphic systems, with a generative pattern completion task as proof of concept. For this, we first propose a method of producing the Gibbs sampler using bio-inspired digital noisy integrate-and-fire neurons. Next, we describe the process of mapping generative RBMs trained offline onto the IBM TrueNorth neurosynaptic processor -a low-power digital neuromorphic VLSI substrate. Mapping these algorithms onto neuromorphic hardware presents unique challenges in network connectivity and weight and bias quantization, which, in turn, require architectural and design strategies for the physical realization. Generative performance metrics are analyzed to validate the neuromorphic requirements and to best select the neuron parameters for the model. Lastly, we describe a design automation procedure which achieves optimal resource usage, accounting for the novel hardware adaptations. This work represents the first implementation of generative RBM inference on a neuromorphic VLSI substrate.

arXiv (Cornell University), Nov 14, 2015

Recent studies have shown that synaptic unreliability is a robust and sufficient mechanism for in... more Recent studies have shown that synaptic unreliability is a robust and sufficient mechanism for inducing the stochasticity observed in cortex. Here, we introduce the Synaptic Sampling Machine (SSM), a stochastic neural network model that uses synaptic unreliability as a means to stochasticity for sampling. Synaptic unreliability plays the dual role of an efficient mechanism for sampling in neuromorphic hardware, and a regularizer during learning akin to DropConnect. Similar to the original formulation of Boltzmann machines, the SSM can be viewed as a stochastic counterpart of Hopfield networks, but where stochasticity is induced by a random mask over the connections. The SSM is trained to learn generative models with a synaptic plasticity rule implementing an event-driven form of contrastive divergence. We demonstrate this by learning a model of MNIST hand-written digit dataset, and by testing it in recognition and inference tasks. We find that SSMs outperform restricted Boltzmann machines (4.4% error rate vs. 5%), they are more robust to overfitting, and tend to learn sparser representations. SSMs are remarkably robust to weight pruning: removal of more than 80% of the weakest connections followed by cursory re-learning causes only a negligible performance loss on the MNIST task (4.8% error rate). These results show that SSMs offer substantial improvements in terms of performance, power and complexity over existing methods for unsupervised learning in spiking neural networks, and are thus promising models for machine learning in neuromorphic execution platforms.

A 7.86 mW +12.5 dBm in-band IIP3 8-to-320 MHz capacitive harmonic rejection mixer in 65nm CMOS

We present a low-power high-linearity capacitive harmonic rejection mixer for cognitive radio app... more We present a low-power high-linearity capacitive harmonic rejection mixer for cognitive radio applications. A passive mixer first receiver with capacitive 16-phase sinusoidal weighting implements harmonic rejection down-conversion, and an AC-coupled fully differential capacitor feedback transimpedance amplifier provides baseband linear voltage gain and band-pass filtering achieving an in-band IIP3 of +12.5 dBm at 320 MHz LO over 3 MHz baseband. The 1.62mm2 mixer in 65nm CMOS consumes 40 μW per I/Q complex output channel, and 7.82 mW for 16-phase PLL clock generation and distribution.

IEEE Journal on Emerging and Selected Topics in Circuits and Systems, Dec 1, 2019

Dropout and DropConnect are known as effective methods to improve on the generalization performan... more Dropout and DropConnect are known as effective methods to improve on the generalization performance of neural networks, by either dropping states of neural units or dropping weights of synaptic connections randomly selected at each time instance throughout the training process. In this paper, we extend on the use of these methods in the design of neuromorphic spiking neural networks (SNN) hardware to improve further on the reliability of inference as impacted by resource constrained errors in network connectivity. Such energy and bandwidth constraints arise for low-power operation in the communication between neural units, which cause dropped spike events due to timeout errors in the transmission. The Dropout and DropConnect processes during training of the network are aligned with a statistical model of the network during inference that accounts for these random errors in the transmission of neural states and synaptic connections. The use of Dropout and DropConnect during training hence allows to simultaneously meet two design objectives: improving robustness of inference to dropped spike events due to timeout communication constraints in network connectivity, while maximizing time-to-decision bandwidth and hence minimizing inference energy in the neuromorphic hardware. Simulations with 5-layer fully connected 784-500-500-500-10 SNN on the MNIST task show a 3.42-fold and 7.06-fold decrease in inference energy at 90% test accuracy, by using Dropout and DropConnect respectively during backpropagation training. Also the simulation with convolutional neural networks on the CIFAR-10 task show a 1.24-fold decrease in inference energy at 60% test accuracy by using Dropout during backpropagation training.

IEEE Transactions on Biomedical Engineering, 2019

Although biological synapses express a large variety of receptors in neuronal membranes, the curr... more Although biological synapses express a large variety of receptors in neuronal membranes, the current hardware implementation of neuromorphic synapses often rely on simple models ignoring the heterogeneity of synaptic transmission. Our objective is to emulate different types of synapses with distinct properties. Methods: Conductance-based chemical and electrical synapses were implemented between silicon neurons on a fully programmable and reconfigurable, biophysically realistic neuromorphic VLSI chip. Different synaptic properties were achieved by configuring on-chip digital parameters for the conductances, reversal potentials, and voltage dependence of the channel kinetics. The measured I-V characteristics of the artificial synapses were compared with biological data. Results: We reproduced the response properties of five different types of chemical synapses, including both excitatory (AMPA, NMDA) and inhibitory (GABA A , GABA C , glycine) ionotropic receptors. In addition, electrical synapses were implemented in a small network of four silicon neurons. Conclusion: Our work extends the repertoire of synapse types between silicon neurons, providing greater flexibility for the design and implementation of biologically realistic neural networks on neuromorphic chips. Significance: A higher synaptic heterogeneity in neuromorphic chips is relevant for the hardware implementation of energy-efficient population codes as well as for dynamic clamp applications where neural models are implemented in neuromorphic VLSI hardware.

Neuromorphic synapses with reconfigurable voltage-gated dynamics for biohybrid neural circuits

Chemical synapses are the main link between neurons in biological neural networks, giving neural ... more Chemical synapses are the main link between neurons in biological neural networks, giving neural cells their special ability to communicate action potentials with great temporal precision. As such, neuromorphic synapses could serve as critical units for implementing biohydrid circuits interfacing biological and silicon neurons. Here we describe a biophysical model of a chemical synapse with reconfigurable pre-synaptic and post-synaptic voltage-gated dynamics implemented on a neuromorphic VLSI chip, and evaluate its versatility with measurements from the chip reproducing response characteristics from ionotropic inhibitory GABAa and excitatory NMDA synapses. We also discuss applications of the reconfigurable dynamic clamp capabilities of the neuromorphic synapse for interacting neuronal populations and biological neurons to establish biohybrid circuits ranging from the single-cell to the population level with intra- and extra-cellular interfaces, respectively.

$Research paper thumbnail of A 6.5- <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>μ</mi><mtext>W</mtext></mrow><annotation encoding="application/x-tex">\mu \text{W}</annotation></semantics></math>μW /MHz Charge Buffer With 7-fF Input Capacitance in 65-nm CMOS for Noncontact Electropotential Sensing$

IEEE Transactions on Circuits and Systems Ii-express Briefs, Dec 1, 2016

This brief presents a CMOS charge buffer with femtofarad-range input capacitance for applications... more This brief presents a CMOS charge buffer with femtofarad-range input capacitance for applications in capacitive electropotential sensing. We analyze and verify a feedback mechanism to negate parasitic capacitances seen at the input of a CMOS amplifier. Measurements are presented from a prototype fabricated in 65-nm CMOS occupying an active area of 193 µm 2 with an efficiency of 6.5 µW/MHz. Over-the-air measurements validate its applicability to electropotential sensing.

arXiv (Cornell University), Apr 10, 2023

Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI ap... more Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using braininspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neuromorphic computing benchmark efforts have not seen widespread adoption due to a lack of inclusive, actionable, and iterative benchmark design and guidelines. To address these shortcomings, we present NeuroBench: a benchmark framework for neuromorphic computing algorithms and systems. NeuroBench is a collaboratively-designed effort from an open community of nearly 100 co-authors across over 50 institutions in industry and academia, aiming to provide a representative structure for standardizing the evaluation of neuromorphic approaches. The NeuroBench framework introduces a common set of tools and systematic methodology for inclusive benchmark measurement, delivering an objective reference framework for quantifying neuromorphic approaches in both hardware-independent (algorithm track) and hardware-dependent (system track) settings. In this article, we present initial performance baselines across various model architectures on the algorithm track and outline the system track benchmark tasks and guidelines. NeuroBench is intended to continually expand its benchmarks and features to foster and track the progress made by the research community.

Neuromorphic Neural Interfaces

Handbook of Neuroengineering, 2023

Nature

Realizing increasingly complex artificial intelligence (AI) functionalities directly on edge devi... more Realizing increasingly complex artificial intelligence (AI) functionalities directly on edge devices calls for unprecedented energy efficiency of edge hardware. Compute-in-memory (CIM) based on resistive random-access memory (RRAM)1 promises to meet such demand by storing AI model weights in dense, analogue and non-volatile RRAM devices, and by performing AI computation directly within RRAM, thus eliminating power-hungry data movement between separate compute and memory2–5. Although recent studies have demonstrated in-memory matrix-vector multiplication on fully integrated RRAM-CIM hardware6–17, it remains a goal for a RRAM-CIM chip to simultaneously deliver high energy efficiency, versatility to support diverse models and software-comparable accuracy. Although efficiency, versatility and accuracy are all indispensable for broad adoption of the technology, the inter-related trade-offs among them cannot be addressed by isolated improvements on any single abstraction level of the desi...

arXiv (Cornell University), Aug 17, 2021

Realizing today's cloud-level artificial intelligence (AI) functionalities directly on devices di... more Realizing today's cloud-level artificial intelligence (AI) functionalities directly on devices distributed at the edge of the internet calls for edge hardware capable of processing multiple modalities of sensory data (e.g. video, audio) at unprecedented energy-efficiency. AI hardware architectures today cannot meet the demand due to a fundamental "memory wall": data movement between separate compute and memory units consumes large energy and incurs long latency 1 . Resistive random-access memory (RRAM) based compute-in-memory (CIM) architectures promise to bring orders of magnitude energy-efficiency improvement by performing computation directly within memory, using intrinsic physical properties of RRAM devices 2-7 . However, conventional approaches to CIM hardware design limit its functional flexibility necessary for processing diverse AI workloads, and must overcome hardware imperfections that degrade inference accuracy. Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single level of the design. By co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAMthe first multimodal edge AI chip using RRAM CIM to simultaneously deliver a high degree of versatility in reconfiguring a single chip for diverse model architectures, record energy-efficiency 5 -8 better than prior art across various computational bit-precisions, and inference accuracy comparable to software models with 4-bit weights on all measured standard AI benchmarks including accuracy of 99.0% on MNIST and 85.7% on CIFAR-10 image classification, 84.7% accuracy on Google speech command recognition, and a 70% reduction in image reconstruction error on a Bayesian image recovery task. This work paves a way towards building highly efficient and reconfigurable edge AI hardware platforms for the more demanding and heterogeneous AI applications of the future. Compute-in-memory (CIM) architecture offers a pathway towards achieving brain-level information processing efficiency by eliminating expensive data movement between isolated compute and memory units in a conventional von Neumann architecture . Resistive random access memory (RRAM) 8 is an emerging non-volatile memory that offers higher density, lower leakage and better analog programmability than conventional on-chip static random access memory (SRAM), making it an ideal candidate to implement large-scale and low-power CIM systems. Research in this area has demonstrated various AI applications by using fabricated resistive memory arrays as electronic synapses while using off-chip software/hardware to implement essential functionalities such as analog-to-digital conversion and neuron activations for a complete system . More recent studies have demonstrated fully-integrated RRAM-CMOS chips and focused on techniques to improve energy-efficiency . However, to date, there has not been a fully-integrated RRAM CIM chip that simultaneously demonstrates a broad cross-section of

Neuromorphic Neural Interfaces

Handbook of Neuroengineering, 2022

ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187)

We present an asynchronous mixed analog-digital VLSI architecture which implements the Fuzzy Adap... more We present an asynchronous mixed analog-digital VLSI architecture which implements the Fuzzy Adaptive Resonance Theory (Fuzzy-ART) algorithm. Both classification and learning are performed on-chip in real-time. Unique features of our implementation include: an embedded refresh mechanism to overcome memory drift due to charge leakage from volatile capacitive storage; and a recoding mechanism to eliminate and reassign inactive categories. A small scale ¢ ¤£ ¥ §¦ m feature size CMOS prototype with 4 inputs and 8 output categories has been designed and fabricated. The unit cell which performs the fuzzy min and learning operations measures 100 ¦ m by 45 ¦ m. Experimental results are included to illustrate performance of the unit cell.

2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2021

In both invertebrate and vertebrate animals, small networks called central pattern generators (CP... more In both invertebrate and vertebrate animals, small networks called central pattern generators (CPGs) form the building blocks of the neuronal circuits involved in locomotion. Most CPGs contain a simple half-center oscillator (HCO) motif which consists of two neurons, or populations of neurons, connected by reciprocal inhibition. CPGs and HCOs are well characterized neuronal networks and have been extensively modeled at different levels of abstraction. In the past two decades, hardware implementation of spiking CPG and HCO models in neuromorphic hardware has opened up new applications in mobile robotics, computational neuroscience, and neuroprosthetics. Despite their relative simplicity, the parameter space of GPG and HCO models can become exhaustive when considering various neuron models and network topologies. Motivated by computational work in neuroscience that used a brute-force approach to generate a large database of millions of simulations of the heartbeat HCO of the leech, we have started to build a database of spiking chains of multiple HCOs for different neuron model types and network topologies. Here we present preliminary results using the Izhikevich and Morris-Lecar neuron models for single and pairs of HCOs with different inter-HCO coupling schemes.

Memristor for computing: Myth or reality?

Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017, 2017

CMOS technology and its sustainable scaling have been the enablers for the design and manufacturi... more CMOS technology and its sustainable scaling have been the enablers for the design and manufacturing of computer architectures that have been fuelling a wider range of applications. Today, however, both the technology and the computer architectures are suffering from serious challenges/ walls making them incapable to deliver the right computing power at pre-defined constraints. This motivates the need of exploring new architectures and new technologies; not only to maintain the economic benefit of scaling, but also to enable the solutions of emerging computer power and data storage hungry applications such as big-data and data-intensive applications. This paper discusses the emerging memristor device as complementary (or alternative) to CMOS device and shows how this device can enable new ways of computing that will at least solve the challenges of today's architectures for some applications. The paper shows not only the potential of memristor devices in enabling new memory technologies and new logic design styles, but also their potential in enabling memory intensive architectures as well as neuromorphic computing due to their unique properties such as the tight integration with CMOS and the ability to learn and adapt.

Frontiers in Neuroscience, 2014

Frontiers in Neuroscience, Mar 31, 2015

arXiv (Cornell University), Sep 24, 2015

arXiv (Cornell University), Nov 14, 2015

A 7.86 mW +12.5 dBm in-band IIP3 8-to-320 MHz capacitive harmonic rejection mixer in 65nm CMOS

IEEE Journal on Emerging and Selected Topics in Circuits and Systems, Dec 1, 2019

IEEE Transactions on Biomedical Engineering, 2019

Neuromorphic synapses with reconfigurable voltage-gated dynamics for biohybrid neural circuits

IEEE Transactions on Circuits and Systems Ii-express Briefs, Dec 1, 2016

arXiv (Cornell University), Apr 10, 2023

Neuromorphic Neural Interfaces

Handbook of Neuroengineering, 2023

Nature

arXiv (Cornell University), Aug 17, 2021

Neuromorphic Neural Interfaces

Handbook of Neuroengineering, 2022

ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187)

2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2021

Memristor for computing: Myth or reality?

Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017, 2017