Hadas Benisty | Technion Israel Institute of Technology (original) (raw)
Papers by Hadas Benisty
SummaryThe primary motor cortex (M1) is crucial for motor skill learning. Previous studies demons... more SummaryThe primary motor cortex (M1) is crucial for motor skill learning. Previous studies demonstrated that skill acquisition requires dopaminergic VTA (ventral-tegmental area) signaling in M1, however little is known regarding the effect of these inputs at the neuronal and network levels. Using dexterity task, calcium imaging, chemogenetic silencing, and geometric data analysis, we demonstrate VTA-dependent reorganization of M1 layer 2-3 during motor learning. While average activity and average functional connectivity of layer 2-3 network remain stable during learning, the activity kinetics, the correlational configuration of functional connectivity, and average connectivity strength of layer 2-3 neurons gradually transform towards an expert configuration. In addition, task success-failure outcome signaling gradually emerges. Silencing VTA dopaminergic inputs to M1 during learning, prevents all these changes. Our findings demonstrate dopaminergic VTA-dependent formation of outcome...
Two-photon microscopy can resolve fluorescence dynamics deep in scattering tissue, but applying t... more Two-photon microscopy can resolve fluorescence dynamics deep in scattering tissue, but applying this techniquein vivois limited by short working distance water-immersion objectives. Here we present an ultra long working distance (20 mm) air objective called the Cousa objective. It is optimized for performance across multiphoton imaging wavelengths, offers a >4 mm2FOV with submicron lateral resolution, and is compatible with commonly used multiphoton imaging systems. We share the full optical prescription, along with data on real world performance includingin vivocalcium imaging in a range of species and approaches.
Journal of Psychopathology and Clinical Science
Science
Tuft dendrites of layer 5 pyramidal neurons form specialized compartments important for motor lea... more Tuft dendrites of layer 5 pyramidal neurons form specialized compartments important for motor learning and performance, yet their computational capabilities remain unclear. Structural-functional mapping of the tuft tree from the motor cortex during motor tasks revealed two morphologically distinct populations of layer 5 pyramidal tract neurons (PTNs) that exhibit specific tuft computational properties. Early bifurcating and large nexus PTNs showed marked tuft functional compartmentalization, representing different motor variable combinations within and between their two tuft hemi-trees. By contrast, late bifurcating and smaller nexus PTNs showed synchronous tuft activation. Dendritic structure and dynamic recruitment of the N -methyl- d -aspartate (NMDA)–spiking mechanism explained the differential compartmentalization patterns. Our findings support a morphologically dependent framework for motor computations, in which independent amplification units can be combinatorically recruite...
These files contain the raw data used to extract ND-PFM signals in Figure 2,3,4 of the manuscript... more These files contain the raw data used to extract ND-PFM signals in Figure 2,3,4 of the manuscript. The MATALB code used to perform virtual lock-in operation is present.
Acta Neurobiologiae Experimentalis, 2019
EURASIP Journal on Audio, Speech, and Music Processing, 2016
The goal of voice conversion is to modify a source speaker's speech to sound as if spoken by a ta... more The goal of voice conversion is to modify a source speaker's speech to sound as if spoken by a target speaker. Common conversion methods are based on Gaussian mixture modeling (GMM). They aim to statistically model the spectral structure of the source and target signals and require relatively large training sets (typically dozens of sentences) to avoid over-fitting. Moreover, they often lead to muffled synthesized output signals, due to excessive smoothing of the spectral envelopes. Mobile applications are characterized with low resources in terms of training data, memory footprint, and computational complexity. As technology advances, computational and memory requirements become less limiting; however, the amount of available training data still presents a great challenge, as a typical mobile user is willing to record himself saying just few sentences. In this paper, we propose the grid-based (GB) conversion method for such low resource environments, which is successfully trained using very few sentences (5-10). The GB approach is based on sequential Bayesian tracking, by which the conversion process is expressed as a sequential estimation problem of tracking the target spectrum based on the observed source spectrum. The converted Mel frequency cepstrum coefficient (MFCC) vectors are sequentially evaluated using a weighted sum of the target training vectors used as grid points. The training process includes simple computations of Euclidian distances between the training vectors and is easily performed even in cases of very small training sets. We use global variance (GV) enhancement to improve the perceived quality of the synthesized signals obtained by the proposed and the GMM-based methods. Using just 10 training sentences, our enhanced GB method leads to converted sentences having closer GV values to those of the target and to lower spectral distances at the same time, compared to enhanced version of the GMM-based conversion method. Furthermore, subjective evaluations show that signals produced by the enhanced GB method are perceived as more similar to the target speaker than the enhanced GMM signals, at the expense of a small degradation in the perceived quality.
2014 IEEE 28th Convention of Electrical & Electronics Engineers in Israel (IEEEI), 2014
ABSTRACT
2014 IEEE 28th Convention of Electrical & Electronics Engineers in Israel (IEEEI), 2014
ABSTRACT
2009 Second International Conference on the Applications of Digital Information and Web Technologies, 2009
In this paper, we introduce a time-varying shorttime Fourier transform (TV-STFT) for representing... more In this paper, we introduce a time-varying shorttime Fourier transform (TV-STFT) for representing discrete signals. We derive an explicit condition for perfect reconstruction using time-varying analysis and synthesis windows. Based on the derived representation, we propose an adaptive algorithm that controls the length of the analysis window to achieve a lower mean-square error (MSE) at each iteration. When compared to the conventional multiplicative transfer function approach with a fixed length analysis window, the resulting algorithm achieves faster convergence without compromising for higher steady state MSE. Experimental results demonstrate the effectiveness of the proposed approach.
Nanoscale, 2017
There has been tremendous interest in piezoelectricity at the nanoscale, for example in nanowires... more There has been tremendous interest in piezoelectricity at the nanoscale, for example in nanowires and nanofibers where piezoelectric properties may be enhanced or controllably tuned, thus necessitating robust characterization techniques of piezoelectric response in nanomaterials. Piezo-response force microscopy (PFM) is a well-established scanning probe technique routinely used to image piezoelectric/ferroelectric domains in thin films, however, its applicability to nanoscale objects is limited due to the requirement for physical contact with an atomic force microscope (AFM) tip that may cause dislocation or damage, particularly to soft materials, during scanning. Here we report a non-destructive PFM (ND-PFM) technique wherein the tip is oscillated into "discontinuous" contact during scanning, while applying an AC bias between tip and sample and extracting the piezoelectric response for each contact point by monitoring the resulting localized deformation at the AC frequenc...
Functional optical imaging in neuroscience is rapidly growing with the development of new optical... more Functional optical imaging in neuroscience is rapidly growing with the development of new optical systems and fluorescence indicators. To realize the potential of these massive spatiotemporal datasets for relating neuronal activity to behavior and stimuli and uncovering local circuits in the brain, accurate automated processing is increasingly essential. In this review, we cover recent computational developments in the full data processing pipeline of functional optical microscopy for neuroscience data and discuss ongoing and emerging challenges.
Clinical Psychology & Psychotherapy
The goal of voice conversion is to transform a sentence said by one speaker, to sound as if anoth... more The goal of voice conversion is to transform a sentence said by one speaker, to sound as if another speaker had said it. The classical conversion based on a Gaussian Mixture Model and several other schemes suggested since, produce muffled sounding outputs, due to excessive smoothing of the spectral envelopes. To reduce the muffling effect, enhancement of the Global Variance (GV) of the spectral features was recently suggested. We propose a different approach for GV enhancement, based on the classical conversion formalized as a GV-constrained minimization. Listening tests show that an improvement in quality is achieved by the proposed approach.
Voice conversion systems aim to transform sentences said by one speaker, to sound as if another s... more Voice conversion systems aim to transform sentences said by one speaker, to sound as if another speaker had said them. Many statistically trained conversion methods produce muffled synthesized outputs due to over-smoothing of the converted spectra. To deal with the muffling effect, conversion methods integrated with Global Variance (GV) enhancement, have been proposed. In order to gain the benefits of GV enhancement, the user is restricted to apply one of these methods as a conversion method. We propose a new GV enhancement method designed independently of any specific conversion scheme and applied as a post-processing block. The extent of GV enhancement is controlled through the allowed spectral distance between the enhanced and the originally converted output, as specified by the user. Listening tests showed that the proposed method improves both quality and similarity to the target of the examined converted sentences, outperforming other enhancement approaches that we evaluated.
2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE), 2018
Existing sound retrieval systems are mostly based on a textual query. Using text to describe a so... more Existing sound retrieval systems are mostly based on a textual query. Using text to describe a sound signal is not intuitive and is often inaccurate due to subjective impression of the user; different people may use different words to describe the same sound which makes theses system complex to design and unintuitive to use. Vocal imitation, however, is the most natural human way to describe a sound. In this paper we consider a newly rising approach for sound retrieval based on vocal imitations, where the user records himself imitating the desired sound, and the system retrieves a ranked list of the most similar sounds in the dataset. In this work we represent sound signals using histograms, obtained with respect to a Gaussian Mixture Model (GMM), representing the spectral domain. This recently proposed approach was successfully applied for word representation in a keyword spotting task. Having a fixed length representation for vocal imitation signals allows us to train a robust cla...
Many voice conversion systems require parallel training sets of the source and target speakers. N... more Many voice conversion systems require parallel training sets of the source and target speakers. Non-parallel training is more complicated as it involves evaluation of source-target correspondence along with the conversion function itself. INCA is a recently proposed method for non-parallel train-ing, based on iterative estimation of alignment and conversion function. The alignment is evaluated using a simple nearest-neighbor search, which often leads to phonetic miss-matched source-target pairs. We propose here a generalized approach, denoted as Temporal-Context INCA (TC-INCA), based on matching temporal context vectors. We formulate the training stage as a minimization problem of a joint cost, considering both context-based alignment and conversion function. We show that TC-INCA reduces the joint cost and prove its convergence. Experimental results indicate that TC-INCA significantly improves the alignment accuracy, compared to INCA. Moreover, subjective evaluations show that TC-IN...
Clinical Brain-Computer Interface (BCI) systems seek to enable paralyzed individuals to operate d... more Clinical Brain-Computer Interface (BCI) systems seek to enable paralyzed individuals to operate devices with their brain activity. Non-invasive systems based on electroen-cephalographic (EEG) signals are popular since they avoid risks associated with invasive procedures, but unfortunately EEG signals are inherently noisy, making effective classifiers challenging to develop. Commonly, new classifiers are benchmarked on signals from healthy subjects executing physical movements, under the assumption that the performance will transfer to clinical cases where only imagined movements are possible. Here, we show in contrast that classifiers trained on signals associated with actual movements perform erratically when applied to signals associated with imagined movements. We suggest that this is because the signals lay in different domains. Then, to exploit the different statistical distributions, we apply a domain adaptation technique, Frustratingly Easy Domain Adaptation (FEDA), improving...
SummaryThe primary motor cortex (M1) is crucial for motor skill learning. Previous studies demons... more SummaryThe primary motor cortex (M1) is crucial for motor skill learning. Previous studies demonstrated that skill acquisition requires dopaminergic VTA (ventral-tegmental area) signaling in M1, however little is known regarding the effect of these inputs at the neuronal and network levels. Using dexterity task, calcium imaging, chemogenetic silencing, and geometric data analysis, we demonstrate VTA-dependent reorganization of M1 layer 2-3 during motor learning. While average activity and average functional connectivity of layer 2-3 network remain stable during learning, the activity kinetics, the correlational configuration of functional connectivity, and average connectivity strength of layer 2-3 neurons gradually transform towards an expert configuration. In addition, task success-failure outcome signaling gradually emerges. Silencing VTA dopaminergic inputs to M1 during learning, prevents all these changes. Our findings demonstrate dopaminergic VTA-dependent formation of outcome...
Two-photon microscopy can resolve fluorescence dynamics deep in scattering tissue, but applying t... more Two-photon microscopy can resolve fluorescence dynamics deep in scattering tissue, but applying this techniquein vivois limited by short working distance water-immersion objectives. Here we present an ultra long working distance (20 mm) air objective called the Cousa objective. It is optimized for performance across multiphoton imaging wavelengths, offers a >4 mm2FOV with submicron lateral resolution, and is compatible with commonly used multiphoton imaging systems. We share the full optical prescription, along with data on real world performance includingin vivocalcium imaging in a range of species and approaches.
Journal of Psychopathology and Clinical Science
Science
Tuft dendrites of layer 5 pyramidal neurons form specialized compartments important for motor lea... more Tuft dendrites of layer 5 pyramidal neurons form specialized compartments important for motor learning and performance, yet their computational capabilities remain unclear. Structural-functional mapping of the tuft tree from the motor cortex during motor tasks revealed two morphologically distinct populations of layer 5 pyramidal tract neurons (PTNs) that exhibit specific tuft computational properties. Early bifurcating and large nexus PTNs showed marked tuft functional compartmentalization, representing different motor variable combinations within and between their two tuft hemi-trees. By contrast, late bifurcating and smaller nexus PTNs showed synchronous tuft activation. Dendritic structure and dynamic recruitment of the N -methyl- d -aspartate (NMDA)–spiking mechanism explained the differential compartmentalization patterns. Our findings support a morphologically dependent framework for motor computations, in which independent amplification units can be combinatorically recruite...
These files contain the raw data used to extract ND-PFM signals in Figure 2,3,4 of the manuscript... more These files contain the raw data used to extract ND-PFM signals in Figure 2,3,4 of the manuscript. The MATALB code used to perform virtual lock-in operation is present.
Acta Neurobiologiae Experimentalis, 2019
EURASIP Journal on Audio, Speech, and Music Processing, 2016
The goal of voice conversion is to modify a source speaker's speech to sound as if spoken by a ta... more The goal of voice conversion is to modify a source speaker's speech to sound as if spoken by a target speaker. Common conversion methods are based on Gaussian mixture modeling (GMM). They aim to statistically model the spectral structure of the source and target signals and require relatively large training sets (typically dozens of sentences) to avoid over-fitting. Moreover, they often lead to muffled synthesized output signals, due to excessive smoothing of the spectral envelopes. Mobile applications are characterized with low resources in terms of training data, memory footprint, and computational complexity. As technology advances, computational and memory requirements become less limiting; however, the amount of available training data still presents a great challenge, as a typical mobile user is willing to record himself saying just few sentences. In this paper, we propose the grid-based (GB) conversion method for such low resource environments, which is successfully trained using very few sentences (5-10). The GB approach is based on sequential Bayesian tracking, by which the conversion process is expressed as a sequential estimation problem of tracking the target spectrum based on the observed source spectrum. The converted Mel frequency cepstrum coefficient (MFCC) vectors are sequentially evaluated using a weighted sum of the target training vectors used as grid points. The training process includes simple computations of Euclidian distances between the training vectors and is easily performed even in cases of very small training sets. We use global variance (GV) enhancement to improve the perceived quality of the synthesized signals obtained by the proposed and the GMM-based methods. Using just 10 training sentences, our enhanced GB method leads to converted sentences having closer GV values to those of the target and to lower spectral distances at the same time, compared to enhanced version of the GMM-based conversion method. Furthermore, subjective evaluations show that signals produced by the enhanced GB method are perceived as more similar to the target speaker than the enhanced GMM signals, at the expense of a small degradation in the perceived quality.
2014 IEEE 28th Convention of Electrical & Electronics Engineers in Israel (IEEEI), 2014
ABSTRACT
2014 IEEE 28th Convention of Electrical & Electronics Engineers in Israel (IEEEI), 2014
ABSTRACT
2009 Second International Conference on the Applications of Digital Information and Web Technologies, 2009
In this paper, we introduce a time-varying shorttime Fourier transform (TV-STFT) for representing... more In this paper, we introduce a time-varying shorttime Fourier transform (TV-STFT) for representing discrete signals. We derive an explicit condition for perfect reconstruction using time-varying analysis and synthesis windows. Based on the derived representation, we propose an adaptive algorithm that controls the length of the analysis window to achieve a lower mean-square error (MSE) at each iteration. When compared to the conventional multiplicative transfer function approach with a fixed length analysis window, the resulting algorithm achieves faster convergence without compromising for higher steady state MSE. Experimental results demonstrate the effectiveness of the proposed approach.
Nanoscale, 2017
There has been tremendous interest in piezoelectricity at the nanoscale, for example in nanowires... more There has been tremendous interest in piezoelectricity at the nanoscale, for example in nanowires and nanofibers where piezoelectric properties may be enhanced or controllably tuned, thus necessitating robust characterization techniques of piezoelectric response in nanomaterials. Piezo-response force microscopy (PFM) is a well-established scanning probe technique routinely used to image piezoelectric/ferroelectric domains in thin films, however, its applicability to nanoscale objects is limited due to the requirement for physical contact with an atomic force microscope (AFM) tip that may cause dislocation or damage, particularly to soft materials, during scanning. Here we report a non-destructive PFM (ND-PFM) technique wherein the tip is oscillated into "discontinuous" contact during scanning, while applying an AC bias between tip and sample and extracting the piezoelectric response for each contact point by monitoring the resulting localized deformation at the AC frequenc...
Functional optical imaging in neuroscience is rapidly growing with the development of new optical... more Functional optical imaging in neuroscience is rapidly growing with the development of new optical systems and fluorescence indicators. To realize the potential of these massive spatiotemporal datasets for relating neuronal activity to behavior and stimuli and uncovering local circuits in the brain, accurate automated processing is increasingly essential. In this review, we cover recent computational developments in the full data processing pipeline of functional optical microscopy for neuroscience data and discuss ongoing and emerging challenges.
Clinical Psychology & Psychotherapy
The goal of voice conversion is to transform a sentence said by one speaker, to sound as if anoth... more The goal of voice conversion is to transform a sentence said by one speaker, to sound as if another speaker had said it. The classical conversion based on a Gaussian Mixture Model and several other schemes suggested since, produce muffled sounding outputs, due to excessive smoothing of the spectral envelopes. To reduce the muffling effect, enhancement of the Global Variance (GV) of the spectral features was recently suggested. We propose a different approach for GV enhancement, based on the classical conversion formalized as a GV-constrained minimization. Listening tests show that an improvement in quality is achieved by the proposed approach.
Voice conversion systems aim to transform sentences said by one speaker, to sound as if another s... more Voice conversion systems aim to transform sentences said by one speaker, to sound as if another speaker had said them. Many statistically trained conversion methods produce muffled synthesized outputs due to over-smoothing of the converted spectra. To deal with the muffling effect, conversion methods integrated with Global Variance (GV) enhancement, have been proposed. In order to gain the benefits of GV enhancement, the user is restricted to apply one of these methods as a conversion method. We propose a new GV enhancement method designed independently of any specific conversion scheme and applied as a post-processing block. The extent of GV enhancement is controlled through the allowed spectral distance between the enhanced and the originally converted output, as specified by the user. Listening tests showed that the proposed method improves both quality and similarity to the target of the examined converted sentences, outperforming other enhancement approaches that we evaluated.
2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE), 2018
Existing sound retrieval systems are mostly based on a textual query. Using text to describe a so... more Existing sound retrieval systems are mostly based on a textual query. Using text to describe a sound signal is not intuitive and is often inaccurate due to subjective impression of the user; different people may use different words to describe the same sound which makes theses system complex to design and unintuitive to use. Vocal imitation, however, is the most natural human way to describe a sound. In this paper we consider a newly rising approach for sound retrieval based on vocal imitations, where the user records himself imitating the desired sound, and the system retrieves a ranked list of the most similar sounds in the dataset. In this work we represent sound signals using histograms, obtained with respect to a Gaussian Mixture Model (GMM), representing the spectral domain. This recently proposed approach was successfully applied for word representation in a keyword spotting task. Having a fixed length representation for vocal imitation signals allows us to train a robust cla...
Many voice conversion systems require parallel training sets of the source and target speakers. N... more Many voice conversion systems require parallel training sets of the source and target speakers. Non-parallel training is more complicated as it involves evaluation of source-target correspondence along with the conversion function itself. INCA is a recently proposed method for non-parallel train-ing, based on iterative estimation of alignment and conversion function. The alignment is evaluated using a simple nearest-neighbor search, which often leads to phonetic miss-matched source-target pairs. We propose here a generalized approach, denoted as Temporal-Context INCA (TC-INCA), based on matching temporal context vectors. We formulate the training stage as a minimization problem of a joint cost, considering both context-based alignment and conversion function. We show that TC-INCA reduces the joint cost and prove its convergence. Experimental results indicate that TC-INCA significantly improves the alignment accuracy, compared to INCA. Moreover, subjective evaluations show that TC-IN...
Clinical Brain-Computer Interface (BCI) systems seek to enable paralyzed individuals to operate d... more Clinical Brain-Computer Interface (BCI) systems seek to enable paralyzed individuals to operate devices with their brain activity. Non-invasive systems based on electroen-cephalographic (EEG) signals are popular since they avoid risks associated with invasive procedures, but unfortunately EEG signals are inherently noisy, making effective classifiers challenging to develop. Commonly, new classifiers are benchmarked on signals from healthy subjects executing physical movements, under the assumption that the performance will transfer to clinical cases where only imagined movements are possible. Here, we show in contrast that classifiers trained on signals associated with actual movements perform erratically when applied to signals associated with imagined movements. We suggest that this is because the signals lay in different domains. Then, to exploit the different statistical distributions, we apply a domain adaptation technique, Frustratingly Easy Domain Adaptation (FEDA), improving...