John Hogden - Academia.edu (original) (raw)
Papers by John Hogden
8th European Conference on Speech Communication and Technology (Eurospeech 2003)
We discuss speech production in terms of a mapping from a low-dimensional articulator space to lo... more We discuss speech production in terms of a mapping from a low-dimensional articulator space to low-dimensional manifold embedded in a high-dimensional acoustic space. Our discussion highlights the advantages of using an articulatory representation of speech. We then summarize mathematical results showing that, because articulator motions are bandlimited, a large class of mappings from articulation to acoustics can be blindly inverted. Simulation results showing the power of the inversion technique are also presented. One of the most interesting simulation results is that some many-to-one mappings can also be inverted. These results explain earlier experimental results that the studied technique can recover articulator positions. We conclude that our technique has many advantages for speech processing, including invariance with respect to various nonlinearities and the ability to exploit context more easily.
This dataset contains features (combined, ionic radii, and electronegativity) and band gap (Eg) v... more This dataset contains features (combined, ionic radii, and electronegativity) and band gap (Eg) values for apatites in their lowest energy structure.
Bulletin of the American Physical Society, 2016
2 AX family of compounds for which the elastic properties [bulk (B), shear (G), and Young's (E) m... more 2 AX family of compounds for which the elastic properties [bulk (B), shear (G), and Young's (E) modulus] have been computed using density functional theory. The strategy is decomposed into two steps: a regressor is trained to predict elastic properties in terms of elementary orbital radii of the individual components of the materials; and a selector uses these predictions to choose the next material to investigate. The ultimate goal is to obtain a material with desired elastic properties. We examine how the choice of data set size, regressor and selector impact the results.
Materials Discovery and Design, 2018
In materials informatics, features (or descriptors) that capture trends in the structure, chemist... more In materials informatics, features (or descriptors) that capture trends in the structure, chemistry and/or bonding for a given chemical composition are crucial. Here, we explore their role in the accelerated search for new materials using machine learning adaptive design. We focus on a specific class of materials referred to as apatites [A\(_{10}\)(BO\(_4\))\(_6\)X\(_2\)] and our objective is to identify an apatite compound with the largest band gap (E\(_g\)) without performing density functional theory calculations over the entire composition space. We construct three datasets that use three sets of features of the A, B, and X-ions (ionic radii, electronegativities, and the combination of both) and independently track which of these sets finds most rapidly the composition with the largest E\(_g\). We find that the combined feature set performs best, followed by the ionic radii feature set. The reason for this ranking is the B-site ionic radius, which is the key E\(_g\)-governing feature and appears in both the ionic radii and combined feature sets. Our results show that a relatively poor ML model with large error but one that contains key features can be more efficient in accelerating the search than a low-error model that lack such features.
Current Opinion in Solid State and Materials Science, 2017
Abstract A key aspect of the developing field of materials informatics is optimally guiding exper... more Abstract A key aspect of the developing field of materials informatics is optimally guiding experiments or calculations towards parts of the relatively vast feature space where a material with desired property may be discovered. We discuss our approach to adaptive experimental design and the methods developed in decision theory and global optimization which can be used in materials science. We show that the use of uncertainties to trade-off exploration versus exploitation to guide new experiments or calculations generally leads to enhanced performance, highlighting the need to evaluate and incorporate errors in predictive materials design. We illustrate our ideas on a computed data set of M2AX phases generated using ab initio calculations to find the sample with the optimal elastic properties, and discuss how our approach leads to the discovery of new NiTi-based alloys with the smallest thermal dissipation.
Nature Communications, 2016
Finding new materials with targeted properties has traditionally been guided by intuition, and tr... more Finding new materials with targeted properties has traditionally been guided by intuition, and trial and error. With increasing chemical complexity, the combinatorial possibilities are too large for an Edisonian approach to be practical. Here we show how an adaptive design strategy, tightly coupled with experiments, can accelerate the discovery process by sequentially identifying the next experiments or calculations, to effectively navigate the complex search space. Our strategy uses inference and global optimization to balance the trade-off between exploitation and exploration of the search space. We demonstrate this by finding very low thermal hysteresis (DT) NiTi-based shape memory alloys, with Ti 50.0 Ni 46.7 Cu 0.8 Fe 2.3 Pd 0.2 possessing the smallest DT (1.84 K). We synthesize and characterize 36 predicted compositions (9 feedback loops) from a potential space of B800,000 compositions. Of these, 14 had smaller DT than any of the 22 in the original data set.
Information Science for Materials Discovery and Design, 2015
We review how classification and regression methods have been used on materials problems and outl... more We review how classification and regression methods have been used on materials problems and outline a design loop that serves as a basis for adaptively finding materials with targeted properties.
Scientific Reports, 2016
We compare several adaptive design strategies using a data set of 223 M 2 AX family of compounds ... more We compare several adaptive design strategies using a data set of 223 M 2 AX family of compounds for which the elastic properties [bulk (B), shear (G), and Young's (E) modulus] have been computed using density functional theory. The design strategies are decomposed into an iterative loop with two main steps: machine learning is used to train a regressor that predicts elastic properties in terms of elementary orbital radii of the individual components of the materials; and a selector uses these predictions and their uncertainties to choose the next material to investigate. The ultimate goal is to obtain a material with desired elastic properties in as few iterations as possible. We examine how the choice of data set size, regressor and selector impact the design. We find that selectors that use information about the prediction uncertainty outperform those that don't. Our work is a step in illustrating how adaptive design tools can guide the search for new materials with desired properties.
Thesis (Ph. D.)--Stanford University, 1991. Submitted to the Department of Psychology. Copyright ... more Thesis (Ph. D.)--Stanford University, 1991. Submitted to the Department of Psychology. Copyright by the author.
This report was prepared as an account of work sponsored by an agency of the United States Govern... more This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, make any warranty, express or implied, or assumes any legal liabzty or responsiiility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference berein to any specific commercial product, pmcess, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recornmedon, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
A new approach to the compression of vector quantized (VQ) speech sequences is evaluated. The tec... more A new approach to the compression of vector quantized (VQ) speech sequences is evaluated. The technique uses a method called maximum likelihood continuity mapping to learn a mapping between ariticulation and speech acoustics. Smooth articulator paths are then derived from VQ codes sequences. The paths are subsequently sampled, quantized, and transmitted along with additional information that allows perfect recovery of the VQ code sequences. A decoder takes the transmitted articulator paths and recovers the correct VQ code sequence for resynthesis of the speech waveform. The algorithm has not achieved compression yet, requiring an average of 6.04 bits/frame to transmit a 6 bit VQ code sequence, and 9.96 bits/frame to transmit a 10 bit VQ code sequence. However, modifications to the algorithm are currently under investigation, and we expect to implement improvements to help us compress VQ code sequences. Results of the improved algorithm will be presented at the conference. 1. INTRODU...
Psychology of Learning and Motivation, 1988
Publisher Summary This chapter discusses perceptual processes and presents a model of the relatio... more Publisher Summary This chapter discusses perceptual processes and presents a model of the relation between the perceptual processing of some stimulus and the quality of the stimulus's eventual memory representation. Perceptual processing occurs in conjunction with the conscious awareness of the to-be-encoded stimulus. It is the existence of such awareness that underlies the perceptual/conceptual dichotomy; there must be a raw-information extraction process that can only occur if the stimulus is phenomenologically present. The perceptual processing model encompasses the phenomenological awareness of a visual stimulus. The model is divided in two forms: general and quantitative. The general form is composed of five qualitative assumptions that may correspond to psychological reality. In the quantitative form of the model, two of these qualitative assumptions are replaced with corresponding quantitative forms. The quantitative assumptions are stronger than the qualitative counterparts.
Although stochastic models of speech signals (e.g. hidden Markov models, trigrams, etc) have lead... more Although stochastic models of speech signals (e.g. hidden Markov models, trigrams, etc) have lead to impressive improvements in speech recognition accuracy, it has been noted that these models have little relationship to speech production (Lee, 1989), and their recognition performance on some important tasks is far from perfect. However, there have been recent attempts to bridge the gap between speech
Speech Communication, 2007
Motor theories, which postulate that speech perception is related to linguistically significant m... more Motor theories, which postulate that speech perception is related to linguistically significant movements of the vocal tract, have guided speech perception research for nearly four decades but have had little impact on automatic speech recognition. In this paper, we describe a signal processing technique named MIMICRI that may help link motor theory with automatic speech recognition by providing a practical
The Journal of the Acoustical Society of America, 1996
Vocal tract models are often used to study the problem of mapping from the acoustic transfer func... more Vocal tract models are often used to study the problem of mapping from the acoustic transfer function to the vocal tract area function ͑inverse mapping͒. Unfortunately, results based on vocal tract models are strongly affected by the assumptions underlying the models. In this study, the mapping from acoustics ͑digitized speech samples͒ to articulation ͑measurements of the positions of receiver coils placed on the tongue, jaw, and lips͒ is examined using human data from a single speaker: Simultaneous acoustic and articulator measurements made for vowel-to-vowel transitions, /,/ closures, and transitions into and out of /,/ closures. Articulator positions were measured using an EMMA system to track coils placed on the lips, jaw, and tongue. Using these data, look-up tables were created that allow articulator positions to be estimated from acoustic signals. On a data set not used for making look-up tables, correlations between estimated and actual coil positions of around 94% and root-mean-squared errors around 2 mm are common for coils on the tongue. An error source evaluation shows that estimating articulator positions from quantized acoustics gives root-mean-squared errors that are typically less than 1 mm greater than the errors that would be obtained from quantizing the articulator positions themselves. This study agrees with and extends previous studies of human data by showing that for the data studied, speech acoustics can be used to accurately recover articulator positions.
The Journal of the Acoustical Society of America, 1993
The Journal of the Acoustical Society of America, 1996
An algorithm called maximum likelihood continuity mapping (MALCOM) will be presented. MALCOM reco... more An algorithm called maximum likelihood continuity mapping (MALCOM) will be presented. MALCOM recovers the positions of the tongue, jaw, and lips from measurements of the sound‐pressure waveform of speech. Unlike other techniques for recovering articulator positions from speech, MALCOM does not require training on measured or modeled articulator positions, and MALCOM does not rely on any particular model of sound propagation through the vocal tract. The algorithm categorizes short‐time windows of speech into a finite number of sound types, and assumes the probability of using any articulator position to produce a given sound type can be described by a parametrized probability density function. MALCOM uses maximum likelihood estimation techniques to: (1) find the most likely smooth articulator path given a speech sample and a set of probability density functions (one density function for each sound type); and (2) change the parameters of the probability density functions to better account for the data. The ...
The Journal of the Acoustical Society of America, 1992
The Journal of the Acoustical Society of America, 1992
A procedure is demonstrated for learning to recover the relative positions of simulated articulat... more A procedure is demonstrated for learning to recover the relative positions of simulated articulators from speech signals generated by articulatory synthesis. The algorithm learns without supervision, that is, it does not require infonnation about which articulator configurations created the acoustic infonnation in the training set. The procedure consists of vector quantizing short time windows of a speech signal, then using multidimensional scaling to represent quantization codes that were temporally close in the encoded speech signal by nearby points in a continuity map. Since temporally close sounds must have been produced by similar articulator configurations, sounds which were produced by similar articulator positions should be represented close to each other in the continuity map. Continuity maps were made from parameters (the first three formant center frequencies) derived from acoustic signals produced by an articulatory synthesizer that could vary the height and degree of fronting of the tongue body. The procedure was evaluated by comparing estimated articulator positions with those used during synthesis. High rankorder correlations (0.95 to 0.99) were found between the estimated and actual articulator positions. Reasonable estimates of relative articulator positions were made using 32 categories of sound and the accuracy improved when more sound categories were used. FOOTNOTES "Appears in Bulletin Communicotion Parlee (1994). tInstitute for Mathematical Behavioral Sciences. University of California at Irvine.
8th European Conference on Speech Communication and Technology (Eurospeech 2003)
We discuss speech production in terms of a mapping from a low-dimensional articulator space to lo... more We discuss speech production in terms of a mapping from a low-dimensional articulator space to low-dimensional manifold embedded in a high-dimensional acoustic space. Our discussion highlights the advantages of using an articulatory representation of speech. We then summarize mathematical results showing that, because articulator motions are bandlimited, a large class of mappings from articulation to acoustics can be blindly inverted. Simulation results showing the power of the inversion technique are also presented. One of the most interesting simulation results is that some many-to-one mappings can also be inverted. These results explain earlier experimental results that the studied technique can recover articulator positions. We conclude that our technique has many advantages for speech processing, including invariance with respect to various nonlinearities and the ability to exploit context more easily.
This dataset contains features (combined, ionic radii, and electronegativity) and band gap (Eg) v... more This dataset contains features (combined, ionic radii, and electronegativity) and band gap (Eg) values for apatites in their lowest energy structure.
Bulletin of the American Physical Society, 2016
2 AX family of compounds for which the elastic properties [bulk (B), shear (G), and Young's (E) m... more 2 AX family of compounds for which the elastic properties [bulk (B), shear (G), and Young's (E) modulus] have been computed using density functional theory. The strategy is decomposed into two steps: a regressor is trained to predict elastic properties in terms of elementary orbital radii of the individual components of the materials; and a selector uses these predictions to choose the next material to investigate. The ultimate goal is to obtain a material with desired elastic properties. We examine how the choice of data set size, regressor and selector impact the results.
Materials Discovery and Design, 2018
In materials informatics, features (or descriptors) that capture trends in the structure, chemist... more In materials informatics, features (or descriptors) that capture trends in the structure, chemistry and/or bonding for a given chemical composition are crucial. Here, we explore their role in the accelerated search for new materials using machine learning adaptive design. We focus on a specific class of materials referred to as apatites [A\(_{10}\)(BO\(_4\))\(_6\)X\(_2\)] and our objective is to identify an apatite compound with the largest band gap (E\(_g\)) without performing density functional theory calculations over the entire composition space. We construct three datasets that use three sets of features of the A, B, and X-ions (ionic radii, electronegativities, and the combination of both) and independently track which of these sets finds most rapidly the composition with the largest E\(_g\). We find that the combined feature set performs best, followed by the ionic radii feature set. The reason for this ranking is the B-site ionic radius, which is the key E\(_g\)-governing feature and appears in both the ionic radii and combined feature sets. Our results show that a relatively poor ML model with large error but one that contains key features can be more efficient in accelerating the search than a low-error model that lack such features.
Current Opinion in Solid State and Materials Science, 2017
Abstract A key aspect of the developing field of materials informatics is optimally guiding exper... more Abstract A key aspect of the developing field of materials informatics is optimally guiding experiments or calculations towards parts of the relatively vast feature space where a material with desired property may be discovered. We discuss our approach to adaptive experimental design and the methods developed in decision theory and global optimization which can be used in materials science. We show that the use of uncertainties to trade-off exploration versus exploitation to guide new experiments or calculations generally leads to enhanced performance, highlighting the need to evaluate and incorporate errors in predictive materials design. We illustrate our ideas on a computed data set of M2AX phases generated using ab initio calculations to find the sample with the optimal elastic properties, and discuss how our approach leads to the discovery of new NiTi-based alloys with the smallest thermal dissipation.
Nature Communications, 2016
Finding new materials with targeted properties has traditionally been guided by intuition, and tr... more Finding new materials with targeted properties has traditionally been guided by intuition, and trial and error. With increasing chemical complexity, the combinatorial possibilities are too large for an Edisonian approach to be practical. Here we show how an adaptive design strategy, tightly coupled with experiments, can accelerate the discovery process by sequentially identifying the next experiments or calculations, to effectively navigate the complex search space. Our strategy uses inference and global optimization to balance the trade-off between exploitation and exploration of the search space. We demonstrate this by finding very low thermal hysteresis (DT) NiTi-based shape memory alloys, with Ti 50.0 Ni 46.7 Cu 0.8 Fe 2.3 Pd 0.2 possessing the smallest DT (1.84 K). We synthesize and characterize 36 predicted compositions (9 feedback loops) from a potential space of B800,000 compositions. Of these, 14 had smaller DT than any of the 22 in the original data set.
Information Science for Materials Discovery and Design, 2015
We review how classification and regression methods have been used on materials problems and outl... more We review how classification and regression methods have been used on materials problems and outline a design loop that serves as a basis for adaptively finding materials with targeted properties.
Scientific Reports, 2016
We compare several adaptive design strategies using a data set of 223 M 2 AX family of compounds ... more We compare several adaptive design strategies using a data set of 223 M 2 AX family of compounds for which the elastic properties [bulk (B), shear (G), and Young's (E) modulus] have been computed using density functional theory. The design strategies are decomposed into an iterative loop with two main steps: machine learning is used to train a regressor that predicts elastic properties in terms of elementary orbital radii of the individual components of the materials; and a selector uses these predictions and their uncertainties to choose the next material to investigate. The ultimate goal is to obtain a material with desired elastic properties in as few iterations as possible. We examine how the choice of data set size, regressor and selector impact the design. We find that selectors that use information about the prediction uncertainty outperform those that don't. Our work is a step in illustrating how adaptive design tools can guide the search for new materials with desired properties.
Thesis (Ph. D.)--Stanford University, 1991. Submitted to the Department of Psychology. Copyright ... more Thesis (Ph. D.)--Stanford University, 1991. Submitted to the Department of Psychology. Copyright by the author.
This report was prepared as an account of work sponsored by an agency of the United States Govern... more This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, make any warranty, express or implied, or assumes any legal liabzty or responsiiility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference berein to any specific commercial product, pmcess, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recornmedon, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
A new approach to the compression of vector quantized (VQ) speech sequences is evaluated. The tec... more A new approach to the compression of vector quantized (VQ) speech sequences is evaluated. The technique uses a method called maximum likelihood continuity mapping to learn a mapping between ariticulation and speech acoustics. Smooth articulator paths are then derived from VQ codes sequences. The paths are subsequently sampled, quantized, and transmitted along with additional information that allows perfect recovery of the VQ code sequences. A decoder takes the transmitted articulator paths and recovers the correct VQ code sequence for resynthesis of the speech waveform. The algorithm has not achieved compression yet, requiring an average of 6.04 bits/frame to transmit a 6 bit VQ code sequence, and 9.96 bits/frame to transmit a 10 bit VQ code sequence. However, modifications to the algorithm are currently under investigation, and we expect to implement improvements to help us compress VQ code sequences. Results of the improved algorithm will be presented at the conference. 1. INTRODU...
Psychology of Learning and Motivation, 1988
Publisher Summary This chapter discusses perceptual processes and presents a model of the relatio... more Publisher Summary This chapter discusses perceptual processes and presents a model of the relation between the perceptual processing of some stimulus and the quality of the stimulus's eventual memory representation. Perceptual processing occurs in conjunction with the conscious awareness of the to-be-encoded stimulus. It is the existence of such awareness that underlies the perceptual/conceptual dichotomy; there must be a raw-information extraction process that can only occur if the stimulus is phenomenologically present. The perceptual processing model encompasses the phenomenological awareness of a visual stimulus. The model is divided in two forms: general and quantitative. The general form is composed of five qualitative assumptions that may correspond to psychological reality. In the quantitative form of the model, two of these qualitative assumptions are replaced with corresponding quantitative forms. The quantitative assumptions are stronger than the qualitative counterparts.
Although stochastic models of speech signals (e.g. hidden Markov models, trigrams, etc) have lead... more Although stochastic models of speech signals (e.g. hidden Markov models, trigrams, etc) have lead to impressive improvements in speech recognition accuracy, it has been noted that these models have little relationship to speech production (Lee, 1989), and their recognition performance on some important tasks is far from perfect. However, there have been recent attempts to bridge the gap between speech
Speech Communication, 2007
Motor theories, which postulate that speech perception is related to linguistically significant m... more Motor theories, which postulate that speech perception is related to linguistically significant movements of the vocal tract, have guided speech perception research for nearly four decades but have had little impact on automatic speech recognition. In this paper, we describe a signal processing technique named MIMICRI that may help link motor theory with automatic speech recognition by providing a practical
The Journal of the Acoustical Society of America, 1996
Vocal tract models are often used to study the problem of mapping from the acoustic transfer func... more Vocal tract models are often used to study the problem of mapping from the acoustic transfer function to the vocal tract area function ͑inverse mapping͒. Unfortunately, results based on vocal tract models are strongly affected by the assumptions underlying the models. In this study, the mapping from acoustics ͑digitized speech samples͒ to articulation ͑measurements of the positions of receiver coils placed on the tongue, jaw, and lips͒ is examined using human data from a single speaker: Simultaneous acoustic and articulator measurements made for vowel-to-vowel transitions, /,/ closures, and transitions into and out of /,/ closures. Articulator positions were measured using an EMMA system to track coils placed on the lips, jaw, and tongue. Using these data, look-up tables were created that allow articulator positions to be estimated from acoustic signals. On a data set not used for making look-up tables, correlations between estimated and actual coil positions of around 94% and root-mean-squared errors around 2 mm are common for coils on the tongue. An error source evaluation shows that estimating articulator positions from quantized acoustics gives root-mean-squared errors that are typically less than 1 mm greater than the errors that would be obtained from quantizing the articulator positions themselves. This study agrees with and extends previous studies of human data by showing that for the data studied, speech acoustics can be used to accurately recover articulator positions.
The Journal of the Acoustical Society of America, 1993
The Journal of the Acoustical Society of America, 1996
An algorithm called maximum likelihood continuity mapping (MALCOM) will be presented. MALCOM reco... more An algorithm called maximum likelihood continuity mapping (MALCOM) will be presented. MALCOM recovers the positions of the tongue, jaw, and lips from measurements of the sound‐pressure waveform of speech. Unlike other techniques for recovering articulator positions from speech, MALCOM does not require training on measured or modeled articulator positions, and MALCOM does not rely on any particular model of sound propagation through the vocal tract. The algorithm categorizes short‐time windows of speech into a finite number of sound types, and assumes the probability of using any articulator position to produce a given sound type can be described by a parametrized probability density function. MALCOM uses maximum likelihood estimation techniques to: (1) find the most likely smooth articulator path given a speech sample and a set of probability density functions (one density function for each sound type); and (2) change the parameters of the probability density functions to better account for the data. The ...
The Journal of the Acoustical Society of America, 1992
The Journal of the Acoustical Society of America, 1992
A procedure is demonstrated for learning to recover the relative positions of simulated articulat... more A procedure is demonstrated for learning to recover the relative positions of simulated articulators from speech signals generated by articulatory synthesis. The algorithm learns without supervision, that is, it does not require infonnation about which articulator configurations created the acoustic infonnation in the training set. The procedure consists of vector quantizing short time windows of a speech signal, then using multidimensional scaling to represent quantization codes that were temporally close in the encoded speech signal by nearby points in a continuity map. Since temporally close sounds must have been produced by similar articulator configurations, sounds which were produced by similar articulator positions should be represented close to each other in the continuity map. Continuity maps were made from parameters (the first three formant center frequencies) derived from acoustic signals produced by an articulatory synthesizer that could vary the height and degree of fronting of the tongue body. The procedure was evaluated by comparing estimated articulator positions with those used during synthesis. High rankorder correlations (0.95 to 0.99) were found between the estimated and actual articulator positions. Reasonable estimates of relative articulator positions were made using 32 categories of sound and the accuracy improved when more sound categories were used. FOOTNOTES "Appears in Bulletin Communicotion Parlee (1994). tInstitute for Mathematical Behavioral Sciences. University of California at Irvine.