Howard Rothman - Academia.edu (original) (raw)

Papers by Howard Rothman

Abstract : Data are presented on relative speech intelligibility as processes via unscramblers un... more Abstract : Data are presented on relative speech intelligibility as processes via unscramblers under two conditions: on-line and off-line. The report provides a discussion of the strengths and weaknesses of current approaches to HeO2 speech processing and makes recommendations for the improvement of these devices.

IEEE Transactions on Audio and Electroacoustics, 1973

The development of saturation diving has enabled man to work in t h e sea at great depths and for... more The development of saturation diving has enabled man to work in t h e sea at great depths and for long periods of time. This advance has resulted, in part, as a consequence of the substitution of helium for nitrogen in breathing gas mixtures. However, the utilization of HeOz breathing mixtures at high ambient pressures has caused problems in speech communication; in turn, electronic aids have been developed to improve diver communication. These helium speech unscramblers attempt to process variously the grossly unintelligible speech resulting from the effects of helium-oxygen breathing mixtures and ambient pressure, and to reconstruct such signals in order to provide adequate voice communication. This paper presents a discussion of the effects of He02/P on speech and then describes some of the techniques used to "unscramble" the distorted speech. Included among the techniques are: 1) frequency subtraction; 2) tape recorder playback; 3) vocoder approaches; 4) digital coding; and 5) convolution processing. In addition, a generalized evaluation of these approaches is included.

1970 IEEE International Conference on Engineering in the Ocean Environment - Digest of Technical Papers, 1970

ABSTRACT

Revista Ingenieria Uc, 2005

Proyecto académico sin fines de lucro, desarrollado bajo la iniciativa de acceso abierto

The detection of fundamental frequency (Fo) in speech has often been shown to be a particularly d... more The detection of fundamental frequency (Fo) in speech has often been shown to be a particularly difficult signal processing problem. This parameter is a necessary one for documenting vocal fold vibration and alterations to these vibratory patterns in the presence of pathology. There exist a variety of algorithms for extracting and analyzing Fo. It has been reported that these techniques do not work well for different types of talkers and decrease in performance as the noise level increases. The main objective of this research was to develop a robust algorithm for the extraction and analysis of Fo from normal and pathological voices. This algorithm is based on the spectrogram and makes use of artificial intelligence techniques to extract Fo. An algorithm was developed and tested with 6 normal and 6 abnormal samples, which contain a sustained vowel. These 12 samples were also analyzed by two commercial software packages, which make use of other techniques, and the results were compare...

The main objective of this paper is to contribute in the process of classification of voice quali... more The main objective of this paper is to contribute in the process of classification of voice quality. This contribution is achieved by means of the analysis of a set of proposed parameters (PMR, SNR, SNRL, SNRM, SNRH), obtained at the frequency domain. These parameters were obtained from the frequency domain after transforming the signal using the DFT. As a result of this work twenty four voices sampling of pathological and good voices were analysed, several parameters were proposed and the results were statistically tested, which yielded three statistically significant parameters PMR, SNRM and SNRH. These parameters have not been found in the reports of investigations performed by others researchers in spite of the great number of articles in this field. Therefore this work can be considered a novel contribution.

Undersea biomedical research, 1980

Word-list intelligibility scores of unprocessed speech (mean of 4 subjects) were recorded in heli... more Word-list intelligibility scores of unprocessed speech (mean of 4 subjects) were recorded in helium-oxygen atmospheres at stable pressures equivalent to 1600, 1400, 1200, 1000, 860, 690, 560, 392, and 200 fsw daring Predictive Studies IV-1975 by wide-bandwidth condenser microphones (frequency responses not degraded by increased gas density). Intelligibility scores were substantially lower in helium-oxygen a 200 fsw than in air at l ATA, but there was little difference between 200 fsw and 1600 fsw. A previously documented prominent decrease in intelligibility of speech between 200 or 600 fsw because of helium and pressure was probably due to degradation of microphone frequency response by high gas density.

IEEE Transactions on Communications, 1971

This investigation was conducted as part of a program of research designed 1) to develop several ... more This investigation was conducted as part of a program of research designed 1) to develop several methodologies for the evaluation of diver communication systems and 2) to carry out these evaluations on available units. The major focus of this report is on a diver-to-diver procedure and data resulting from the evaluation of seven diver communication systems, viz: a) hard-lineAquaphone; b)

The Laryngoscope, 1985

Postlaryngectomy speech rehabilitation more frequently includes surgical-prosthetic methods since... more Postlaryngectomy speech rehabilitation more frequently includes surgical-prosthetic methods since the introduction of a low morbidity tracheoesophageal puncture technique and a one-way airflow valve. This study compares speech using an artificial larynx and, in one case, esophageal speech with speech using a tracheoesophageal puncture and valve in the same speaker. Using nonprofessional listeners, speech was rated for intelligibility and preference. Voice spectrograms were employed for measurement of rate, fundamental frequency, and intensity. While no statistically significant differences were found in mean fundamental frequency or intensity, the rate of post-tracheoesophageal speech was considerably faster. In addition, when individual speakers are compared with themselves, post-tracheoesophageal speech is significantly more intelligible and preferred by naive listeners. We conclude that using the tracheoesophageal puncture with valve should be strongly considered in total laryngectomy patients whose present mode of communication is unsatisfactory.

Revista Ingenieria UC, 2004

Journal of Voice, 2003

Vocal training (VT) has, in part, been associated with the distinctions in the physiological, aco... more Vocal training (VT) has, in part, been associated with the distinctions in the physiological, acoustic, and perceptual parameters found in singers' voices versus the voices of nonsingers. This study provides information on the changes in the singing voice as a function of VT over time. Fourteen college voice majors (12 females and 2 males; age range, 17-20 years) were recorded while singing, once a semester, for four consecutive semesters. Acoustic measures included fundamental frequency (F 0) and sound pressure level (SPL) of the 10% and 90% levels of the maximum phonational frequency range (MPFR), vibrato pulses per second, vibrato amplitude variation, and the presence of the singer's formant. Results indicated that VT had a significant effect on the MPFR. F 0 and SPL of the 90% level of the MPFR and the 90-10% range increased significantly as VT progressed. However, no vibrato or singers' formant differences were detected as a function of training. This longitudinal study not only validates previous cross-sectional research, ie, that VT has a significant effect on the singing voice, but also it demonstrates that these effects can be acoustically detected by the fourth semester of college vocal training.

Journal of Voice, 2000

Acoustic and perceptual analyses were completed to determine the effect of vocal training on prof... more Acoustic and perceptual analyses were completed to determine the effect of vocal training on professional singers when speaking and singing. Twenty professional singers and 20 nonsingers, acting as the control, were recorded while sustaining a vowel, reading a modified Rainbow Passage, and singing "America the Beautiful" Acoustic measures included fundamental frequency, duration, percent jitter, percent shimmer, noise-to-harmonic ratio, and determination of the presence or absence of both vibrato and the singer's formant. Results indicated that, whereas certain acoustic parameters differentiated singers from nonsingers within sex, no consistently significant trends were found across males and females for either speaking or singing. The most consistent differences were the presence or absence of the singer's vibrato and formant in the singers versus the nonsingers, respectively. Perceptual analysis indicated that singers could be correctly identified with greater frequency than by chance alone from their singing, but not their speaking utterances.

Journal of Voice, 2002

From postrecording interviews of professional singers, it was hypothesized that recording environ... more From postrecording interviews of professional singers, it was hypothesized that recording environments, i.e., sound-treated environment versus an auditorium, may induce different vocal behaviors. To test this hypothesis, three groups consisting of nonsingers, singers, and actors were recorded in two different recording environments: a sound-treated booth (IAC) and an auditorium (AUD). Three recordings were obtained from each participant: recording one (IAC) and two (AUD1) required the participants to read in a normal voice; recording three (AUD2) required participants to pretend that they were "performing" before a full house. Results indicated that only the singers and the actors exhibited significant spectral and/or frequency/duration differences from one recording environment to another, with the most dramatic differences exhibited by the singers. It was concluded that the environment in which we record experimental samples from professional voice users, especially singers, should be considered as a variable that can affect results.

Journal of Voice, 2004

This longitudinal study gathered data with regard to the question: Does singing training have an ... more This longitudinal study gathered data with regard to the question: Does singing training have an effect on the speaking voice? Fourteen voice majors (12 females and two males; age range 17 to 20 years) were recorded once a semester for four consecutive semesters, while sustaining vowels and reading the "Rainbow Passage." Acoustic measures included speaking fundamental frequency (SFF) and sound pressure level (SLP). Perturbation measures included jitter, shimmer, and harmonic-to-noise ratio. Temporal measures included sentence, consonant, and diphthong durations. Results revealed that, as the number of semesters increased, the SFF increased while jitter and shimmer slightly decreased. Repeated measure analysis, however, indicated that none of the acoustic, temporal, or perturbation differences were statistically significant. These results confirm earlier cross-sectional studies that compared singers with nonsingers, in that singing training mostly affects the singing voice and rarely the speaking voice.

Journal of Voice, 1987

Journal of Voice, Volume 1, Issue 2, Pages 168-171, 1987, Authors:Robert F. Coleman; Jean Hakes; ... more Journal of Voice, Volume 1, Issue 2, Pages 168-171, 1987, Authors:Robert F. Coleman; Jean Hakes; Douglas M. Hicks; John F. Michel; Lorraine A. Ramig; Howard B. Rothman.

Journal of Voice, 1987

Historically, studies of vocal vibrato have concentrated on pulse rate as being a primary factor ... more Historically, studies of vocal vibrato have concentrated on pulse rate as being a primary factor in determining whether a given vocal movement is a good or bad vibrato or a tremolo or wobble. More recently, investigators have been studying the extent of frequency variation ...

Journal of Voice, 1990

Recent papers by Rothman and Timberlake (1), Rothman (2), Rothman and Arroyo (3), and Keidar, Tit... more Recent papers by Rothman and Timberlake (1), Rothman (2), Rothman and Arroyo (3), and Keidar, Titze, and Timberlake (4) have focused on the pulse rate, frequency extent, and amplitude extent of vibrato. Some of the emphases of these papers were attempts to clarify the ...

Journal of Voice, 2010

To specify a set of acoustic cues for vocal aging and to establish their perceptual relevance. St... more To specify a set of acoustic cues for vocal aging and to establish their perceptual relevance. Study Design. Perceptual testing. Methods. To identify the acoustic and perceptual correlates of the aging voice, voice quality [in conjunction with speaking rate and fundamental frequency (F 0)] was systematically manipulated using resynthesis to determine its effect on perceived age. Ten young male voices were resynthesized using two levels of noise (random modulation of F 0 contour) and two levels of tremor (constant modulation of F 0 contour with a low-amplitude wave) under a speaking-rate manipulation (an increase in speaking rate that is common to older male voices). These materials were submitted to 40 naive listeners in an age-estimation task. Two sets of comparison materials were also included for evaluation: unmanipulated samples from a 150 voice database of young, middle-aged, and older voices and disordered voice samples representing natural manifestations of the voice qualities of interest. Results. Speaking rate, highest degree of tremor, and highest degree of noise all shifted, in an additive manner, the mean perceived age of the young male voices by a maximum of 12 years on average; individual voices were observed being shifted by a generation. Fundamental frequency manipulations had no significant effect on perceived age. Conclusions. Voice quality (both tremor and noise) and speaking rate are all perceptually relevant cues of age in male voices.

Journal of Voice, 2000

This study is an attempt to ascertain if singers from different traditions and milieus follow sim... more This study is an attempt to ascertain if singers from different traditions and milieus follow similar aesthetic trends regardless of training and/or background. Cantors who sang the Jewish synagogue liturgy during the Golden Age of cantorial singing prior to World War II came from Eastern and Central Europe. For the most part, they were not trained in the classical Western opera tradition. They received training from choir leaders and other cantors and the training was primarily in the modes of synagogue chant. Cantors today receive the same kinds of training that opera singers receive, often from the same teachers. Four groups of singers, consisting of four singers in each group, were utilized in this study. The four groups are: historical opera singers, contemporary opera singers, historical cantors, and contemporary cantors. The historical opera singer recordings date from as early as 1909 to as late as 1939. It was not possible to determine the dates of the historical cantor recordings. However, the four cantors chosen for this group were active only to the 1940s. Contemporary samples were taken from CDs and/or live recordings and all the singers from the contemporary groups are either still active or were active in the 1960s through the 1980s and all of them are considered to be premier-level singers in their respective areas. The variables analyzed were: vibrato pulse rate, frequency variation of the vibrato pulse above and below the mean sustained sung frequency in percent, the mean amplitude variation of the amplitude vibrato pulse above and below the mean sustained amplitude in percent and the fast Fourier transform (FFT) power spectrum of the sustained samples. Results indicate that most of the significant differences were found between eras and not between groups within a time period.

IEEE Transactions on Audio and Electroacoustics, 1973

1970 IEEE International Conference on Engineering in the Ocean Environment - Digest of Technical Papers, 1970

ABSTRACT

Revista Ingenieria Uc, 2005

Proyecto académico sin fines de lucro, desarrollado bajo la iniciativa de acceso abierto

Undersea biomedical research, 1980

IEEE Transactions on Communications, 1971

The Laryngoscope, 1985

Revista Ingenieria UC, 2004

Journal of Voice, 2003

Journal of Voice, 2000

Journal of Voice, 2002

Journal of Voice, 2004

Journal of Voice, 1987

Journal of Voice, 1990

Journal of Voice, 2010

Journal of Voice, 2000