Tony Robinson | Cambridge Alumni (original) (raw)
Uploads
Papers by Tony Robinson
Two important components of a speech archiving system are the compression scheme and the search f... more Two important components of a speech archiving system are the compression scheme and the search facility. We investigate two ways of providing these components. The first is to run the recogniser directly from the compressed speech - we show how even with a 2.4kbit/sec codec it is possible to produce good recognition results; but the search is slow. The second is to preprocess the speech and store the extra data in a compressed form along with the speech. In the case of an RNN-HMM hybrid system, the posterior probabilties provide a suitable intermediate data format. Vector quantizing these at just 625 bits/sec enables the search to run many times real-time and still maintain good recognition accuracy.
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
This paper investigates the scaling properties of Recurrent Neural Network Language Models (RNNLM... more This paper investigates the scaling properties of Recurrent Neural Network Language Models (RNNLMs). We discuss how to train very large RNNs on GPUs and address the questions of how RNNLMs scale with respect to model size, training-set size, computational costs and memory. Our analysis shows that despite being more costly to train, RNNLMs obtain much lower perplexities on standard benchmarks than n-gram models. We train the largest known RNNs and present relative word error rates gains of 18% on an ASR task. We also present the new lowest perplexities on the recently released billion word language modelling benchmark, 1 BLEU point gain on machine translation and a 17% relative hit rate gain in word prediction.
Bookmarks Related papers MentionsView impact
Acoustics Speech and Signal Processing 1988 Icassp 88 1988 International Conference on, May 24, 1999
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
This chapter contains sections titled: Objectives of the operatorHydraulic ground-crop sprayersRo... more This chapter contains sections titled: Objectives of the operatorHydraulic ground-crop sprayersRotary atomizersGranule applicatorsBand sprayersKnapsack sprayersSprayer faultsErrors in applying herbicidesHerbicide driftDecontaminationsprayers and disposal of waste materialStorage of herbicidesReferences and further readinObjectives of the operatorHydraulic ground-crop sprayersRotary atomizersGranule applicatorsBand sprayersKnapsack sprayersSprayer faultsErrors in applying herbicidesHerbicide driftDecontaminationsprayers and disposal of waste materialStorage of herbicidesReferences and further readin
Bookmarks Related papers MentionsView impact
... DOLE [SIL] WHO ANNOUNCED HE IS RESIGNING FROM THE SENATE TO DEVOTE FULL TIME TO HIS QUEST FOR... more ... DOLE [SIL] WHO ANNOUNCED HE IS RESIGNING FROM THE SENATE TO DEVOTE FULL TIME TO HIS QUEST FOR THE WHITE HOUSE [SIL ... Gethin Williams and Steve Renals at the Univer-sity of Sheffield report using acoustic-based confidence measures derived from the ...
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
ABSTRACT ABBOT is the hybrid connectionist-hidden Markov model largevocabulary speech recognition... more ABSTRACT ABBOT is the hybrid connectionist-hidden Markov model largevocabulary speech recognition system developed at Cambridge University. In this system, a recurrent network maps each acoustic vector to an estimate of the posterior probabilities of the phone classes. The maximum likelihood word string is then extracted using Markov models. As in traditional hidden Markov models, the Markov process is used to model the lexical and language model constraints. This paper describes the system which participated in the ...
Bookmarks Related papers MentionsView impact
ABSTRACT
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Two important components of a speech archiving system are the compression scheme and the search f... more Two important components of a speech archiving system are the compression scheme and the search facility. We investigate two ways of providing these components. The first is to run the recogniser directly from the compressed speech - we show how even with a 2.4kbit/sec codec it is possible to produce good recognition results; but the search is slow. The second is to preprocess the speech and store the extra data in a compressed form along with the speech. In the case of an RNN-HMM hybrid system, the posterior probabilties provide a suitable intermediate data format. Vector quantizing these at just 625 bits/sec enables the search to run many times real-time and still maintain good recognition accuracy.
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
This paper investigates the scaling properties of Recurrent Neural Network Language Models (RNNLM... more This paper investigates the scaling properties of Recurrent Neural Network Language Models (RNNLMs). We discuss how to train very large RNNs on GPUs and address the questions of how RNNLMs scale with respect to model size, training-set size, computational costs and memory. Our analysis shows that despite being more costly to train, RNNLMs obtain much lower perplexities on standard benchmarks than n-gram models. We train the largest known RNNs and present relative word error rates gains of 18% on an ASR task. We also present the new lowest perplexities on the recently released billion word language modelling benchmark, 1 BLEU point gain on machine translation and a 17% relative hit rate gain in word prediction.
Bookmarks Related papers MentionsView impact
Acoustics Speech and Signal Processing 1988 Icassp 88 1988 International Conference on, May 24, 1999
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
This chapter contains sections titled: Objectives of the operatorHydraulic ground-crop sprayersRo... more This chapter contains sections titled: Objectives of the operatorHydraulic ground-crop sprayersRotary atomizersGranule applicatorsBand sprayersKnapsack sprayersSprayer faultsErrors in applying herbicidesHerbicide driftDecontaminationsprayers and disposal of waste materialStorage of herbicidesReferences and further readinObjectives of the operatorHydraulic ground-crop sprayersRotary atomizersGranule applicatorsBand sprayersKnapsack sprayersSprayer faultsErrors in applying herbicidesHerbicide driftDecontaminationsprayers and disposal of waste materialStorage of herbicidesReferences and further readin
Bookmarks Related papers MentionsView impact
... DOLE [SIL] WHO ANNOUNCED HE IS RESIGNING FROM THE SENATE TO DEVOTE FULL TIME TO HIS QUEST FOR... more ... DOLE [SIL] WHO ANNOUNCED HE IS RESIGNING FROM THE SENATE TO DEVOTE FULL TIME TO HIS QUEST FOR THE WHITE HOUSE [SIL ... Gethin Williams and Steve Renals at the Univer-sity of Sheffield report using acoustic-based confidence measures derived from the ...
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
ABSTRACT ABBOT is the hybrid connectionist-hidden Markov model largevocabulary speech recognition... more ABSTRACT ABBOT is the hybrid connectionist-hidden Markov model largevocabulary speech recognition system developed at Cambridge University. In this system, a recurrent network maps each acoustic vector to an estimate of the posterior probabilities of the phone classes. The maximum likelihood word string is then extracted using Markov models. As in traditional hidden Markov models, the Markov process is used to model the lexical and language model constraints. This paper describes the system which participated in the ...
Bookmarks Related papers MentionsView impact
ABSTRACT
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
This thesis extends the error propagation network to deal with time varying or dynamic patterns. ... more This thesis extends the error propagation network to deal with time varying or dynamic patterns. Examples are given of supervised, reinforcement driven and unsupervised learning.
Chapter 1 presents an overview of connectionist models.
Chapter 2 introduces the error propagation algorithm for general node types.
Chapter 3 discusses the issue of data representation in connectionist models.
Chapter 4 describes the use of several types of networks applied to the problem of the recognition of steady state vowels from multiple speakers.
Chapter 5 extends the error propagation algorithm to deal with time varying input. Three possible architectures are explored which deal with learning sequences of known length and sequences of unknown and possibly indefinite length. Several simple examples are given.
Chapter 6 describes the use of two dynamic nets to form a speech coder. The popular method of Differential Pulse Code Modulation for speech coding employs two linear filters to encode and decode speech. By generalising these to non-linear filters, implemented as dynamic nets, a reduction in the noise imposed by a limited bandwidth channel is achieved.
Chapter 7 describes the application of a dynamic net to the recognition of a large subset of the phonemes of English from continuous speech. The dynamic net is found to give a higher recognition rate both in comparison with a fixed window net and with the established k nearest neighbour technique.
Chapter 8 describes a further development of dynamic nets which allows them to be trained by a reinforcement signal which expresses the correctness of the output of the net. Two possible architectures are given and an example of learning to play the game of noughts and crosses is presented.
Bookmarks Related papers MentionsView impact