Speaker Verification Using Support Vector Machines and High-Level Features (original) (raw)

2007, IEEE Transactions on Audio, Speech and Language Processing

High-level characteristics such as word usage, pronunciation, phonotactics, prosody, etc., have seen a resurgence for automatic speaker recognition over the last several years. With the availability of many conversation sides per speaker in current corpora, high-level systems now have the amount of data needed to sufficiently characterize a speaker. Although a significant amount of work has been done in finding novel high-level features, less work has been done on modeling these features. We describe a method of speaker modeling based upon support vector machines. Current high-level feature extraction produces sequences or lattices of tokens for a given conversation side. These sequences can be converted to counts and then frequencies of-gram for a given conversation side. We use support vector machine modeling of these n-gram frequencies for speaker verification. We derive a new kernel based upon linearizing a log likelihood ratio scoring system. Generalizations of this method are shown to produce excellent results on a variety of high-level features. We demonstrate that our methods produce results significantly better than standard log-likelihood ratio modeling. We also demonstrate that our system can perform well in conjunction with standard cesptral speaker recognition systems.

Sign up for access to the world's latest research.

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact