Structure and performance of a dependency language model (original) (raw)
We present a maximum entropy language model that incorporates both syntax and semantics via a dependency grammar. Such a grammar expresses the relations between words by a directed graph. Because the edges of this graph may connect words that are arbitrarily far apart in a sentence, this technique can incorporate the predictive p o wer of words that lie outside of bigram or trigram range. We h a ve built several simple dependency models, as we call them, and tested them in a speech recognition experiment. We report experimental results for these models here, including one that has a small but statistically signi cant a d v antage (p < : 02) over a bigram language model.