The learnability consequences of Zipfian distributions: Word Segmentation is Facilitated in More Predictable Distributions (original) (raw)
One of the striking commonalities between languages is the way word frequencies are distributed. Across languages, word frequencies follow a Zipfian distribution, showing a power law relation between a word's frequency and its rank (Zipf, 1949). Intuitively, this means that languages have relatively few high-frequency words and many low-frequency ones. While studied extensively, little work has explored the learnability consequences of the greater predictability of words in such distributions. Here, we propose such distributions confer a learnability advantage for word segmentation, a foundational aspect of language acquisition. We capture the greater predictability of words using the information-theoretic notion of efficiency, which tells us how predictable a distribution is relative to a uniform one. We first use corpus analyses to show that child-directed speech is similarly predictable across fifteen different languages. We then experimentally investigate the impact of distribution predictability on children and adults. We show that word segmentation is uniquely facilitated at the predictability levels found in language, compared both with uniform distributions and with skewed distributions that are less predictable than those of natural language. We further show that distribution predictability impacts learning more than distribution shape, and that learning is not improved further in distributions more predictable than natural language. These novel findings illustrate learners' sensitivity to the overall predictability of the linguistic environment; suggest that the predictability levels found in language provide an optimal environment for learning; and point to the possible role of cognitive pressures in the emergence and propensity of such distributions in language.