Frank Seifart | Leibniz-Zentrum Allgemeine Sprachwissenschaft (ZAS), Berlin (original) (raw)
Papers by Frank Seifart
LIAMES: Línguas Indígenas Americanas, 2015
Bora and Muinane are two neighboring, relatively closely related languages spoken in the North-We... more Bora and Muinane are two neighboring, relatively closely related languages spoken in the North-West Amazon. Whether these two languages can be shown to be genealogically related to other languages is still debated. In this paper, we provide a reconstruction of 416 proto Bora-Muinane word forms and of the proto Bora-Muinane phonological system. Our reconstruction confirms an earlier reconstruction by Aschmann (1993), but also differs from it in some important respects. We firmly establish sound changes in Bora and Muinane based on extensive data and first-hand knowledge of the languages involved. The reconstructed proto Bora-Muinane forms presented here provide a sound basis for re-evaluating the genealogical relationship of Bora-Muinane with Witoto and Ocaina, which was proposed by Aschmann (1993), and opens the possibility to search for genealogical relations with other languages.
PLoS ONE, 2014
This paper provides evidence for the identification of the language of the uncontacted indigenous... more This paper provides evidence for the identification of the language of the uncontacted indigenous group called Carabayo, who live in voluntary isolation in the Colombian Amazon region. The only linguistic data available from this group is a set of about 50 words, most of them without reliable translations, that were collected in 1969 during a brief encounter with one Carabayo family. We compare this material with various languages (once) spoken in the region, showing that four attested Carabayo forms (a first person singular prefix and words for 'warm', 'father', and 'boy') display striking similarities with Yurí and at least 13 Carabayo forms display clear correspondences with contemporary Tikuna. Tikuna and Yurí are the only two known members of the Tikuna-Yurí linguistic family. Yurí was documented in the 19th century but has been thought to have become extinct since. We conclude that the Carabayo-directly or indirectly-descend from the Yurí people whose language and customs were described by explorers in the 19th century, before they took up voluntary isolation, escaping atrocities during the rubber boom in the early 20th century.
Language, 2018
This discussion note reviews responses of the linguistics profession to the grave issues of langu... more This discussion note reviews responses of the linguistics profession to the grave issues of language endangerment identified a quarter of a century ago in the journal Language by Krauss, Hale, England, Craig, and others (Hale et al. 1992). Two and a half decades of worldwide research not only have given us a much more accurate picture of the number, phylogeny, and typological variety of the world's languages, but they have also seen the development of a wide range of new approaches, conceptual and technological, to the problem of documenting them. We review these approaches and the manifold discoveries they have unearthed about the enormous variety of linguistic structures. The reach of our knowledge has increased by about 15% of the world's languages, especially in terms of digitally archived material, with about 500 languages now reasonably documented thanks to such major programs as DoBeS, ELDP, and DEL. But linguists are still falling behind in the race to document the planet's rapidly dwindling linguistic diversity, with around 35-42% of the world's languages still substantially undocumented, and in certain countries (such as the US) the call by Krauss (1992) for a significant professional realignment toward language documentation has only been heeded in a few institutions. Apart from the need for an intensified documentarist push in the face of accelerating language loss, we argue that existing language documentation efforts need to do much more to focus on crosslinguistically comparable data sets, sociolinguistic context, semantics, and interpretation of text material, and on methods for bridging the 'transcription bottleneck', which is creating a huge gap between the amount we can record and the amount in our transcribed corpora.*
The Amazonian language Resígaro was heavily influenced by the unrelated, neighboring Bora languag... more The Amazonian language Resígaro was heavily influenced by the unrelated, neighboring Bora language. Bora influence involves cultural assimilation, some loanwords, and heavy morphological borrowing. What social circumstances lead to this influence? This paper reviews our current knowledge about the cultural and linguistic features that Resígaro borrowed from Bora and interprets these as reflecting a particular social history involving bilingualism and ceremonial exchange.
Borrowing affixes may be rare compared to lexical borrowing, but it is not random. The current st... more Borrowing affixes may be rare compared to lexical borrowing, but it is not random. The current study describes regular patterns of affix borrowing in a database containing over 650 borrowed affixes, challenging a number of previous claims about relative borrowability, in particular regarding inflectional categories. It is shown that borrowing affixes of all major nominal and verbal inflectional categories, including case markers and argument indexes, is well attested. Borrowing case markers, for instance, appears to be just as common as borrowing plural markers. By factoring in the " availability " for borrowing (i.e. whether a potential donor language has a relevant affix), it can be shown that nominal categories are far more frequently borrowed than verbal categories. Additionally, it is shown that sets of borrowed affixes often consist of interrelated sets of forms, e.g. forming paradigms, rather than being isolated forms from different morphosyntactic systems, in particular for the more tightly integrated inflectional subsystems. The frequency and systematicity by which inflectional affixes are borrowed calls for a reconsideration of the role of inflection in models of language contact. 1. What constrains affix borrowability? Models of language contact rely heavily on statements about the relative ease or frequency by which different types of linguistic items are borrowed, as typically expressed in " borrowing hierarchies " , e.g. free forms > bound forms, or derivation > inflection. The differential manipulability in language contact expressed in such asymmetries is interpreted as reflecting aspects of the architecture of language: Less easily borrowable forms apparently have inherent structural properties (e.g. being morphologically bound) or semantic/functional properties (e.g. expressing abstract, inflectional categories) that inhibit transfer from one language to another. Borrowing 1 affixes – in the sense of borrowing the morphological material of a bound form, along with its function, and applying it to native stems – plays a prominent role in many of these hierarchies, both in terms of affixes vs. stems and among different types of affixes, e.g. derivation vs. inflection. However, to date no attempt has been made to objectively assess the relative frequency of (different types of) affix borrowing in a worldwide sample of languages. Weinreich's (1953) seminal study on language contact put forward a set of hypotheses, some of which relate to affixes, based loosely on a number of case studies (see the very useful summary by Wilkins 1996), without attempting to quantify empirical data. The work of Yaron Matras and colleagues marks the beginning of more rigorous empirical testing of the borrowability of grammatical categories, including affixes. In particular, Elšík & Matras (2006) analyze a variety of borrowed grammatical categories, including some expressed by affixes, in a sample of Romani languages. A volume edited by Matras & Sakel (2007) presents case studies on grammatical borrowing in 27 languages from around the world, 1 The term " borrowing " is used here simply because it is the most widely used term with the longest tradition, even though it has obvious shortcomings compared to terms such as " transfer " (e.g. Myers-Scotton 2007) or " copying " (e.g. Johanson 1999).
Many Amazonian systems of nominal classification have been perceived as constituting a descriptiv... more Many Amazonian systems of nominal classification have been perceived as constituting a descriptive and typological challenge. The proposal presented here is to consider many of them as emerging noun class systems rather than as a-typical systems that defy integration within an overall typology of nominal classification, at the opposite end from the Niger-Congo systems on a continuum of grammaticalization. First the African noun class systems are reviewed, with an emphasis on the sociolinguistic context of their descriptions and on their common deviations from a prototypical image of them projected in the general linguistic literature. Then a recapitulation of various proposals of a-typicality of the Amazonian systems is given, followed by the presentation of a typology of nominal classification systems that integrates the dynamic dimension of grammaticalization. The application of this typological framework is illustrated with a case study from the Miraña language of Colombia.
Can an unknown Amazonian language be identified by statistical procedures based on n-gram frequen... more Can an unknown Amazonian language be identified by statistical procedures based on n-gram frequencies if only a short list of words is available and at the same time, the available data of the potential candidate languages are also limited to relatively short wordlists? In this paper we show that n-gram frequencies (specifically 1-grams and 2-grams) allow us to identify languages reliably based on as few as 20 words, as long as these are transcribed consistently, and as long as characteristic monogram and bigram frequencies for these languages have previously been established based on consistently transcribed data. If no such consistently transcribed data are available, as is the case of our Amazonian case study, such procedures clearly fail for wordlists with 50 or fewer words. Our study thus contributes to exploring the limits of such automated detection procedures, both in terms of corpus size and transcription quality.
This paper provides evidence for the identification of the language of the uncontacted indigenous... more This paper provides evidence for the identification of the language of the uncontacted indigenous group called Carabayo, who live in voluntary isolation in the Colombian Amazon region. The only linguistic data available from this group is a set of about 50 words, most of them without reliable translations, that were collected in 1969 during a brief encounter with one Carabayo family. We compare this material with various languages (once) spoken in the region, showing that four attested Carabayo forms (a first person singular prefix and words for 'warm', 'father', and 'boy') display striking similarities with Yurí and at least 13 Carabayo forms display clear correspondences with contemporary Tikuna. Tikuna and Yurí are the only two known members of the Tikuna-Yurí linguistic family. Yurí was documented in the 19th century but has been thought to have become extinct since. We conclude that the Carabayo -directly or indirectly -descend from the Yurí people whose language and customs were described by explorers in the 19th century, before they took up voluntary isolation, escaping atrocities during the rubber boom in the early 20th century.
A widespread assumption in the language contact literature is that affixes are never borrowed dir... more A widespread assumption in the language contact literature is that affixes are never borrowed directly, but only indirectly, that is, as part of complex loanwords. From such complex loanwords, affixes may eventually spread to native stems, creating hybrid formations, in a process of language-internal analogical extension. Direct borrowing is the extraction of an affix based on knowledge of the donor language, without the mediation of complex loanwords within the recipient language. This article suggests that direct borrowing can also be the only or primary process leading to productive loan affixes. Criteria are provided to assess instances of direct and indirect borrowing on the basis of the distribution of borrowed affixes across complex loanwords and hybrid formations. These are applied to corpora of various languages. A scale of directness of affix borrowing is proposed, based on the extent to which speakers of the recipient language rely (i) on their knowledge of the donor language (direct borrowing) and (ii) on complex loanwords within their native language (indirect borrowing).*
Language and Linguistics Compass, 2010
Many languages have morphosyntactic systems that impose a classification on their nominal lexicon... more Many languages have morphosyntactic systems that impose a classification on their nominal lexicon. The nature of these systems varies widely, ranging from large systems of lexico-syntactic numeral classifiers, as known from East and South East Asian languages, to highly grammaticalized gender agreement systems of, e.g. European languages (see Aikhenvald 2000 for a comprehensive overview). These systems are also often internally complex, heterogeneous, and descriptively challenging. The morphology, syntax, and semantics of nominal classification bear ample material for interesting research questions in, e.g. typology and historical linguistics. This article reviews some basic features of nominal classification and discusses a number of selected issues that are particularly relevant for ongoing discussions and future research: How can nominal classification systems be typologized according to morphosyntactic criteria (Section 2)? What are the semantic principles of nominal classification and how do they relate to culture and cognition (Section 3)? How does nominal classification diachronically develop and diffuse across languages (Section 4)?
Linguistics, 2010
The Amazonian language Bora systematically uses in narratives a special, paragraphinitial anaphor... more The Amazonian language Bora systematically uses in narratives a special, paragraphinitial anaphoric "connector pronoun". This pronoun helps to ensure referential coherence through agreement in noun class and number with an antecedent, whose referent is thematic in the new paragraph. Additional morphology in the connector pronoun specifies temporal, causal, and other relations between events. The connector pronoun is syntactically tightly integrated into the clause, where it may function as an argument of a verb or as the dependent element of a genitive phrase. Certain frequent forms of the connector pronoun are the basis for a number of lexicalized conjunctions. This paragraph-linking strategy parallels in a number of ways tail-head linkage systems, not only in its functionality, but also with respect to its diachronic outcome (discourse conjunctions). The fact that Bora grammaticalized nominal expressions in a paragraph-linking system (whereas verbs are the central components of tail-head linkage) is congruent with the general preference of Bora to use many noun phrases per clause, in contrast to tail-head linkage languages, where noun phrases are rarely used.
International Journal of American Linguistics
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, a... more JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Nijmegen: DOBES-MPI, 2009
LIAMES: Línguas Indígenas Americanas, 2015
Bora and Muinane are two neighboring, relatively closely related languages spoken in the North-We... more Bora and Muinane are two neighboring, relatively closely related languages spoken in the North-West Amazon. Whether these two languages can be shown to be genealogically related to other languages is still debated. In this paper, we provide a reconstruction of 416 proto Bora-Muinane word forms and of the proto Bora-Muinane phonological system. Our reconstruction confirms an earlier reconstruction by Aschmann (1993), but also differs from it in some important respects. We firmly establish sound changes in Bora and Muinane based on extensive data and first-hand knowledge of the languages involved. The reconstructed proto Bora-Muinane forms presented here provide a sound basis for re-evaluating the genealogical relationship of Bora-Muinane with Witoto and Ocaina, which was proposed by Aschmann (1993), and opens the possibility to search for genealogical relations with other languages.
PLoS ONE, 2014
This paper provides evidence for the identification of the language of the uncontacted indigenous... more This paper provides evidence for the identification of the language of the uncontacted indigenous group called Carabayo, who live in voluntary isolation in the Colombian Amazon region. The only linguistic data available from this group is a set of about 50 words, most of them without reliable translations, that were collected in 1969 during a brief encounter with one Carabayo family. We compare this material with various languages (once) spoken in the region, showing that four attested Carabayo forms (a first person singular prefix and words for 'warm', 'father', and 'boy') display striking similarities with Yurí and at least 13 Carabayo forms display clear correspondences with contemporary Tikuna. Tikuna and Yurí are the only two known members of the Tikuna-Yurí linguistic family. Yurí was documented in the 19th century but has been thought to have become extinct since. We conclude that the Carabayo-directly or indirectly-descend from the Yurí people whose language and customs were described by explorers in the 19th century, before they took up voluntary isolation, escaping atrocities during the rubber boom in the early 20th century.
Language, 2018
This discussion note reviews responses of the linguistics profession to the grave issues of langu... more This discussion note reviews responses of the linguistics profession to the grave issues of language endangerment identified a quarter of a century ago in the journal Language by Krauss, Hale, England, Craig, and others (Hale et al. 1992). Two and a half decades of worldwide research not only have given us a much more accurate picture of the number, phylogeny, and typological variety of the world's languages, but they have also seen the development of a wide range of new approaches, conceptual and technological, to the problem of documenting them. We review these approaches and the manifold discoveries they have unearthed about the enormous variety of linguistic structures. The reach of our knowledge has increased by about 15% of the world's languages, especially in terms of digitally archived material, with about 500 languages now reasonably documented thanks to such major programs as DoBeS, ELDP, and DEL. But linguists are still falling behind in the race to document the planet's rapidly dwindling linguistic diversity, with around 35-42% of the world's languages still substantially undocumented, and in certain countries (such as the US) the call by Krauss (1992) for a significant professional realignment toward language documentation has only been heeded in a few institutions. Apart from the need for an intensified documentarist push in the face of accelerating language loss, we argue that existing language documentation efforts need to do much more to focus on crosslinguistically comparable data sets, sociolinguistic context, semantics, and interpretation of text material, and on methods for bridging the 'transcription bottleneck', which is creating a huge gap between the amount we can record and the amount in our transcribed corpora.*
The Amazonian language Resígaro was heavily influenced by the unrelated, neighboring Bora languag... more The Amazonian language Resígaro was heavily influenced by the unrelated, neighboring Bora language. Bora influence involves cultural assimilation, some loanwords, and heavy morphological borrowing. What social circumstances lead to this influence? This paper reviews our current knowledge about the cultural and linguistic features that Resígaro borrowed from Bora and interprets these as reflecting a particular social history involving bilingualism and ceremonial exchange.
Borrowing affixes may be rare compared to lexical borrowing, but it is not random. The current st... more Borrowing affixes may be rare compared to lexical borrowing, but it is not random. The current study describes regular patterns of affix borrowing in a database containing over 650 borrowed affixes, challenging a number of previous claims about relative borrowability, in particular regarding inflectional categories. It is shown that borrowing affixes of all major nominal and verbal inflectional categories, including case markers and argument indexes, is well attested. Borrowing case markers, for instance, appears to be just as common as borrowing plural markers. By factoring in the " availability " for borrowing (i.e. whether a potential donor language has a relevant affix), it can be shown that nominal categories are far more frequently borrowed than verbal categories. Additionally, it is shown that sets of borrowed affixes often consist of interrelated sets of forms, e.g. forming paradigms, rather than being isolated forms from different morphosyntactic systems, in particular for the more tightly integrated inflectional subsystems. The frequency and systematicity by which inflectional affixes are borrowed calls for a reconsideration of the role of inflection in models of language contact. 1. What constrains affix borrowability? Models of language contact rely heavily on statements about the relative ease or frequency by which different types of linguistic items are borrowed, as typically expressed in " borrowing hierarchies " , e.g. free forms > bound forms, or derivation > inflection. The differential manipulability in language contact expressed in such asymmetries is interpreted as reflecting aspects of the architecture of language: Less easily borrowable forms apparently have inherent structural properties (e.g. being morphologically bound) or semantic/functional properties (e.g. expressing abstract, inflectional categories) that inhibit transfer from one language to another. Borrowing 1 affixes – in the sense of borrowing the morphological material of a bound form, along with its function, and applying it to native stems – plays a prominent role in many of these hierarchies, both in terms of affixes vs. stems and among different types of affixes, e.g. derivation vs. inflection. However, to date no attempt has been made to objectively assess the relative frequency of (different types of) affix borrowing in a worldwide sample of languages. Weinreich's (1953) seminal study on language contact put forward a set of hypotheses, some of which relate to affixes, based loosely on a number of case studies (see the very useful summary by Wilkins 1996), without attempting to quantify empirical data. The work of Yaron Matras and colleagues marks the beginning of more rigorous empirical testing of the borrowability of grammatical categories, including affixes. In particular, Elšík & Matras (2006) analyze a variety of borrowed grammatical categories, including some expressed by affixes, in a sample of Romani languages. A volume edited by Matras & Sakel (2007) presents case studies on grammatical borrowing in 27 languages from around the world, 1 The term " borrowing " is used here simply because it is the most widely used term with the longest tradition, even though it has obvious shortcomings compared to terms such as " transfer " (e.g. Myers-Scotton 2007) or " copying " (e.g. Johanson 1999).
Many Amazonian systems of nominal classification have been perceived as constituting a descriptiv... more Many Amazonian systems of nominal classification have been perceived as constituting a descriptive and typological challenge. The proposal presented here is to consider many of them as emerging noun class systems rather than as a-typical systems that defy integration within an overall typology of nominal classification, at the opposite end from the Niger-Congo systems on a continuum of grammaticalization. First the African noun class systems are reviewed, with an emphasis on the sociolinguistic context of their descriptions and on their common deviations from a prototypical image of them projected in the general linguistic literature. Then a recapitulation of various proposals of a-typicality of the Amazonian systems is given, followed by the presentation of a typology of nominal classification systems that integrates the dynamic dimension of grammaticalization. The application of this typological framework is illustrated with a case study from the Miraña language of Colombia.
Can an unknown Amazonian language be identified by statistical procedures based on n-gram frequen... more Can an unknown Amazonian language be identified by statistical procedures based on n-gram frequencies if only a short list of words is available and at the same time, the available data of the potential candidate languages are also limited to relatively short wordlists? In this paper we show that n-gram frequencies (specifically 1-grams and 2-grams) allow us to identify languages reliably based on as few as 20 words, as long as these are transcribed consistently, and as long as characteristic monogram and bigram frequencies for these languages have previously been established based on consistently transcribed data. If no such consistently transcribed data are available, as is the case of our Amazonian case study, such procedures clearly fail for wordlists with 50 or fewer words. Our study thus contributes to exploring the limits of such automated detection procedures, both in terms of corpus size and transcription quality.
This paper provides evidence for the identification of the language of the uncontacted indigenous... more This paper provides evidence for the identification of the language of the uncontacted indigenous group called Carabayo, who live in voluntary isolation in the Colombian Amazon region. The only linguistic data available from this group is a set of about 50 words, most of them without reliable translations, that were collected in 1969 during a brief encounter with one Carabayo family. We compare this material with various languages (once) spoken in the region, showing that four attested Carabayo forms (a first person singular prefix and words for 'warm', 'father', and 'boy') display striking similarities with Yurí and at least 13 Carabayo forms display clear correspondences with contemporary Tikuna. Tikuna and Yurí are the only two known members of the Tikuna-Yurí linguistic family. Yurí was documented in the 19th century but has been thought to have become extinct since. We conclude that the Carabayo -directly or indirectly -descend from the Yurí people whose language and customs were described by explorers in the 19th century, before they took up voluntary isolation, escaping atrocities during the rubber boom in the early 20th century.
A widespread assumption in the language contact literature is that affixes are never borrowed dir... more A widespread assumption in the language contact literature is that affixes are never borrowed directly, but only indirectly, that is, as part of complex loanwords. From such complex loanwords, affixes may eventually spread to native stems, creating hybrid formations, in a process of language-internal analogical extension. Direct borrowing is the extraction of an affix based on knowledge of the donor language, without the mediation of complex loanwords within the recipient language. This article suggests that direct borrowing can also be the only or primary process leading to productive loan affixes. Criteria are provided to assess instances of direct and indirect borrowing on the basis of the distribution of borrowed affixes across complex loanwords and hybrid formations. These are applied to corpora of various languages. A scale of directness of affix borrowing is proposed, based on the extent to which speakers of the recipient language rely (i) on their knowledge of the donor language (direct borrowing) and (ii) on complex loanwords within their native language (indirect borrowing).*
Language and Linguistics Compass, 2010
Many languages have morphosyntactic systems that impose a classification on their nominal lexicon... more Many languages have morphosyntactic systems that impose a classification on their nominal lexicon. The nature of these systems varies widely, ranging from large systems of lexico-syntactic numeral classifiers, as known from East and South East Asian languages, to highly grammaticalized gender agreement systems of, e.g. European languages (see Aikhenvald 2000 for a comprehensive overview). These systems are also often internally complex, heterogeneous, and descriptively challenging. The morphology, syntax, and semantics of nominal classification bear ample material for interesting research questions in, e.g. typology and historical linguistics. This article reviews some basic features of nominal classification and discusses a number of selected issues that are particularly relevant for ongoing discussions and future research: How can nominal classification systems be typologized according to morphosyntactic criteria (Section 2)? What are the semantic principles of nominal classification and how do they relate to culture and cognition (Section 3)? How does nominal classification diachronically develop and diffuse across languages (Section 4)?
Linguistics, 2010
The Amazonian language Bora systematically uses in narratives a special, paragraphinitial anaphor... more The Amazonian language Bora systematically uses in narratives a special, paragraphinitial anaphoric "connector pronoun". This pronoun helps to ensure referential coherence through agreement in noun class and number with an antecedent, whose referent is thematic in the new paragraph. Additional morphology in the connector pronoun specifies temporal, causal, and other relations between events. The connector pronoun is syntactically tightly integrated into the clause, where it may function as an argument of a verb or as the dependent element of a genitive phrase. Certain frequent forms of the connector pronoun are the basis for a number of lexicalized conjunctions. This paragraph-linking strategy parallels in a number of ways tail-head linkage systems, not only in its functionality, but also with respect to its diachronic outcome (discourse conjunctions). The fact that Bora grammaticalized nominal expressions in a paragraph-linking system (whereas verbs are the central components of tail-head linkage) is congruent with the general preference of Bora to use many noun phrases per clause, in contrast to tail-head linkage languages, where noun phrases are rarely used.
International Journal of American Linguistics
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, a... more JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Nijmegen: DOBES-MPI, 2009
Language Documentation & Conservation, 2020
This paper explores the application of quantitative methods to study the effect of various factor... more This paper explores the application of quantitative methods to study the effect of various factors on phonetic word duration in ten languages. Data on most of these languages were collected in fieldwork aiming at documenting spontaneous speech in mostly endangered languages, to be used for multiple purposes, including the preservation of cultural heritage and community work. Here we show the feasibility of studying processes of online acceleration and deceleration of speech across languages using such data, which have not been considered for this purpose before. Our results show that it is possible to detect a consistent effect of higher frequency of words leading to faster articulation even in the relatively small language documentation corpora used here. We also show that nouns tend to be pronounced more slowly than verbs when other factors are controlled for. Comparison of the effects of these and other factors shows that some of them are difficult to capture with the current data and methods, including potential effects of cross-linguistic differences in morphological complexity. In general, this paper argues for widening the cross-linguistic scope of phonetic and psycholinguistic research by including the wealth of language documentation data that has recently become available.
Classifiers and noun class markers are often semantically general and semantically opaque when co... more Classifiers and noun class markers are often semantically general and semantically opaque when compared to open-class nouns, and in this sense they constitute a semantic reduction of the noun universe. These two semantic characteristics also play an important role in the diachronic development of nominal classification systems in various respects. First, the need for semantically general forms for anaphoric reference may be a possible motivation for developing nominal classification in the first place. Second, opaque classification, which may, for example, emerge through coalescence of classes with homophonous markers, may be replaced by transparent classification because of the incompatibility of opaque classification and certain syntactic constructions, such as contrastive focus. Finally, opaque classification, which is typical for grammatical gender systems, is less likely to diffuse through language contact than transparent classification, which is typical for other types of systems, including numeral classifier systems.
Proceedings of the National Academy of Sciences, 2018
By force of nature, every bit of spoken language is produced at a particular speed. However, this... more By force of nature, every bit of spoken language is produced at a particular speed. However, this speed is not constant-speakers regularly speed up and slow down. Variation in speech rate is influenced by a complex combination of factors, including the frequency and predictability of words, their information status, and their position within an utterance. Here, we use speech rate as an index of word-planning effort and focus on the time window during which speakers prepare the production of words from the two major lexical classes, nouns and verbs. We show that, when naturalistic speech is sampled from languages all over the world, there is a robust cross-linguistic tendency for slower speech before nouns compared with verbs, both in terms of slower articulation and more pauses. We attribute this slowdown effect to the increased amount of planning that nouns require compared with verbs. Unlike verbs, nouns can typically only be used when they represent new or unexpected information; otherwise, they have to be replaced by pronouns or be omitted. These conditions on noun use appear to outweigh potential advantages stemming from differences in internal complexity between nouns and verbs. Our findings suggest that, beneath the staggering diversity of grammatical structures and cultural settings, there are robust universals of language processing that are intimately tied to how speakers manage referential information when they communicate with one another. speech rate | nouns | language universals | word planning | language processing H uman language in its most widespread form (i.e., in spontaneously spoken interactions) is locked in one-dimensional time. This was recognized by the founding father of modern linguistics, Ferdinand de Saussure, as one of the two fundamental principles of the linguistic sign, the other one being its arbitrary nature (1, 2). An unresolved question is which aspects of local variation in speech rate are universal (3, 4), which vary across languages and cultures (5), and which vary across individuals (6). For example, marking the end of utterances by slowing down speech is cross-linguistically common, but its implementation is languagespecific (7). Good candidates for truly universal temporal features are the relatively fast pronunciations of frequent, and thus predictable, words (8) and second mentions of words (9). This speedup is argued to result from automated articulation (4) and has been suggested to contribute to efficient communication by spreading information more evenly across the speech signal (10, 11). Frequency effects also explain why function words, such as articles, prepositions, and pronouns, are pronounced faster than the less frequently occurring content words, such as nouns and verbs (12).