Sebastian Drude | Goethe-Universität Frankfurt am Main (original) (raw)
Papers by Sebastian Drude
Training in filming, recording, edition for indigenous people in the field and in the lab Product... more Training in filming, recording, edition for indigenous people in the field and in the lab Production of documentary CDs and DVDs for the indigenous communities Informative WEB-site on language documentation and portal to the archive (being implemented) http://www.museu-goeldi.br/linguistica
Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 2015
in this paper we present the first results of the application of computational methods, inspired ... more in this paper we present the first results of the application of computational methods, inspired by the ideas in McMahon & McMahon (2005), to a dataset collected from languages of every branch of the tupian family (including all living non-tupí-Guaraní languages) in order to produce a classification of the family based on lexical distance. We used both a swadesh list (with historically stabler terms) and a list of animal and plant names for results comparison. in addition, we also selected more (HiHi) and less (LoLo) stable terms from the swadesh list to form sublists for indepedent treatment. We compared the resulting neighbornet networks and neighbor-joining cladograms and drew conclusions about their significance for the current understanding of the classification of tupian languages. one important result is the lack of support for the currently discussed idea of an Eastern-Western division within tupí.
Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 2015
this paper presents a succinct reconstruction of the segmental phonology of Proto-Maweti-Guarani,... more this paper presents a succinct reconstruction of the segmental phonology of Proto-Maweti-Guarani, the hypothetical protolanguage from which modern Mawe, Aweti and the tupi-Guarani branches of the tupi linguistic family have evolved. Based on about 300 cognate sets from the authors' field data (for Mawe and Aweti) and from Mello's reconstruction (2000) for Proto-tupi-Guarani (with additional information from other works; and with a few changes concerning certain doubtful features, such as the status of stem-final lenis consonants *r and *ß, and the distinction of *c and *č), the consonants and vowels of Proto-Maweti-Guarani were reconstructed with the help of the traditional historical-comparative method. the development of the reconstructed segments is then traced from the protolanguage to each of the modern branches. A comparison with other claims made about Proto-Maweti-Guarani is given in the conclusion.
Guaraní" 2. Vogais idênticas 3. Vogais não-idênticas 4. Consoantes idênticas 5. Consonantes não-i... more Guaraní" 2. Vogais idênticas 3. Vogais não-idênticas 4. Consoantes idênticas 5. Consonantes não-idênticas regulares simples 6. Harmonia nasal 7. Consonantes complicadas 8. "Prefixos relacionais" 9. Paradigmas possessivos 1. "Mawetí-Guaraní"
LIAMES, 2019
This article describes and substantiates the orthography of the Awetí language (Tupí, Alto Xingu/... more This article describes and substantiates the orthography of the Awetí language (Tupí, Alto Xingu/ mt), based on the analysis of the phonological and grammatical structure of Awetí. The orthography is a result of a long collaborative effort of the three authors, started in 1998. It establishes more than an alphabet (the representation of the vowels and consonants of the language): it also deals with internal variation, re-syllabification, lenition, and other (morpho)phonological processes. Special attention was given to the written representation of the glottal stop, as well as the orthographical consequences of nasal harmony. Although the lexical accent (stress) is not explicitly marked in Awetí, most of its affixes and particles are examined as to their stress and interaction with neighboring morphemes, also determining orthographical words. Finally, the alphabetical sort order is established, where digraphs are treated as regular sequences of letters, and the glottal stop ⟨ʼ⟩ is ignored, which helps learners of Awetí. The orthography as described here has now been used for some 10 years in the village school for alphabetizing in Awetí, with good results. We believe that several of the arguments raised here can be fruitfully transferred to other languages with similar phenomena (glottal stop as consonant, nasal harmony, morpho-phonological assimilation, etc.). keywords: Awetí; Orthography; Writing; Alphabet; Phonology. resumo: Este trabalho descreve e fundamenta a ortografia da língua Awetí (Tupí, Alto Xingu/mt), com base na análise da estrutura fonológica e gramatical do Awetí. A ortografia é resultado de um longo trabalho colaborativo entre os três autores, iniciado em 1998. Ela não define apenas um alfabeto (a representação das vogais e das consoantes da língua), mas também aborda a variação interna, ressilabificação, lenição, palatalização e outros processos (morfo)fonológicos. Tanto a representação escrita da oclusiva glotal, quanto as consequências ortográficas da harmonia nasal receberam uma atenção especial. Apesar de o acento lexical não ser ortograficamente marcado em Awetí, a grande maioria dos afixos e partículas é abordada considerando o acento e sua interação com morfemas adjacentes, ao mesmo tempo determinando as palavras ortográficas. Finalmente foi estabelecida a ordem alfabética em que dígrafos são tratados como sequências de letras, já a oclusiva glotal ⟨ʼ⟩ é ignorada, facilitando o aprendizado do Awetí. A ortografia tal como descrita aqui tem sido usada por aproximadamente dez anos na escola para a alfabetização em Awetí, com bons resultados obtidos. Acreditamos que vários dos argumentos aqui levantados podem ser produtivamente transferidos para outras línguas com fenômenos semelhantes (a oclusiva glotal como consoante, harmonia nasal, assimilação morfo-fonológica, etc.).
Entre os fenômenos gramaticais mais frequentemente descritos das línguas da família Tupí-Guaraní ... more Entre os fenômenos gramaticais mais frequentemente descritos das línguas da família Tupí-Guaraní estão os prefixos relacionais, analisados como componentes de um sistema que marca a contiguidade ou nãocontiguidade entre um termo dependente e o termo do qual este depende. No presente artigo, examina-se a questão da origem histórica destes prefixos, propondo-se a hipótese (sugerida inicialmente em Moore e Galucio 1994) de que sejam o resultado de alterações sofridas pela consoante inicial da raiz do termo dependente, condicionadas pelo ambiente fonológico e pela estrutura morfossintática em que os termos dependente e independente se encontravam inicialmente. Esta hipótese põe em dúvida a análise sincrônica destes elementos como prefixos.
Proceedings from the XXI FEL Conference, 2018
This paper proposes that the digital domains of language use (DDLU) be included in future assessm... more This paper proposes that the digital domains of language use (DDLU) be included in future assessments of language vitality. DDLU, including the consumption of online content, engagement with social media and chat which now make an important, and rapidly growing, part of the daily language use in many speech communities. This is true even in communities which were until recently fairly isolated from modern technology. Criteria for determining digital language vitality vary according to the general situation of the language but constitute a separate dimension-languages otherwise in a similar situation can differ considerably in terms of digital vitality. We consider, for instance, advanced language technology and digital content for institutional languages, teaching material and dedicated collaborative resources for vigorous and developing languages, and language documentation material for endangered languages. Crucially, what needs to be assessed is not only quantity, availability, quality, coverage, maturity, sustainability, and adaptability of resources and technology as proposed in earlier frameworks, but also, principally, their usage by the speakers. We find the concept of digital minoritization useful to advance the assessment of digital language vitality, as illustrated by the language situation in three West Nordic countries, Iceland, Norway and The Faroe Islands. This is work in progress and is an extension of the MoLiCoDiLaCo project being carried out at the University of Iceland.
Lessons from documented endangered languages, 2008
This is a pre-print of a chapter in: K. David Harrison, David S. Rood, & Arienne Dwyer (eds.), ... more This is a pre-print of a chapter in:
K. David Harrison, David S. Rood, & Arienne Dwyer (eds.), Lessons from documented endangered languages, 67–110. Amsterdam; Philadelphia: Benjamins. DOI: https://doi.org/10.1075/tsl.78.04dru
This is a pre-print of a chapter in SACKMANN, Robin (ed). Explorations in integrational linguisti... more This is a pre-print of a chapter in SACKMANN, Robin (ed). Explorations in integrational linguistics: four essays on German, French, and Guaraní. Amsterdam, Philadelphia: Benjamins, p. 153–189. (Current Issues in Linguistic Theory 285). ISBN: 9789027248000
Electronic Grammaticography, 2012
Although intrinsically closely related to the new field of language documentation, gram-maticogra... more Although intrinsically closely related to the new field of language documentation, gram-maticography is still mostly oriented to the book model, usually falling short of making use of related digital resources and hypertext functionalities. In this contribution, we show and discuss possible or easily achievable advances that can built on top of existing technology such as Language Archiving Technology as developed at The Language Archive at the MPI-PL: Exemplars and examples can be found in multimedia corpora of natural speech events annotated with ELAN and visualized with ANNEX, words and word forms can be linked to lexical entries in LEXUS online-databases, and the precise meaning of theoretical concepts can be given in ISOcat entries or related terminologi-cal databases. Independently from LAT, Wiki-technology provides online collaboration and version control and opens even the possibility to address different audiences in related sets of pages, but also poses challenges for the overall didactic structure of a descriptive work. As one of the formats, at least for export and exchange, the XML-based TEI may provide a suitable framework, although many specialized tags would still have to be introduced and formatting and functionalities for these tags still has to be implemented. Generally, synchronization between different versions (e.g., on-line and off-line) poses the most intriguing difficulties, but the advantages (also in terms of Nordhoff's maxims) of hypertext grammars as proposed here are overwhelming.
Potentials of language documentation: Methods, analyses, and utilization, 2012
This contribution explores the potentials of combining corpora of language use data with language... more This contribution explores the potentials of combining corpora of language use data with language description in e-grammars (or digital grammars). We present three directions of ongoing research and discuss the advantages of combining these and similar approaches, arguing that the technological possibilities have barely begun to be explored.
In ELAN, merge the result with the original file, taking only tx@A and its dependent tier from th... more In ELAN, merge the result with the original file, taking only tx@A and its dependent tier from the Toolbox-re-import In Segmentation mode, use the root tier ref@A for segmentation into intonation units Open the media file(s) in ELAN using the template file ELAN-Template-IU+Sent.etf Copy po@A to a tier tx@A with linguistic type tx under ref@A (rename po@A-cp to tx@A) Export tiers \tx and \tf to Toolbox using Toolbox Database Type sentences.typ, creating tier \ref (for automatic numbering) and the tiers \ELANbegin, \ELANend, \ELANspeaker. Import the result back to ELAN, using the same Toolbox Database Type sentences.typ Create (empty) annotations on dependent tiers dt@A, tn@A and tl@A In Transcription mode with dt@A and tl@A, use dt@A for narrow transcription (with pauses, false starts, etc.), and use tl@A and tn@A for a literal translation (close to word-to-word) In Annotation mode, merge those ref@A-segments that belong to the same syntactic unit Copy dt@A to a tier po@A with linguistic type po under ref (rename dt@A-cp to po@A), and copy tl@A to a tier tf@A with linguistic type tf under ref (rename tl@A-cp to tf@A), merging adjacent transcriptions and translations belonging to the same sentences In Transcription mode (with po@A and tf@A), clean the transcription (elimination hesitation, false starts, length markers etc.), and change the translation in tf@A into a free translation (idiomatic in the target language) In Toolbox, interlinearize \tx, creating \mb, \ge, and \ps Workflow: A segmentation into intonation units (IU) is useful for studying prosody and other speech phenomena. For the grammatical analysis, syntactic units (SU, "sentences") are more appropriate, which also apply in spoken language. We propose to annotate both in language documentation. IU stand in a many-to-one-relation to SU, and are therefore the better starting point. We present an efficient workflow with recommended settings for ELAN and Toolbox that (a) avoids doubled work (segmenting and annotating) in ELAN, and that (b) includes a solid round-trip-configuration for exporting basic annotation from ELAN to Toolbox, where basic glossing is done, and importing the result back from Toolbox into ELAN. The result is a documentation format that provides a good basis for both discourse and grammatical analysis. 5th International Conference on Language Documentation and Conservation (ICLDC) 2-5 March, 2017 University of Hawaiʻi at Mānoa.
Endangered Languages and the Land: Mapping Landscapes of Multilingualism, 2018
Current language maps (maps aiming at showing the localization of languages) usually display eith... more Current language maps (maps aiming at showing the localization of languages) usually display either mostly non-overlapping areas or single dots for language localizations. Both are unsatisfactory given the much more complex geographical distribution patterns of speakers and language use. Future maps should take diglossia and diverse areas into account as well as multilingual individuals. The conceptual basis are patterns of language domains of use and significance. Each language needs a large set of maps for different domains, which then can be combined (through overlay or more advanced techniques) for one language or several languages. Interactive maps can give access to external additional information or allow visualization of developments over time.
Reduplication in Indigenous languages of South America, 2014
0 Introduction This study describes reduplication in Awetí, a Brazilian indigenous language spoke... more 0 Introduction This study describes reduplication in Awetí, a Brazilian indigenous language spoken byasmallpeoplelivinginthreevillagesintheregionoftheheadwatersoftheXingu river,MatoGrosso,centralBrazil. 1 Awetídoesnotbelongto,butistheclosestrelative of the well-known Tupí-Guaraní subfamily, the largest branch of the Tupí family. Mawé,AwetíandTupí-GuaranítogetherconstituteamajorbranchofTupí,"Mawetí-Guaraní"(RodriguesandDietrich1997;Drude2006;DrudeandMeirainprep.). The main interest this study may claim is of a methodological character: the contributionthatthestudyofreduplication(henceforth"RED",forshort)canmakefor the understanding of phonological and morphological units and processes, such as nasalization, abstract morpheme-final phonemes, identification of the stem and of morphemeboundaries,thederivation/inflectiondistinction,etc. Sections1and2serveasabasisfortheremainingpaper.Section1summarizes the phonological system of Awetí. Section2 gives some basic morphological properties and presents some diagnostic affixes used in the remaining sections. Then section3 gives a short general overview over RED in Awetí. Section4 describes the basic patterns of RED found with active (transitive and intransitive) verbs. The members of asmall particular subclass of these end in an unstressed syllable; these aretreatedinsection5.Section6usesevidencefromREDforacloserlookatcertain (morpho-)phonological alternations at the beginning and end of verbal stems. RED with stative verbs is treated in section7. Finally, the RED-related behavior of 1 ThegeneralknowledgeofAwetíunderlyingthisstudyisaresultoftenyearsofworkonAwetí,inclu-dingatotalofaroundfourteenmonthsoffieldwork.From2001to2005,theprojectwassupportedby the Volkswagen Foundation within the DOBES (Documentation of Endangered Languages) program, focusingondocumentingthelanguageandaspectsoftheculture.From2008on,myresearchhasbeen supportedbyaDilthey-fellowshipwhichagainallowsmetodofieldworkandtoinvitespeakerstothe Museu Goeldi in Belém. Reduplication has been studied, and the specific data used in this study has beenelicited,duringthefieldtripinApril/May2009.Iamverygratefulforthesupportbythefunding agencies, the Museu Goeldi and my Awetí teachers. I also owe thanks to several reviewers of earlier versions of this paper, in particular to an anonymous reviewer and the editors. Remaining shortcomingsareminealone.
Reflections on Language Documentation on the 20 Year Anniversary of Himmelmann 1998, LD&C Special Publication 15, 2018
This contribution gives a short overview of “language inventorying”: research aiming at creating ... more This contribution gives a short overview of “language inventorying”: research aiming at creating comprehensive catalogues and atlases of all the languages in the world, which has seen a boost with the renewed interest in linguistic diversity triggered by the awareness of language endangerment in the 1990s. By focusing on the development of the ISO standard 639 and SIL’s Ethnologue, the main advances and issues in this area are discussed.
The overview concludes by presenting the major alternative resources, in particular Glottolog.
Estudos Linguisticos, 2006
This paper presents the methods of language documentation as applied in the Awetí Language Docume... more This paper presents the methods of language documentation as applied in the Awetí Language Documentation Project, one of the projects in the Documentation of Endangered Languages Programme (DOBES). It describes the steps of how a large digital corpus of annotated multi-media data is built. Special attention is devoted to the format of annotation of linguistic data. The Advanced Glossing format is presented and justified.
Language Documentation and Conservation, 2008
This paper addresses a set of issues related to language documentation that are not often explici... more This paper addresses a set of issues related to language documentation that are
not often explicitly dealt with in academic publications, yet are highly important
for the development and success of this new discipline. These issues include
embedding language documentation in the socio-political context not only at
the community level but also at the national level, the ethical and technical
challenges of digital language archives, and the importance of regional and
international cooperation among documentation activities. These issues play a
major role in the initiative to set up a network of regional language archives in
three South American countries, which this paper reports on. Local archives for
data on endangered languages have recently been set up in Iquitos (Peru), Buenos
Aires (Argentina), and in various locations in Brazil. An important feature of
these is that they provide fast and secure access to linguistic and cultural data for
local researchers and the language communities. They also make data safer by
allowing for regular update procedures within the network.
Revista Virtual de Estudos da Linguagem, Ed. esp. 3, 2009
This article describes and analyses nasal harmony (or spreading of nasality) in Awetí. It first s... more This article describes and analyses nasal harmony (or spreading of nasality) in Awetí. It first shows generally how sounds in prefixes adapt to nasality or orality of stems, and how nasality in stems also 'extends' to the left. With abstract templates we show which phonetically nasal or oral sequences are possible in Awetí (focusing on stops, pre-nasalized stops and nasals) and which phonological analysis is appropriate for account for this regularities. In Awetí, there are intrinsically nasal and oral vowels and 'neutral' vowels which adapt phonetically to a following vowel or consonant, as is the case of sonorant consonants. Pre-nasalized stops such as "nt" are nasalized variants of stops, not post-oralized variants of nasals as in Tupí-Guaranian languages. For nasals and stops in syllable coda (end of morphemes), we postulate arqui-phonemes which adapt to the preceding vowel or a following consonant. Finally, using a declarative approach, the analysis formulates 'rules' (statements) which account for the 'behavior' of nasality in Awetí words, making use of "structured sequences" on both the phonetic and phonological levels. So, each unit (syllable, morpheme, word etc.) on any level has three components, a sequence of segments, a constituent structure (where pre-nasalized stops, like diphthongs, correspond to two segments), and an intonation structure. The statements describe which phonetic variants can be combined (concatenated) with which other variants, depending on their nasality or orality.
Amerindia 31, 2008
1. Object: Awetí and the ‘Mawetí-Guaraní’ subfamily “Mawetí-Guaraní” is a shorter designation of... more 1. Object: Awetí and the ‘Mawetí-Guaraní’ subfamily
“Mawetí-Guaraní” is a shorter designation of a branch of the large Tupí language family, alongside with eight other branches or subfamilies. This branch in turn consists internally of the languages (Sateré-) Mawé and Awetí and the large Tupí-Guaraní subfamily, and so its explicit but longish name could be “Mawé-Awetí-Tupí-Guaraní” (MTAG). This genetic grouping has already been suggested (without any specific designation) by A. D. Rodrigues (e.g., 1984/85; Rodrigues and Dietrich 1997), and, more recently, it has been confirmed by comparative studies (Corrêa da Silva 2007; Drude 2006; Meira and Drude in prep.), which also more reliably establish the most probable internal ramification, according to which Mawé separated first, whereas the differentiation between Awetí, on the one hand, and the precursor of the Tupí-Guaraní (TG) subfamily, proto-Tupí-Guaraní (pTG), on the other, would have been more recent. The intermediate branch could be named “Awetí-Tupí-Guaraní” (“Awetí-TG” or “ATG”). Figure 1 shows the internal grouping of the Tupí family according to results of the Tupí Comparative Project under D. Moore at the Museu Goeldi (2000–2006).
AMERINDIA, 2011
This paper describes the distribution and phonetic properties of accentuation of word forms in Aw... more This paper describes the distribution and phonetic properties of accentuation of word forms in Awetí, a Tupian language spoken by ca. 150 people in central Brazil in the Upper Xingu area. Awet\'ı does not belong to, but is arguably the closest relative of the better known Tupí-Guaraní subfamily, the largest branch of the Tupí stock. After a short overview over the word classes and general phonotactics of Awetí (sec-tion 2), we briefly discuss the notion ‘word accent’ and show that, in Awetí, it is generally located on the last syllable of the stem in morphologically simple forms (section 3). We then discuss regular and isolated exceptions to this rule (section 4). In section 5, we describe the distribution of the word accent when inflectional or deriva-tional suffixes are present – usually, the word accent of the word form with suffixes continues to be on the last syllable of the stem. After this descriptive part, we present a preliminary study of the acoustic-phonetic details of the manifestation of the word accent, observing word forms in isolation (section 6) and in different syntactic con-texts (section 7). The results are briefly summarized in the conclusion (section 8)
Training in filming, recording, edition for indigenous people in the field and in the lab Product... more Training in filming, recording, edition for indigenous people in the field and in the lab Production of documentary CDs and DVDs for the indigenous communities Informative WEB-site on language documentation and portal to the archive (being implemented) http://www.museu-goeldi.br/linguistica
Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 2015
in this paper we present the first results of the application of computational methods, inspired ... more in this paper we present the first results of the application of computational methods, inspired by the ideas in McMahon & McMahon (2005), to a dataset collected from languages of every branch of the tupian family (including all living non-tupí-Guaraní languages) in order to produce a classification of the family based on lexical distance. We used both a swadesh list (with historically stabler terms) and a list of animal and plant names for results comparison. in addition, we also selected more (HiHi) and less (LoLo) stable terms from the swadesh list to form sublists for indepedent treatment. We compared the resulting neighbornet networks and neighbor-joining cladograms and drew conclusions about their significance for the current understanding of the classification of tupian languages. one important result is the lack of support for the currently discussed idea of an Eastern-Western division within tupí.
Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 2015
this paper presents a succinct reconstruction of the segmental phonology of Proto-Maweti-Guarani,... more this paper presents a succinct reconstruction of the segmental phonology of Proto-Maweti-Guarani, the hypothetical protolanguage from which modern Mawe, Aweti and the tupi-Guarani branches of the tupi linguistic family have evolved. Based on about 300 cognate sets from the authors' field data (for Mawe and Aweti) and from Mello's reconstruction (2000) for Proto-tupi-Guarani (with additional information from other works; and with a few changes concerning certain doubtful features, such as the status of stem-final lenis consonants *r and *ß, and the distinction of *c and *č), the consonants and vowels of Proto-Maweti-Guarani were reconstructed with the help of the traditional historical-comparative method. the development of the reconstructed segments is then traced from the protolanguage to each of the modern branches. A comparison with other claims made about Proto-Maweti-Guarani is given in the conclusion.
Guaraní" 2. Vogais idênticas 3. Vogais não-idênticas 4. Consoantes idênticas 5. Consonantes não-i... more Guaraní" 2. Vogais idênticas 3. Vogais não-idênticas 4. Consoantes idênticas 5. Consonantes não-idênticas regulares simples 6. Harmonia nasal 7. Consonantes complicadas 8. "Prefixos relacionais" 9. Paradigmas possessivos 1. "Mawetí-Guaraní"
LIAMES, 2019
This article describes and substantiates the orthography of the Awetí language (Tupí, Alto Xingu/... more This article describes and substantiates the orthography of the Awetí language (Tupí, Alto Xingu/ mt), based on the analysis of the phonological and grammatical structure of Awetí. The orthography is a result of a long collaborative effort of the three authors, started in 1998. It establishes more than an alphabet (the representation of the vowels and consonants of the language): it also deals with internal variation, re-syllabification, lenition, and other (morpho)phonological processes. Special attention was given to the written representation of the glottal stop, as well as the orthographical consequences of nasal harmony. Although the lexical accent (stress) is not explicitly marked in Awetí, most of its affixes and particles are examined as to their stress and interaction with neighboring morphemes, also determining orthographical words. Finally, the alphabetical sort order is established, where digraphs are treated as regular sequences of letters, and the glottal stop ⟨ʼ⟩ is ignored, which helps learners of Awetí. The orthography as described here has now been used for some 10 years in the village school for alphabetizing in Awetí, with good results. We believe that several of the arguments raised here can be fruitfully transferred to other languages with similar phenomena (glottal stop as consonant, nasal harmony, morpho-phonological assimilation, etc.). keywords: Awetí; Orthography; Writing; Alphabet; Phonology. resumo: Este trabalho descreve e fundamenta a ortografia da língua Awetí (Tupí, Alto Xingu/mt), com base na análise da estrutura fonológica e gramatical do Awetí. A ortografia é resultado de um longo trabalho colaborativo entre os três autores, iniciado em 1998. Ela não define apenas um alfabeto (a representação das vogais e das consoantes da língua), mas também aborda a variação interna, ressilabificação, lenição, palatalização e outros processos (morfo)fonológicos. Tanto a representação escrita da oclusiva glotal, quanto as consequências ortográficas da harmonia nasal receberam uma atenção especial. Apesar de o acento lexical não ser ortograficamente marcado em Awetí, a grande maioria dos afixos e partículas é abordada considerando o acento e sua interação com morfemas adjacentes, ao mesmo tempo determinando as palavras ortográficas. Finalmente foi estabelecida a ordem alfabética em que dígrafos são tratados como sequências de letras, já a oclusiva glotal ⟨ʼ⟩ é ignorada, facilitando o aprendizado do Awetí. A ortografia tal como descrita aqui tem sido usada por aproximadamente dez anos na escola para a alfabetização em Awetí, com bons resultados obtidos. Acreditamos que vários dos argumentos aqui levantados podem ser produtivamente transferidos para outras línguas com fenômenos semelhantes (a oclusiva glotal como consoante, harmonia nasal, assimilação morfo-fonológica, etc.).
Entre os fenômenos gramaticais mais frequentemente descritos das línguas da família Tupí-Guaraní ... more Entre os fenômenos gramaticais mais frequentemente descritos das línguas da família Tupí-Guaraní estão os prefixos relacionais, analisados como componentes de um sistema que marca a contiguidade ou nãocontiguidade entre um termo dependente e o termo do qual este depende. No presente artigo, examina-se a questão da origem histórica destes prefixos, propondo-se a hipótese (sugerida inicialmente em Moore e Galucio 1994) de que sejam o resultado de alterações sofridas pela consoante inicial da raiz do termo dependente, condicionadas pelo ambiente fonológico e pela estrutura morfossintática em que os termos dependente e independente se encontravam inicialmente. Esta hipótese põe em dúvida a análise sincrônica destes elementos como prefixos.
Proceedings from the XXI FEL Conference, 2018
This paper proposes that the digital domains of language use (DDLU) be included in future assessm... more This paper proposes that the digital domains of language use (DDLU) be included in future assessments of language vitality. DDLU, including the consumption of online content, engagement with social media and chat which now make an important, and rapidly growing, part of the daily language use in many speech communities. This is true even in communities which were until recently fairly isolated from modern technology. Criteria for determining digital language vitality vary according to the general situation of the language but constitute a separate dimension-languages otherwise in a similar situation can differ considerably in terms of digital vitality. We consider, for instance, advanced language technology and digital content for institutional languages, teaching material and dedicated collaborative resources for vigorous and developing languages, and language documentation material for endangered languages. Crucially, what needs to be assessed is not only quantity, availability, quality, coverage, maturity, sustainability, and adaptability of resources and technology as proposed in earlier frameworks, but also, principally, their usage by the speakers. We find the concept of digital minoritization useful to advance the assessment of digital language vitality, as illustrated by the language situation in three West Nordic countries, Iceland, Norway and The Faroe Islands. This is work in progress and is an extension of the MoLiCoDiLaCo project being carried out at the University of Iceland.
Lessons from documented endangered languages, 2008
This is a pre-print of a chapter in: K. David Harrison, David S. Rood, & Arienne Dwyer (eds.), ... more This is a pre-print of a chapter in:
K. David Harrison, David S. Rood, & Arienne Dwyer (eds.), Lessons from documented endangered languages, 67–110. Amsterdam; Philadelphia: Benjamins. DOI: https://doi.org/10.1075/tsl.78.04dru
This is a pre-print of a chapter in SACKMANN, Robin (ed). Explorations in integrational linguisti... more This is a pre-print of a chapter in SACKMANN, Robin (ed). Explorations in integrational linguistics: four essays on German, French, and Guaraní. Amsterdam, Philadelphia: Benjamins, p. 153–189. (Current Issues in Linguistic Theory 285). ISBN: 9789027248000
Electronic Grammaticography, 2012
Although intrinsically closely related to the new field of language documentation, gram-maticogra... more Although intrinsically closely related to the new field of language documentation, gram-maticography is still mostly oriented to the book model, usually falling short of making use of related digital resources and hypertext functionalities. In this contribution, we show and discuss possible or easily achievable advances that can built on top of existing technology such as Language Archiving Technology as developed at The Language Archive at the MPI-PL: Exemplars and examples can be found in multimedia corpora of natural speech events annotated with ELAN and visualized with ANNEX, words and word forms can be linked to lexical entries in LEXUS online-databases, and the precise meaning of theoretical concepts can be given in ISOcat entries or related terminologi-cal databases. Independently from LAT, Wiki-technology provides online collaboration and version control and opens even the possibility to address different audiences in related sets of pages, but also poses challenges for the overall didactic structure of a descriptive work. As one of the formats, at least for export and exchange, the XML-based TEI may provide a suitable framework, although many specialized tags would still have to be introduced and formatting and functionalities for these tags still has to be implemented. Generally, synchronization between different versions (e.g., on-line and off-line) poses the most intriguing difficulties, but the advantages (also in terms of Nordhoff's maxims) of hypertext grammars as proposed here are overwhelming.
Potentials of language documentation: Methods, analyses, and utilization, 2012
This contribution explores the potentials of combining corpora of language use data with language... more This contribution explores the potentials of combining corpora of language use data with language description in e-grammars (or digital grammars). We present three directions of ongoing research and discuss the advantages of combining these and similar approaches, arguing that the technological possibilities have barely begun to be explored.
In ELAN, merge the result with the original file, taking only tx@A and its dependent tier from th... more In ELAN, merge the result with the original file, taking only tx@A and its dependent tier from the Toolbox-re-import In Segmentation mode, use the root tier ref@A for segmentation into intonation units Open the media file(s) in ELAN using the template file ELAN-Template-IU+Sent.etf Copy po@A to a tier tx@A with linguistic type tx under ref@A (rename po@A-cp to tx@A) Export tiers \tx and \tf to Toolbox using Toolbox Database Type sentences.typ, creating tier \ref (for automatic numbering) and the tiers \ELANbegin, \ELANend, \ELANspeaker. Import the result back to ELAN, using the same Toolbox Database Type sentences.typ Create (empty) annotations on dependent tiers dt@A, tn@A and tl@A In Transcription mode with dt@A and tl@A, use dt@A for narrow transcription (with pauses, false starts, etc.), and use tl@A and tn@A for a literal translation (close to word-to-word) In Annotation mode, merge those ref@A-segments that belong to the same syntactic unit Copy dt@A to a tier po@A with linguistic type po under ref (rename dt@A-cp to po@A), and copy tl@A to a tier tf@A with linguistic type tf under ref (rename tl@A-cp to tf@A), merging adjacent transcriptions and translations belonging to the same sentences In Transcription mode (with po@A and tf@A), clean the transcription (elimination hesitation, false starts, length markers etc.), and change the translation in tf@A into a free translation (idiomatic in the target language) In Toolbox, interlinearize \tx, creating \mb, \ge, and \ps Workflow: A segmentation into intonation units (IU) is useful for studying prosody and other speech phenomena. For the grammatical analysis, syntactic units (SU, "sentences") are more appropriate, which also apply in spoken language. We propose to annotate both in language documentation. IU stand in a many-to-one-relation to SU, and are therefore the better starting point. We present an efficient workflow with recommended settings for ELAN and Toolbox that (a) avoids doubled work (segmenting and annotating) in ELAN, and that (b) includes a solid round-trip-configuration for exporting basic annotation from ELAN to Toolbox, where basic glossing is done, and importing the result back from Toolbox into ELAN. The result is a documentation format that provides a good basis for both discourse and grammatical analysis. 5th International Conference on Language Documentation and Conservation (ICLDC) 2-5 March, 2017 University of Hawaiʻi at Mānoa.
Endangered Languages and the Land: Mapping Landscapes of Multilingualism, 2018
Current language maps (maps aiming at showing the localization of languages) usually display eith... more Current language maps (maps aiming at showing the localization of languages) usually display either mostly non-overlapping areas or single dots for language localizations. Both are unsatisfactory given the much more complex geographical distribution patterns of speakers and language use. Future maps should take diglossia and diverse areas into account as well as multilingual individuals. The conceptual basis are patterns of language domains of use and significance. Each language needs a large set of maps for different domains, which then can be combined (through overlay or more advanced techniques) for one language or several languages. Interactive maps can give access to external additional information or allow visualization of developments over time.
Reduplication in Indigenous languages of South America, 2014
0 Introduction This study describes reduplication in Awetí, a Brazilian indigenous language spoke... more 0 Introduction This study describes reduplication in Awetí, a Brazilian indigenous language spoken byasmallpeoplelivinginthreevillagesintheregionoftheheadwatersoftheXingu river,MatoGrosso,centralBrazil. 1 Awetídoesnotbelongto,butistheclosestrelative of the well-known Tupí-Guaraní subfamily, the largest branch of the Tupí family. Mawé,AwetíandTupí-GuaranítogetherconstituteamajorbranchofTupí,"Mawetí-Guaraní"(RodriguesandDietrich1997;Drude2006;DrudeandMeirainprep.). The main interest this study may claim is of a methodological character: the contributionthatthestudyofreduplication(henceforth"RED",forshort)canmakefor the understanding of phonological and morphological units and processes, such as nasalization, abstract morpheme-final phonemes, identification of the stem and of morphemeboundaries,thederivation/inflectiondistinction,etc. Sections1and2serveasabasisfortheremainingpaper.Section1summarizes the phonological system of Awetí. Section2 gives some basic morphological properties and presents some diagnostic affixes used in the remaining sections. Then section3 gives a short general overview over RED in Awetí. Section4 describes the basic patterns of RED found with active (transitive and intransitive) verbs. The members of asmall particular subclass of these end in an unstressed syllable; these aretreatedinsection5.Section6usesevidencefromREDforacloserlookatcertain (morpho-)phonological alternations at the beginning and end of verbal stems. RED with stative verbs is treated in section7. Finally, the RED-related behavior of 1 ThegeneralknowledgeofAwetíunderlyingthisstudyisaresultoftenyearsofworkonAwetí,inclu-dingatotalofaroundfourteenmonthsoffieldwork.From2001to2005,theprojectwassupportedby the Volkswagen Foundation within the DOBES (Documentation of Endangered Languages) program, focusingondocumentingthelanguageandaspectsoftheculture.From2008on,myresearchhasbeen supportedbyaDilthey-fellowshipwhichagainallowsmetodofieldworkandtoinvitespeakerstothe Museu Goeldi in Belém. Reduplication has been studied, and the specific data used in this study has beenelicited,duringthefieldtripinApril/May2009.Iamverygratefulforthesupportbythefunding agencies, the Museu Goeldi and my Awetí teachers. I also owe thanks to several reviewers of earlier versions of this paper, in particular to an anonymous reviewer and the editors. Remaining shortcomingsareminealone.
Reflections on Language Documentation on the 20 Year Anniversary of Himmelmann 1998, LD&C Special Publication 15, 2018
This contribution gives a short overview of “language inventorying”: research aiming at creating ... more This contribution gives a short overview of “language inventorying”: research aiming at creating comprehensive catalogues and atlases of all the languages in the world, which has seen a boost with the renewed interest in linguistic diversity triggered by the awareness of language endangerment in the 1990s. By focusing on the development of the ISO standard 639 and SIL’s Ethnologue, the main advances and issues in this area are discussed.
The overview concludes by presenting the major alternative resources, in particular Glottolog.
Estudos Linguisticos, 2006
This paper presents the methods of language documentation as applied in the Awetí Language Docume... more This paper presents the methods of language documentation as applied in the Awetí Language Documentation Project, one of the projects in the Documentation of Endangered Languages Programme (DOBES). It describes the steps of how a large digital corpus of annotated multi-media data is built. Special attention is devoted to the format of annotation of linguistic data. The Advanced Glossing format is presented and justified.
Language Documentation and Conservation, 2008
This paper addresses a set of issues related to language documentation that are not often explici... more This paper addresses a set of issues related to language documentation that are
not often explicitly dealt with in academic publications, yet are highly important
for the development and success of this new discipline. These issues include
embedding language documentation in the socio-political context not only at
the community level but also at the national level, the ethical and technical
challenges of digital language archives, and the importance of regional and
international cooperation among documentation activities. These issues play a
major role in the initiative to set up a network of regional language archives in
three South American countries, which this paper reports on. Local archives for
data on endangered languages have recently been set up in Iquitos (Peru), Buenos
Aires (Argentina), and in various locations in Brazil. An important feature of
these is that they provide fast and secure access to linguistic and cultural data for
local researchers and the language communities. They also make data safer by
allowing for regular update procedures within the network.
Revista Virtual de Estudos da Linguagem, Ed. esp. 3, 2009
This article describes and analyses nasal harmony (or spreading of nasality) in Awetí. It first s... more This article describes and analyses nasal harmony (or spreading of nasality) in Awetí. It first shows generally how sounds in prefixes adapt to nasality or orality of stems, and how nasality in stems also 'extends' to the left. With abstract templates we show which phonetically nasal or oral sequences are possible in Awetí (focusing on stops, pre-nasalized stops and nasals) and which phonological analysis is appropriate for account for this regularities. In Awetí, there are intrinsically nasal and oral vowels and 'neutral' vowels which adapt phonetically to a following vowel or consonant, as is the case of sonorant consonants. Pre-nasalized stops such as "nt" are nasalized variants of stops, not post-oralized variants of nasals as in Tupí-Guaranian languages. For nasals and stops in syllable coda (end of morphemes), we postulate arqui-phonemes which adapt to the preceding vowel or a following consonant. Finally, using a declarative approach, the analysis formulates 'rules' (statements) which account for the 'behavior' of nasality in Awetí words, making use of "structured sequences" on both the phonetic and phonological levels. So, each unit (syllable, morpheme, word etc.) on any level has three components, a sequence of segments, a constituent structure (where pre-nasalized stops, like diphthongs, correspond to two segments), and an intonation structure. The statements describe which phonetic variants can be combined (concatenated) with which other variants, depending on their nasality or orality.
Amerindia 31, 2008
1. Object: Awetí and the ‘Mawetí-Guaraní’ subfamily “Mawetí-Guaraní” is a shorter designation of... more 1. Object: Awetí and the ‘Mawetí-Guaraní’ subfamily
“Mawetí-Guaraní” is a shorter designation of a branch of the large Tupí language family, alongside with eight other branches or subfamilies. This branch in turn consists internally of the languages (Sateré-) Mawé and Awetí and the large Tupí-Guaraní subfamily, and so its explicit but longish name could be “Mawé-Awetí-Tupí-Guaraní” (MTAG). This genetic grouping has already been suggested (without any specific designation) by A. D. Rodrigues (e.g., 1984/85; Rodrigues and Dietrich 1997), and, more recently, it has been confirmed by comparative studies (Corrêa da Silva 2007; Drude 2006; Meira and Drude in prep.), which also more reliably establish the most probable internal ramification, according to which Mawé separated first, whereas the differentiation between Awetí, on the one hand, and the precursor of the Tupí-Guaraní (TG) subfamily, proto-Tupí-Guaraní (pTG), on the other, would have been more recent. The intermediate branch could be named “Awetí-Tupí-Guaraní” (“Awetí-TG” or “ATG”). Figure 1 shows the internal grouping of the Tupí family according to results of the Tupí Comparative Project under D. Moore at the Museu Goeldi (2000–2006).
AMERINDIA, 2011
This paper describes the distribution and phonetic properties of accentuation of word forms in Aw... more This paper describes the distribution and phonetic properties of accentuation of word forms in Awetí, a Tupian language spoken by ca. 150 people in central Brazil in the Upper Xingu area. Awet\'ı does not belong to, but is arguably the closest relative of the better known Tupí-Guaraní subfamily, the largest branch of the Tupí stock. After a short overview over the word classes and general phonotactics of Awetí (sec-tion 2), we briefly discuss the notion ‘word accent’ and show that, in Awetí, it is generally located on the last syllable of the stem in morphologically simple forms (section 3). We then discuss regular and isolated exceptions to this rule (section 4). In section 5, we describe the distribution of the word accent when inflectional or deriva-tional suffixes are present – usually, the word accent of the word form with suffixes continues to be on the last syllable of the stem. After this descriptive part, we present a preliminary study of the acoustic-phonetic details of the manifestation of the word accent, observing word forms in isolation (section 6) and in different syntactic con-texts (section 7). The results are briefly summarized in the conclusion (section 8)
2ICLDC, 2011
This poster reports on the challenges and solutions in creating a large LAT-based digital languag... more This poster reports on the challenges and solutions in creating a large LAT-based digital language archive of the indigenous languages of the Brazilian Amazon region, focusing on standards and workflow concerning data management, on training and the applied tools and technology.
Multi-verb constructions: a view from the Americas, 2011
First version of ‘Derivational verbs’ and other multiple‐verb constructions in Awetí and Tupí‐Gu... more First version of
‘Derivational verbs’ and other multiple‐verb constructions in Awetí and Tupí‐Guaraní