DOCUMENTATION OF THE TAI KHAMYANG LANGUAGE: ROLE OF DIGITAL HUMANITIES (original) (raw)

Digital preservation of language, cultural knowledge and traditions of the indigenous Semai

SHS Web of Conferences

In this paper we describe two community-based participatory research projects in an Orang Asli community that set out to document their local knowledge and culture. We describe how with the support of technology we are able to document indigenous oral traditions and practices that are on the verge of disappearing. The Semai are the largest Orang Asli community in Peninsular Malaysia and mainly live in the states of Perak and Pahang. Like in many other indigenous contexts, the Semai peoples’ transition from an oral culture to a literate culture is relatively recent. In this paper we discuss how our long-term relationship has facilitated two projects using digital media technology that focus on the documentation of local knowledge and culture of the community members in a Semai-speaking village in Perak. Elders in this community, having local knowledge accumulated over generations through direct experiences and participation, were the main source of information for these documentation...

Documentation and Digitization of Indigenous Languages

Proceedings of the International Mother Language Day 2024, held at Sri Lanka Foundation, Colombo on 21st February 2024 and published by the National Institute of Language Education and Training (NILET) and the Department of Sinhala, University of Ruhuna, Sri Lanka, Pp. 5-24., 2024

In this paper, I argue for establishing the importance of documenting linguistic data and ethnographic information from the endangered indigenous language communities. I also address the academic and functional roles of the Corpus of Indigenous Language (CoIL) in the act of protecting, preserving and promoting indigenous languages for their sustainable development. To substantiate my ideas, I first define the concept of ‘indigenous language’, investigate the modern concept of ‘language documentation’, reflect on the transition of the concept of traditional dialectology into language documentation, investigate the relational interface between dialectology and CoIL, define features of a CoIL, sum up the major contributions of a CoIL in the documentation of indigenous languages and address some of the limitations of a CoIL. This is not a research paper; it is a reflection on my understanding of the question of how a CoIL can contribute to empowering vulnerable indigenous languages. This paper carries some academic and referential values for those scholars who are working in the areas of documentation of indigenous endangered languages, revitalization of vulnerable languages and empowerment of the indigenous speech communities who are living mostly in India, Sri Lanka, Bangladesh, Nepal, Bhutan and similar other countries.

Born archival: The ebb and flow of digital documents from the field

History and Anthropology, 2011

Facilitated by an infusion of funding from philanthropic sources, descriptive linguists have been galvanized to document the world’s languages before they disappear without record. Linguists have responded to the “crisis of documentation” (Dobrin, L. M. & Berson, J. (2011), “Speakers and Language Documentation”, in The Cambridge Handbook of Endangered Languages, P. K. Austin & J. Sallabank (eds), Cambridge University Press, Cambridge, pp. 187–211) by entering into increasingly collaborative partnerships with speech communities, producing “documents” that have both local relevance and academic integrity. The growth in access to digital recording technology has meant that contemporary research initiatives on endangered languages are not only born digital, but often birthed straight into an archive. Yet heritage collections of recordings made by ethnographers and linguists in the past are ever more endangered, becoming orphaned when their collectors die or fragmented into their component parts based on the medium of documentation when they are finally archived. Drawing on fieldwork in Nepal with a community speaking an endangered Tibeto–Burman language, and reflecting on the decade I have spent directing a digital humanities research initiative—the Digital Himalaya Project—I discuss how linguists and anthropologists are collecting, protecting and connecting their data, and how technology influences their relationship to documents.

How usable are digital collections for endangered languages? A review

Proceedings of the Linguistic Society of America, 2022

Here, we report on pilot research on the extent to which language collections in digital linguistic archives are discoverable, accessible, and usable for linguistic research. Using a test case of common tasks in phonetic and phonological documentation, we evaluate a small random sample of collections and find substantial, striking problems in all domains. Of the original 20 collections, only six had digitized audio files with associated transcripts (preferably phrase-aligned). That is, only 30% of the collections in our sample were even potentially suitable for any type of phonetic work (regardless of quality of recording). Information about the contents of the collection was usually discoverable, though there was variation in the types of information that could be easily searched for in the collection. Though eventually three collections were aligned, only one collection was successfully forcealigned from the archival materials without substantial intervention. We close with recommendations for archive depositors to facilitate discoverability, accessibility, and functionality of language collections. Consistency and accuracy in file naming practices, data descriptions, and transcription practices is imperative. Providing a collection guide also helps. Including useful search terms about collection contents makes the materials more findable. Researchers need to be aware of the changes to collection structure that may result from archival uploads. Depositors need to consider how their metadata is included in collections and how items in the collection may be matched to each other and to metadata categories. Finally, if our random sample is indicative, linguistic documentation practices for future phonetic work need to change rapidly, if such work from archival collections is to be done in future.

Self documentation of endangered languages

2012 8th International Symposium on Chinese Spoken Language Processing, 2012

Several minority languages are on the verge of extinction in Nepal, especially when they don't have a generally accepted writing system and occur in an area where Nepali (the official language) is predominantly used. Lohorung is an example, which is spoken among the Lohroung Rai communities of Sankhuwasabha, a hilly district of eastern Nepal. Older generations of Lohorung are the only experts in Lohorung but they have limitations in reading and writing English or Nepali. The documentation of Lohorung and other similar endangered languages is important. If the right tools and techniques are used, we believe that self documentation is one of the best ways, to document a language. We have developed an online platform using which community members can collaboratively self document their language. The platform is multimodal dictionary authoring and browsing tool and it has been developed with the focus on usability, ease of use and productivity.

Internet applications for endangered languages: a talking dictionary of Ainu

2011

There are an estimated 6,900 languages spoken in the world today and at least half of them are under threat of extinction. This is mainly because speakers of smaller languages are switching to other larger languages for economic, social or political reasons, or because they feel ashamed of their ancestral language. The language can thus be lost in one or two generations, often to the great regret of their descendants. Over the past ten years a new field of study called “language documentation” has developed. Language documentation is concerned with the methods, tools, and theoretical bases for compiling a representative and lasting multipurpose record of languages. It has developed in response to the urgent need to make an enduring record of the world’s many endangered languages and to support speakers of these languages in their desire to maintain them. It is also fueled by developments in information and media technologies which make documentation and the preservation and dissemin...

Language Documentation Accessibility in Indigenous Languages: A Study in the Chittagong Hill Tract

SLL, 2021

Protecting endangered indigenous languages from extinction due to multiple threats is one way to foster cultural diversity across regions. However, documenting these endangered languages with no access barriers has proven difficult. This research explores the language documentation accessibility efforts among indigenous populations living in the Chittagong Hill Tract region. Protecting these vulnerable languages through proper linguistic documentation practices remains essential. Therefore, the study aims at identifying and proposing practical strategies for increasing access while identifying possible barriers hindering their preservation efforts. The findings from this study will reveal technological infrastructure issues alongside socio-political factors that obstruct effective language documentation practices and recommend innovative best practices for previously unknown issues not identified before. During the investigation, the accessibility of digital technology for effective documentation practice is assessed again for improved inclusiveness that serves all relevant groups concerned. A community-driven approach should be encouraged for safeguarding vulnerable indigenous populations' cultures and promoting wider dissemination of inclusive practices recommended within and beyond the Chittagong Hill Tract region context.