Toward the creation of WordNets for ancient Indo-European languages (original) (raw)

The making of Ancient Greek WordNet

This paper describes the process of creation and review of a new lexico-semantic resource for the classical studies: AncientGreekWord-Net. The candidate sets of synonyms (synsets) are extracted from Greek-English dictionaries, on the assumption that Greek words translated by the same English word or phrase have a high probability of being synonyms or at least semantically closely related. The process of validation and the web interface developed to edit and query the resource are described in detail. The lexical coverage of Ancient Greek WordNet is illustrated and the accuracy is evaluated. Finally, scenarios for exploiting the resource are discussed.

Linking the Ancient Greek WordNet to the Homeric Dependency Lexicon

2021

The Ancient Greek WordNet is a new resource that is being developed at the Universities of Pavia and Exeter, based on the Princeton WordNet. The Princeton WordNet provides sentence frames for verb senses, but this type of information is lacking in most WordNets of other languages. In fact, exporting sentence frames from English to other languages is not a trivial task, as sentence frames depend on the syntax of individual languages. In addition, the information provided by the Princeton WordNet is not corpus-based but relies on native speakers' knowledge. This type of information is not available for dead languages, which are by definition corpus languages. In this paper, we show how sentence frames can be extracted from morpho-syntactically parsed corpora by linking an existing dependency lexicon of Homeric verbs (HoDeL) to verbs in the Ancient Greek WordNet. Given its features, HoDeL allows automatically extracting all subcategorization frames available for each verb along with information concerning their frequency as well as semantic information regarding the possible arguments occurring in specific frames. In the paper, we show our method to automatically link the two resources and compare some of the resulting sentence frames with the English sentence frames in the Princeton WordNet.

Ancient Greek WordNet Meets the Dynamic Lexicon: the Example of the Fragments of the Greek Historians

Eighth Global WordNet Conference. Bucharest, Romania, January 27-30, 2016

The Ancient Greek WordNet (AGWN) and the Dynamic Lexicon (DL) are multilingual resources to study the lexicon of Ancient Greek texts and their translations. Both AGWN and DL are works in progress that need accuracy improvement and manual validation. After a detailed description of the current state of each work, this paper illustrates a methodology to cross AGWN and DL data, in order to mutually score the items of each resource according to the evidence provided by the other resource. The training data is based on the corpus of the Digital Fragmenta Historicorum Graecorum (DFHG), which includes ancient Greek texts with Latin translations.

ABHIDHA: An extended WordNet for Indo-Aryan Languages

Research Issues in …, 2003

The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document and are linked to publications on ResearchGate, letting you access and read them immediately.

Linking the Sanskrit WordNet to the Vedic Dependency Treebank: a pilot study

Proceedings of the 12th Global Wordnet Conference, 2023

The Sanskrit WordNet is a resource currently under development, whose core was induced from a Vedic text sample semantically annotated by means of an ontology mapped on the Princeton WordNet synsets. Building on a previous case study on Ancient Greek (Zanchi et al. 2021), we show how sentence frames can be extracted from morphosyntactically parsed corpora by linking an existing dependency treebank of Vedic Sanskrit to verbal synsets in the Sanskrit WordNet. Our case study focuses on two verbs of asking, yācand prach-, featuring a high degree of variability in sentence frames. Treebanks enhanced with WordNet-based semantic information revealed to be of crucial help in motivating sentence frame alternations.

Building a WordNet for Dravidian Languages

This paper attempts to emphasize the need for a standalone and independent Dravidian WordNet. Since the morphology and lexical concepts of Dravidian languages are closer to each other than to a language from a different family, it is proposed to base the Dravidian WordNet on a Dravidian Language. A signifi-cant amount of work has already been done in Tamil language to understand the ontological structure and vocabulary. Based on the find-ings of these studies, it is proposed to build a Tamil WordNet first and then extend it to complete the Dravidian WordNet. A prototype model for the Tamil WordNet is also proposed in this paper.

Indian Language Wordnets and their Linkages with Princeton WordNet

2018

Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, primarily those based on knowledge-based approaches. In such approaches, these resources are considered as gold standard/oracle. Thus, it is crucial that these resources hold correct information. Thereby, they are created by human experts. However, human experts in multiple languages are hard to come by. Thus, the community would benefit from sharing of such manually created resources. In this paper, we release mappings of 18 Indian language wordnets linked with Princeton WordNet. We believe that availability of such resources will have a direct impact on the progress in NLP for these languages.

Introduction to the special issue: On wordnets and relations

Since its inception a quarter century ago, Princeton WordNet [PWN] has had a profound influence on research and applications in lexical semantics, computational linguistics and natural language processing. The numerous uses of this lexical resource have motivated the building of wordnets 1 in several dozen languages, including even a ''dead'' language, Latin. This special issue looks at certain aspects of wordnet construction and organisation.