From PropBank to EngValLex: Adapting the PropBank-Lexicon to the Valency Theory of the Functional Generative Description (original) (raw)

VerbaLex–New Comprehensive Lexicon of Verb Valencies for Czech

… of the Slovko Conference, Bratislava, Slovakia, 2005

Abstract. The paper presents new lexicon of verb valencies for the Czech language named VerbaLex. VerbaLex is based on three valuable language resources for Czech, three independent electronic dictionaries of verb valency frames. The first resource, Czech WordNet valency frames dictionary, was created during the Balkanet project and contains semantic roles and links to the Czech WordNet semantic network. The other resource, VALLEX 1.0, is a lexicon based on the formalism of the Functional Generative Description (FGD) and was developed during ...

Verb Valency Descriptors for a Syntactic Treebank

Proceedings of LREC 2004, 2004

An essential component of Language Engineering (LE) tools are verb class descriptors that provide information about the relations of the predicates to their arguments. The production of computationally tractable language resources necessitates the assignment of types of predicate-argument relations to a great variety of verb-centered structures: it is necessary to define not only the initial, canonical valency frame of a great number of verb lexemes, but also the diathesis alternations, which reflect the real-life usage of verbs. This paper describes the implementation of descriptors of the valency properties of Bulgarian verbs used in the production of a syntactic treebank of Bulgarian. The descriptors are based on available LE resources for Bulgarian: a verb subcategorization model implemented in the lexical data base that is used; a chunk grammar that recognizes verb form patterns. Predictive models are built and applied in a grammar that annotates grammatical relations inferred from the combination of morphosyntactic and shallow syntactic processing cues. The real significance of this particular processing is the resolution, in relation to the valency properties of many verbs, of the discrepancy or the contradiction between the verb lexicon specifications and the verb syntagmatic realization.

A Treebank-driven Creation of an OntoValence Verb lexicon

The paper presents a treebank-driven approach to the construction of a Bulgarian valence lexicon with ontological restrictions over the inner participants of the event. First, the underlying ideas behind the Bulgarian Ontology-based lexicon are outlined. Then, the extraction and manipulation of the valence frames is discussed with respect to the BulTreeBank annotation scheme and DOLCE ontology. Also, the most frequent types of syntactic frames are specified as well as the most frequent types of ontological restrictions over the verb arguments. The envisaged application of such a lexicon would be: in assigning ontological labels to syntactically parsed corpora, and expanding the lexicon and lexical information in the Bulgarian Resource Grammar.

Valency Lexicon of Czech Verbs: Alternation-Based Model

2006

The main objective of this paper is to introduce an alternation-based model of valency lexicon of Czech verbs VALLEX. Alternations describe regular changes in valency structure of verbs – they are seen as transformations taking one lexical unit and return a modified lexical unit as a result. We characterize and exemplify ‘syntactically-based’ and ‘semantically-based’ alternations and their effects on verb argument structure. The alternation-based model allows to distinguish a minimal form of lexicon, which provides compact characterization of valency structure of Czech verbs, and an expanded form of lexicon useful for some applications.

Abstraction and Generalisation in Semantic Role Labels: PropBank, VerbNet or both

2009

Semantic role labels are the representation of the grammatically relevant aspects of a sentence meaning. Capturing the nature and the number of semantic roles in a sentence is therefore fundamental to correctly describing the interface between grammar and meaning. In this paper, we compare two annotation schemes, Prop-Bank and VerbNet, in a task-independent, general way, analysing how well they fare in capturing the linguistic generalisations that are known to hold for semantic role labels, and consequently how well they grammaticalise aspects of meaning. We show that VerbNet is more verb-specific and better able to generalise to new semantic role instances, while PropBank better captures some of the structural constraints among roles. We conclude that these two resources should be used together, as they are complementary.

Formal representation of the syntactical environment and the semantic features of verbs

The Semantic-Syntactical Dictionary of the Bulgarian language contains information concerning the syntactical environments of lexical units, their semantic combinability, as well as the possible formation of diatheses. The first partition of the dictionary will consist of the 3 000 most frequent Bulgarian verbs with all their meanings and respective formal semantic and syntactic descriptions. The development of the Semantic-Syntactical Dictionary of the Bulgarian Language is carried out with the help of a web-based system called SYNText (SYNtactic dictionary Tool). This allows the developers of the dictionary to work independently from each other and using different operational systems (Windows or Linux), while using one and the same data base.

Meaning and Semantic Roles in CzEngClass Lexicon

Journal of Linguistics/Jazykovedný casopis

This paper focuses on Semantic Roles, an important component of studies in lexical semantics, as they are captured as part of a bilingual (Czech-English) synonym lexicon called CzEngClass. This lexicon builds upon the existing valency lexicons included within the framework of the annotation of the various Prague Dependency Treebanks. The present analysis of Semantic Roles is being approached from the Functional Generative Description point of view and supported by the textual evidence taken specifically from the Prague Czech-English Dependency Treebank.

Inherently Pronominal Verbs in Czech: Description and Conversion Based on Treebank Annotation

Proceedings of the 12th Workshop on Multiword Expressions, 2016

This paper describes results of a study related to the PARSEME Shared Task on automatic detection of verbal Multi-Word Expressions (MWEs) which focuses on their identification in running texts in many languages. The Shared Task's organizers have provided basic annotation guidelines where four basic types of verbal MWEs are defined including some specific subtypes. Czech is among the twenty languages selected for the task. We will contribute to the Shared Task dataset, a multilingual open resource, by converting data from the Prague Dependency Treebank (PDT) to the Shared Task format. The question to answer is to which extent this can be done automatically. In this paper, we concentrate on one of the relevant MWE categories, namely on the quasi-universal category called "Inherently Pronominal Verbs" (IPronV) and describe its annotation in the Prague Dependency Treebank. After comparing it to the Shared Task guidelines, we can conclude that the PDT and the associated valency lexicon, PDT-Vallex, contain sufficient information for the conversion, even if some specific instances will have to be checked. As a side effect, we have identified certain errors in PDT annotation which can now be automatically corrected.

THE INTERACTION OF PARSING RULES AND ARGUMENT – PREDICATE CONSTRUCTIONS: IMPLICATIONS FOR THE STRUCTURE OF THE GRAMMATICON IN FUNGRAMKB 1

The Functional Grammar Knowledge Base (FunGramKB), (Periñán-Pascual and Arcas-Túnez 2010) is a multipurpose lexico-conceptual knowledge base designed to be used in different Natural Language Processing (NLP) tasks. It is complemented with the ARTEMIS (Automatically Representing Text Meaning via an Interlingua–based System) application, a parsing device linguistically grounded on Role and Reference Grammar (RRG) that transduces natural language fragments into their corresponding grammatical and semantic structures. This paper unveils the different phases involved in its parsing routine, paying special attention to the treatment of argumental constructions. As an illustrative case, we will follow all the steps necessary to effectively parse a For-Benefactive structure within ARTEMIS. This methodology will reveal the necessity to distinguish between Kernel constructs and L1-constructions, since the latter involve a modification of the lexical template of the verb. Our definition of L1-constructions leads to the reorganization of the catalogue of FunGramKB L1-constructions, formerly based on Levin's (1993) alternations. Accordingly, a rearrangement of the internal configuration of the L1-Constructicon within the Grammaticon is proposed.