Frame Semantics across Languages: Towards a Multilingual FrameNet (original) (raw)

A Bilingual Electronic Dictionary for Frame Semantics

2000

Frame semantics is a linguistic theory which is currently gaining ground. The creation of lexical entries for a large number of words presupposes the development of complex lexical acquisition techniques in order to identify the vocabulary for describing the elements of a 'frame'. In this paper, we show how a lexical-semantic database compiled on the basis of a bilingual (English-French) dictionary can be used to identify some general frame elements which are relevant in a frame-semantic approach such as the one adopted in the FrameNet project (Fillmore & Atkins 1998, Gahl 1998). The database has been systematically enriched with explicit lexical-semantic relations holding between some elements of the microstructure of the dictionary entries. The manifold relationships have been labelled in terms of lexical functions, based on Mel'cuk's notion of co-occurrence and lexical-semantic relations in Meaning-Text Theory (Mel'cuk et al. 1984). We show how these lexical f...

Building Multilingual Specialized Resources Based on FrameNet: Application to the Field of the Environment

Conference: International FrameNet Workshop 2020. Towards a Global, Multilingual FrameNet. Proceedings, Workshop of the Language Resources and Evaluation, LREC, 2020

The methodology developed within the FrameNet project is being used to compile resources in an increasing number of specialized fields of knowledge. The methodology along with the theoretical principles on which it is based, i.e. Frame Semantics, are especially appealing as they allow domain-specific resources to account for the conceptual background of specialized knowledge and to explain the linguistic properties of terms against this background. This paper presents a methodology for building a multilingual resource that accounts for terms of the environment. After listing some lexical and conceptual differences that need to be managed in such a resource, we explain how the FrameNet methodology is adapted for describing terms in different languages. We first applied our methodology to French and then extended it to English. Extensions to Spanish, Portuguese and Chinese were made more recently. Up to now, we have defined 190 frames: 112 frames are new; 38 are used as such; and 40 are slightly different (a different number of obligatory participants; a significant alternation, etc.) when compared to Berkeley FrameNet.

Learning to Align across Languages: Toward Multilingual FrameNet

2018

The FrameNet (FN) project, developed at ICSI since 1997, was the first lexical resource based on the theory of Frame Semantics, and documents contemporary English. It has inspired related projects in roughly a dozen other languages, which, while based on frame semantics, have evolved somewhat independently. Multilingual FrameNet (MLFN) is an attempt to find alignments between them all. The degree to which these projects have adhered to Berkeley FrameNet frames and the data release on which they are based varies, complicating the alignment problem. To minimize the resources needed to produce the alignments, we will rely on machine learning whenever that’s possible and appropriate. We briefly describe the various FrameNets and their history, and our ongoing work employing tools from the fields of machine translation and document classification to introduce a new relation of similarity between frames, combining structural and distributional similarity, and how this will contribute to t...

A Danish FrameNet Lexicon and an Annotated Corpus Used for Training and Evaluating a Semantic Frame Classifier

2018

In this paper, we present an approach to efficiently compile a Danish FrameNet based on the Danish Thesaurus, focusing in particular on cognition and communication frames. The Danish FrameNet uses the frame and role inventory of the English FrameNet. We present the corresponding corpus annotations of frames and roles and show how our corpus can be used for training and evaluating a semantic frame classifier for cognition and communication frames. We also present results of cross-language transfer of a model trained on the English FrameNet. Our approach is significantly faster than building a lexicon from scratch, and we show that it is feasible to annotate Danish with frames developed for English, and finally, that frame annotations – even if limited in size at the current stage – are useful for automatic frame classification.

FrameNet, current collaborations and future goals

Language Resources and Evaluation, 2012

This paper will focus on recent and near-term future developments at FrameNet (FN) and the interoperability issues they raise. We begin by discussing the current state of the Berkeley FN database including major changes in the data format for the latest data release. We then briefly review two recent local projects, ''Rapid Vanguarding'', which has created a new interface for the frame and lexical unit definition process based on the Word Sketch Engine of , and ''Beyond the Core'', which has developed tools for annotating constructions, and created a sample ''construction'' of especially ''interesting'' constructions which are neither simply lexical nor easy for the standard parsers to parse. We also cover two current collaborations, FN's part in the development of the manually annotated subcorpus of the American National Corpus, and a pilot study on aligning WordNet and FrameNet, to exploit the complementary strengths of these quite different resources. We discuss FN-related research on Spanish, Japanese, German (SALSA), Chinese and other languages, and the language-independence of frames, along with interesting FN-related work by others, and a sketch of a large group of imageschematic frames which are now being added to FN. We close with some ideas about how FrameNet can be opened up, to allow broader participation in the development process without losing precision and coherence, including a smallscale study on acquiring data for FN using Amazon's Mechanical Turk crowdsourcing system.

Developing a french framenet: Methodology and first results

The Asfalda project aims to develop a French corpus with frame-based semantic annotations and automatic tools for shallow semantic analysis. We present the first part of the project: focusing on a set of notional domains, we delimited a subset of English frames, adapted them to French data when necessary, and developed the corresponding French lexicon. We believe that working domain by domain helped us to enforce the coherence of the resulting resource, and also has the advantage that, though the number of frames is limited (around a hundred), we obtain full coverage within a given domain.

FrameNet and Typology

Proceedings of the Third Workshop on Computational Typology and Multilingual NLP

FrameNet and the Multilingual FrameNet project have produced multilingual semantic annotations of parallel texts that yield extremely fine-grained typological insights. Moreover, frame semantic annotation of a wide cross-section of languages can provide information on the limits of Frame Semantics (Fillmore, 1982, 1985). Multilingual semantic annotation offers critical input for research on linguistic diversity and recurrent patterns in computational typology. Drawing on results from FrameNet annotation of parallel texts, this paper proposes frame semantic annotation as a new component to complement the state of the art in computational semantic typology. 1

Exploring Crosslinguistic Frame Alignment

2020

The FrameNet (FN) project at the International Computer Science Institute in Berkeley (ICSI), which documents the core vocabulary of contemporary English, was the first lexical resource based on Fillmore’s theory of Frame Semantics. Berkeley FrameNet has inspired related projects in roughly a dozen other languages, which have evolved somewhat independently; the current Multilingual FrameNet project (MLFN) is an attempt to find alignments between all of them. The alignment problem is complicated by the fact that these projects have adhered to the Berkeley FrameNet model to varying degrees, and they were also founded at different times, when different versions of the Berkeley FrameNet data were available. We describe several new methods for finding relations of similarity between semantic frames across languages. We will demonstrate ViToXF, a new tool which provides interactive visualizations of these cross-lingual relations, between frames, lexical units, and frame elements, based on...

Frame semantics and lexical translation

This study deals with the use of Frame Semantics for lexical translation, particularly for the elaboration of bilingual dictionaries. To this purpose, we took Fillmore and Atkins's analysis of the word "risk" as the starting point for our paper. We analyzed the entries of RISK offered by three major English-Spanish/Spanish-English bilingual dictionaries in order to locate possible points of confusion, and found three main obscure points: a) We firstly realised that the distinction between the lexical items provided as the Spanish equivalents of risk was not clear. b) Secondly, we discovered that the use of the reflexive or non-reflexive form of the equivalents provided was not clear either. c) And thirdly, there was also some confusion regarding the syntactic complementation of the lexical entries in Spanish. We then set out to check whether Fillmore and Atkins' frame could help to translate risk in a more systematic and functional way. Our analysis showed that the three conceptual schemas distinguished by Fillmore and Atkins helped clarify these three problems. Thus, the greater explanatory capacity of Frame Semantics when compared to traditional lexicography methods is shown. Building the frame that underlies the meaning of a word can contribute to increase not only the functional capacity of dictionaries but also the translator's ability to account for those uses which do not appear in a dictionary.