Latent Relational Model for Relation Extraction (original) (raw)

Learning Relational Representations by Analogy using Hierarchical

Proceedings of the 2019 Conference of the North, 2019

We address relation extraction as an analogy problem by proposing a novel approach to learn representations of relations expressed by their textual mentions. In our assumption, if two pairs of entities belong to the same relation, then those two pairs are analogous. Following this idea, we collect a large set of analogous pairs by matching triples in knowledge bases with web-scale corpora through distant supervision. We leverage this dataset to train a hierarchical siamese network in order to learn entity-entity embeddings which encode relational information through the different linguistic paraphrasing expressing the same relation. We evaluate our model in a one-shot learning task by showing a promising generalization capability in order to classify unseen relation types, which makes this approach suitable to perform automatic knowledge base population with minimal supervision. Moreover, the model can be used to generate pretrained embeddings which provide a valuable signal when integrated into an existing neuralbased model by outperforming the state-ofthe-art methods on a downstream relation extraction task.

A Language Model for Extracting Implicit Relations

2015

Open Information Extraction has shown promise of overcoming a knowledge engineering bottleneck, but has a fundamental limitation. It is unable to extract implicit relations, where the sentence lacks an explicit relation phrase. We present IMPLIE (Implicit relation Information Extraction) that uses an open-domain syntactic language model and user-supplied semantic taggers to overcome this limitation. IMPLIE can extract an implicit has nationality relation, has job title, and has city relation from “French journalist Paul Legall reported ... at Athens International Airport.”. Formal evaluations of IMPLIE show high precision, over 0.90 for nationality and job title on newswire text. IMPLIE nearly doubles recall for 2013 KBP Slot Filling queries and more than doubles it for 2014 when combined with an Open IE-based system, maintaining precision of 0.58 and 0.59 respectively.

State-ofthe-Art Models for Relation Extraction

2021

The task of relation extraction aims at classifying the semantic relations between entities in a text. When coupled with named-entity recognition these can be used as the building blocks for an information extraction procedure that results in the construction of a Knowledge Graph. While many NLP libraries support named-entity recognition, there is no off-the-shelf solution for relation extraction. In this paper, we evaluate and compare several state-of-the-art approaches on a subset of the FewRel data set as well as a manually annotated corpus. The custom corpus contains six relations from the area of market research and is available for public use. Our approach provides guidance for the selection of models and training data for relation extraction in realworld projects.

A Generative Model for Relation Extraction and Classification

arXiv (Cornell University), 2022

Relation extraction (RE) is an important information extraction task which provides essential information to many NLP applications such as knowledge base population and question answering. In this paper, we present a novel generative model for relation extraction and classification (which we call GREC), where RE is modeled as a sequenceto-sequence generation task. We explore various encoding representations for the source and target sequences, and design effective schemes that enable GREC to achieve stateof-the-art performance on three benchmark RE datasets. In addition, we introduce negative sampling and decoding scaling techniques which provide a flexible tool to tune the precision and recall performance of the model. Our approach can be extended to extract all relation triples from a sentence in one pass. Although the one-pass approach incurs certain performance loss, it is much more computationally efficient.

Revisiting Unsupervised Relation Extraction

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Unsupervised relation extraction (URE) extracts relations between named entities from raw text without manually-labelled data and existing knowledge bases (KBs). URE methods can be categorised into generative and discriminative approaches, which rely either on hand-crafted features or surface form. However, we demonstrate that by using only named entities to induce relation types, we can outperform existing methods on two popular datasets. We conduct a comparison and evaluation of our findings with other URE techniques, to ascertain the important features in URE. We conclude that entity types provide a strong inductive bias for URE. 1

A Review of Relation Extraction

Many applications in information extraction, natural language understanding, information retrieval require an understanding of the semantic relations between entities. We present a comprehensive review of various aspects of the entity relation extraction task. Some of the most important supervised and semi-supervised classification approaches to the relation extraction task are covered in sufficient detail along with critical analyses. We also discuss extensions to higher-order relations. Evaluation methodologies for both supervised and semi-supervised methods are described along with pointers to the commonly used performance evaluation datasets. Finally, we also give short descriptions of two important applications of relation extraction, namely question answering and biotext mining.

Relation Extraction using Explicit Context Conditioning

Proceedings of the 2019 Conference of the North

Relation Extraction (RE) aims to label relations between groups of marked entities in raw text. Most current RE models learn context-aware representations of the target entities that are then used to establish relation between them. This works well for intrasentence RE and we call them first-order relations. However, this methodology can sometimes fail to capture complex and long dependencies. To address this, we hypothesize that at times two target entities can be explicitly connected via a context token. We refer to such indirect relations as second-order relations and describe an efficient implementation for computing them. These second-order relation scores are then combined with first-order relation scores. Our empirical results show that the proposed method leads to state-of-theart performance over two biomedical datasets.

PromptORE - A Novel Approach Towards Fully Unsupervised Relation Extraction

Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Unsupervised Relation Extraction (RE) aims to identify relations between entities in text, without having access to labeled data during training. This setting is particularly relevant for domain specific RE where no annotated dataset is available and for open-domain RE where the types of relations are a priori unknown. Although recent approaches achieve promising results, they heavily depend on hyperparameters whose tuning would most often require labeled data. To mitigate the reliance on hyperparameters, we propose Prompt-ORE, a "Prompt-based Open Relation Extraction" model. We adapt the novel prompt-tuning paradigm to work in an unsupervised setting, and use it to embed sentences expressing a relation. We then cluster these embeddings to discover candidate relations, and we experiment different strategies to automatically estimate an adequate number of clusters. To the best of our knowledge, PromptORE is the first unsupervised RE model that does not need hyperparameter tuning. Results on three general and specific domain datasets show that PromptORE consistently outperforms state-of-the-art models with a relative gain of more than 40% in B 3 , V-measure and ARI. Qualitative analysis also indicates PromptORE's ability to identify semantically coherent clusters that are very close to true relations.

Life is a Circus and We are the Clowns: Automatically Finding Analogies between Situations and Processes

Cornell University - arXiv, 2022

Analogy-making gives rise to reasoning, abstraction, flexible categorization and counterfactual inference-abilities lacking in even the best AI systems today. Much research has suggested that analogies are key to non-brittle systems that can adapt to new domains. Despite their importance, analogies received little attention in the NLP community, with most research focusing on simple word analogies. Work that tackled more complex analogies relied heavily on manually constructed, hardto-scale input representations. In this work, we explore a more realistic, challenging setup: our input is a pair of natural language procedural texts, describing a situation or a process (e.g., how the heart works/how a pump works). Our goal is to automatically extract entities and their relations from the text and find a mapping between the different domains based on relational similarity (e.g., blood is mapped to water). We develop an interpretable, scalable algorithm and demonstrate that it identifies the correct mappings 87% of the time for procedural texts and 94% for stories from cognitivepsychology literature. We show it can extract analogies from a large dataset of procedural texts, achieving 79% precision (analogy prevalence in data: 3%). Lastly, we demonstrate that our algorithm is robust to paraphrasing the input texts 1. CAUSE(PULL(piston),CAUSE(GREATER(PRESSURE(water), PRESSURE(pipe)),FLOW(water,pipe)))

Paths to Relation Extraction through Semantic Structure

2021

Syntactic and semantic structure directly reflect relations expressed by the text at hand and are thus very useful for the relation extraction (RE) task. Their symbolic nature allows increased interpretability for end-users and developers, which is particularly appealing in RE. Although they have been somewhat overshadowed recently by the use of end-to-end neural network models and contextualized word embeddings, we show that they may be leveraged as input for neural networks to positive effect. We present two methods for integrating broad-coverage semantic structure (specifically, UCCA) into supervised neural RE models, demonstrating benefits over the use of exclusively syntactic integrations. The first method involves reduction of UCCA into a bilexical structure, while the second leverages a novel technique for encoding semantic DAG structures. Our approach is general and can be used for integrating a wide range of graphbased semantic structures.1