BUAP_1: A Naïve Approach to the Entity Linking Task (original) (raw)

Nus-i2r: Learning a combined system for entity linking

2010

In this paper, we report the joint participation of NUS and I2R team in Knowledge Base Population at Text analysis conference 2010. For Entity Linking, we analyze IR approaches and SVM classification in the disambiguation stage and develop a supervised learner for combining these approaches. The combined system performs better than the individual components and achieves results much better than the median. Furthermore, according to our error analysis, quite some errors are caused due to the different Wikipedia version is used, which hinder our system to show significant better performance.

Mining and Leveraging Background Knowledge for Improving Named Entity Linking

Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, 2018

Knowledge-rich Information Extraction (IE) methods aspire towards combining classical IE with background knowledge obtained from third-party resources. Linked Open Data repositories that encode billions of machine readable facts from sources such as Wikipedia play a pivotal role in this development. The recent growth of Linked Data adoption for Information Extraction tasks has shed light on many data quality issues in these data sources that seriously challenge their usefulness such as completeness, timeliness and semantic correctness. Information Extraction methods are, therefore, faced with problems such as name variance and type confusability. If multiple linked data sources are used in parallel, additional concerns regarding link stability and entity mappings emerge. This paper develops methods for integrating Linked Data into Named Entity Linking methods and addresses challenges in regard to mining knowledge from Linked Data, mitigating data quality issues, and adapting algorithms to leverage this knowledge. Finally, we apply these methods to Recognyze, a graph-based Named Entity Linking (NEL) system, and provide a comprehensive evaluation which compares its performance to other well-known NEL systems, demonstrating the impact of the suggested methods on its own entity linking performance.

Context-Based Entity Linking – University of Amsterdam at TAC 2012

This paper describes our approach to the 2012 Text Analysis Conference (TAC) Knowledge Base Population (KBP) entity linking track. For this task, we turn to a state-of-the-art system for entity linking in microblog posts. Compared to the little context microblog posts provide, the documents in the TAC KBP track provide context of greater length and of a less noisy nature. In this paper, we adapt the entity linking system for microblog posts to the KBP task by extending it with approaches that explicitly rely on the query's context. We show that incorporating novel features that leverage the context on the entity-level can lead to improved performance in the TAC KBP task.

A Hybrid Approach to Domain-Specific Entity Linking

The current state-of-the-art Entity Linking (EL) systems are geared towards corpora that are as heterogeneous as the Web, and therefore perform sub-optimally on domain-specific corpora. A key open problem is how to construct effective EL systems for specific domains, as knowledge of the local context should in principle increase, rather than decrease, effectiveness. In this paper we propose the hybrid use of simple specialist linkers in combination with an existing generalist system to address this problem. Our main findings are the following. First, we construct a new reusable benchmark for EL on a corpus of domain-specific conversations. Second, we test the performance of a range of approaches under the same conditions, and show that specialist linkers obtain high precision in isolation, and high recall when combined with generalist linkers. Hence, we can effectively exploit local context and get the best of both worlds.

Supervised Learning for Linking Named Entities to Knowledge Base Entries

2011

Abstract This paper addresses the challenging information extraction problem of linking named entities in text to entries in a knowledge base. Our approach uses supervised learning to (a) rank candidate knowledge base entries for each named entity,(b) classify the top-ranked entry as the correct disambiguation or not, and (c) group together the named entities without a corresponding entry in the knowledge base.

Entity Linking is the task of detecting, in text documents, relevant mentions to entities of a given knowledge base. To this end, entity-linking algorithms use several signals and features extracted from the input text or from the knowledge base. The most important of such features is entity relatedness. Indeed, we argue that these algorithms benefit from maximizing the relatedness among the relevant entities selected for annotation, since this minimizes errors in disambiguating entity-linking. The definition of an e↵ective relatedness function is thus a crucial point in any entity-linking algorithm. In this paper we address the problem of learning high-quality entity relatedness functions. First, we formalize the problem of learning entity relatedness as a learning-to-rank problem. We propose a methodology to create reference datasets on the basis of manually annotated data. Finally, we show that our machine-learned entity relatedness function performs better than other relatedness functions previously proposed, and, more importantly, improves the overall performance of different state-of-the-art entity-linking algorithms.

Bhilai Institute of Technology Durg at TAC 2010: Knowledge Base Population Task Challenge

Theory and Applications of Categories, 2010

The present communication aims to report the TAC forum about the system-incorporated towards Entity-Linking task. The noun-phrases relevant to the search term were used to correlate the entityrelevant happenings mentioned in the source documents to that of the entity-relevant information in the knowledge-base. Care was taken in identifying the abbreviation and full form variants of search names if present in the same set of queries or in the provided Knowledge base as Wikipedia structured nodes. The subsequent knowledge-base node hits led to the ranking of their respective content for retrieving entity-linking node-id responses. Three different runs were submitted the Task Challenge following the different ways to search for the most meaningfully relevant information about the entities mentioned in Entity-Linking Target list from the Knowledge-base of the track. The team’s spirits feels elevated at the thought of using the free text from the Wikipedia pages associated with the knowl...

SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous Labels

Proceedings of the 13th International Conference on Semantic Systems, 2017

Webpages are an abundant source of textual information with manually annotated entity links, and are often used as a source of training data for a wide variety of machine learning NLP tasks. However, manual annotations such as those found on Wikipedia are sparse, noisy, and biased towards popular entities. Existing entity linking systems deal with those issues by relying on simple statistics extracted from the data. While such statistics can effectively deal with noisy annotations, they introduce bias towards head entities and are ineffective for long tail (e.g., unpopular) entities. In this work, we first analyze statistical properties linked to manual annotations by studying a large annotated corpus composed of all English Wikipedia webpages, in addition to all pages from the CommonCrawl containing English Wikipedia annotations. We then propose and evaluate a series of entity linking approaches, with the explicit goal of creating highly-accurate (precision > 95%) and broad anno...

Linking Entities to Wikipedia Documents

2013

This paper addresses the challenging information extraction problem of linking named entities in text to entries in a large knowledge base such as Wikipedia. The approach, which is essentially an evolution of a system originally developed in the context of the English Entity Linking Task of the Text Analysis Conference, uses supervised learning to rank candidate knowledge base entries for each named entity, and then for classifying the top-ranked entry as the correct disambiguation or not. In this paper, I analyze the fundamental design challenges involved in the development of a learningbased entity-linking system, and provide extensive experimental results with both Portuguese and Spanish texts, for a wide range of methods and feature sets. The experiments demonstrate the effectiveness of supervised learning methods, showing that out-of-the-box algorithms and relatively simple to compute features can obtain a high accuracy in this task.