DBpedia spotlight (shedding light on the web of documents) (original) (raw)
Related papers
Proceedings of the 7th International Conference on Semantic Systems - I-Semantics '11, 2011
Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to configure the annotations to their specific needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation confidence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.
A disambiguation resource extracted from Wikipedia for semantic annotation
The Semantic Annotation (SA) task consists in establishing the relation between a textual entity (word or group of words designating a named entity of the real world or a concept) and its corresponding entity in an ontology. The main difficulty of this task is that a textual entity might be highly polysemic and potentially related to many different ontological representations. To solve this specific problem, various Information Retrieval techniques can be used. Most of those involves contextual words to estimate wich exact textual entity have to be recognized. In this paper, we present a resource of contextual words that can be used by IR algorithms to establish a link between a named entity (NE) in a text and an entry point to its semantic description in the LinkedData Network.
DBpediaSameAs: An approach to tackle heterogeneity in DBpedia identifiers
2015
The DBpedia dataset has multiple URIs within the dataset and from other datasets connected with (transitive) owl :sameAs relations and thus referring to the same concepts. With this heterogeneity of identifiers it is complicated for users and agents to find the unique identifier which should be preferably used. We are introducing the concept of DBpedia Unique Identifier (DUI) and a dataset of linksets relating URIs to DUIs. In order to improve the quality of our dataset we developed a mechanism that allows the user to rate and suggest links. As proof of concept an implementation with a graphical web user interface is provided for accessing the linkset and rating the links. The DBpedia sameAs service is available at http://dbpsa.aksw.org/SameAsService.
DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia
The DBpedia community project extracts structured, multilingual knowledge from Wikipedia and makes it freely available on the Web using Semantic Web and Linked Data technologies. The project extracts knowledge from 111 different language editions of Wikipedia. The largest DBpedia knowledge base which is extracted from the English edition of Wikipedia consists of over 400 million facts that describe 3.7 million things. The DBpedia knowledge bases that are extracted from the other 110 Wikipedia editions together consist of 1.46 billion facts and describe 10 million additional things. The DBpedia project maps Wikipedia infoboxes from 27 different language editions to a single shared ontology consisting of 320 classes and 1,650 properties. The mappings are created via a world-wide crowd-sourcing effort and enable knowledge from the different Wikipedia editions to be combined. The project publishes regular releases of all DBpedia knowledge bases for download and provides SPARQL query access to 14 out of the 111 language editions via a global network of local DBpedia chapters. In addition to the regular releases, the project maintains a live knowledge base which is updated whenever a page in Wikipedia changes. DBpedia sets 27 million RDF links pointing into over 30 external data sources and thus enables data from these sources to be used together with DBpedia data. Several hundred data sets on the Web publish RDF links pointing to DBpedia themselves and thus make DBpedia one of the central interlinking hubs in the Linked Open Data (LOD) cloud. In this system report, we give an overview of the DBpedia community project, including its architecture, technical implementation, maintenance, internationalisation, usage statistics and applications.
Round-trip semantics with sztakipedia and DBpedia spotlight
2012
We describe a tool kit to support a knowledge-enhancement cycle on the Web. In the first step, structured data which is extracted from Wikipedia is used to construct automatic content enhancement engines. Those engines can be used to interconnect knowledge in structured and unstructured information sources on the Web, including Wikipedia itself. Sztakipedia-toolbar is a MediaWiki user script which brings DBpedia Spotlight and other kinds of machine intelligence into the Wiki editor interface to provide enhancement suggestions to the user. The suggestions offered by the tool focus on complementing knowledge and increasing the availability of structured data on Wikipedia. This will, in turn, increase the available information for the content enhancement engines themselves, completing a virtuous cycle of knowledge enhancement. A 90 seconds long screencast instroduces the system on youtube: http://www.youtube.com/watch?v=8VW0TrvXpl4\. For those who are interested in more details there is an other 4 minutes long video: http://www.youtube.com/watch?v= cLqe-DOqKCM.
DBpedia Atlas: Mapping the Uncharted Lands of Linked Data
2015
In the last few years, Linked Open Data sources have extremely increased in number. Despite their enormous potential, it is really hard to nd eective and ecient ways for navigating and exploring them, mainly because of complexity and volume issues. In fact, application developers, students and researchers that are not experts in Semantic Web technologies often lose themselves in the intricacies of the Web of Data. We propose to address this problem by providing users with a map-like visualization that acts as an entry point for the exploration of a dataset. To this end, we adapt a spatialization approach, based on cartographic and information visualisation techniques, to make it suitable for Linked Data sets with a hierarchical ontological structure. Finally, we apply our method on DBpedia, implementing and testing a prototype web application that shows a comprehensive and organic representation of the more than 4 million instances dened by the dataset.
Improving Wikipedia with DBpedia
Proceedings of the 21st international conference companion on World Wide Web - WWW '12 Companion, 2012
DBpedia is the semantic mirror of Wikipedia. DBpedia extracts information from Wikipedia and stores it in a semantic knowledge base. This semantic feature allows complex semantic queries, which could infer new relations that are missing in Wikipedia. This is an interesting source of knowledge to increase Wikipedia content. But, what is the best way to add these new relations following the Wikipedia conventions? In this paper, we propose a path indexing algorithm (PIA) which takes the resulting set of a DBPedia query and discovers the best representative path in Wikipedia. We evaluate the algorithm with real data sets from DBpedia.
Wikidata through the eyes of DBpedia
Semantic Web, 2017
DBpedia is one of the first and most prominent nodes of the Linked Open Data cloud. It provides structured data for more than 100 Wikipedia language editions as well as Wikimedia Commons, has a mature ontology and a stable and thorough Linked Data publishing lifecycle. Wikidata, on the other hand, has recently emerged as a user curated source for structured information which is included in Wikipedia. In this paper, we present how Wikidata is incorporated in the DBpedia ecosystem. Enriching DBpedia with structured information from Wikidata provides added value for a number of usage scenarios. We outline those scenarios and describe the structure and conversion process of the DBpediaWikidata dataset.