Automatic Multilingual Lexicon Generation Using Wikipedia as a Resource (original) (raw)

Strategies in automatic traversal of Wikipedia articles for mining multilingual resources

TKE 2012

Abstract. In this article we present Termontospider, a wiki crawler that optimally traverses Wikipedia in search of domain-specific texts for extracting terminological and ontological information. The crawler is part of a tool suite for automatically developing multilingual termontological databases, ie ontologically-underpinned multilingual terminological databases. The focus is on analyzing the best value for internal links, categories and other metadata to assign weights and search mechanisms in network traversal. Keywords: data ...

WikiBABEL: A System for Multilingual Wikipedia Content

Proc. of AMTA …, 2010

This position paper outlines our project–WikiBABEL–which will be released as an open source project for the creation of multilingual Wikipedia content, and has potential to produce parallel data as a by-product for Machine Translation systems research. We ...

Building bilingual dictionaries from parallel web documents

2002

In this paper we describe a system for automatically constructing a bilingual dictionary for cross-language information retrieval applications. We describe how we automatically target candidate parallel documents, filter the candidate documents and process them to create parallel sentences. The parallel sentences are then automatically translated using an adaptation of the EMIM technique and a dictionary of translation terms is created. We evaluate our dictionary using human experts.

Automatic Wikipedia Link Generation Based On Interlanguage Links

ArXiv, 2017

This paper presents a new way to increase interconnectivity in small Wikipedias (fewer than a 100,000 articles), by automatically linking articles based on interlanguage links. Many small Wikipedias have many articles with very few links, this is mainly due to the short article length. This makes it difficult to navigate between the articles. In many cases the article does exist for a small Wikipedia, however the article is just missing a link. Due to the fact that Wikipedias are translated in to many languages, it allows us to generate new links for small Wikipedias using the links from a large Wikipedia (more than a 100,000 articles).