Extracting Semantics with Lexical and Ontological Knowledge Sources: A Methodology for Context-Aware Information Retrieval from the World Wide Web (original) (raw)
Related papers
Making the Web More Semantic: A Methodology for Context-Aware Query Processing
The continued growth of the World Wide Web has made the retrieval of relevant information for a user's query increasingly difficult. A major obstacle to more accurate and semantically sound retrieval is the lack of intelligence in web search systems. This research presents a methodology to increase the semantic content of web query results by building context-aware queries. The methodology contains heuristic mechanisms that use lexical sources such as WordNet and ontologies such as the DAML library to augment a query. A semantic net representation facilitates the process. The methodology has been implemented in a research prototype that connects to search engines (Google and AlltheWeb) to execute the augmented query. An empirical test of the methodology and comparison of results against those directly obtained from the search engines demonstrates that the proposed methodology provides more relevant results to users.
CONQUER: A Methodology for Context-Aware Query Processing on the World Wide Web
Information Systems Research, 2008
A major impediment to accurate information retrieval from the World Wide Web is the inability of search engines to incorporate semantics in the search process. This research presents a methodology, CONQUER (CONtext-aware QUERy processing), that enhances the semantic content of Web queries using two complementary knowledge sources: lexicons and ontologies. The methodology constructs a semantic net using the original query as a seed, and refines the net with terms from the two knowledge sources. The enhanced query, represented by the refined semantic net, can be executed by search engines. This paper describes the methodology and its implementation in a prototype. An empirical evaluation shows that queries suggested by the prototype produce more relevant results than those obtained by the original queries. The research, thus, provides a successful demonstration of the use of existing knowledge sources to enhance the semantic content of Web queries. The paper concludes by identifying potential uses of such enhancements of search technology in organizational contexts.
Searching the Web: From Keywords to Semantic Queries
Third International Conference on Information Technology and Applications (ICITA'05), 2005
Within the emergent Semantic Web framework, the use of traditional web search engines based on keywords provided by the users is not adequate anymore. Instead, new methods based on the semantics of user keywords must be defined to search in the vast Web space without incurring in an undesirable loss of information.
Using the semantic web for web searches
2004
Abstract ReQuest is a semantic search system for specialized domains. It aims to offer context based searches by integrating semantic web technology, such as ontologies and resource description files. ReQuest was built to evaluate and compare the relevance of semantic searches and regular searches used in current information retrieval systems with a user survey. Keywords:
From Web search to Semantic Web search
2008
Many experts predict that the next huge step forward in Web information technology will be achieved by adding semantics to Web data, and will possibly consist of (some form of) the Semantic Web. In this paper, we present an approach to Semantic Web search, which combines standard Web search with ontological background knowledge. In fact, we show how standard Web search engines can be used as the main inference motor for ontology-based search. To make this possible, lightweight software clients are used for annotation and query decomposition. We develop the formal model behind this approach and also provide an implementation in desktop search. Experiments show that the implementation scales quite well to very large amounts of data.
Context-Aware Query Processing on the Semantic Web
The continued growth of the World Wide Web makes the need for retrieval of relevant information for a user's query increasingly important. Current search engines provide the user many Web pages, but with varying levels of relevancy. In this research, an architecture is presented for the development of an intelligent agent methodology to automate the processing of a user's query, while taking into account the query's context. Four sample queries are processed simulating the methodology of the agent. The queries differ on two dimensions: the amount of clarity of the domain and the number of related terms. The results of the queries are compared to those obtained from the Google search engine. The comparisons show that applying the intelligent agent methodology to the queries produces significantly fewer Web pages that are equally or more relevant to the user.
Assessing the Effectiveness of the DAML Ontologies for the Semantic Web
2003
The continued growth of the World Wide Web makes the retrieval of relevant information for a user's query increasingly difficult. Current search engines provide the user with many web pages, but varying levels of relevancy. In response, the Semantic Web has been proposed to retrieve and use more semantic information from the web. Our prior research has developed an intelligent agent to automate the processing of a user's query while taking into account the query's context. The intelligent agent uses WordNet and the DARPA Agent Markup Language (DAML) ontologies to act as surrogates for understanding the context of terms in a user's query. This research develops a set of syntactic, semantic, and pragmatic constructs to assess the effectiveness of the DAML ontologies so that the intelligent agent can select the most useful ontologies. These constructs have been implemented in a tool called the "Ontology Auditor" for use by the intelligent agent.
Word semantics for information retrieval: moving one step closer to the Semantic Web
ictai, 2001
The goal of the Semantic Web is to create a new form of Web content meaningful to computers. The Semantic Web aims to provide greater functionality, via intelligent tools such as information extractors, brokers, reasoning services or question answering systems. Semantics can be addressed at several levels. In this paper, we focus on the lowest level -word semantics on which other higher levels such as concept, paragraph, or document levels can be based upon. This model, which we call Word Semantics (WS), does not include the rich set of tags proposed by the XML/RDF standards. Nevertheless, this simpler WS format comes with a big advantage: it is possible with existing technologies and resources. Practically, this new model relies on understanding word meanings, identifying important named entities such as person, organization and others, and linking all this information via an external general purpose ontology, namely WordNet. With these features, we regard the WS model as a short but strong step toward the long term goal of a Semantic Web.
Context based semantic data retrieval
uam.es
The large quantity of information accessible across the Internet has raised in a new and urgent form the information retrieval problem. The search engines, services that based on keywords return a list of documents more or less relevant, represent a first response to this problem. Nevertheless, in spite of their great utility, the search engines suffer from several limitations that affect the precision of the results of the queries and, therefore, their utility. Solutions based on formal annotations, like the ones that are being proposed in the semantic web, suffer important limitations since they do not take into account the context in which the search is performed. In this work we present a search system based on the idea that each search is performed in the context of a certain activity, context which will be formalized.
Semantic Lexical Resources Applied to Content-based Querying-the OntoQuery Project
2002
This paper deals with the exploitation of the l exical and conceptual knowledge coded in the SIMPLE-DK lexicon in the methodology for content-based querying developed by the OntoQuery project. SIMPLE-DK has proven a rich and flexible lexical resource, which the project has taken advantage of in several ways. Firstly, the paper explains how the ontology provided by SIMPLE is used by the current project prototype to derive conceptual descriptors on which to base the matching of documents to user queries. Furthermore, it discusses how selectional restrictions and qualia roles, both coded in SIMPLE, can be used to construct an ontological grammar to build more complex descriptors.