The Journey is the Reward - Towards New Paradigms in Web Search (original) (raw)

2006

The Web has grown from a simple hypertext system for research labs to an ubiquitous information system including virtually all human knowledge, e.g., movies, images, music, documents, etc. The traditional browsing activity seems to be often inadequate to locate information satisfying the user needs. Even search engines, based on the Information Retrieval approach, with their huge indexes show many drawbacks, which force users to sift through long lists of results or reformulate queries several times.

Search Engines going beyond Keyword Search: A Survey

In order to solve the problem of information overkill on the web or large domains, current information retrieval tools especially search engines need to be improved. Much more intelligence should be embedded to search tools to manage the search and filtering processes effectively and present relevant information. As the web swells with more and more data, the predominant way of sifting through all of that data -keyword search -will one day break down in its ability to deliver the exact information people want at our fingertips. Hence search engines are trying to break the shackles of the concept of keyword search what typically most search engines do. This paper tries to identify the major challenges for today's keyword search engines to adapt with the fast growth of web and support comprehensive user demands in quick time. Then it surveys different non-keyword based paradigms proposed, developed or implemented by researchers and different search engines and also classifies those approaches according to the features focused by the different search engines to deliver results.

A STUDY OF WEB INFORMATION RETRIEVAL AND E-CONTENT SEARCHING SYSTEMS

IAEME PUBLICATION, 2020

The searching of relevant content for any particular application area is a challenging task in today’s ocean of knowledge on the web. There are various factors such as user query, search technique and ranking algorithms that affect the traditional information retrieval process as well as the modern semantic based search systems. The present study aim to provide a comprehensive review of literature on the various approaches that are used for the retrieval and searching of the relevant e-content.

Semantic search engines

We all are aware of two letter word named Information Retrieval (IR) which is nothing but a process of retrieving or gathering information from a given document or a file. The concept of Information Retrieval has gained much height for many years because of large collection of information that is available in form of documents on Internet and to arrange and retrieve utilized words from them is cumbersome task. The information can be structured, unstructured or semi-structured. This paper consists of four sections. In section 1,

A Smart Query Formulation for an Efficient Web Search

2007

Traditional search engines rely on keyword-based matching, recovering the documents which present some occurrences of the input keywords, but ignore at all the data meaning of the retrieved documents. Thus, long lists of pages links are returned but actually only a handful of pages contain reference to relevant web resources and meet the needs of users. The exigency of major awareness in the interpretation of web data yields new approaches and methodologies for improving the web search and retrieval, by taking into account the context of information, related to the user query. This work presents an approach for supporting the user in the Web search activity: it achieves the interpretation of the input query and, on the basis of the the local knowledge, replies by providing (links of) web pages which are more relevant to the content meaning of the input query. The approach combines intrinsic potential of the agent-based paradigm with the modeling of knowledge through techniques of soft computing. The agents encode the semantics of data, by exploiting ontologies, in order to grasp the actual query meaning. The information elicited by the query interpretation represents an add-on, aimed at augmenting the system knowledge, exploited in the discovery of web pages which match the user request. engines are not efficient in terms of time and bandwidth). On the other hand, reusing the existing large indexes of general purpose search engines is a solution to retrieve , after a filtering activity, documents from a specific domain (though the the response time to the user query are slow too). Similarly, other approaches achieve clustering of results for automatic organization (into categories) of documents (i.e. WiseNut and Vivissimo [34]). Metasearch environments, instead implement strategies that apply user queries to several search engines simultaneously. However many of these approaches do not consider the semantic relationships existing among terms: the query ambiguity and the vocabulary gap represent extant impediments that confirm the search engine technology is far from the ideal response to a certain query.

Bridging the Gap: From Traditional Information Retrieval to the Semantic Web

Web is the nature of information search. The Semantic Web vision reveals a radical departure from the traditional theories of Information Retrieval (IR) upon which current search engine technology is built. Semantic Web researchers are very articulate about how the pillars of the Semantic Web-semantically aware, intelligent agents, ontologies, and markup languages-will revolutionize the way that we interact with information on the web. They are less articulate about how we will get there from here. While it's true that the traditional assumptions of IR-small, static, homogeneous, centrally located, monolingual document collections-don't hold for the Web, still it is important to note the success of search engines built on IR theory. This paper calls attention to the gap between traditional IR and the more visionary Semantic Web research. We describe a preliminary roadmap bridging the two areas focusing on the concrete contributions and also calling attention to the weak points of both fields.

Design and Evaluation of Semantic Guided Search Engine

International Journal of Web Engineering, 2012

Search engines provide a gateway through which people can find relevant informat ion in large collections of heterogeneous data. Search engines efficiently service the informat ion needs of people that require access to the data therein. Web search engines service millions of queries per day, and search collections that contain billions of documents. As the growth in the number of documents that are available in such collections continues, the task of finding documents that are relevant to user queries becomes increasingly costly. In this work, Se mantic Gu ided Internet Search Engine is built to present an efficient search engine-crawl, index and rank the web pages-by applying two approaches. The first one, implementing Se mantic principles through the searching stage, which depends on morphology concept-applying stemming concept-and synonyms dictionary, and the second, implementing guided concept during input the query stage which assist the user to find the suitable and corrected words. The advantage of guided concept is to reduce the probability of inserting wrong words in query. The concluded issues in this research that the returned web pages are semantic pages yielding with synonyms depending on the query terms wh ich achieved the concept of se mantic search and as co mpared with Google, good results are appeared depending on the Recall and Precision measurements reaching 95%-100% for some queries in spite of the differential of environ ment between the two systems. Also, the performance of the search is imp roved by using guided search and by using the imp roved PageRank, wh ich reduces the retrieved time. Finally remov ing stop words from a document minimizes the storage space, which enhanced the proposed system.

Prototype for Enhancing Search Engine Performance Using Semantic Data Search

Information's on Internet are vast that are retrieved by the search engines based on page ranks. But the search results are not related to one particular user's environment. Many researches had been possessed to provide better results. In this project, we propose a new system called as Semantic Search log Social Personalized Search which would be able to provide results for search query that relates to a particular user's environment based on the users area of interests, his likes and dislikes etc.., Social networks are such domain in which we could obtain the user oriented information, which can be used for providing personalized search results. Here a supervised learning technique is used to learn about the user, based upon his interactions inside the system. This process can be able to make applicable for each and every registered user in this application. This can be done by proving the user basic information in their profile and get benefits from their each and every search. When the user gets register with the system, it creates an ontological profile, when the user gets login into the social network and interacts with it the system updates the user ontological profile based upon their interaction. The search provision can be finding out in their home page after they get login. When the user searches a keyword using the search engine inside the social network, it refers to the ontological profile of the user and displays the Personalized Search results. The system should be able to intelligently identify whether a search result has been useful to him or not and save it for his future reference when he searches for the same or similar keyword next time. The main objective of this project involves with search engine and its optimization methods. A new technique called as ontology search logs is introduced, which will be used for customized search logs according to the user's define input based on his/her area of interests, his/her likes and dislikes,. This application will be processed in any type of the search engine.

Information Retrieval on the World Wide Web

IEEE Internet Computing, 1997

T he World Wide Web is a very large distributed digital information space. From its origins in 1991 as an organization-wide collaborative environment at CERN for sharing research documents in nuclear physics, the Web has grown to encompass diverse information resources: personal home pages; online digital libraries; virtual museums; product and service catalogs; government information for public dissemination; research publications; and Gopher, FTP, Usenet news, and mail servers. Some estimates suggest that the Web currently includes about 150 million pages and that this number doubles every four months.

Intelligent Semantic Web Search Engines: A Brief Survey

International journal of Web & Semantic Technology, 2011

The World Wide Web (WWW) allows the people to share the information (data) from the large database repositories globally. The amount of information grows billions of databases. We need to search the information will specialize tools known generically search engine. There are many of search engines available today, retrieving meaningful information is difficult. However to overcome this problem in search engines to retrieve meaningful information intelligently, semantic web technologies are playing a major role. In this paper we present survey on the search engine generations and the role of search engines in intelligent web and semantic search technologies.

The Journey is the Reward - Towards New Paradigms in Web Search (original) (raw)

Related papers