Clustering of Web Search Results Based on Document Segmentation (original) (raw)

IJERT-Analysis and Comparison of Web Document Clustering Algorithms with Lingo

International Journal of Engineering Research and Technology (IJERT), 2013

https://www.ijert.org/analysis-and-comparison-of-web-document-clustering-algorithms-with-lingo https://www.ijert.org/analysis-and-comparison-of-web-document-clustering-algorithms-with-lingo In Today's world, with the increased use of internet we have large amount of shared information on World Wide Web. To access small piece of relevant information from this largest repository is overwhelming. Even with the use of search engines, it is difficult to find the most relevant documents from the returned list of large number of documents in response to the user query. Sometimes, users with the absence of domain expertise gives the more abstract query terms, and it leads to the more irrelevant pages and the most relevant pages do not necessarily appear at the top of the query output sequence. It forces the need of documents clustering using the snippet returned by the query. In this paper we discussed various clustering methods, document clustering and web document clustering algorithm and their comparison with lingo algorithm.

Search Results Clustering: Comparison of Lingo and K-Means

When the intention behind the search query is not clear, the search engine returns a large number of results. The results are displayed in the form of a ranked list. The user need to swift through the long list of results to find the result that suits his (or) her information need. This process is a tedious job, hence search results clustering is proposed to present the results in thematic groups. The aim of the search results clustering is to provide quick focus on relevant search results. The search results clustering organize the search results into different groups each group corresponding to different theme. For example, the query "engine" yields results that belong to search engine as well as car engine parts.

Clustering Web Search Results-A Review

The rapid growth of the Internet has made the Web a popular place for collecting information. Today, Internet user access billions of web pages online using search engines. Information in the Web comes from many sources, including websites of companies, organizations, communications and personal homepages, etc. Effective representation of Web search results remains an open problem in the Information Retrieval (IR) community. Web search result clustering has been emerged as a method which overcomes these drawbacks of conventional information retrieval (IR) community. It is the clustering of results returned by the search engines into meaningful, thematic groups. This paper gives issues that must be addressed in the development of a Web clustering engine and categorizes various techniques that have been used in clustering of web search results.

K-Means Clustering For Segment Web Search Results

— Clustering is the power full technique for segment relevant data into different levels. This study has proposed K-means clustering method for cluster web search results for search engines. For represent documents we used vector space model and use cosine similarity method for measure similarity between user query and the search results. As an improvement of K-means clustering we used distortion curve method for identify optimal initial number of clusters.

IJERT-Clustering of Web Search Results using Hybrid Algorithm

International Journal of Engineering Research and Technology (IJERT), 2016

https://www.ijert.org/clustering-of-web-search-results-using-hybrid-algorithm https://www.ijert.org/research/clustering-of-web-search-results-using-hybrid-algorithm-IJERTV4IS120183.pdf Clustering the web search has become a very fascinating research area among scientific and academic associations involved in information retrieval. It is also knows as Web Clustering Engines, appeal to increase the description of documents presented to the user for review, while decreasing the time spent reviewing them. Many algorithms for web document clustering already exist, but conclusions show there is room for more algorithms. Our Project works on providing concise information on an ambiguous search. This allows the user to gain precise information faster and reduces the time spent on looking through thousands of pages for simple information. The information obtained will be segmented, sorted and irrelevant information will be avoided.

Topical clustering of search results

2012

Search results clustering (SRC) is a challenging algorithmic problem that requires grouping together the results returned by one or more search engines in topically coherent clusters, and labeling the clusters with meaningful phrases describing the topics of the results included in them.

Clustering of Web Search Results using Hybrid Algorithm

International Journal of Engineering Research and, 2015

Clustering the web search has become a very fascinating research area among scientific and academic associations involved in information retrieval. It is also knows as Web Clustering Engines, appeal to increase the description of documents presented to the user for review, while decreasing the time spent reviewing them. Many algorithms for web document clustering already exist, but conclusions show there is room for more algorithms. Our Project works on providing concise information on an ambiguous search. This allows the user to gain precise information faster and reduces the time spent on looking through thousands of pages for simple information. The information obtained will be segmented, sorted and irrelevant information will be avoided.

An Analysis of Web Document Clustering Algorithms

Evidently there is a tremendous increase in the amount of information found today on the largest shared information source, the World Wide Web. The process of finding relevant information on the web is overwhelming. Even with the presence of today's search engines that index the web it is difficult to wade through the large number of returned documents in a response to a user query. Furthermore, users without domain expertise are not familiar with the appropriate terminology thus not submitting the right query terms, leading to the retrieval of more irrelevant pages and the most relevant documents do not necessarily appear at the top of the query output sequence. Users of Web search engines are thus often forced to sift through the long ordered list of document " snippets " returned by the engines. This fact has lead to the need to organize a large set of documents into categories through clustering. The Information Retrieval community has explored document clustering as an alternative method of organizing retrieval results. Grouping similar documents together into clusters will help the users find relevant information quicker and will allow them to focus their search in the appropriate direction. Various web document clustering techniques are now being used to give meaningful search result on web. In this paper an analysis of the various categories of web document clustering and also the various existing web clustering engines with its relevant clustering techniques are presented.

Clustering of Web Page Search Results: A Full Text Based Approach

International Journal of …, 2008

With so much information available on the web, looking for relevant documents on the Internet has become a tough task. In this paper we present as approach which is a match between a query-based Google and a category-based Yahoo. WISE is a web page hierarchical ...

A Relative Study on Search Results Clustering

The performance of the web search engines could be improved by properly clustering the search result documents.. Most of the users are not able to give the appropriate query to get what exactly they wanted to retrieve. So the search engine will retrieve a massive list of data , which are ranked by the page rank algorithm(7) or relevancy algorithm or human judgment algorithm.

Clustering of Web Search Results Based on Document Segmentation (original) (raw)

Related papers