Learning to rank relational objects and its application to web search (original) (raw)
Related papers
Role of Ranking Algorithms for Information Retrieval
International Journal of Artificial Intelligence & Applications, 2012
As the use of web is increasing more day by day, the web users get easily lost in the web's rich hyper structure. The main aim of the owner of the website is to give the relevant information according their needs to the users. We explained the Web mining is used to categorize users and pages by analyzing user's behavior, the content of pages and then describe Web Structure mining. This paper includes different Page Ranking algorithms and compares those algorithms used for Information Retrieval. Different Page Rank based algorithms like Page Rank (PR), WPR (Weighted Page Rank), HITS (Hyperlink Induced Topic Selection), Distance Rank and EigenRumor algorithms are discussed and compared. Simulation Interface has been designed for PageRank algorithm and Weighted PageRank algorithm but PageRank is the only ranking algorithm on which Google search engine works.
Relevance Vector Ranking for Information Retrieval
2010
In recent years, learning ranking function for information retrieval has drawn the attentions of the researchers from information retrieval and machine learning community. In existing approaches of learning to rank, the sparse prediction model only can be learned by support vector learning approach. However, the number of support vectors grows steeply with the size of the training data set. In this paper, we propose a sparse Bayesian kernel approach to learn ranking function. By this approach accurate prediction models can be derived, which typically utilize fewer basis functions than the comparable SVM-based approaches while offering a number of additional advantages. Experimental results on document retrieval data set show that the generalization performance of this approach competitive with two state-of-the-art approaches and the prediction model learned by it is typically sparse.
[Hang Li] Learning to Rank for Information Retriev(BookFi)
Technologies is edited by Graeme Hirst of the University of Toronto. The series consists of 50-to 150-page monographs on topics relating to natural language processing, computational linguistics, information retrieval, and spoken language understanding. Emphasis is on important new techniques, on new applications, and on topics that combine two or more HLT subfields.
An Improved Approach to Ranking Web Documents
Ranking thousands of web documents so that they are matched in response to a user query is really a challenging task. For this purpose, search engines use different ranking mechanisms on apparently related resultant web documents to decide the order in which documents should be displayed. Existing ranking mechanisms decide on the order of a web page based on the amount and popularity of the links pointed to and emerging from it. Sometime search engines result in placing less relevant documents in the top positions in response to a user query. There is a strong need to improve the ranking strategy. In this paper, a novel ranking mechanism is being proposed to rank the web documents that consider both the HTML structure of a page and the contextual senses of keywords that are present within it and its back-links. The approach has been tested on data sets of URLs and on their back-links in relation to different topics. The experimental result shows that the overall search results, in response to user queries, are improved. The ordering of the links that have been obtained is compared with the ordering that has been done by using the page rank score. The results obtained thereafter shows that the proposed mechanism contextually puts more related web pages in the top order, as compared to the page rank score.
A Complete Survey on Web Document Ranking
Ijca Proceedings on International Conference on Advances in Computer Engineering and Applications, 2014
Today, web plays a critical role in human life and also simplifies the same to a great extent. However, due to the towering increase in the number of web pages, the challenge of providing quality and relevant information to the users also needs to be addressed. Thus, search engines need to implement such algorithms which spans the pages as per user's interest and satisfaction and rank them accordingly. The concept of web mining tremendously assists in the mentioned scenario. Web mining helps in retrieving potentially useful information and patterns from web. This paper includes different Page Ranking algorithms and compares those algorithms used for Information Retrieval. Additionally it also presents some interesting facts about research in page ranking to find further scope of research in this area.
Performance Evaluation of a Relation Based Page Ranking Algorithm Used for Improving Search Results
Search engines help us mine the web and get what we are looking for. But the problem with today's search engines is that, it gives us most reputable results but not really the most relevant ones. The search criteria of these traditional web search engines is based on keyword matching or to some higher extent (like in Google), matching words similar or in the same context to the query word and then ranking the matched pages with PageRank Algorithm, based purely on link analysis. PageRank despite being quiet efficient in sorting the results to be presented to the user, still results in a number of irrelevant pages. To overcome this problem and provide relevant results, the Relation Based Page Ranking Algorithm has been introduced recently. In this paper, we evaluate the performance of the Relation Based Page Ranking Algorithm on some academic web pages annotated with an Ontology specifically built for the web page's domain. Ranking of web pages through this algorithm is based ...
The whens and hows of learning to rank for web search
Abstract Web search engines are increasingly deploying many features, combined using learning to rank techniques. However, various practical questions remain concerning the manner in which learning to rank should be deployed. For instance, a sample of documents with sufficient recall is used, such that re-ranking of the sample by the learned model brings the relevant documents to the top. However, the properties of the document sample such as when to stop ranking—ie its minimum effective size—remain unstudied.
HYBRID ALGORITHM FOR PAGE RANKING IN INFORMATION RETRIEVAL SYSTEMS
Information Retrieval IR systems store a large volume of unstructured data and provide search results for a user query. The performance of the IR systems depends upon the relevancy of the search results with user query. Page ranking algorithms are used to assign rank to the retrieved results for a user query. Page ranking algorithms are mainly categories in to web structure mining and web content mining. In literature many page ranking algorithms have been proposed to improve the relevancy of search results for a user query. In this paper a new hybrid page ranking algorithm using web structure mining and web content mining has been proposed. The algorithm is implemented and tested on a test data results shows that the new proposed algorithm performs better than the existing algorithms.
Xhits: Learning to Rank in a Hyperlinked Structure
Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, 2011
The explosive growth and the widespread accessibility of the Web has led to a surge of research activity in the area of information retrieval on the WWW. This is a huge and rich environment where the web pages can be viewed as a large community of elements that are connected through links due to several issues. The HITS approach introduces two basic concepts, hubs and authorities, which reveal some hidden semantic information from the links. In this paper, we review the XHITS, a generalization of HITS, which expands the model from two to several concepts and present a new Machine Learning algorithm to calibrate an XHITS model. The new learning algorithm uses latent feature concepts. Furthermore, we provide some illustrative examples and empirical tests. Our findings indicate that the new learning approach provides a more accurate XHITS model.
SemRank: ranking refinement strategy by using the semantic intensity
Procedia Computer Science, 2011
The ubiquity of the multimedia has raised a need for the system that can store, manage, structured the multimedia data in such a way that it can be retrieved intelligently. One of the current issues in media management or data mining research is ranking of retrieved documents. Ranking is one of the provocative problems for information retrieval systems. Given a user query comes up with the millions of relevant results but if the ranking function cannot rank it according to the relevancy than all results are just obsolete. However, the current ranking techniques are in the level of keyword matching. The ranking among the results is usually done by using the term frequency. This paper is concerned with ranking the document relying merely on the rich semantic inside the document instead of the contents. Our proposed ranking refinement strategy known as SemRank, rank the document based on the semantic intensity. Our approach has been applied on the open benchmark LabelMe dataset and compared against one of the well known ranking model i.e. Vector Space Model (VSM). The experimental results depicts that our approach has achieved significant improvement in retrieval performance over the state of the art ranking methods.