On computing PageRank via lumping the Google matrix (original) (raw)
Related papers
Analysis of iterative methods in PageRank computation
Journal of Information and Optimization Sciences, 2017
PageRank is one of the basic metric used in web search technology to rank the web pages. It uses power method to compute principal eigenvector of the web matrix of several billion nodes. PageRank method incorporates a parameter a called damping factor that plays a major role in PageRank computation. In this study, we have observed experimentally the ef ficiency of various iterative methods on hyperlink graph for different value of a. We conclude from experiment that Power method is effective and more competitive for the well condition problem i.e. small value of a. However, for a AE 1 Power method becomes more complex, and other methods such as Aitken-Power, SOR, and Gauss-Seidel are more ef ficient than it in respect of CPU time as well as the number of iteration needed for convergence.
Google PageRanking problem: The model and the analysis
Journal of Computational and Applied Mathematics, 2010
The spectral and Jordan structures of the web hyperlink matrix G(c) = cG + (1 − c)ev T have been analyzed when G is the basic (stochastic) Google matrix, c is a real parameter such that 0 < c < 1, v is a nonnegative probability vector, and e is the all-ones vector. Typical studies have relied heavily on special properties of nonnegative, positive, and stochastic matrices. There is a unique nonnegative vector y(c) such that y T G(c) = y T and y(c) T e = 1. This PageRank vector y(c) can be computed effectively by the power method.
The way in which the displaying of the web pages is done within a search is not a mystery. It involves applied math and good computer science knowledge for the right implementation. This relation involves vectors, matrixes and other mathematical notations. The PageRank vector needs to be calculated, that implies calculations for a stationary distribution, stochastic matrix. The matrices hold the link structure and the guidance of the web surfer. As links are added every day, and the number of websites goes beyond billions, the modification of the web link's structure in the web affects the PageRank. In order to make this work, search algorithms need improvements. Problems and misbehaviors may come into place, but this topic pays attention to many researches which do improvements day by day. Even though it is a simple formula, PageRank runs a successful business. PageRank may be considered as the right example where applied math and computer knowledge can be fitted together.
Exploiting Web matrix permutations to speedup PageRank computation
2004
Recently, the research community has devoted an increased attention to reduce the computational time needed by Web ranking algorithms. In particular, we saw many proposals to speed up the well-known PageRank algorithm used by Google. This interest is motivated by two dominant factors: (1) the Web Graph has huge dimensions and it is subject to dramatic updates in term of nodes and links-therefore PageRank assignment tends to became obsolete very soon; (2) many PageRank vectors need to be computed according to different personalization vectors chosen. In the present paper, we address this problem from a numerical point of view. First, we show how to treat dangling nodes in a way which naturally adapts to the random surfer model and preserves the sparsity of the Web Graph. This result allows to consider the PageRank computation as a sparse linear system in alternative to the commonly adopted eigenpairs interpretation. Second, we exploit the Web Matrix reducibility and compose opportunely some Web matrix permutation to speed up the PageRank computation. We tested our approaches on a Web Graphs crawled from the net. The largest one account about 24 millions nodes and more than 100 million links. Upon this Web Graph, the cost for computing the PageRank is reduced of 58% in terms of Mflops and of 89% in terms of time respect to the Power method commonly used.
Jordan Canonical Form of the Google Matrix: A Potential Contribution to the PageRank Computation
SIAM Journal on Matrix Analysis and Applications, 2005
We consider the web hyperlink matrix used by Google for computing the PageRank whose form is given by A(c) = [cP + (1 − c)E] T , where P is a row stochastic matrix, E is a row stochastic rank one matrix, and c ∈ [0, 1]. We determine the analytic expression of the Jordan form of A(c) and, in particular, a rational formula for the PageRank in terms of c. The use of extrapolation procedures is very promising for the efficient computation of the PageRank when c is close or equal to 1.
Computing PageRank using power extrapolation
2003
We present a novel technique for speeding up the computation of PageRank, a hyperlink-based estimate of the "importance" of Web pages, based on the ideas presented in . The original PageRank algorithm uses the Power Method to compute successive iterates that converge to the principal eigenvector of the Markov matrix representing the Web link graph. The algorithm presented here, called Power Extrapolation, accelerates the convergence of the Power Method by subtracting off the error along several nonprincipal eigenvectors from the current iterate of the Power Method, making use of known nonprincipal eigenvalues of the Web hyperlink matrix. Empirically, we show that using Power Extrapolation speeds up PageRank computation by 30% on a Web graph of 80 million nodes in realistic scenarios over the standard power method, in a way that is simple to understand and implement.
The matrix method to calculate page rank
Choosing the right keywords is relatively easy, whereas getting a high PageRank is more complicated. The index Page Rank is what defines the position in the result pages of search engines (for Google of course, but the other engines are now using more or less the same kind of algorithm). It is therefore very important to understand how this type of algorithm functions to hope to appear on the first page of results (the only page read in 95 % of cases) or at least be among the first. We propose in this paper to clarify the operation of this algorithm using a matrix method and a JavaScript program enabling to experience this type of analysis. It is of course a simplified version, but it can add value to the website and achieve a high ranking in the search results and reach a larger customer base. The interest is to disclose an algorithm to calculate the relevance of each page. This is in fact a mathematical algorithm based on a web graph. This graph is formed of all the web pages that are modeled by nodes, and hyperlinks that are modeled by arcs.