in website optimization, often referred to the content is king, the link is emperor "has now more than two such, in addition to the user experience, comprehensive factors, content is king, the link for the emperor, and the user experience is the three generation of search engine look at their focus, which is also the link analysis used the most the fire, in a lot of black hat Shanghai dragon also has many techniques are based on the analysis of link vulnerability to amplification by. Today Chengdu Shanghai dragon will sum up the link analysis algorithm of search engine. The two most important algorithms in the analysis of search engine links to belong to PageRank and HITS algorithm, two are transfer from random walk model and propagation model two subset model to calculate the weights of the links.
First look at the
PageRank algorithm, PageRank algorithm is a random walk model to the development of PageRank, mainly on account of the number of inbound links and "two quality factors to judge the importance of a web page. But because PageRank does not take into account the relevance problem, and then extend the topic sensitive PageRank algorithm, also called Topic Sensitive PageRank, to strengthen the weight value relevance link. At the same time, people developed the intelligent walk model and biased walk model two link algorithm, the two algorithms for random PageRank, for browsing the web on the normal person may not be random clicking a link on the page, but the correlation can click on the link to get more. Then the extended subset propagation model under HITS algorithm. The HITS algorithm defines two pages are Hub pages and Authority pages, Hub pages that contain a lot of high quality links to a Authority page page, for example, hao123 is this kind of page; Authority page is a field and topics related to the high quality page, similar to the Shanghai Longfeng area of Shanghai dragon WHY, search engine. Love Shanghai, noble baby. HITS is the mutual support, a good Hub page will point to many good Authority page, a Authority page must have a lot of good points to the Hub page. The issue of HITS algorithm is quite obvious, such as easy to be malicious use, structural instability, calculation causes low efficiency, then introduces the PHITS algorithm, here is not to do that. In view of the advantages and disadvantages of PageRank algorithm and HITS algorithm, and study the SALSA algorithm, the main advantage of the correlation between the characteristics of HITS algorithm, and using the random walk model PageRank algorithm, is currently one of the best algorithm of link analysis algorithm.
This article from the Lenny