Section A.2 Computer Science: PageRank
Activity A.2.1. The $978,000,000,000 Problem.
In the picture below, each circle represents a webpage, and each arrow represents a link from one page to another.
Based on how these pages link to each other, write a list of the 7 webpages in order from most important to least important.
Observation A.2.2. The $978,000,000,000 Idea.
Links are endorsements. That is:
A webpage is important if it is linked to (endorsed) by important pages.
A webpage distributes its importance equally among all the pages it links to (endorses).
Example A.2.3.
Consider this small network with only three pages. Let \(x_1, x_2, x_3\) be the importance of the three pages respectively.
\(x_1\) splits its endorsement in half between \(x_2\) and \(x_3\)
\(x_2\) sends all of its endorsement to \(x_1\)
\(x_3\) sends all of its endorsement to \(x_2\text{.}\)
This corresponds to the page rank system:
Observation A.2.4.
By writing this linear system in terms of matrix multiplication, we obtain the page rank matrix \(A = \left[\begin{array}{ccc} 0 & 1 & 0 \\ \frac{1}{2} & 0 & 1 \\ \frac{1}{2} & 0 & 0 \end{array}\right]\) and page rank vector \(\vec{x}=\left[\begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array}\right]\text{.}\)
Thus, computing the importance of pages on a network is equivalent to solving the matrix equation \(A\vec{x}=1\vec{x}\text{.}\)
Activity A.2.5.
Thus, our $978,000,000,000 problem is what kind of problem?
An antiderivative problem
A bijection problem
A cofactoring problem
A determinant problem
An eigenvector problem
Activity A.2.6.
Find a page rank vector \(\vec x\) satisfying \(A\vec x=1\vec x\) for the following network's page rank matrix \(A\text{.}\)
That is, find the eigenspace associated with \(\lambda=1\) for the matrix \(A\text{,}\) and choose a vector from that eigenspace.
Observation A.2.7.
Row-reducing \(A-I = \left[\begin{array}{ccc} -1 & 1 & 0 \\ \frac{1}{2} & -1 & 1 \\ \frac{1}{2} & 0 & -1 \end{array}\right] \sim \left[\begin{array}{ccc} 1 & 0 & -2 \\ 0 & 1 & -2 \\ 0 & 0 & 0 \end{array}\right]\) yields the basic eigenvector \(\left[\begin{array}{c} 2 \\ 2 \\1 \end{array}\right]\text{.}\)
Therefore, we may conclude that pages \(1\) and \(2\) are equally important, and both pages are twice as important as page \(3\text{.}\)
Activity A.2.8.
Compute the \(7 \times 7\) page rank matrix for the following network.
For example, since website \(1\) distributes its endorsement equally between \(2\) and \(4\text{,}\) the first column is \(\left[\begin{array}{c} 0 \\ \frac{1}{2} \\ 0 \\ \frac{1}{2} \\ 0 \\ 0 \\ 0 \end{array}\right]\text{.}\)
Activity A.2.9.
Find a page rank vector for the given page rank matrix.
Which webpage is most important?
Observation A.2.10.
Since a page rank vector for the network is given by \(\vec x\text{,}\) it's reasonable to consider page \(2\) as the most important page.
Based upon this page rank vector, here is a complete ranking of all seven pages from most important to least important:
Activity A.2.11.
Given the following diagram, use a page rank vector to rank the pages \(1\) through \(7\) in order from most important to least important.
Subsection A.2.1 Slideshow
Slideshow of activities available at https://teambasedinquirylearning.github.io/linear-algebra/2022/pagerank.slides.html
.