Skip to main content

Section A.2 Computer Science: PageRank

Activity A.2.1. The $978,000,000,000 Problem.

In the picture below, each circle represents a webpage, and each arrow represents a link from one page to another.

Figure 81. A seven-webpage network

Based on how these pages link to each other, write a list of the 7 webpages in order from most important to least important.

Observation A.2.2. The $978,000,000,000 Idea.

Links are endorsements. That is:

  1. A webpage is important if it is linked to (endorsed) by important pages.

  2. A webpage distributes its importance equally among all the pages it links to (endorses).

Example A.2.3.

Consider this small network with only three pages. Let \(x_1, x_2, x_3\) be the importance of the three pages respectively.

Figure 82. A three-webpage network
  1. \(x_1\) splits its endorsement in half between \(x_2\) and \(x_3\)

  2. \(x_2\) sends all of its endorsement to \(x_1\)

  3. \(x_3\) sends all of its endorsement to \(x_2\text{.}\)

This corresponds to the page rank system:

\begin{alignat*}{4} && x_2 && &=& x_1 \\ \frac{1}{2} x_1&& &+&x_3 &=& x_2\\ \frac{1}{2} x_1&& && &=& x_3 \end{alignat*}

Observation A.2.4.

Figure 83. A three-webpage network
\begin{alignat*}{4} && x_2 && &=& x_1 \\ \frac{1}{2} x_1&& &+&x_3 &=& x_2\\ \frac{1}{2} x_1&& && &=& x_3 \end{alignat*}
\begin{equation*} \left[\begin{array}{ccc}0&1&0\\\frac{1}{2}&0 & 1\\\frac{1}{2}&0&0\end{array}\right] \left[\begin{array}{c}x_1\\x_2\\x_3\end{array}\right] = \left[\begin{array}{c}x_1\\x_2\\x_3\end{array}\right] \end{equation*}

By writing this linear system in terms of matrix multiplication, we obtain the page rank matrix \(A = \left[\begin{array}{ccc} 0 & 1 & 0 \\ \frac{1}{2} & 0 & 1 \\ \frac{1}{2} & 0 & 0 \end{array}\right]\) and page rank vector \(\vec{x}=\left[\begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array}\right]\text{.}\)

Thus, computing the importance of pages on a network is equivalent to solving the matrix equation \(A\vec{x}=1\vec{x}\text{.}\)

Activity A.2.5.

Thus, our $978,000,000,000 problem is what kind of problem?

\begin{equation*} \left[\begin{array}{ccc}0&1&0\\\frac{1}{2}&0&\frac{1}{2}\\\frac{1}{2}&0&0\end{array}\right] \left[\begin{array}{c}x_1\\x_2\\x_3\end{array}\right] = 1\left[\begin{array}{c}x_1\\x_2\\x_3\end{array}\right] \end{equation*}
  1. An antiderivative problem

  2. A bijection problem

  3. A cofactoring problem

  4. A determinant problem

  5. An eigenvector problem

Activity A.2.6.

Find a page rank vector \(\vec x\) satisfying \(A\vec x=1\vec x\) for the following network's page rank matrix \(A\text{.}\)

That is, find the eigenspace associated with \(\lambda=1\) for the matrix \(A\text{,}\) and choose a vector from that eigenspace.

Figure 84. A three-webpage network
\begin{equation*} A = \left[\begin{array}{ccc} 0 & 1 & 0 \\ \frac{1}{2} & 0 & 1 \\ \frac{1}{2} & 0 & 0 \end{array}\right] \end{equation*}

Observation A.2.7.

Row-reducing \(A-I = \left[\begin{array}{ccc} -1 & 1 & 0 \\ \frac{1}{2} & -1 & 1 \\ \frac{1}{2} & 0 & -1 \end{array}\right] \sim \left[\begin{array}{ccc} 1 & 0 & -2 \\ 0 & 1 & -2 \\ 0 & 0 & 0 \end{array}\right]\) yields the basic eigenvector \(\left[\begin{array}{c} 2 \\ 2 \\1 \end{array}\right]\text{.}\)

Therefore, we may conclude that pages \(1\) and \(2\) are equally important, and both pages are twice as important as page \(3\text{.}\)

Activity A.2.8.

Compute the \(7 \times 7\) page rank matrix for the following network.

Figure 85. A seven-webpage network

For example, since website \(1\) distributes its endorsement equally between \(2\) and \(4\text{,}\) the first column is \(\left[\begin{array}{c} 0 \\ \frac{1}{2} \\ 0 \\ \frac{1}{2} \\ 0 \\ 0 \\ 0 \end{array}\right]\text{.}\)

Activity A.2.9.

Find a page rank vector for the given page rank matrix.

\begin{equation*} A=\left[\begin{array}{ccccccc} 0 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 \\ \frac{1}{2} & 0 & 0 & 1 & 0 & 0 & \frac{1}{2} \\ 0 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 \\ \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0 & 0 & \frac{1}{2} \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & 0 \\ 0 & 0 & 0 & 0 & \frac{1}{2} & 0 & 0 \\ 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} & \frac{1}{2} & 0 \end{array}\right] \end{equation*}
Figure 86. A seven-webpage network

Which webpage is most important?

Observation A.2.10.

Since a page rank vector for the network is given by \(\vec x\text{,}\) it's reasonable to consider page \(2\) as the most important page.

\begin{equation*} \vec{x} = \left[\begin{array}{c} 2 \\ 4 \\2 \\ 2.5 \\ 0 \\ 0 \\ 1\end{array}\right] \end{equation*}

Based upon this page rank vector, here is a complete ranking of all seven pages from most important to least important:

\begin{equation*} 2, 4, 1, 3, 7, 5, 6 \end{equation*}
Figure 87. A seven-webpage network

Activity A.2.11.

Given the following diagram, use a page rank vector to rank the pages \(1\) through \(7\) in order from most important to least important.

Figure 88. Another seven-webpage network

Subsection A.2.1 Slideshow

Slideshow of activities available at https://teambasedinquirylearning.github.io/linear-algebra/2022/pagerank.slides.html.