Massey Centrality

In 1997, Kenneth Massey, then an undergraduate, created a method for ranking college football teams. We wrote about this method, which uses the mathematical theory of least squares, as his honors thesis.

Math

The main idea of Massey's method is enclosed in the following equation:

rirj=yk
where ri and rj are the ratings of teams i and j and yk is the absolute margin of victory for game k between teams i and j. If there are n teams who played m games, we have a linear system:
Xr=y
where X is a m×n matrix such the k-th row of X contains all 0s with the exception of a 1 in location i and a 1 in location j, meaning that team i beat team j in match k (if match k ends with a draw, either i or j location can be assigned 1, and the other 1). Since typically m>>n, this system is highly overdetermined and inconsistent.

Massey proposes to find the solution r that minimizes:

||Xry||2
where ||||2 is the norm 2, that is ||x||2=ixi2. Let
M=XTX
and
p=XTy.
Notice that
Mi,j={the negation of the # of matches between i and jif ij,# of games played by iif i=j.
and pi is the signed sum of point spreads of every game played by i. The Massey's method is then defined by the following linear system:
Mr=p
The solution of the above system to the solution that minimizes ||Xry||2.

We observe that the Massey's team ratings are in fact interdependent. Indeed, Massey's matrix M can be decomposed as:

M=DA
where D is a diagonal matrix with Di,i equal to the number of games played by team i, and A is a matrix with Ai,j equal to the number of matches played by team i against team j. Hence, Massey's linear system is equivalent to:
DrAr=p
or, equivalently:
r=D1(Ar+p)=D1Ar+D1p
That is, for any team i:
ri=1Di,ijAi,jrj+piDi,i
This means that the rating ri of team i is the sum ri(1)+ri(2) of two meaningful components:

  1. the mean current rating of teams that i has matched:
    ri(1)=1Di,ijAi,jrj;
  2. the mean point spread of team i:
    ri(2)=piDi,i.
Finally, it is interesting to analyse what happens to Massey's system at the end of the season, assuming that all n teams matched all other teams once. In this case, the opponents rating component
ri(1)=rin1,
where we have used the fact that iri=0, and the point spread component
ri(2)=pin1,
hence
ri=ri(1)+ri(2)=rin1+pin1,
and thus
ri=pin.
The same result is obtained noticing that
Mp=(DA)p=(n1)pAp=np(p+Ap)=np
where we have used the fact that p sums to 0. Hence p is an eigenvector of M with eigenvalue n and hence, once again, r=p/n is a solution of the Massey's system. Hence, the final rating of a team is proportional to the final point spread of the team.

We now establish a connection between Massey's method and Katz's centrality. Observe that Massey's equation

r=D1(Ar+p)
is close to the equation defining Katz's centrality
r=αAr+β
where α is a given constant and β is a given vector. The Katz's equation can be solved by inverting matrix IαA as soon as 0<α<1/ρ(A), with ρ(A) the spectral radius of A. In particular, making the reasonable assumption that all teams played the same number k of matches, then Massey's equation simplifies:

r=1k(Ar+p)
which corresponds to Katz's centrality with parameters α=1/k and β=p/k. Nevertheless, notice that ρ(A)=k hence α=1/ρ(A). Massey's equation is an instance of Katz's centrality, but cannot be computed by inverting matrix IαA.

Massey's rating has also an electrical interpretation in terms of resistor networks. If we view the symmetric matrix A as the adjacency matrix of an undirected graph GA, then M is the Laplacian matrix of the graph GA. The Massey's rating vector r defined in system Mr=(DA)r=p is then equivalent to the potential vector over a resistor network defined by A with supply vector p.

Following this metaphor, the resistor network has a resistor between nodes i and j as soon as teams i and j has matched with conductance (reciprocal of resistance) equal to the number Ai,j of matches between i and j. Sources, that are nodes i with positive supply pi through which current enters the network, are teams with positive point spread, while targets, that are nodes i with negative supply pi through which current leaves the network, are teams with negative point spread. Notice that current entering and leaving the network must be equal, indeed the point spread vector p sums to 0. The potential ri for node i corresponds to the rating of team i: teams with large potential are teams high in the ranking. Furthermore, the current flow through edge (i,j) is, by Ohm's law, the quantity Ai,j(rirj), which corresponds to the rating difference between teams i and j (which is also an estimate of the point spread in a match between i and j) multiplied by the number of times they matched: current flows more intensely between teams of different strengths (as measured by Massey's method) that matched many times.

How do we solve the Massey's system Mr=p? Unfortunately, M=DA is a singular matrix, hence it is not possible to solve this system computing the inverse of M. Assuming that GA is connected, a solution of the system can be obtained as follows: let M¯ be the matrix obtained by replacing any row, say the last, of M with a vector e of all 1's, and let p¯ be the vector obtained by replacing the last element of p with a 0. Then, solve the system:

M¯r=p¯
Notice that the added constraint forced the ranking vector r to sum to 0.

Offensive and defensive ratings

Massey also proposed an offensive rating (o) and a defensive rating (d) characterizing the offensive and defensive strengths of teams. Massey assumes that r=o+d, that is, the overall strength of a team is the sum of its offensive and defensive powers. Let's decompose the point spread vector p=fa, where f holds the total number of points scored by each team and a holds the total number of points scored against each team. Then, the equations defining o and d are:

DoAd=fAoDd=a
That is, for any team i:
oi=1Di,i(jAi,jdj+fi)di=1Di,i(jAi,jojai)
This means that the offensive rating of team i multiplied by the number of games played by i is equal to the defensive ratings of opponents of i plus the number of points scored by i. On the other hand, the defensive rating of team i multiplied by the number of games played by i is equal to the offensive ratings of opponents of i minus the number of points scored against i. Since we know that r=o+d, we have that, knowing r, the defensive rating d can be computed as:
(D+A)d=Drf
and, knowing d and r, the offensive rating
o=rd.

Let N=D+A and q=Drf. It might happen that matrix N is singular, hence it cannot be inverted. Assuming that GA is connected, it holds that N is singular precisely when the graph GA of A is bipartite. If GA is bipartite, we can apply a perturbation trick similar to the one exploited for the Massey's matrix M=DA. Let V1 and V2 be the two set of nodes of the bipartite graph GA. Let v be a vector such that vi=1 if iV1 and vi=1 if iV2. Let N¯ be the matrix obtained by replacing any row, say the last, of N with vector v and q¯ be the vector obtained by replacing the last element of q with a 0. Then, a solution of system Massey's system for defensive ranking is obtained solving the following perturbed system:

N¯d=q¯

Code

The following user-defined function massey computes the various Massey's ratings described above:

## Massey # ASSUMPTION: graph of A is connected # INPUT # matches: matches data frame # teams: team names # OUTPUT # r: Massey rating (r = r1 + r2; r = o + d) # r1: opponents rating # r2: spread rating # o: offensive rating # d: defensive rating # A: match matrix # g: match graph massey = function(matches, teams) { # number of teams n = length(teams) # number of matches m = dim(matches)[1] # match matrix X = matrix(0, nrow=m, ncol=n, byrow=TRUE) # margin of victory vector y = rep(0, m) # points for vector f = rep(0, n) # points against vector a = rep(0, n) # populate match matrix for (i in 1:m) { if (matches[i,"score1"] >= matches[i,"score2"]) { X[i, matches[i,"team1"]] = 1; X[i, matches[i,"team2"]] = -1; y[i] = matches[i,"score1"] - matches[i,"score2"] } else { X[i, matches[i,"team2"]] = 1; X[i, matches[i,"team1"]] = -1; y[i] = matches[i,"score2"] - matches[i,"score1"] } f[matches[i,"team1"]] = f[matches[i,"team1"]] + matches[i,"score1"] f[matches[i,"team2"]] = f[matches[i,"team2"]] + matches[i,"score2"] a[matches[i,"team1"]] = a[matches[i,"team1"]] + matches[i,"score2"] a[matches[i,"team2"]] = a[matches[i,"team2"]] + matches[i,"score1"] } # compute Massey matrix M = t(X) %*% X # compute team spread vector (or p = f-a) p = t(X) %*% y # check assumptions D = diag(diag(M), n, n) A = D - M g = graph_from_adjacency_matrix(A, mode="undirected", weighted=TRUE) if (!is.connected(g)) { cat("The graph is not connected"); return(NULL) } # perturbate the system M1 = M p1 = p M1[n, ] = rep(1, n) p1[n,1] = 0 # compute ranking r r = solve(M1, p1) # compute partial rankings r1 and r2 (r = r1 + r2) D1 = solve(D) r1 = D1 %*% A %*% r r2 = D1 %*% p # compute offensive and defensive rankings (r = o + d) N = D + A q = (D %*% r) - f # check if graph of A is bipartite (we know that the graph of A is connected) bipa = bipartite_mapping(graph_from_adjacency_matrix(A, mode="undirected")) if (bipa$res == TRUE) { # if bipartite, perturbate the system v = bipa$type v[v == TRUE] = 1 v[v == FALSE] = -1 N1 = N q1 = q N1[n, ] = v q1[n,1] = 0 d = solve(N1, q1) } else { d = solve(N, q) } o = r - d r = as.vector(r) r1 = as.vector(r1) r2 = as.vector(r2) o = as.vector(o) d = as.vector(d) names(r) = teams names(r1) = teams names(r2) = teams names(o) = teams names(d) = teams V(g)$teams = teams return(list(r=r, r1=r1, r2=r2, o=o, d=d, A = A, g = g)) }

Example

The table below shows the results of 4 matches (numbered 1, 2, 3, 4) involving 4 fictitious teams (labelled A, B, C, D):

matchteam1team2score1score21AC202AD303BC114BD21
The match-team matrix X is given below:
ABCD11010210013011040101
The match spread vector y is:
teamy12233041
The Massey's matrix M=XTX is:
ABCDA2011B0211C1120D1102
and the team spread vector p=XTy is:
teampA5B1C2D4
The resulting Massey's rating is the following:
teamrA1.75B0.25C0.25D1.25
Notice that teams B and C have the same rating (and hence ranking), despite the point spread of B (1) is higher than the point spread of C (-2). This is because B against C ended in a draw, B won the second match, but against the weakest team (D), while C lost the second match, but against the strongest team (A). Indeed, the rating r can be decomposed in the partial ratings ri(1) and ri(2) that follows:
teamrri(1)ri(2)A1.750.752.5B0.250.750.5C0.250.751.0D1.250.752.0
Hence, B has a better point spread but an easier schedule with respect to C. Summing the two rating componentes we get the same rating for teams B and C. Moreover, the rating r can be decomposed in the offensive and defensive ratings that follows:
teamrodA1.751.8750.125B0.250.8751.125C0.250.1250.125D1.250.1251.125
where the points for and against vectors are as follows:
teampfaA550B132C213D415
Notice that team B has a better offensive rating than C, but a worse defensive rating, and the sum of offensive and defensive powers is the same for both teams. Also, A and C have the same defensive rating, although A has 0 points against and C has 3 points again; again, this because the schedule of C is harder than the schedule of A.