We have learnt how to exploit the network topology to gauge the similarity (or dissimilarity) of pairs of networks. This opens the possibility of defining a measure of heterogeneity for a node in terms of the dissimilarity of its neighbors:
A node is heterogeneous if it has dissimilar neighbors.
We will work on weighted undirected graphs. In the case of directed graphs, one edge direction among incoming and outgoing must be chosen. The unweighted case is a special case of the weighted one. Let be the adjacency matrix of a weighted undirected graph. Let be a node with positive degree . We set:
We will work out the math for a general probability distribution , with and . A measure of heterogeneity of is Shannon entropy:
If we have information about pairwise distance (dissimilarity) of elements and , then another measure of heterogeneity is Rao quadratic entropy:
Let us now assume the simplest distance among nodes: for and . In this case:
It follows that, in general, is large, hence is heterogeneous, when evenly distributes its probability among dissimilar elements. On the contrary, is homogeneous when it concentrates its probability on similar elements. If we go back to heterogeneity of nodes of a graph and apply as a measure, we have that a node is heterogeneous when it evenly distributes its strength among dissimilar neighbors and it is homogeneous when it concentrates its strength on similar neighbors.
shannon = function(p) {
x = p * log2(p)
x = replace(x, is.nan(x), 0)
return(-sum(x))
}
simpson = function(p) {
x = 1 - sum(p * p)
return(x)
}
rao = function(p, D) {
x = diag(p) %*% D %*% diag(p)
return(sum(c(x)))
}
heterogeneity = function(g, D, mode = "col") {
A = as_adjacency_matrix(g, attr = "weight", sparse = FALSE)
if (mode == "col") {
A = A %*% diag(1/colSums(A))
dim = 2
} else {
A = diag(1/rowSums(A)) %*% A
dim = 1
}
return(list(shannon = apply(A, dim, shannon),
simpson = apply(A, dim, simpson),
rao = apply(A, dim, rao, D)))
}
}