What is Closeness Centrality?
A popular centrality measurement is the closeness centrality. How and why different types of closeness centrality can be calculated. Also, we discuss important assumptions.
TABLE OF CONTENTS
Closeness in Undirected Networks
Bavelas introduced in 1950 the network measurement closeness centrality in an undirected network [1]. This centrality measures the 'average' distance to the other nodes in the network. The definition of closeness centrality of a node is:
- = 'average' distance (to the other nodes in the network);
- = reciprocal of the farness;
- = 1 / sum of the geodesic distance;
- = 1 / sum of the length of the shortest paths.
In an unweighted network, all links have the same length. To keep it simple, we take a length of 1.
Examples Closeness Centrality
Closeness centrality can be used to find nodes in a network that can quickly interact with other nodes. An example of an undirected social network is Facebook. This is important in diffusion processes and may not always be a positive property. When a disease spreads across a network, nodes with high closeness centrality are potential super spreaders.
How do I Compute?
Another representation of the network is shown in the geodesic distance matrix. It is easier to calculate the score with a matrix. The closeness of a node in an undirected network is the inverse of the sum of the values in a row.
An important assumption is that closeness centrality assumes that everything always follows the shortest path. However, reality is more complex, and the chosen path is not always the shortest.
Constraint: Reachable Nodes
Sometimes some nodes are unreachable and undefined. There are some solutions to fix this problem. There are some solutions to fix this problem. Some authors suggest replacing undefined links with an infinite distance or, more practical, a large value. Csardi & Nepusz propose to replace the infinite distance with the largest geodesic possible length [3]. In a network with n nodes, this is the length n-1, a chain-network.
But these solutions can pose some scaling issues, which leads to wrong interpretations. Therefore, a constraint of closeness centrality is that there is a path between each pair of nodes.
Improved Variations
Closeness centrality is just one of many measures of centrality. There are also variations of closeness centrality that deal with the limitation and can be applied to disconnected networks: harmonic centrality, information centrality, effective distance-based closeness centrality, random walk closeness centrality, hierarchical closeness centrality, forest distance closeness centrality, influence range closeness centrality, and more.
Some of these also contain other improvements. Most alternatives have similar use cases.
Closeness in Directed Networks
In practice, most use cases are directed networks; fortunately, we can also calculate the closeness centrality for these networks. In the case of directed networks, most software only calculates the outbound closeness centrality. The most important nodes reach all others in the network most quickly and help find good broadcasters. This is important for diffusion processes.
Inbound closeness can be useful if we want to find the optimal location for our customers or improve the search results of a website. In these cases, the incoming paths are meaningful.
Closeness in Weighted Networks
Also, you can calculate closeness centrality with weighted networks as well. Now, not all distances are the same. The farness is now the sum of the weights, and the inverse of this sum is the weighted closeness centrality.
Closeness Normalisation
We should normalise if we want to compare the closeness centrality of nodes in graphs of varying sizes. We do this by multiplying the previous score with n-1 nodes. If we do not allow a link between a node and itself, this is n-1.
This places all scores in the range of 0 to 1. A node with a normalised closeness of 1 is directly connected to all nodes in the network. Now networks of different sizes have the same scale and can be compared.
Pro closeness Centrality
- Entire network: The position of a node in relation to the entire network is a significant advantage of closeness centrality.
Cons Closeness Centrality
- Sensitive: Because we need the total network to calculate closeness, this centrality is sensitive to changes in this network.
- All nodes reachable: Another limitation is the condition that all nodes must be reachable. A last limitation is that you will often find all nodes have a similar score in a highly connected network. Searching for closeness in a subnetwork may then be more informative.
- Assumption shortest path: In real life, a flow doesn't always take the shortest path.
Conclusion
The type of network determines how to calculate closeness centrality. Also, we have to check if there is a path between each pair of nodes; if not, this gives some complications. The closeness centrality is perhaps a popular centrality, but the harmonic centrality or other alternatives are, in most cases, more practical.
References
[1] Bavelas, A. (1950). Communication patterns in task?oriented groups. The journal of the acoustical society of America, 22(6), 725-730.
[2] Vignery, K., & Laurier, W. (2020). A methodology and theoretical taxonomy for centrality measures: What are the best centrality indicators for student networks?. PLoS One, 15(12), e0244377.
[3] Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, complex systems, 1695(5), 1-9.