Betweenness Centrality: Identifying Key Nodes in Networks and Graphs

Learn about betweenness centrality, a metric for measuring the importance of nodes in a network. This guide explains how betweenness centrality is calculated in both unweighted and weighted graphs, and its applications in identifying critical points within network structures.



Betweenness Centrality: Identifying Key Nodes in a Network

What is Betweenness Centrality?

Betweenness centrality is a way to measure the importance of a node (a point or device in a network) within a network or graph. It quantifies how often a node lies on the shortest paths between other nodes. Nodes with high betweenness centrality act as bridges, connecting different parts of the network. This is a useful metric for understanding network structure and identifying critical points within a network.

Calculating Betweenness Centrality

For each node (v), betweenness centrality is calculated by counting the number of shortest paths that pass through that node. In an unweighted graph (where all connections have equal cost), betweenness centrality is calculated for each node (v):

The number of shortest paths between all pairs of nodes (s and t) that pass through node v.

In a weighted graph (where connections have different costs), the calculation uses weighted shortest paths.

Normalization:

To make betweenness centrality scores easier to compare across different graphs, they are often normalized. A common normalization method scales the scores to a range between 0 and 1:

normalized_centrality(v) = (centrality(v) - min_centrality) / (max_centrality - min_centrality)

Betweenness Centrality in Weighted Networks

In weighted networks, connections (edges) have weights representing factors like bandwidth, distance, or influence. This adds complexity to the calculation, often involving weighted adjacency matrices. In some types of networks (scale-free networks), the strength of a node might follow a power-law distribution: S(k) ≈ kβ

Computational Challenges and Sampling

Calculating betweenness centrality can be computationally expensive for large graphs. Approximation methods, such as Brandes' algorithm, are often used. Even these methods can be slow for extremely large networks. To further improve efficiency, *sampling* is employed: instead of calculating betweenness centrality for *all* nodes, you calculate it for a randomly selected subset of nodes.

Factors Affecting Sampling:

  • Parallelism: Using multiple processors to speed up the calculation (but this can increase memory use).
  • Sampling Size: A larger sample is more accurate but takes longer to compute.

Sampling Strategies:

Various strategies exist for selecting nodes for sampling. One common method is to select nodes with a probability proportional to their degree (the number of connections they have). This is because nodes with many connections are more likely to be on many shortest paths.

Applications of Betweenness Centrality

Betweenness centrality is useful in many fields:

  • Network Analysis: Identifying critical nodes in infrastructure networks.
  • Social Network Analysis: Finding influential people.
  • Transportation Networks: Optimizing routes.
  • Biology: Analyzing biological networks.

Conclusion

Betweenness centrality is a valuable metric for identifying important nodes in a network. While computationally intensive for large networks, approximation methods and sampling strategies help to make it practical for large-scale analysis.