A Python project that analyzes a social network graph to identify the most influential people using graph centrality metrics.
This project uses Zachary's Karate Club dataset — a well-known social network dataset that represents friendships between 34 members of a university karate club. We apply graph analytics techniques to figure out which members are the most influential in the network.
The analysis uses two key centrality metrics:
- Degree Centrality – how many direct connections a person has
- Betweenness Centrality – how often a person acts as a bridge between others
By combining these metrics, we can identify the top 5 most influential members of the club.
- Python 3.8+
- NetworkX – for graph creation and analysis
- Pandas – for organizing and displaying results
- Matplotlib – for visualizing the network graph
social-network-analyzer/
│
├── data/
│ └── dataset_info.txt # Info about the dataset
│
├── src/
│ ├── graph_loader.py # Loads the karate club graph
│ ├── centrality_analysis.py # Calculates centrality metrics
│ ├── visualization.py # Creates the network plot
│ └── main.py # Main script to run everything
│
├── results/
│ └── graph_visualization.png # Saved graph image (generated)
│
├── requirements.txt
└── README.md
-
Clone the repository:
git clone https://github.com/your-username/social-network-analyzer.git cd social-network-analyzer -
Install dependencies:
pip install -r requirements.txt
-
Run the analysis:
python src/main.py
The script will print the results in the console and save a graph visualization to the
results/folder.
Degree centrality measures how many connections (edges) a node has compared to the maximum possible. A person with high degree centrality knows a lot of people directly.
Formula: degree_centrality(v) = degree(v) / (n - 1)
where n is the total number of nodes.
Betweenness centrality measures how often a node appears on the shortest path between two other nodes. A person with high betweenness centrality is a key connector — removing them would disrupt communication in the network.
Formula: betweenness(v) = Σ (σ_st(v) / σ_st) for all pairs s, t
where σ_st is the total number of shortest paths from s to t, and σ_st(v) is the number of those paths that pass through v.
We calculate a simple combined score by averaging the degree and betweenness centralities. This gives us a balanced view of overall influence.
=======================================================
SOCIAL NETWORK INFLUENCE ANALYZER
Analyzing Zachary's Karate Club Dataset
=======================================================
[Step 1] Loading the social network graph...
Graph loaded successfully!
Number of nodes (members): 34
Number of edges (connections): 78
[Step 2] Calculating centrality metrics...
=======================================================
CENTRALITY ANALYSIS RESULTS
=======================================================
--- Degree Centrality (top 5) ---
Node 33: 0.5152
Node 0: 0.4848
Node 32: 0.3636
Node 2: 0.3030
Node 1: 0.2727
--- Betweenness Centrality (top 5) ---
Node 0: 0.4376
Node 33: 0.3041
Node 32: 0.1452
Node 2: 0.1437
Node 31: 0.1383
--- Top 5 Most Influential Nodes (Combined Score) ---
Node Degree Centrality Betweenness Centrality Combined Score
33 0.5152 0.3041 0.4096
0 0.4848 0.4376 0.4612
32 0.3636 0.1452 0.2544
2 0.3030 0.1437 0.2234
1 0.2727 0.0566 0.1647
[Step 3] Generating network visualization...
Graph visualization saved to: results/graph_visualization.png
=======================================================
Analysis complete! Check the results/ folder
for the saved graph visualization.
=======================================================
The generated plot shows the karate club network where:
- Node size represents the number of connections (bigger = more connections)
- Node color ranges from yellow (few connections) to red (many connections)
- Edges represent social connections between members
This project is for educational purposes.