Minimum Spanning Tree (MST)

A Minimum Spanning Tree is a subset of edges from a connected, weighted graph that:

  1. Connects all nodes in the graph without forming cycles.
  2. Has the minimum possible total weight.

An MST is unique if all edge weights are distinct. It produces a list of edges


Applications of MST

  • Network Design:
    • Building cost-effective networks (e.g., power grids, telecommunication networks).
  • Approximation Algorithms:
    • Solving problems like the Traveling Salesman Problem (TSP).
  • Clustering:
    • Grouping data by using MSTs in hierarchical clustering algorithms.

Algorithms for Finding MST

1. Prim’s Algorithm

  • Approach: Greedily builds the MST starting from an arbitrary node.
  • Steps:
    1. Start with any node.
    2. Add the smallest edge that connects a node in the MST to a node outside the MST.
    3. Repeat until all nodes are included.
  • Data Structures:
    • Min-Heap for efficiently finding the smallest edge.
  • Complexity:
    • (O(E \log V)), where (E) is the number of edges and (V) is the number of vertices.

Python Implementation:

import heapq
 
def prim(graph, start):
    mst = []
    visited = set()
    min_heap = [(0, start, None)]  # (weight, current_node, parent)
 
    while min_heap:
        weight, current_node, parent = heapq.heappop(min_heap)
 
        if current_node in visited:
            continue
        visited.add(current_node)
        if parent is not None:
            mst.append((parent, current_node, weight))
 
        for neighbor, edge_weight in graph[current_node]:
            if neighbor not in visited:
                heapq.heappush(min_heap, (edge_weight, neighbor, current_node))
 
    return mst

2. Kruskal’s Algorithm

  • Approach: Greedily adds edges to the MST in increasing order of weight, avoiding cycles.
  • Steps:
    1. Sort all edges by weight.
    2. Add the smallest edge to the MST if it doesn’t form a cycle.
    3. Use a Union-Find data structure to detect cycles.
  • Complexity:
    • (O(E \log E)), where sorting edges dominates the runtime.

Python Implementation:

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
 
    def find(self, node):
        if self.parent[node] != node:
            self.parent[node] = self.find(self.parent[node])
        return self.parent[node]
 
    def union(self, node1, node2):
        root1 = self.find(node1)
        root2 = self.find(node2)
        if root1 != root2:
            if self.rank[root1] > self.rank[root2]:
                self.parent[root2] = root1
            elif self.rank[root1] < self.rank[root2]:
                self.parent[root1] = root2
            else:
                self.parent[root2] = root1
                self.rank[root1] += 1
 
def kruskal(edges, num_nodes):
    edges.sort(key=lambda x: x[2])  # Sort by weight
    uf = UnionFind(num_nodes)
    mst = []
 
    for u, v, weight in edges:
        if uf.find(u) != uf.find(v):
            uf.union(u, v)
            mst.append((u, v, weight))
 
    return mst

Example Graph and Outputs

    A ---1--- B
    |        / |
    4      2   6
    |    /     |
    C ---3-----D

Graph Representation for Prim’s:

graph = {
    'A': [('B', 1), ('C', 4)],
    'B': [('A', 1), ('C', 2), ('D', 6)],
    'C': [('A', 4), ('B', 2), ('D', 3)],
    'D': [('B', 6), ('C', 3)],
}
mst = prim(graph, 'A')
print(mst)  # Output: [('A', 'B', 1), ('B', 'C', 2), ('C', 'D', 3)]

Graph Representation for Kruskal’s:

edges = [
    ('A', 'B', 1),
    ('B', 'C', 2),
    ('C', 'D', 3),
    ('A', 'C', 4),
    ('B', 'D', 6),
]
mst = kruskal(edges, 4)
print(mst)  # Output: [('A', 'B', 1), ('B', 'C', 2), ('C', 'D', 3)]

Comparison of Prim’s and Kruskal’s

FeaturePrim’s AlgorithmKruskal’s Algorithm
ApproachGrows MST from a starting nodeAdds edges in increasing order of weight
Data StructurePriority QueueUnion-Find
Best forDense GraphsSparse Graphs
Complexity(O(E \log V))(O(E \log E))