Disjoint Set Union (Union-Find)

The Power of DSU

DSU maintains a collection of disjoint sets and supports two operations:

Find: Which set does element x belong to?
Union: Merge two sets together.

With path compression and union by rank, both operations run in O(α(n)) ≈ O(1) amortized time.

Pattern Recognition Signals:

“Connected components” that change over time → DSU
“Are x and y in the same group?” → DSU
“Merge groups” → DSU
“Minimum Spanning Tree” → Kruskal’s + DSU
“Online connectivity queries” → DSU

Standard Implementation

The Core Idea: Represent each set as a tree. Each element points to its parent, and the root is the set’s representative. Path Compression: When we find the root of an element, we make all nodes along the path point directly to the root. This flattens the tree, making future queries faster. Union by Rank: When merging two sets, attach the shorter tree under the taller one. This keeps trees balanced. Visual Example:

Before find(4):           After path compression:
    1                           1
    |                        / | \
    2                       2  3  4
    |
    3
    |
    4

Finding root of 4: 4→3→2→1 (root is 1)
After compression: 4→1, 3→1, 2→1 (all point to root)

Why O(α(n))? The inverse Ackermann function α(n) grows incredibly slowly—for any practical n (even n = 10^80), α(n) ≤ 4. So it’s effectively constant time!

class DSU {
public:
    vector<int> parent, rank_;
    int components;
    
    DSU(int n) : parent(n + 1), rank_(n + 1, 0), components(n) {
        iota(parent.begin(), parent.end(), 0);  // parent[i] = i
    }
    
    int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]);  // Path compression
        }
        return parent[x];
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;  // Already in same set
        
        // Union by rank
        if (rank_[px] < rank_[py]) swap(px, py);
        parent[py] = px;
        if (rank_[px] == rank_[py]) rank_[px]++;
        
        components--;
        return true;
    }
    
    bool connected(int x, int y) {
        return find(x) == find(y);
    }
};

Usage

int n = 5;
DSU dsu(n);

dsu.unite(1, 2);
dsu.unite(3, 4);
dsu.unite(2, 4);

cout << dsu.connected(1, 3);  // true
cout << dsu.components;        // 2 (sets: {1,2,3,4} and {5})

DSU with Size Tracking

Track size of each set for weighted merging.

class DSU {
public:
    vector<int> parent, size_;
    
    DSU(int n) : parent(n + 1), size_(n + 1, 1) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]);
        }
        return parent[x];
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;
        
        // Union by size
        if (size_[px] < size_[py]) swap(px, py);
        parent[py] = px;
        size_[px] += size_[py];
        
        return true;
    }
    
    int getSize(int x) {
        return size_[find(x)];
    }
};

Pattern 1: Kruskal’s MST

Problem: Find Minimum Spanning Tree of a weighted graph. Why Kruskal + DSU? Kruskal’s algorithm sorts edges by weight and greedily adds each edge if it does not form a cycle. The cycle check is exactly the DSU “are these in the same set?” query. Sort takes O(E log E), and E union/find operations take O(E * alpha(V)), so total is O(E log E). Key insight: An MST of n nodes always has exactly n-1 edges. If fewer edges are used after processing all edges, the graph is disconnected (no spanning tree exists).

struct Edge {
    int u, v, w;
    bool operator<(const Edge& other) const {
        return w < other.w;
    }
};

long long kruskal(int n, vector<Edge>& edges) {
    sort(edges.begin(), edges.end());  // Sort by weight ascending
    
    DSU dsu(n);
    long long mstWeight = 0;
    int edgesUsed = 0;
    
    for (auto& [u, v, w] : edges) {
        if (dsu.unite(u, v)) {     // If u and v are in different components
            mstWeight += w;         // Add this edge to MST
            edgesUsed++;
            
            if (edgesUsed == n - 1) break;  // MST complete (n-1 edges)
        }
        // If unite returns false, adding this edge would form a cycle -- skip
    }
    
    // If edgesUsed < n-1, graph is disconnected
    return mstWeight;
}

Codeforces Problems:

Problem	Rating	Link
Road Reparation	1400	CSES
Road Construction	1300	CSES

Pattern 2: Cycle Detection

Problem: Detect if adding an edge creates a cycle.

bool addEdge(DSU& dsu, int u, int v) {
    if (dsu.connected(u, v)) {
        return true;  // Cycle detected
    }
    dsu.unite(u, v);
    return false;
}

// Count edges that create cycles
int countCycles = 0;
for (auto [u, v] : edges) {
    if (dsu.connected(u, v)) {
        countCycles++;
    } else {
        dsu.unite(u, v);
    }
}

Pattern 3: Connected Components Queries

Problem: Answer “are u and v connected?” after edge additions.

// Process edges online
void solve() {
    int n, m, q;
    cin >> n >> m >> q;
    
    DSU dsu(n);
    
    // Add initial edges
    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        dsu.unite(u, v);
    }
    
    // Answer queries
    for (int i = 0; i < q; i++) {
        int type, u, v;
        cin >> type >> u >> v;
        
        if (type == 1) {
            dsu.unite(u, v);
        } else {
            cout << (dsu.connected(u, v) ? "YES" : "NO") << "\n";
        }
    }
}

Pattern 4: DSU with Weighted Edges

Track something along the path from node to root.

class WeightedDSU {
public:
    vector<int> parent;
    vector<long long> dist;  // Distance from node to parent
    
    WeightedDSU(int n) : parent(n + 1), dist(n + 1, 0) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    pair<int, long long> find(int x) {
        if (parent[x] == x) return {x, 0};
        
        auto [root, d] = find(parent[x]);
        parent[x] = root;
        dist[x] += d;
        
        return {root, dist[x]};
    }
    
    void unite(int x, int y, long long w) {
        // w = dist[y] - dist[x] (relative distance)
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px == py) return;
        
        parent[py] = px;
        dist[py] = dx - dy + w;
    }
    
    long long getDist(int x, int y) {
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px != py) return LLONG_MAX;  // Not connected
        return dy - dx;
    }
};

Application: “A is 5 units more than B” type constraints. Also used in problems like “determine if relationships are consistent” (e.g., “if X says A is heavier than B by 3 kg, and Y says B is heavier than C by 2 kg, what is the difference between A and C?”).

Contest tip: Weighted DSU is commonly tested in problems that give pairwise relative information and ask you to detect contradictions or compute absolute values. If you see “relative differences” or “potentials” between elements in the same group, think weighted DSU.

Pattern 5: Rollback DSU (Offline)

Problem: Process queries where edges can be added AND removed. Solution: Use union by size (no path compression) to enable rollback. Why no path compression? Path compression is irreversible — it flattens the tree structure, making it impossible to “undo” a union. Without path compression, each union only changes one parent pointer, which is easy to save and restore. When to use: Offline dynamic connectivity, divide and conquer on queries, and problems where you need to “undo” merges (e.g., process edges that exist only during a time interval).

class RollbackDSU {
public:
    vector<int> parent, size_;
    vector<pair<int, int>> history;  // {node, old_parent}
    
    RollbackDSU(int n) : parent(n + 1), size_(n + 1, 1) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    int find(int x) {
        while (parent[x] != x) x = parent[x];  // No path compression!
        return x;
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;
        
        if (size_[px] < size_[py]) swap(px, py);
        
        history.push_back({py, py});  // Save state
        history.push_back({px, size_[px]});  // Actually store size
        
        parent[py] = px;
        size_[px] += size_[py];
        
        return true;
    }
    
    void rollback(int savedState) {
        while (history.size() > savedState) {
            // Restore previous state
            // Implementation depends on what you're tracking
        }
    }
    
    int saveState() {
        return history.size();
    }
};

Pattern 6: DSU on Tree

Merge child sets when processing nodes bottom-up.

// Problem: For each node, count distinct values in subtree
vector<set<int>> nodeSet(n + 1);
vector<int> answer(n + 1);

void dfs(int u, int p) {
    nodeSet[u].insert(val[u]);
    
    for (int v : adj[u]) {
        if (v != p) {
            dfs(v, u);
            
            // Merge smaller into larger (small-to-large)
            if (nodeSet[u].size() < nodeSet[v].size()) {
                swap(nodeSet[u], nodeSet[v]);
            }
            
            for (int x : nodeSet[v]) {
                nodeSet[u].insert(x);
            }
        }
    }
    
    answer[u] = nodeSet[u].size();
}

Pattern 7: Bipartiteness Check with DSU

Problem: Check if graph can be 2-colored (bipartite).

class BipartiteDSU {
public:
    vector<int> parent;
    vector<int> dist;  // 0 or 1 (parity from root)
    
    BipartiteDSU(int n) : parent(n + 1), dist(n + 1, 0) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    pair<int, int> find(int x) {
        if (parent[x] == x) return {x, 0};
        
        auto [root, d] = find(parent[x]);
        parent[x] = root;
        dist[x] ^= d;
        
        return {root, dist[x]};
    }
    
    bool addEdge(int x, int y) {
        // Returns false if adding edge (x, y) violates bipartiteness
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px == py) {
            return dx != dy;  // Must have different parities
        }
        
        parent[py] = px;
        dist[py] = dx ^ dy ^ 1;  // Force different parity
        
        return true;
    }
};

Common Mistakes

Mistake 1: Not initializing parent correctly

// WRONG
vector<int> parent(n + 1);  // All zeros!

// CORRECT
iota(parent.begin(), parent.end(), 0);  // parent[i] = i

Mistake 2: Union without find

// WRONG
parent[x] = y;

// CORRECT
parent[find(x)] = find(y);

Mistake 3: Using path compression with rollback Path compression destroys the tree structure needed for rollback.

Complexity Analysis

Operation	Without Optimization	Path Compression	Path + Rank
Find	O(n)	O(log n) amortized	O(α(n)) ≈ O(1)
Union	O(n)	O(log n) amortized	O(α(n)) ≈ O(1)

α(n) is the inverse Ackermann function, effectively constant for all practical n.

Practice Problems

Beginner (1000-1300)

Problem	Pattern	Link
Road Construction	Basic DSU	CSES
Road Reparation	Kruskal MST	CSES

Intermediate (1300-1600)

Problem	Pattern	Link
Planets Queries I	DSU + binary lifting	CSES
Ice Skating	Components	CF 217A

Advanced (1600-1900)

Problem	Pattern	Link
New Roads Queries	Offline DSU	CSES
Giant Pizza	2-SAT + DSU	CSES

Key Takeaways

Path Compression

Point nodes directly to root during find.

Union by Rank/Size

Attach smaller tree under larger tree.

Kruskal's MST

Sort edges, greedily add if no cycle.

Near-Constant Time

O(α(n)) per operation with both optimizations.

Next Up

Chapter 17: Segment Trees

Master range queries and point updates with segment trees.

​Disjoint Set Union (Union-Find)

​The Power of DSU

​Standard Implementation

​Usage

​DSU with Size Tracking

​Pattern 1: Kruskal’s MST

​Pattern 2: Cycle Detection

​Pattern 3: Connected Components Queries

​Pattern 4: DSU with Weighted Edges

​Pattern 5: Rollback DSU (Offline)

​Pattern 6: DSU on Tree

​Pattern 7: Bipartiteness Check with DSU

​Common Mistakes

​Complexity Analysis

​Practice Problems

​Beginner (1000-1300)

​Intermediate (1300-1600)

​Advanced (1600-1900)

​Key Takeaways

Path Compression

Union by Rank/Size

Kruskal's MST

Near-Constant Time

​Next Up

Chapter 17: Segment Trees

Disjoint Set Union (Union-Find)

The Power of DSU

Standard Implementation

Usage

DSU with Size Tracking

Pattern 1: Kruskal’s MST

Pattern 2: Cycle Detection

Pattern 3: Connected Components Queries

Pattern 4: DSU with Weighted Edges

Pattern 5: Rollback DSU (Offline)

Pattern 6: DSU on Tree

Pattern 7: Bipartiteness Check with DSU

Common Mistakes

Complexity Analysis

Practice Problems

Beginner (1000-1300)

Intermediate (1300-1600)

Advanced (1600-1900)

Key Takeaways

Next Up