Skip to main content

Disjoint Set Union (Union-Find)

DSU Visualization

The Power of DSU

DSU maintains a collection of disjoint sets and supports two operations:
  • Find: Which set does element x belong to?
  • Union: Merge two sets together.
With path compression and union by rank, both operations run in O(α(n)) ≈ O(1) amortized time.
Pattern Recognition Signals:
  • “Connected components” that change over time → DSU
  • “Are x and y in the same group?” → DSU
  • “Merge groups” → DSU
  • “Minimum Spanning Tree” → Kruskal’s + DSU
  • “Online connectivity queries” → DSU

Standard Implementation

The Core Idea: Represent each set as a tree. Each element points to its parent, and the root is the set’s representative. Path Compression: When we find the root of an element, we make all nodes along the path point directly to the root. This flattens the tree, making future queries faster. Union by Rank: When merging two sets, attach the shorter tree under the taller one. This keeps trees balanced. Visual Example:
Before find(4):           After path compression:
    1                           1
    |                        / | \
    2                       2  3  4
    |
    3
    |
    4

Finding root of 4: 4→3→2→1 (root is 1)
After compression: 4→1, 3→1, 2→1 (all point to root)
Why O(α(n))? The inverse Ackermann function α(n) grows incredibly slowly—for any practical n (even n = 10^80), α(n) ≤ 4. So it’s effectively constant time!
class DSU {
public:
    vector<int> parent, rank_;
    int components;
    
    DSU(int n) : parent(n + 1), rank_(n + 1, 0), components(n) {
        iota(parent.begin(), parent.end(), 0);  // parent[i] = i
    }
    
    int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]);  // Path compression
        }
        return parent[x];
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;  // Already in same set
        
        // Union by rank
        if (rank_[px] < rank_[py]) swap(px, py);
        parent[py] = px;
        if (rank_[px] == rank_[py]) rank_[px]++;
        
        components--;
        return true;
    }
    
    bool connected(int x, int y) {
        return find(x) == find(y);
    }
};

Usage

int n = 5;
DSU dsu(n);

dsu.unite(1, 2);
dsu.unite(3, 4);
dsu.unite(2, 4);

cout << dsu.connected(1, 3);  // true
cout << dsu.components;        // 2 (sets: {1,2,3,4} and {5})

DSU with Size Tracking

Track size of each set for weighted merging.
class DSU {
public:
    vector<int> parent, size_;
    
    DSU(int n) : parent(n + 1), size_(n + 1, 1) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]);
        }
        return parent[x];
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;
        
        // Union by size
        if (size_[px] < size_[py]) swap(px, py);
        parent[py] = px;
        size_[px] += size_[py];
        
        return true;
    }
    
    int getSize(int x) {
        return size_[find(x)];
    }
};

Pattern 1: Kruskal’s MST

Problem: Find Minimum Spanning Tree of a weighted graph.
struct Edge {
    int u, v, w;
    bool operator<(const Edge& other) const {
        return w < other.w;
    }
};

long long kruskal(int n, vector<Edge>& edges) {
    sort(edges.begin(), edges.end());
    
    DSU dsu(n);
    long long mstWeight = 0;
    int edgesUsed = 0;
    
    for (auto& [u, v, w] : edges) {
        if (dsu.unite(u, v)) {
            mstWeight += w;
            edgesUsed++;
            
            if (edgesUsed == n - 1) break;  // MST complete
        }
    }
    
    return mstWeight;
}
Codeforces Problems:
ProblemRatingLink
Road Reparation1400CSES
Road Construction1300CSES

Pattern 2: Cycle Detection

Problem: Detect if adding an edge creates a cycle.
bool addEdge(DSU& dsu, int u, int v) {
    if (dsu.connected(u, v)) {
        return true;  // Cycle detected
    }
    dsu.unite(u, v);
    return false;
}

// Count edges that create cycles
int countCycles = 0;
for (auto [u, v] : edges) {
    if (dsu.connected(u, v)) {
        countCycles++;
    } else {
        dsu.unite(u, v);
    }
}

Pattern 3: Connected Components Queries

Problem: Answer “are u and v connected?” after edge additions.
// Process edges online
void solve() {
    int n, m, q;
    cin >> n >> m >> q;
    
    DSU dsu(n);
    
    // Add initial edges
    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        dsu.unite(u, v);
    }
    
    // Answer queries
    for (int i = 0; i < q; i++) {
        int type, u, v;
        cin >> type >> u >> v;
        
        if (type == 1) {
            dsu.unite(u, v);
        } else {
            cout << (dsu.connected(u, v) ? "YES" : "NO") << "\n";
        }
    }
}

Pattern 4: DSU with Weighted Edges

Track something along the path from node to root.
class WeightedDSU {
public:
    vector<int> parent;
    vector<long long> dist;  // Distance from node to parent
    
    WeightedDSU(int n) : parent(n + 1), dist(n + 1, 0) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    pair<int, long long> find(int x) {
        if (parent[x] == x) return {x, 0};
        
        auto [root, d] = find(parent[x]);
        parent[x] = root;
        dist[x] += d;
        
        return {root, dist[x]};
    }
    
    void unite(int x, int y, long long w) {
        // w = dist[y] - dist[x] (relative distance)
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px == py) return;
        
        parent[py] = px;
        dist[py] = dx - dy + w;
    }
    
    long long getDist(int x, int y) {
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px != py) return LLONG_MAX;  // Not connected
        return dy - dx;
    }
};
Application: “A is 5 units more than B” type constraints.

Pattern 5: Rollback DSU (Offline)

Problem: Process queries where edges can be removed. Solution: Use union by size (no path compression) to enable rollback.
class RollbackDSU {
public:
    vector<int> parent, size_;
    vector<pair<int, int>> history;  // {node, old_parent}
    
    RollbackDSU(int n) : parent(n + 1), size_(n + 1, 1) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    int find(int x) {
        while (parent[x] != x) x = parent[x];  // No path compression!
        return x;
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;
        
        if (size_[px] < size_[py]) swap(px, py);
        
        history.push_back({py, py});  // Save state
        history.push_back({px, size_[px]});  // Actually store size
        
        parent[py] = px;
        size_[px] += size_[py];
        
        return true;
    }
    
    void rollback(int savedState) {
        while (history.size() > savedState) {
            // Restore previous state
            // Implementation depends on what you're tracking
        }
    }
    
    int saveState() {
        return history.size();
    }
};

Pattern 6: DSU on Tree

Merge child sets when processing nodes bottom-up.
// Problem: For each node, count distinct values in subtree
vector<set<int>> nodeSet(n + 1);
vector<int> answer(n + 1);

void dfs(int u, int p) {
    nodeSet[u].insert(val[u]);
    
    for (int v : adj[u]) {
        if (v != p) {
            dfs(v, u);
            
            // Merge smaller into larger (small-to-large)
            if (nodeSet[u].size() < nodeSet[v].size()) {
                swap(nodeSet[u], nodeSet[v]);
            }
            
            for (int x : nodeSet[v]) {
                nodeSet[u].insert(x);
            }
        }
    }
    
    answer[u] = nodeSet[u].size();
}

Pattern 7: Bipartiteness Check with DSU

Problem: Check if graph can be 2-colored (bipartite).
class BipartiteDSU {
public:
    vector<int> parent;
    vector<int> dist;  // 0 or 1 (parity from root)
    
    BipartiteDSU(int n) : parent(n + 1), dist(n + 1, 0) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    pair<int, int> find(int x) {
        if (parent[x] == x) return {x, 0};
        
        auto [root, d] = find(parent[x]);
        parent[x] = root;
        dist[x] ^= d;
        
        return {root, dist[x]};
    }
    
    bool addEdge(int x, int y) {
        // Returns false if adding edge (x, y) violates bipartiteness
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px == py) {
            return dx != dy;  // Must have different parities
        }
        
        parent[py] = px;
        dist[py] = dx ^ dy ^ 1;  // Force different parity
        
        return true;
    }
};

Common Mistakes

Mistake 1: Not initializing parent correctly
// WRONG
vector<int> parent(n + 1);  // All zeros!

// CORRECT
iota(parent.begin(), parent.end(), 0);  // parent[i] = i
Mistake 2: Union without find
// WRONG
parent[x] = y;

// CORRECT
parent[find(x)] = find(y);
Mistake 3: Using path compression with rollback Path compression destroys the tree structure needed for rollback.

Complexity Analysis

OperationWithout OptimizationPath CompressionPath + Rank
FindO(n)O(log n) amortizedO(α(n)) ≈ O(1)
UnionO(n)O(log n) amortizedO(α(n)) ≈ O(1)
α(n) is the inverse Ackermann function, effectively constant for all practical n.

Practice Problems

Beginner (1000-1300)

ProblemPatternLink
Road ConstructionBasic DSUCSES
Road ReparationKruskal MSTCSES

Intermediate (1300-1600)

ProblemPatternLink
Planets Queries IDSU + binary liftingCSES
Ice SkatingComponentsCF 217A

Advanced (1600-1900)

ProblemPatternLink
New Roads QueriesOffline DSUCSES
Giant Pizza2-SAT + DSUCSES

Key Takeaways

Path Compression

Point nodes directly to root during find.

Union by Rank/Size

Attach smaller tree under larger tree.

Kruskal's MST

Sort edges, greedily add if no cycle.

Near-Constant Time

O(α(n)) per operation with both optimizations.

Next Up

Chapter 17: Segment Trees

Master range queries and point updates with segment trees.