Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Disjoint Set Union (Union-Find)

DSU Visualization

The Power of DSU

DSU maintains a collection of disjoint sets and supports two operations:
  • Find: Which set does element x belong to?
  • Union: Merge two sets together.
With path compression and union by rank, both operations run in O(α(n)) ≈ O(1) amortized time.
Pattern Recognition Signals:
  • “Connected components” that change over time → DSU
  • “Are x and y in the same group?” → DSU
  • “Merge groups” → DSU
  • “Minimum Spanning Tree” → Kruskal’s + DSU
  • “Online connectivity queries” → DSU

Standard Implementation

The Core Idea: Represent each set as a tree. Each element points to its parent, and the root is the set’s representative. Path Compression: When we find the root of an element, we make all nodes along the path point directly to the root. This flattens the tree, making future queries faster. Union by Rank: When merging two sets, attach the shorter tree under the taller one. This keeps trees balanced. Visual Example:
Before find(4):           After path compression:
    1                           1
    |                        / | \
    2                       2  3  4
    |
    3
    |
    4

Finding root of 4: 4→3→2→1 (root is 1)
After compression: 4→1, 3→1, 2→1 (all point to root)
Why O(α(n))? The inverse Ackermann function α(n) grows incredibly slowly—for any practical n (even n = 10^80), α(n) ≤ 4. So it’s effectively constant time!
class DSU {
public:
    vector<int> parent, rank_;
    int components;
    
    DSU(int n) : parent(n + 1), rank_(n + 1, 0), components(n) {
        iota(parent.begin(), parent.end(), 0);  // parent[i] = i
    }
    
    int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]);  // Path compression
        }
        return parent[x];
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;  // Already in same set
        
        // Union by rank
        if (rank_[px] < rank_[py]) swap(px, py);
        parent[py] = px;
        if (rank_[px] == rank_[py]) rank_[px]++;
        
        components--;
        return true;
    }
    
    bool connected(int x, int y) {
        return find(x) == find(y);
    }
};

Usage

int n = 5;
DSU dsu(n);

dsu.unite(1, 2);
dsu.unite(3, 4);
dsu.unite(2, 4);

cout << dsu.connected(1, 3);  // true
cout << dsu.components;        // 2 (sets: {1,2,3,4} and {5})

DSU with Size Tracking

Track size of each set for weighted merging.
class DSU {
public:
    vector<int> parent, size_;
    
    DSU(int n) : parent(n + 1), size_(n + 1, 1) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]);
        }
        return parent[x];
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;
        
        // Union by size
        if (size_[px] < size_[py]) swap(px, py);
        parent[py] = px;
        size_[px] += size_[py];
        
        return true;
    }
    
    int getSize(int x) {
        return size_[find(x)];
    }
};

Pattern 1: Kruskal’s MST

Problem: Find Minimum Spanning Tree of a weighted graph. Why Kruskal + DSU? Kruskal’s algorithm sorts edges by weight and greedily adds each edge if it does not form a cycle. The cycle check is exactly the DSU “are these in the same set?” query. Sort takes O(E log E), and E union/find operations take O(E * alpha(V)), so total is O(E log E). Key insight: An MST of n nodes always has exactly n-1 edges. If fewer edges are used after processing all edges, the graph is disconnected (no spanning tree exists).
struct Edge {
    int u, v, w;
    bool operator<(const Edge& other) const {
        return w < other.w;
    }
};

long long kruskal(int n, vector<Edge>& edges) {
    sort(edges.begin(), edges.end());  // Sort by weight ascending
    
    DSU dsu(n);
    long long mstWeight = 0;
    int edgesUsed = 0;
    
    for (auto& [u, v, w] : edges) {
        if (dsu.unite(u, v)) {     // If u and v are in different components
            mstWeight += w;         // Add this edge to MST
            edgesUsed++;
            
            if (edgesUsed == n - 1) break;  // MST complete (n-1 edges)
        }
        // If unite returns false, adding this edge would form a cycle -- skip
    }
    
    // If edgesUsed < n-1, graph is disconnected
    return mstWeight;
}
Codeforces Problems:
ProblemRatingLink
Road Reparation1400CSES
Road Construction1300CSES

Pattern 2: Cycle Detection

Problem: Detect if adding an edge creates a cycle.
bool addEdge(DSU& dsu, int u, int v) {
    if (dsu.connected(u, v)) {
        return true;  // Cycle detected
    }
    dsu.unite(u, v);
    return false;
}

// Count edges that create cycles
int countCycles = 0;
for (auto [u, v] : edges) {
    if (dsu.connected(u, v)) {
        countCycles++;
    } else {
        dsu.unite(u, v);
    }
}

Pattern 3: Connected Components Queries

Problem: Answer “are u and v connected?” after edge additions.
// Process edges online
void solve() {
    int n, m, q;
    cin >> n >> m >> q;
    
    DSU dsu(n);
    
    // Add initial edges
    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        dsu.unite(u, v);
    }
    
    // Answer queries
    for (int i = 0; i < q; i++) {
        int type, u, v;
        cin >> type >> u >> v;
        
        if (type == 1) {
            dsu.unite(u, v);
        } else {
            cout << (dsu.connected(u, v) ? "YES" : "NO") << "\n";
        }
    }
}

Pattern 4: DSU with Weighted Edges

Track something along the path from node to root.
class WeightedDSU {
public:
    vector<int> parent;
    vector<long long> dist;  // Distance from node to parent
    
    WeightedDSU(int n) : parent(n + 1), dist(n + 1, 0) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    pair<int, long long> find(int x) {
        if (parent[x] == x) return {x, 0};
        
        auto [root, d] = find(parent[x]);
        parent[x] = root;
        dist[x] += d;
        
        return {root, dist[x]};
    }
    
    void unite(int x, int y, long long w) {
        // w = dist[y] - dist[x] (relative distance)
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px == py) return;
        
        parent[py] = px;
        dist[py] = dx - dy + w;
    }
    
    long long getDist(int x, int y) {
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px != py) return LLONG_MAX;  // Not connected
        return dy - dx;
    }
};
Application: “A is 5 units more than B” type constraints. Also used in problems like “determine if relationships are consistent” (e.g., “if X says A is heavier than B by 3 kg, and Y says B is heavier than C by 2 kg, what is the difference between A and C?”).
Contest tip: Weighted DSU is commonly tested in problems that give pairwise relative information and ask you to detect contradictions or compute absolute values. If you see “relative differences” or “potentials” between elements in the same group, think weighted DSU.

Pattern 5: Rollback DSU (Offline)

Problem: Process queries where edges can be added AND removed. Solution: Use union by size (no path compression) to enable rollback. Why no path compression? Path compression is irreversible — it flattens the tree structure, making it impossible to “undo” a union. Without path compression, each union only changes one parent pointer, which is easy to save and restore. When to use: Offline dynamic connectivity, divide and conquer on queries, and problems where you need to “undo” merges (e.g., process edges that exist only during a time interval).
class RollbackDSU {
public:
    vector<int> parent, size_;
    vector<pair<int, int>> history;  // {node, old_parent}
    
    RollbackDSU(int n) : parent(n + 1), size_(n + 1, 1) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    int find(int x) {
        while (parent[x] != x) x = parent[x];  // No path compression!
        return x;
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;
        
        if (size_[px] < size_[py]) swap(px, py);
        
        history.push_back({py, py});  // Save state
        history.push_back({px, size_[px]});  // Actually store size
        
        parent[py] = px;
        size_[px] += size_[py];
        
        return true;
    }
    
    void rollback(int savedState) {
        while (history.size() > savedState) {
            // Restore previous state
            // Implementation depends on what you're tracking
        }
    }
    
    int saveState() {
        return history.size();
    }
};

Pattern 6: DSU on Tree

Merge child sets when processing nodes bottom-up.
// Problem: For each node, count distinct values in subtree
vector<set<int>> nodeSet(n + 1);
vector<int> answer(n + 1);

void dfs(int u, int p) {
    nodeSet[u].insert(val[u]);
    
    for (int v : adj[u]) {
        if (v != p) {
            dfs(v, u);
            
            // Merge smaller into larger (small-to-large)
            if (nodeSet[u].size() < nodeSet[v].size()) {
                swap(nodeSet[u], nodeSet[v]);
            }
            
            for (int x : nodeSet[v]) {
                nodeSet[u].insert(x);
            }
        }
    }
    
    answer[u] = nodeSet[u].size();
}

Pattern 7: Bipartiteness Check with DSU

Problem: Check if graph can be 2-colored (bipartite).
class BipartiteDSU {
public:
    vector<int> parent;
    vector<int> dist;  // 0 or 1 (parity from root)
    
    BipartiteDSU(int n) : parent(n + 1), dist(n + 1, 0) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    pair<int, int> find(int x) {
        if (parent[x] == x) return {x, 0};
        
        auto [root, d] = find(parent[x]);
        parent[x] = root;
        dist[x] ^= d;
        
        return {root, dist[x]};
    }
    
    bool addEdge(int x, int y) {
        // Returns false if adding edge (x, y) violates bipartiteness
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px == py) {
            return dx != dy;  // Must have different parities
        }
        
        parent[py] = px;
        dist[py] = dx ^ dy ^ 1;  // Force different parity
        
        return true;
    }
};

Common Mistakes

Mistake 1: Not initializing parent correctly
// WRONG
vector<int> parent(n + 1);  // All zeros!

// CORRECT
iota(parent.begin(), parent.end(), 0);  // parent[i] = i
Mistake 2: Union without find
// WRONG
parent[x] = y;

// CORRECT
parent[find(x)] = find(y);
Mistake 3: Using path compression with rollback Path compression destroys the tree structure needed for rollback.

Complexity Analysis

OperationWithout OptimizationPath CompressionPath + Rank
FindO(n)O(log n) amortizedO(α(n)) ≈ O(1)
UnionO(n)O(log n) amortizedO(α(n)) ≈ O(1)
α(n) is the inverse Ackermann function, effectively constant for all practical n.

Practice Problems

Beginner (1000-1300)

ProblemPatternLink
Road ConstructionBasic DSUCSES
Road ReparationKruskal MSTCSES

Intermediate (1300-1600)

ProblemPatternLink
Planets Queries IDSU + binary liftingCSES
Ice SkatingComponentsCF 217A

Advanced (1600-1900)

ProblemPatternLink
New Roads QueriesOffline DSUCSES
Giant Pizza2-SAT + DSUCSES

Key Takeaways

Path Compression

Point nodes directly to root during find.

Union by Rank/Size

Attach smaller tree under larger tree.

Kruskal's MST

Sort edges, greedily add if no cycle.

Near-Constant Time

O(α(n)) per operation with both optimizations.

Next Up

Chapter 17: Segment Trees

Master range queries and point updates with segment trees.