> ## Documentation Index
> Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Disjoint Set Union (Union-Find)

> Master Union-Find for connectivity, components, and Kruskal's MST

# Disjoint Set Union (Union-Find)

<img src="https://mintcdn.com/devweeekends/AEOaWh79Ur7CdHHv/images/courses/cp/dsu-visualization.svg?fit=max&auto=format&n=AEOaWh79Ur7CdHHv&q=85&s=970d477b14db857082234672acc55482" alt="DSU Visualization" width="1080" height="1080" data-path="images/courses/cp/dsu-visualization.svg" />

## The Power of DSU

DSU maintains a collection of disjoint sets and supports two operations:

* **Find**: Which set does element x belong to?
* **Union**: Merge two sets together.

With path compression and union by rank, both operations run in **O(α(n))** ≈ O(1) amortized time.

<Note>
  **Pattern Recognition Signals**:

  * "Connected components" that change over time → DSU
  * "Are x and y in the same group?" → DSU
  * "Merge groups" → DSU
  * "Minimum Spanning Tree" → Kruskal's + DSU
  * "Online connectivity queries" → DSU
</Note>

***

## Standard Implementation

**The Core Idea**: Represent each set as a tree. Each element points to its parent, and the root is the set's representative.

**Path Compression**: When we find the root of an element, we make all nodes along the path point directly to the root. This flattens the tree, making future queries faster.

**Union by Rank**: When merging two sets, attach the shorter tree under the taller one. This keeps trees balanced.

**Visual Example**:

```
Before find(4):           After path compression:
    1                           1
    |                        / | \
    2                       2  3  4
    |
    3
    |
    4

Finding root of 4: 4→3→2→1 (root is 1)
After compression: 4→1, 3→1, 2→1 (all point to root)
```

**Why O(α(n))?** The inverse Ackermann function α(n) grows incredibly slowly—for any practical n (even n = 10^80), α(n) ≤ 4. So it's effectively constant time!

```cpp theme={null}
class DSU {
public:
    vector<int> parent, rank_;
    int components;
    
    DSU(int n) : parent(n + 1), rank_(n + 1, 0), components(n) {
        iota(parent.begin(), parent.end(), 0);  // parent[i] = i
    }
    
    int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]);  // Path compression
        }
        return parent[x];
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;  // Already in same set
        
        // Union by rank
        if (rank_[px] < rank_[py]) swap(px, py);
        parent[py] = px;
        if (rank_[px] == rank_[py]) rank_[px]++;
        
        components--;
        return true;
    }
    
    bool connected(int x, int y) {
        return find(x) == find(y);
    }
};
```

### Usage

```cpp theme={null}
int n = 5;
DSU dsu(n);

dsu.unite(1, 2);
dsu.unite(3, 4);
dsu.unite(2, 4);

cout << dsu.connected(1, 3);  // true
cout << dsu.components;        // 2 (sets: {1,2,3,4} and {5})
```

***

## DSU with Size Tracking

Track size of each set for weighted merging.

```cpp theme={null}
class DSU {
public:
    vector<int> parent, size_;
    
    DSU(int n) : parent(n + 1), size_(n + 1, 1) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]);
        }
        return parent[x];
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;
        
        // Union by size
        if (size_[px] < size_[py]) swap(px, py);
        parent[py] = px;
        size_[px] += size_[py];
        
        return true;
    }
    
    int getSize(int x) {
        return size_[find(x)];
    }
};
```

***

## Pattern 1: Kruskal's MST

**Problem**: Find Minimum Spanning Tree of a weighted graph.

**Why Kruskal + DSU?** Kruskal's algorithm sorts edges by weight and greedily adds each edge if it does not form a cycle. The cycle check is exactly the DSU "are these in the same set?" query. Sort takes O(E log E), and E union/find operations take O(E \* alpha(V)), so total is O(E log E).

**Key insight**: An MST of n nodes always has exactly n-1 edges. If fewer edges are used after processing all edges, the graph is disconnected (no spanning tree exists).

```cpp theme={null}
struct Edge {
    int u, v, w;
    bool operator<(const Edge& other) const {
        return w < other.w;
    }
};

long long kruskal(int n, vector<Edge>& edges) {
    sort(edges.begin(), edges.end());  // Sort by weight ascending
    
    DSU dsu(n);
    long long mstWeight = 0;
    int edgesUsed = 0;
    
    for (auto& [u, v, w] : edges) {
        if (dsu.unite(u, v)) {     // If u and v are in different components
            mstWeight += w;         // Add this edge to MST
            edgesUsed++;
            
            if (edgesUsed == n - 1) break;  // MST complete (n-1 edges)
        }
        // If unite returns false, adding this edge would form a cycle -- skip
    }
    
    // If edgesUsed < n-1, graph is disconnected
    return mstWeight;
}
```

**Codeforces Problems**:

| Problem           | Rating | Link                                         |
| ----------------- | ------ | -------------------------------------------- |
| Road Reparation   | 1400   | [CSES](https://cses.fi/problemset/task/1675) |
| Road Construction | 1300   | [CSES](https://cses.fi/problemset/task/1676) |

***

## Pattern 2: Cycle Detection

**Problem**: Detect if adding an edge creates a cycle.

```cpp theme={null}
bool addEdge(DSU& dsu, int u, int v) {
    if (dsu.connected(u, v)) {
        return true;  // Cycle detected
    }
    dsu.unite(u, v);
    return false;
}

// Count edges that create cycles
int countCycles = 0;
for (auto [u, v] : edges) {
    if (dsu.connected(u, v)) {
        countCycles++;
    } else {
        dsu.unite(u, v);
    }
}
```

***

## Pattern 3: Connected Components Queries

**Problem**: Answer "are u and v connected?" after edge additions.

```cpp theme={null}
// Process edges online
void solve() {
    int n, m, q;
    cin >> n >> m >> q;
    
    DSU dsu(n);
    
    // Add initial edges
    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        dsu.unite(u, v);
    }
    
    // Answer queries
    for (int i = 0; i < q; i++) {
        int type, u, v;
        cin >> type >> u >> v;
        
        if (type == 1) {
            dsu.unite(u, v);
        } else {
            cout << (dsu.connected(u, v) ? "YES" : "NO") << "\n";
        }
    }
}
```

***

## Pattern 4: DSU with Weighted Edges

Track something along the path from node to root.

```cpp theme={null}
class WeightedDSU {
public:
    vector<int> parent;
    vector<long long> dist;  // Distance from node to parent
    
    WeightedDSU(int n) : parent(n + 1), dist(n + 1, 0) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    pair<int, long long> find(int x) {
        if (parent[x] == x) return {x, 0};
        
        auto [root, d] = find(parent[x]);
        parent[x] = root;
        dist[x] += d;
        
        return {root, dist[x]};
    }
    
    void unite(int x, int y, long long w) {
        // w = dist[y] - dist[x] (relative distance)
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px == py) return;
        
        parent[py] = px;
        dist[py] = dx - dy + w;
    }
    
    long long getDist(int x, int y) {
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px != py) return LLONG_MAX;  // Not connected
        return dy - dx;
    }
};
```

**Application**: "A is 5 units more than B" type constraints. Also used in problems like "determine if relationships are consistent" (e.g., "if X says A is heavier than B by 3 kg, and Y says B is heavier than C by 2 kg, what is the difference between A and C?").

<Tip>
  **Contest tip**: Weighted DSU is commonly tested in problems that give pairwise relative information and ask you to detect contradictions or compute absolute values. If you see "relative differences" or "potentials" between elements in the same group, think weighted DSU.
</Tip>

***

## Pattern 5: Rollback DSU (Offline)

**Problem**: Process queries where edges can be added AND removed.

**Solution**: Use union by size (no path compression) to enable rollback.

**Why no path compression?** Path compression is irreversible -- it flattens the tree structure, making it impossible to "undo" a union. Without path compression, each union only changes one parent pointer, which is easy to save and restore.

**When to use**: Offline dynamic connectivity, divide and conquer on queries, and problems where you need to "undo" merges (e.g., process edges that exist only during a time interval).

```cpp theme={null}
class RollbackDSU {
public:
    vector<int> parent, size_;
    vector<pair<int, int>> history;  // {node, old_parent}
    
    RollbackDSU(int n) : parent(n + 1), size_(n + 1, 1) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    int find(int x) {
        while (parent[x] != x) x = parent[x];  // No path compression!
        return x;
    }
    
    bool unite(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;
        
        if (size_[px] < size_[py]) swap(px, py);
        
        history.push_back({py, py});  // Save state
        history.push_back({px, size_[px]});  // Actually store size
        
        parent[py] = px;
        size_[px] += size_[py];
        
        return true;
    }
    
    void rollback(int savedState) {
        while (history.size() > savedState) {
            // Restore previous state
            // Implementation depends on what you're tracking
        }
    }
    
    int saveState() {
        return history.size();
    }
};
```

***

## Pattern 6: DSU on Tree

Merge child sets when processing nodes bottom-up.

```cpp theme={null}
// Problem: For each node, count distinct values in subtree
vector<set<int>> nodeSet(n + 1);
vector<int> answer(n + 1);

void dfs(int u, int p) {
    nodeSet[u].insert(val[u]);
    
    for (int v : adj[u]) {
        if (v != p) {
            dfs(v, u);
            
            // Merge smaller into larger (small-to-large)
            if (nodeSet[u].size() < nodeSet[v].size()) {
                swap(nodeSet[u], nodeSet[v]);
            }
            
            for (int x : nodeSet[v]) {
                nodeSet[u].insert(x);
            }
        }
    }
    
    answer[u] = nodeSet[u].size();
}
```

***

## Pattern 7: Bipartiteness Check with DSU

**Problem**: Check if graph can be 2-colored (bipartite).

```cpp theme={null}
class BipartiteDSU {
public:
    vector<int> parent;
    vector<int> dist;  // 0 or 1 (parity from root)
    
    BipartiteDSU(int n) : parent(n + 1), dist(n + 1, 0) {
        iota(parent.begin(), parent.end(), 0);
    }
    
    pair<int, int> find(int x) {
        if (parent[x] == x) return {x, 0};
        
        auto [root, d] = find(parent[x]);
        parent[x] = root;
        dist[x] ^= d;
        
        return {root, dist[x]};
    }
    
    bool addEdge(int x, int y) {
        // Returns false if adding edge (x, y) violates bipartiteness
        auto [px, dx] = find(x);
        auto [py, dy] = find(y);
        
        if (px == py) {
            return dx != dy;  // Must have different parities
        }
        
        parent[py] = px;
        dist[py] = dx ^ dy ^ 1;  // Force different parity
        
        return true;
    }
};
```

***

## Common Mistakes

<Warning>
  **Mistake 1: Not initializing parent correctly**

  ```cpp theme={null}
  // WRONG
  vector<int> parent(n + 1);  // All zeros!

  // CORRECT
  iota(parent.begin(), parent.end(), 0);  // parent[i] = i
  ```
</Warning>

<Warning>
  **Mistake 2: Union without find**

  ```cpp theme={null}
  // WRONG
  parent[x] = y;

  // CORRECT
  parent[find(x)] = find(y);
  ```
</Warning>

<Warning>
  **Mistake 3: Using path compression with rollback**
  Path compression destroys the tree structure needed for rollback.
</Warning>

***

## Complexity Analysis

| Operation | Without Optimization | Path Compression   | Path + Rank    |
| --------- | -------------------- | ------------------ | -------------- |
| Find      | O(n)                 | O(log n) amortized | O(α(n)) ≈ O(1) |
| Union     | O(n)                 | O(log n) amortized | O(α(n)) ≈ O(1) |

α(n) is the inverse Ackermann function, effectively constant for all practical n.

***

## Practice Problems

### Beginner (1000-1300)

| Problem           | Pattern     | Link                                         |
| ----------------- | ----------- | -------------------------------------------- |
| Road Construction | Basic DSU   | [CSES](https://cses.fi/problemset/task/1676) |
| Road Reparation   | Kruskal MST | [CSES](https://cses.fi/problemset/task/1675) |

### Intermediate (1300-1600)

| Problem           | Pattern              | Link                                                       |
| ----------------- | -------------------- | ---------------------------------------------------------- |
| Planets Queries I | DSU + binary lifting | [CSES](https://cses.fi/problemset/task/1750)               |
| Ice Skating       | Components           | [CF 217A](https://codeforces.com/problemset/problem/217/A) |

### Advanced (1600-1900)

| Problem           | Pattern     | Link                                         |
| ----------------- | ----------- | -------------------------------------------- |
| New Roads Queries | Offline DSU | [CSES](https://cses.fi/problemset/task/2101) |
| Giant Pizza       | 2-SAT + DSU | [CSES](https://cses.fi/problemset/task/1684) |

***

## Key Takeaways

<CardGroup cols={2}>
  <Card title="Path Compression" icon="compress">
    Point nodes directly to root during find.
  </Card>

  <Card title="Union by Rank/Size" icon="scale-balanced">
    Attach smaller tree under larger tree.
  </Card>

  <Card title="Kruskal's MST" icon="diagram-project">
    Sort edges, greedily add if no cycle.
  </Card>

  <Card title="Near-Constant Time" icon="bolt">
    O(α(n)) per operation with both optimizations.
  </Card>
</CardGroup>

***

## Next Up

<Card title="Chapter 17: Segment Trees" icon="arrow-right" href="/courses/competitive-programming/17-segment-trees">
  Master range queries and point updates with segment trees.
</Card>
