Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Graph algorithms solve problems involving nodes and edges. Master these core algorithms to tackle most graph problems. The key to graph algorithm mastery is knowing which algorithm to reach for based on the problem constraints. Think of it as a decision tree:- Unweighted shortest path? Use BFS (covered in the BFS chapter).
- Weighted, non-negative edges? Use Dijkstra.
- Negative edges possible? Use Bellman-Ford.
- All-pairs shortest paths? Use Floyd-Warshall.
- Minimum cost to connect everything? Use Kruskal or Prim (MST).
- Ordering with dependencies? Use Topological Sort.
When to Use
Shortest Path
MST
Topological Sort
Cycle Detection
Key Algorithms
1. Dijkstra’s Algorithm
Dijkstra finds the shortest path from a single source to all other nodes in a graph with non-negative edge weights. It works like a “greedy BFS” — always process the unvisited node with the smallest known distance. How it works: Maintain a priority queue (min-heap) of(distance, node). Pop the smallest, relax all its neighbors (update their distance if a shorter path is found). The if dist > distances[node]: continue line is a lazy deletion optimization — it skips stale entries instead of decreasing keys in the heap.
Critical constraint: Dijkstra fails with negative edge weights. Once a node is popped from the heap, Dijkstra assumes its distance is final. A negative edge could invalidate that assumption. Use Bellman-Ford instead.
Time: O((V + E) log V) with a binary heap. Space: O(V + E).
2. Bellman-Ford
Bellman-Ford handles negative edge weights and can detect negative cycles (a cycle whose total weight is negative, making shortest paths undefined). How it works: Relax every edge V-1 times. After V-1 rounds, all shortest paths (which have at most V-1 edges) are finalized. Then do one more pass: if any distance can still be reduced, a negative cycle exists. Why V-1 iterations? The shortest path between any two nodes visits at most V-1 edges (no cycles in shortest paths for non-negative total cost). Each iteration guarantees at least one more node gets its correct distance. When to use over Dijkstra: When edges can be negative, or when you need to detect negative cycles, or when you need “shortest path with at most K edges” (just run K iterations instead of V-1 — this is the “Cheapest Flights Within K Stops” pattern). Time: O(V * E). Space: O(V).3. Topological Sort (Kahn’s BFS)
Topological sort orders vertices of a DAG (directed acyclic graph) so that for every edge u -> v, u comes before v. Think of it as task scheduling where some tasks depend on others. Kahn’s algorithm (BFS-based): Start with all nodes that have no incoming edges (in-degree 0). Process them, remove their outgoing edges (decrement neighbors’ in-degrees), and add any neighbor whose in-degree drops to 0. If the result includes all n nodes, the ordering is valid; otherwise a cycle exists. Why Kahn’s over DFS-based? Kahn’s is iterative (no recursion), gives a natural BFS-level ordering, and makes cycle detection trivial (just checklen(result) == n). The DFS approach uses postorder reversal and three-color marking.
Time: O(V + E). Space: O(V + E).
4. Kruskal’s MST
Kruskal’s builds a Minimum Spanning Tree by greedily picking the cheapest edge that does not form a cycle. It uses Union-Find (disjoint set) to efficiently detect whether two nodes are already connected. How it works: Sort all edges by weight. Iterate through them in order. For each edge, if it connects two different components (Union-Find says their roots differ), add it to the MST and merge the components. Stop when the MST has V-1 edges. Union-Find internals: Two optimizations make it near-O(1) per operation: path compression (duringfind, point every node directly to the root) and union by rank (attach the shorter tree under the taller one). Together they achieve amortized O(alpha(n)) per operation, where alpha is the inverse Ackermann function — effectively constant.
Time: O(E log E) dominated by sorting. Space: O(V) for Union-Find.
Common Mistakes
Complexity Comparison
| Algorithm | Time | Space | Use Case |
|---|---|---|---|
| Dijkstra | O((V+E) log V) | O(V) | Non-negative weights, single source |
| Bellman-Ford | O(VE) | O(V) | Negative weights, negative cycle detection |
| Floyd-Warshall | O(V^3) | O(V^2) | All-pairs shortest paths (small V) |
| Kruskal | O(E log E) | O(V) | MST, works well with sparse graphs |
| Prim | O(E log V) | O(V) | MST, works well with dense graphs |
| Topological Sort | O(V + E) | O(V + E) | DAG ordering, dependency resolution |
Worked LeetCode Problems
These five cover the bulk of weighted-graph interview territory: pure Dijkstra (743), constrained shortest path that breaks Dijkstra (787), Dijkstra-on-bottleneck (1631), MST (1584), and topological sort with custom graph construction (269).LC 743 — Network Delay Time (Medium, Dijkstra)
A signal originates at nodek and propagates along directed edges with travel times. Return the time for ALL nodes to receive the signal — the maximum over the shortest paths from k.
if d > dist[node] line is critical — it skips stale heap entries instead of trying to decrease-key (which Python’s heapq does not support).
LC 787 — Cheapest Flights Within K Stops (Medium, Bellman-Ford)
Find the cheapest path with AT MOSTk stops. Standard Dijkstra fails: a node could be reached cheaply with too many stops, blocking the discovery of a shorter-stops-but-pricier path. Bellman-Ford with exactly K+1 rounds is the natural fit.
dist per round.
LC 1631 — Path With Minimum Effort (Medium, modified Dijkstra)
Move on a 2D grid; “effort” is the MAX absolute height difference along the path. Minimize the max — a bottleneck path. Dijkstra still works if you redefine “distance”: instead of summing edge weights, take the MAX along the path.LC 1584 — Min Cost to Connect All Points (Medium, MST via Kruskal)
Connect all points with minimum total Manhattan-distance edges. Classic MST. Build all O(n^2) edges, sort by weight, Kruskal with Union-Find.LC 269 — Alien Dictionary (Hard, topological sort)
Given a list of words from an alien language, sorted lexicographically by their unknown alphabet, derive a possible alphabet ordering.["abc", "ab"] — "abc" cannot come before "ab" since "ab" is a prefix of "abc". Return "". (2) Cycle in the inferred graph (e.g., ["a", "b", "a"]). Detect via len(result) != len(in_degree).
Caveats and Traps
Curated LeetCode List
| Difficulty | Problem | LC # | Pattern |
|---|---|---|---|
| Easy | Find the Town Judge | 997 | In-degree / out-degree counting |
| Easy | Find Center of Star Graph | 1791 | Trivial degree analysis |
| Easy | Find if Path Exists in Graph | 1971 | BFS / DFS / Union-Find |
| Medium | Network Delay Time | 743 | Dijkstra |
| Medium | Cheapest Flights Within K Stops | 787 | Bellman-Ford with K rounds |
| Medium | Path With Minimum Effort | 1631 | Dijkstra with MAX combiner |
| Medium | Swim in Rising Water | 778 | Dijkstra-style or binary search + BFS |
| Medium | Min Cost to Connect All Points | 1584 | MST (Kruskal or Prim) |
| Medium | Course Schedule | 207 | Topological sort / cycle detection |
| Medium | Course Schedule II | 210 | Topological sort returning order |
| Medium | Number of Connected Components | 323 | Union-Find or DFS |
| Medium | Reconstruct Itinerary | 332 | Hierholzer’s Eulerian path |
| Medium | Evaluate Division | 399 | Weighted Union-Find or DFS with weights |
| Hard | Alien Dictionary | 269 | Topological sort, careful char graph |
| Hard | Word Ladder II | 126 | BFS for distance + DFS for paths |
| Hard | Critical Connections in a Network | 1192 | Tarjan’s bridges algorithm |
| Hard | Reachable Nodes In Subdivided Graph | 882 | Modified Dijkstra |
| Hard | Minimum Cost to Make at Least One Valid Path | 1368 | 0-1 BFS with deque |
Graph Algorithms Interview Q&A
Dijkstra vs Bellman-Ford — when to use which
Dijkstra vs Bellman-Ford — when to use which
- Default to Dijkstra when all edges are non-negative — it is faster: O((V + E) log V) vs Bellman-Ford’s O(V * E).
- Switch to Bellman-Ford when any edge can be negative, or when you need to detect a negative cycle.
- Switch to Bellman-Ford for the “shortest path with at most K edges” variant — run only K rounds instead of V-1.
- Special case: 0/1 weighted edges -> 0-1 BFS with a deque, O(V + E).
- Special case: DAG -> topological sort + relaxation, O(V + E), handles negatives.
- Using Dijkstra for the K-stops problem. Dijkstra finalizes nodes greedily; once it commits to “shortest distance” for a node, it cannot revisit even if more stops would yield a cheaper route. You get wrong answers.
- Saying “Bellman-Ford is just slower Dijkstra.” It handles negative weights; Dijkstra does not. They solve different problems.
- Forgetting that Bellman-Ford detects negative cycles as a bonus. This is a unique capability — using it interchangeably with Dijkstra misses the point.
Kruskal vs Prim for MST — which to choose
Kruskal vs Prim for MST — which to choose
- Both produce a Minimum Spanning Tree of total weight — the result is identical (up to ties).
- Kruskal: sort all edges by weight, scan in order, use Union-Find to skip cycle-creating edges. O(E log E).
- Prim: start from any node, maintain a heap of edges crossing the cut, pop the lightest. O(E log V) with binary heap, O(E + V log V) with Fibonacci heap.
- Kruskal wins on sparse graphs (E close to V) — sorting fewer edges is cheaper.
- Prim wins on dense graphs (E close to V^2) — avoids sorting all edges.
- Both rely on the cut property: the lightest edge crossing any cut is in some MST.
- Using BFS or DFS for cycle detection in Kruskal. O(V+E) per edge -> O(E(V+E)) total. Use Union-Find for O(alpha(n)) per check.
- Picking Prim for sparse graphs because “it sounds fancier.” Kruskal beats Prim on sparse graphs because edge sorting dominates for both.
- Confusing MST with shortest path tree. MST minimizes TOTAL weight; SPT minimizes distance from a single source. They are different trees.
Topological sort — Kahn's BFS vs DFS postorder
Topological sort — Kahn's BFS vs DFS postorder
- Recognize: ordering with dependencies -> topological sort on a DAG.
- Kahn’s BFS: compute in-degrees, seed queue with all in-degree-zero nodes, process and decrement neighbors’ in-degrees, push when in-degree hits zero.
- DFS postorder: run DFS from each unvisited node, append node to result when finished, reverse the result at the end.
- Cycle detection: Kahn’s — if
len(result) < V, cycle exists. DFS — three-color marking, GRAY hit means cycle. - Default to Kahn’s: simpler, naturally iterative, fewer bugs.
- Using BFS shortest-path style. Topological sort is not a shortest-path problem; the distance metric is irrelevant.
- Returning the result without checking for cycle. If
len(result) < V, the input has a cycle — return empty / report invalid. - Building edges in the wrong direction. For prerequisites,
[a, b]means b is a prerequisite of a -> edgeb -> a. Reversing this silently flips the entire ordering.
Interview Questions
Q1: Explain Dijkstra's algorithm. Why does it not work with negative edge weights?
Q1: Explain Dijkstra's algorithm. Why does it not work with negative edge weights?
- Core idea. “Dijkstra is a greedy single-source shortest path algorithm. It processes nodes in order of their current shortest known distance, using a min-heap. Once a node is popped from the heap, its distance is finalized — we will never find a shorter path to it.”
- Why the greedy works. With non-negative weights, any path that goes through an unvisited node must be at least as long as the direct distance we already have. If I have finalized node A at distance 5, any path to A through some unvisited node B must cost at least
dist(B) + edge(B, A), and sincedist(B) >= 5(it is still in the heap) and the edge is non-negative, the total is>= 5. So the greedy choice is safe. - How negatives break it. With negative edges, a node finalized at distance 5 might later be reachable at distance 3 via a path that includes a negative edge from an unfinalized node. The greedy invariant — “once popped, it is settled” — collapses. Concretely: node A popped at distance 5, then later we discover edge B->A with weight -10 and B was at distance 12, giving A a new distance of 2. But we already declared A done.
- What to use instead. Bellman-Ford handles negative weights by relaxing all edges V-1 times, costing O(VE) but correctly handling negative edges and detecting negative cycles.
- Complexity reminder. O((V+E) log V) with a binary heap, O(V^2) with a simple array (which is actually better for dense graphs).
- Can you construct a small graph (3-4 nodes) where Dijkstra gives the wrong answer due to a negative edge? Walk me through step by step. (Tests whether they can actually demonstrate the failure, not just state it.)
- What about negative edges but no negative cycles? Is there a way to adapt Dijkstra to handle that case? (Answer: yes, Johnson’s algorithm re-weights all edges to be non-negative using a potential function derived from a Bellman-Ford run, then runs Dijkstra from each source. O(VE + V(V+E) log V).)
Q2: You need to find the shortest path in a graph where all edge weights are 0 or 1. What algorithm do you use and why?
Q2: You need to find the shortest path in a graph where all edge weights are 0 or 1. What algorithm do you use and why?
- The trick: 0-1 BFS. “When edge weights are only 0 or 1, I use a deque instead of a priority queue. Zero-weight edges push to the front of the deque, one-weight edges push to the back. This maintains the sorted order that Dijkstra needs, but without the log V overhead of a heap.”
- Why it works. The deque naturally maintains the invariant that elements are in non-decreasing order of distance. When we add a node via a 0-weight edge, its distance equals the current node’s distance — so it belongs at the front. When we add via a 1-weight edge, its distance is current + 1 — so it belongs at the back. This is exactly a BFS-like level ordering.
- Complexity. O(V + E) time, O(V) space. Compared to Dijkstra’s O((V+E) log V), we eliminate the log factor entirely.
- When to recognize this pattern. Any graph where weights are from a set of two values
{a, b}whereb = a + 1(or can be normalized to 0/1). Also works for problems like “minimum flips” in a grid where each move has cost 0 or 1. - Real example. Grid problems where moving in the same direction is free (cost 0) but turning has cost 1.
- Can you extend this to weights of 0, 1, and 2? What about 0 through K? (Answer: for 0 to K, you can use a generalization with K+1 buckets — called “dial’s algorithm” or bucket queue. Complexity becomes O(V + E + K*V). For small K, this beats Dijkstra.)
- How would you implement this for a grid where moving forward costs 0 but turning costs 1, to find the minimum number of turns from corner to corner? (Tests whether they can apply the 0-1 BFS idea to an actual problem encoding.)
Q3: When would you choose Bellman-Ford over Dijkstra, and vice versa? Are there cases where neither is the best choice?
Q3: When would you choose Bellman-Ford over Dijkstra, and vice versa? Are there cases where neither is the best choice?
- Dijkstra when: All edge weights are non-negative. You need single-source shortest paths efficiently. O((V+E) log V) with a binary heap makes it the default choice for road networks, navigation systems, and most “find the shortest path” interview problems.
- Bellman-Ford when: The graph has negative edge weights. You need to detect negative cycles. The edge list representation is more natural than adjacency lists (Bellman-Ford iterates over edges, not neighbors). Also critical for the “cheapest flights within K stops” variant, where you run only K relaxation rounds instead of V-1.
- Neither is best when: You need all-pairs shortest paths — use Floyd-Warshall at O(V^3) for dense graphs or run Dijkstra from each vertex for sparse graphs. For unweighted graphs, plain BFS at O(V+E) beats both. For DAGs, topological sort + relaxation gives O(V+E) even with negative weights.
- The underrated option. For DAGs, you can compute shortest paths in O(V+E) by processing nodes in topological order and relaxing edges. This works with negative weights and is faster than both Dijkstra and Bellman-Ford. Most candidates forget this exists.
- SPFA note. SPFA (Shortest Path Faster Algorithm) is a queue-based optimization of Bellman-Ford that is often faster in practice (average O(E)) but has the same O(VE) worst case. Some competitive programmers default to SPFA, but it can be killed by adversarial inputs.
- You are building a navigation system for a city with 10 million intersections. Which algorithm do you use and what data structure optimizations would you consider? (Answer: Dijkstra with a Fibonacci heap for O(V log V + E) theoretically, though in practice a binary heap or a 4-ary heap is faster. Also consider A* with a good heuristic, bidirectional Dijkstra, or contraction hierarchies for repeated queries.)
- How does the “Cheapest Flights Within K Stops” problem modify Bellman-Ford, and why can you not just use Dijkstra with a visited-with-stops check? (Answer: you run Bellman-Ford for only K+1 rounds instead of V-1. Dijkstra with a stops counter does not work cleanly because a node can be reached in fewer stops at a higher cost, and you need to keep both options open.)
Q4: Explain topological sort. Can a graph have multiple valid topological orderings? How would you detect if one is impossible?
Q4: Explain topological sort. Can a graph have multiple valid topological orderings? How would you detect if one is impossible?
- What it is. “A topological ordering of a DAG is a linear sequence of vertices such that for every directed edge u->v, u appears before v. It is only possible on directed acyclic graphs.”
- Two approaches. Kahn’s algorithm (BFS-based): compute in-degrees, start with nodes at in-degree 0, process them and decrement neighbors’ in-degrees. DFS-based: run DFS and add nodes to a stack in post-order (when you finish processing all of a node’s neighbors), then reverse the stack. Both are O(V+E).
- Multiple valid orderings? Yes, whenever there are multiple nodes with in-degree 0 at the same time in Kahn’s algorithm. For example, if courses A and B have no prerequisites, either can come first. The number of valid topological orderings can be exponentially large.
- Detecting impossibility (cycles). With Kahn’s: if the result contains fewer than V nodes, a cycle exists — some nodes never reached in-degree 0 because they are in a cycle. With DFS: detect a back edge (an edge to a node currently on the recursion stack, typically tracked with a “gray” color in the white/gray/black coloring scheme).
- Uniqueness check. The topological order is unique if and only if the queue in Kahn’s algorithm never contains more than one node at any step. This implies a Hamiltonian path exists in the DAG. You can check this during the BFS.
- You have a build system with 50,000 packages and their dependency graph. How do you determine the maximum parallelism — the most packages you can build simultaneously? (Answer: the maximum parallelism at any point is the maximum number of nodes with in-degree 0 at the same step in Kahn’s. More formally, it is the width of the longest antichain in the DAG, which can be found by computing the longest path and using Dilworth’s theorem.)
- Can you find the longest path in a DAG? How? (Answer: process nodes in topological order, relax edges to compute the longest distance instead of shortest. O(V+E). This is impossible efficiently on general graphs — longest path is NP-hard on graphs with cycles.)
Q5: Describe Kruskal's algorithm. Why does the greedy strategy of always picking the cheapest edge produce a correct MST?
Q5: Describe Kruskal's algorithm. Why does the greedy strategy of always picking the cheapest edge produce a correct MST?
- The algorithm. “Sort all edges by weight. Process them in order from lightest to heaviest. For each edge, if it connects two different components (check with Union-Find), add it to the MST. Otherwise skip it. Stop when we have V-1 edges.”
- Why the greedy works — the cut property. For any cut (partition of vertices into two sets S and V-S), the minimum weight edge crossing the cut must be in every MST (assuming unique weights). Kruskal implicitly uses this: when it picks the cheapest edge connecting two different components, that edge is the minimum weight edge across the cut between those two components.
- Why Union-Find is the right data structure. Each “is this edge connecting two components?” query is a
find()operation, and each “merge components” is aunion()operation. With path compression and union by rank, each operation is nearly O(1) — amortized O(alpha(n)) where alpha is the inverse Ackermann function, effectively constant. - Complexity. O(E log E) dominated by the sort. The Union-Find operations total O(E * alpha(V)) which is essentially O(E). Since E can be at most V^2, O(E log E) = O(E log V^2) = O(2E log V) = O(E log V).
- When Kruskal beats Prim. Kruskal is better for sparse graphs (E close to V) because sorting fewer edges is cheap. Prim’s with a binary heap is O(E log V) as well, but Prim’s with a Fibonacci heap achieves O(E + V log V) which wins on dense graphs.
- What if some edges have equal weights? Can the MST be non-unique? How does this affect Kruskal’s? (Answer: yes, the MST can be non-unique when ties exist. Different tie-breaking orders in the sort can produce different MSTs, but all will have the same total weight. The MST is unique if and only if all edge weights are distinct.)
- After building an MST, one edge weight in the original graph decreases. How do you efficiently update the MST without recomputing from scratch? (Answer: add the edge to the MST, which creates exactly one cycle. Find and remove the heaviest edge in that cycle. This can be done in O(V) with a path query on the MST. For repeated updates, link-cut trees give O(log V) per update.)
Q6: You are given a graph and need to detect if it contains a cycle. How does your approach differ between directed and undirected graphs?
Q6: You are given a graph and need to detect if it contains a cycle. How does your approach differ between directed and undirected graphs?
- Undirected graphs — Union-Find or DFS. With Union-Find: for each edge (u, v), check if u and v are already in the same component. If yes, adding this edge creates a cycle. With DFS: if we visit a neighbor that is already visited and it is not the parent of the current node, we found a cycle. The parent check is critical — without it, the edge we just traversed from would be falsely flagged.
- Directed graphs — DFS with three colors. White (unvisited), Gray (on the current recursion stack), Black (fully processed). A cycle exists if and only if we encounter a Gray node during DFS — this is a back edge, meaning we have found a path from that node back to itself. A cross edge to a Black node is fine — it does not indicate a cycle.
- Why the distinction matters. In an undirected graph, edge (A, B) is traversed both ways, so we need the parent check to avoid false positives. In a directed graph, edge A->B does not imply B->A, so cross edges (to fully processed nodes in a different subtree) are harmless. Only back edges (to ancestors in the current DFS path) create cycles.
- Alternative for directed graphs. Kahn’s topological sort: if the result has fewer than V nodes, a cycle exists. This is sometimes cleaner than DFS coloring when you are already computing a topological order.
- Can you detect the length of the shortest cycle in an undirected graph? What is the complexity? (Answer: run BFS from each vertex. When BFS finds an already-visited node, the cycle length is
dist[u] + dist[v] + 1. Total: O(V(V+E)). The shortest cycle in a graph is called the “girth.”) - In a directed graph, how would you find all nodes that are part of some cycle? (Answer: compute strongly connected components using Tarjan’s or Kosaraju’s algorithm. Any SCC with more than one node contains a cycle, and all nodes in it are part of a cycle. O(V+E).)
Q7: What is Floyd-Warshall and when would you use it over running Dijkstra from every vertex?
Q7: What is Floyd-Warshall and when would you use it over running Dijkstra from every vertex?
- What it is. “Floyd-Warshall computes shortest paths between all pairs of vertices in O(V^3) time and O(V^2) space. It uses a dynamic programming approach: for each intermediate vertex k, update
dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j]). The key insight is iterating over intermediate vertices in the outer loop.” - The DP formulation.
dist_k[i][j]= shortest path from i to j using only vertices{0, 1, ..., k}as intermediates. The recurrence: either the shortest path through{0..k}uses vertex k (then it isdist_{k-1}[i][k] + dist_{k-1}[k][j]) or it does not (then it isdist_{k-1}[i][j]). We can update in-place becausedist[i][k]anddist[k][j]do not change when we are processing vertex k. - Floyd-Warshall vs. V times Dijkstra. Floyd-Warshall: O(V^3), handles negative weights (no negative cycles), extremely simple to implement (three nested loops). V times Dijkstra: O(V(V+E) log V) with a binary heap. For dense graphs (E ~ V^2), Dijkstra gives O(V^3 log V) — worse than Floyd-Warshall. For sparse graphs (E ~ V), Dijkstra gives O(V^2 log V) — much better. Rule of thumb: Floyd-Warshall for dense graphs or graphs with negative weights, repeated Dijkstra for sparse graphs.
- Negative cycle detection. After running Floyd-Warshall, if any
dist[i][i] < 0, a negative cycle exists through vertex i. - The simplicity advantage. Floyd-Warshall is 5 lines of code. In an interview or contest, when you need all-pairs and the graph is not huge, it is often the pragmatic choice.
- Why must the k-loop (intermediate vertex loop) be the outermost loop? What goes wrong if you put it inside? (Answer: the DP recurrence requires that when computing paths through vertex k, all paths through vertices 0..k-1 are already finalized. If k is not the outer loop, you use partially computed values and get wrong answers.)
- Floyd-Warshall uses O(V^2) space. Can you reconstruct the actual shortest path, not just the distance? How? (Answer: maintain a
next[i][j]matrix. Initializenext[i][j] = jfor each edge. When updatingdist[i][j]through k, setnext[i][j] = next[i][k]. To reconstruct the path from i to j, follownext[i][j],next[next[i][j]][j], etc. Same O(V^2) space.)
Q8: Design a system to find the shortest path between any two users in a social network with 500 million users. How do you approach this at scale?
Q8: Design a system to find the shortest path between any two users in a social network with 500 million users. How do you approach this at scale?
Q9: What are strongly connected components? Describe an algorithm to find them and give a real-world use case.
Q9: What are strongly connected components? Describe an algorithm to find them and give a real-world use case.
- Definition. “A strongly connected component (SCC) of a directed graph is a maximal set of vertices such that there is a directed path from every vertex to every other vertex in the set. In other words, within an SCC, you can reach any node from any other node.”
- Kosaraju’s algorithm (two-pass DFS). Pass 1: Run DFS on the original graph, recording the finish order of nodes (push to a stack when done). Pass 2: Transpose the graph (reverse all edges). Pop nodes from the stack and run DFS on the transposed graph. Each DFS in pass 2 discovers one SCC. O(V+E) total.
- Why it works. The first DFS orders nodes so that the “root” of each SCC’s DFS tree finishes last. Reversing edges means that starting from these roots in reverse finish order, the DFS can only reach nodes within the same SCC — the reversed edges prevent crossing into other SCCs.
- Tarjan’s algorithm (single-pass). Uses a single DFS with a stack and tracking of “low-link” values. When a node’s low-link equals its discovery time, it is the root of an SCC — pop everything down to it from the stack. Also O(V+E) but in a single pass with more complex bookkeeping.
- Real-world use cases. Compiler optimization: finding cycles in call graphs to determine which functions are mutually recursive. Web crawling: SCC analysis of web links reveals tightly connected clusters of pages. Dependency analysis: in a package manager, an SCC in the dependency graph means circular dependencies that must be resolved together.
- After finding all SCCs, if you contract each SCC into a single node, what kind of graph do you get? (Answer: a directed acyclic graph — the “condensation” of the original graph. This is extremely useful because you can now run topological sort and DP on the condensation. For example, finding the longest path in a directed graph with cycles reduces to finding SCCs, condensing, then doing longest-path DP on the DAG.)
- Can you find SCCs in a graph that does not fit in memory? (Answer: this is an active research area. External-memory SCC algorithms exist but are significantly more complex. One approach: partition the graph, compute SCCs within partitions, then iteratively merge across partitions. In practice, systems like MapReduce-based graph processing frameworks handle this with multiple rounds of computation.)
Q10: Given a weighted directed graph, how do you find the shortest path from source to destination that uses at most K edges? Why can you not just use standard Dijkstra?
Q10: Given a weighted directed graph, how do you find the shortest path from source to destination that uses at most K edges? Why can you not just use standard Dijkstra?
- Why standard Dijkstra fails. Dijkstra pops a node and marks it as finalized. But with a K-edge constraint, a node might be reachable in 2 edges at cost 100 and in 5 edges at cost 50. If K >= 5, we want the second path. If K = 3, we want the first. Standard Dijkstra would finalize the node at cost 50 (via 5 edges) and never consider the 2-edge path. You cannot just track
(node, edges_used)as your state in Dijkstra either, because the state space explodes and the greedy invariant gets complicated. - Bellman-Ford approach. Run exactly K rounds of Bellman-Ford (instead of V-1). After i rounds,
dist[v]contains the shortest path to v using at most i edges. The key detail: on each round, use the distances from the previous round (make a copy) to avoid “chaining” relaxations within the same round, which could use more than one additional edge. - The copy trick. In standard Bellman-Ford, you relax in-place and it does not matter if relaxations chain within a round (it converges in V-1 rounds regardless). But for the K-edge variant, chaining within a round means a single round could use multiple edges. So you must relax from a snapshot of the previous round’s distances.
- DP formulation.
dp[k][v]= minimum cost to reach v using at most k edges. Transition:dp[k][v] = min(dp[k-1][v], min over all edges (u,v) of dp[k-1][u] + weight(u,v)). Space-optimized to two arrays (current and previous round). - Complexity. O(K * E) time, O(V) space (with the rolling array optimization).
- What if K is very large (close to V)? Is Bellman-Ford still the best approach? (Answer: if K >= V-1, the constraint is effectively lifted and standard Dijkstra or full Bellman-Ford works. For K close to V, Bellman-Ford at O(KE) is similar to its standard O(VE). Dijkstra at O((V+E) log V) would be faster if there are no negative weights.)
- Can you solve this problem with a modified BFS if all weights are equal to 1? (Answer: yes. With unit weights, “shortest path with at most K edges” is just “is the BFS distance at most K?” Standard BFS in O(V+E) handles it directly. The problem is only interesting with varying weights.)
Interview Deep-Dive
Given a new shortest-path problem, how do you decide between BFS, Dijkstra, Bellman-Ford, and Floyd-Warshall? What are the exact conditions for each?
Given a new shortest-path problem, how do you decide between BFS, Dijkstra, Bellman-Ford, and Floyd-Warshall? What are the exact conditions for each?
- BFS: Unweighted graph (or all edges weight 1). Single-source. O(V + E). The fastest possible shortest-path algorithm for this case — never use Dijkstra when BFS suffices.
- Dijkstra: Weighted graph with non-negative edges. Single-source. O((V + E) log V) with binary heap. The default choice for most weighted shortest-path problems. Fails silently with negative edges.
- Bellman-Ford: Weighted graph, negative edges allowed but no negative cycles. Single-source. O(VE). Also detects negative cycles (if any distance decreases in the Vth relaxation round). Use for “cheapest flights within K stops” variant by running only K rounds.
- Floyd-Warshall: All-pairs shortest paths. O(V^3). Handles negative edges (no negative cycles). Best for dense graphs or when you need every pair’s distance. For sparse graphs, V runs of Dijkstra is faster: O(V(V+E) log V).
- 0-1 BFS: Edges with weights 0 or 1 only. Single-source. O(V + E) using a deque. Push weight-0 edges to front, weight-1 edges to back.
- DAG shortest path: Topological sort + relaxation. O(V + E). Works with negative edges. Fastest single-source shortest path for DAGs.
- Decision tree: Unweighted -> BFS. Weighted, non-negative -> Dijkstra. Negative edges -> Bellman-Ford. All-pairs -> Floyd-Warshall (dense) or V x Dijkstra (sparse). DAG -> topological sort.
- Dijkstra is the starting point. But at 10M nodes, even Dijkstra is too slow per query for real-time routing. Production systems use contraction hierarchies (preprocess shortcut edges) or A search* (heuristic guided — use Euclidean distance as the heuristic for road networks). Google Maps uses a variant of contraction hierarchies that precomputes a hierarchy of important roads, reducing query time from seconds to milliseconds.
Explain why Dijkstra fails with negative edge weights. Construct a 3-node graph where it gives the wrong answer, and walk through step by step.
Explain why Dijkstra fails with negative edge weights. Construct a 3-node graph where it gives the wrong answer, and walk through step by step.
- The invariant Dijkstra relies on: Once a node is popped from the min-heap, its distance is final. No shorter path to it can be discovered later. This is true when all edges are non-negative because any path through an unvisited node must cost at least as much as the direct path.
- Concrete 3-node counterexample: Nodes A, B, C. Edges: A->B (weight 1), A->C (weight 5), B->C (weight -10).
- Initialize: dist = [A:0, B:inf, C:inf]. Heap: [(0, A)].
- Pop A (dist 0). Relax A->B: dist[B] = 1. Relax A->C: dist[C] = 5. Heap: [(1, B), (5, C)].
- Pop B (dist 1). Dijkstra finalizes B at distance 1. Relax B->C: dist[C] = min(5, 1 + (-10)) = -9. Heap: [(5, C), (-9, C)].
- Pop C (dist -9). Dijkstra finalizes C at -9. But wait — Dijkstra already finalized B at distance 1. Is that correct? Yes in this case.
- Actually, the real failure is subtler. Consider: A->B (weight 6), A->C (weight 2), C->B (weight -5).
- Initialize: dist = [A:0, B:inf, C:inf]. Heap: [(0, A)].
- Pop A. Relax A->B: dist[B] = 6. Relax A->C: dist[C] = 2. Heap: [(2, C), (6, B)].
- Pop C (dist 2). Finalize C at 2. Relax C->B: dist[B] = min(6, 2 + (-5)) = -3. Heap: [(-3, B), (6, B)].
- Pop B (dist -3). Finalize B at -3. Correct answer is -3 via A->C->B.
- But with lazy deletion, Dijkstra gets this right! The failure is more subtle — it occurs when a finalized node could be reached cheaper through a not-yet-finalized node with a negative edge. For a clear failure: A->B (1), B->C (1), A->C (10), C->A (-20). Dijkstra finalizes A at 0, B at 1, C at 2. But the path A->C->A has cost 10 + (-20) = -10, meaning A’s true shortest “distance” from itself is -10 (a negative cycle). Dijkstra cannot detect this.
- The fundamental issue: Negative edges violate the greedy invariant. Dijkstra assumes “settled” means “optimal,” but a negative edge from an unsettled node can retroactively improve a settled node’s distance.
- Yes, via Johnson’s algorithm. Run Bellman-Ford from a virtual source node connected to all vertices with weight-0 edges. This computes a potential function h(v). Reweight each edge: w’(u,v) = w(u,v) + h(u) - h(v). All reweighted edges are non-negative. Then run Dijkstra from each source on the reweighted graph. Total: O(VE + V(V+E) log V).
What is the difference between Kruskal's and Prim's algorithm for MST? When does each one win, and what data structures make them efficient?
What is the difference between Kruskal's and Prim's algorithm for MST? When does each one win, and what data structures make them efficient?
- Kruskal’s: Sort all edges by weight. Process edges from lightest to heaviest. If an edge connects two different components (checked via Union-Find), add it to the MST. Stop at V-1 edges. Time: O(E log E) dominated by sorting. Union-Find operations are O(E * alpha(V)), effectively O(E).
- Prim’s: Start from any node. Maintain a min-heap of edges crossing the cut between MST nodes and non-MST nodes. Pop the lightest crossing edge, add the new node to the MST, and push its edges. Time: O(E log V) with binary heap, O(E + V log V) with Fibonacci heap.
- When Kruskal wins: Sparse graphs (E close to V). Sorting fewer edges is cheap. Also, Kruskal naturally works with edge lists and does not need an adjacency list.
- When Prim wins: Dense graphs (E close to V^2). Prim with an adjacency matrix is O(V^2), which beats Kruskal’s O(E log E) = O(V^2 log V) on dense graphs. With Fibonacci heap, Prim achieves O(E + V log V), which is optimal.
- Correctness proof: Both rely on the cut property — for any cut of the graph, the minimum weight edge crossing the cut must be in every MST (assuming unique weights). Kruskal implicitly applies this by always choosing the globally cheapest edge that does not create a cycle. Prim applies it by always choosing the cheapest edge crossing the MST/non-MST boundary.
- MST uniqueness: If all edge weights are distinct, the MST is unique. If some weights are equal, multiple MSTs may exist, but all have the same total weight.
- Add the updated (cheaper) edge to the MST. This creates exactly one cycle. Find and remove the heaviest edge in that cycle. If the removed edge is the updated edge itself (it was already the heaviest in the cycle), the MST is unchanged. Otherwise, you have a new, lighter MST. Finding the heaviest edge in the cycle can be done in O(V) with a path query, or O(log V) with link-cut trees.