Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
What is Backtracking?
Backtracking is a systematic way to explore all possible solutions by building candidates incrementally and abandoning a path (“backtracking”) as soon as it violates constraints. Think of it like navigating a maze. At every fork, you pick a direction and keep walking. If you hit a dead end, you retrace your steps to the last fork and try the next direction. You do not restart from the entrance each time — you only undo the most recent decision. That “undo and try something else” loop is the heart of backtracking. More formally, backtracking explores a decision tree. Each node represents a partial solution, each edge represents a choice, and leaves are either valid solutions or dead ends. The algorithm performs a depth-first walk of this tree, pruning entire subtrees the moment a constraint is violated, which is what makes it dramatically faster than brute force in practice. Complexity insight: Without pruning, backtracking visits every leaf of the decision tree (often O(2^n) or O(n!)). Good pruning can cut this down by orders of magnitude, but worst-case exponential behavior is inherent to the approach. Always discuss pruning strategies when explaining backtracking in an interview.Pattern Recognition Checklist
Use Backtracking When
- Need all combinations/permutations
- Problem is about constraint satisfaction
- Generate all valid solutions
- Need to explore decision tree
- Pruning can reduce search space
Don't Use When
- Only need one solution (might use simpler approach)
- Optimal solution needed (use DP instead)
- Counting only (DP often better)
- No way to prune invalid paths
When to Use
Combinations/Permutations
Subsets
Constraint Satisfaction
Path Finding
Core Template
Every backtracking solution follows three steps: choose, explore, un-choose. Memorize this skeleton and you can adapt it to any variant.Pattern Variations
1. Subsets (All Combinations)
For subsets, every node in the decision tree is a valid answer (not just leaves). At each position we decide “include this element or not” and we only look forward to avoid duplicates like picking and . Time: O(n * 2^n) — 2^n subsets, each copied in O(n). Space: O(n) recursion depth.2. Permutations
Unlike subsets, permutations use every element exactly once and order matters. We track which elements are already in the current arrangement with aused boolean array. Only leaf nodes (full-length arrangements) are collected.
Time: O(n * n!) — n! permutations, each copied in O(n). Space: O(n) for the used array and recursion stack.
3. Combinations (Choose K)
Combinations are subsets of a fixed size k. The key optimization is pruning: if there are not enough remaining numbers to fill the combination, stop early instead of recursing pointlessly. For example, if you need 3 more numbers but only 2 remain, prune immediately. Time: O(k * C(n,k)) — C(n,k) combinations, each copied in O(k). Space: O(k) recursion depth.4. N-Queens
The classic constraint satisfaction problem. We place one queen per row and useis_safe to prune placements that conflict with previously placed queens. We only need to check upward (rows already filled) because we place queens top to bottom.
Time: O(n!) approximate with pruning. Space: O(n^2) for the board.
Interview tip: You can speed up is_safe from O(n) to O(1) by maintaining sets for columns, left-diagonals (row - col), and right-diagonals (row + col). Mention this optimization even if you code the simpler version.
Classic Problems
1. Subsets
1. Subsets
nums[i] == nums[i-1] at the same recursion level to avoid duplicate subsets.2. Permutations
2. Permutations
start index.Complexity: O(n * n!) time, O(n) space.Common mistake: Forgetting to reset used[i] = False after backtracking — this silently produces fewer results without erroring out, making it hard to debug.3. Combination Sum
3. Combination Sum
if remaining < 0: return is essential.Interview tip: Sort the candidates first. Then when the current candidate exceeds the remaining target, break out of the loop because all subsequent candidates are even larger.4. Word Search
4. Word Search
Worked LeetCode Problems
These six problems cover roughly 80% of the backtracking questions you will see in interviews. Master the templates here and the rest are variations.LC 46: Permutations
Problem: Given an array of distinct integers, return all possible permutations. Why it matters: The cleanest “use aused array” template. Every interviewer asks some version of this.
LC 78: Subsets (Power Set)
Problem: Return all subsets (the power set) of an array of distinct integers. Key insight: Every recursive call’s path IS a valid subset — collect at the top of every call, not just at leaves.LC 39: Combination Sum
Problem: Find all unique combinations ofcandidates that sum to target. Each candidate may be used unlimited times.
Key insight: Recurse with i (NOT i+1) to allow element reuse. Sort first, then break when a candidate exceeds the remaining target — this single optimization is huge.
LC 51: N-Queens
Problem: Place N queens on an NxN board so no two attack each other. Return all distinct configurations. Key insight: Place exactly one queen per row. Track occupied columns and diagonals in O(1)-lookup sets. Diagonals are identified byrow - col (left-to-right) and row + col (right-to-left).
LC 79: Word Search (Grid Backtracking)
Problem: Given an m x n grid of characters, returntrue if word exists by walking adjacent cells (no cell reused per word).
Key insight: Mark visited cells with a sentinel character ('#'), recurse 4 directions, restore after the recursion.
LC 22: Generate Parentheses
Problem: Given n, return all combinations of n pairs of well-formed parentheses. Key insight: Track how many( and ) have been placed. Two pruning rules: cannot add ( if open == n; cannot add ) if close >= open.
Common Mistakes
Caveats and Traps
Curated LeetCode Practice List
Grouped by difficulty, with annotations for what each problem teaches.Easy (Warm-up)
| LC# | Problem | What It Teaches |
|---|---|---|
| 17 | Letter Combinations of a Phone Number | Backtracking on a fixed-depth tree. Cleanest first problem. |
| 784 | Letter Case Permutation | ”Include or modify” choice at each character. |
| 401 | Binary Watch | Backtracking with constraint validation at the end. |
Medium (LeetCode bread-and-butter)
| LC# | Problem | What It Teaches |
|---|---|---|
| 78 | Subsets | Canonical “every node is a result” template. |
| 90 | Subsets II | Subsets with duplicates — sort plus skip-equal-at-depth. |
| 46 | Permutations | The used[] array template. |
| 47 | Permutations II | Permutations with duplicates — sort plus not used[i-1] skip. |
| 39 | Combination Sum | Reuse allowed; sort plus break-prune. |
| 40 | Combination Sum II | Each used once; duplicates in input. |
| 22 | Generate Parentheses | Constraint-based pruning (open/close counts). |
| 79 | Word Search | Grid backtracking with visited-marker. |
| 131 | Palindrome Partitioning | Try every prefix; recurse on the rest if valid. |
| 93 | Restore IP Addresses | Constraint-bounded depth (exactly 4 octets). |
| 698 | Partition to K Equal Sum Subsets | Backtracking + bitmask state. |
Hard (Stretch problems for senior interviews)
| LC# | Problem | What It Teaches |
|---|---|---|
| 51 | N-Queens | Constraint sets for O(1) validity. |
| 52 | N-Queens II | Same but counting only — bitmask optimization shines. |
| 37 | Sudoku Solver | Multiple interacting constraints; MRV heuristic. |
| 212 | Word Search II | Trie + grid backtracking — the Trie trick. |
| 301 | Remove Invalid Parentheses | Generate-and-validate with minimum-removal constraint. |
| 282 | Expression Add Operators | Track running value plus last operand for multiplication precedence. |
The Backtracking Template
Every backtracking solution follows this structure:Problem Type Quick Reference
| Problem Type | Include/Exclude | Start Index | Reuse Elements |
|---|---|---|---|
| Subsets | Every node is result | i + 1 | No |
| Combinations | Leaf nodes only | i + 1 | No |
| Combination Sum | Leaf nodes only | i (same) | Yes |
| Permutations | Leaf nodes only | 0 (use visited) | No |
| Unique Results | Skip duplicates at same level | i + 1 | No |
Interview Problems by Company
- Medium
- Hard
| Problem | Company | Key Concept |
|---|---|---|
| Subsets | All FAANG | Basic include/exclude |
| Permutations | All FAANG | Used array tracking |
| Combination Sum | Amazon, Meta | Reusable elements |
| Letter Combinations | Meta | Phone keypad mapping |
| Word Search | Amazon, Meta | Grid backtracking |
Interview Tips
How to Explain Your Approach
How to Explain Your Approach
- “This requires generating all possibilities, so I’ll use backtracking.”
- “I’ll build solutions incrementally, making one choice at a time.”
- “At each step, I’ll check if the current path is valid.”
- “After exploring a choice, I’ll undo it (backtrack) and try the next.”
- “I’ll prune invalid paths early to optimize.”
When Interviewer Says...
When Interviewer Says...
| Interviewer Says | You Should Think |
|---|---|
| ”Generate all combinations” | Backtracking |
| ”Find all permutations” | Backtracking with used array |
| ”All valid configurations” | Backtracking with constraints |
| ”Can you prune?” | Early termination conditions |
| ”Handle duplicates” | Sort + skip same at same level |
Avoiding Duplicates
Avoiding Duplicates
- Sort the input first
- Skip duplicates at same level:
if i > start and nums[i] == nums[i-1]: continue - This ensures same value isn’t used twice at same position in recursion tree
Practice Problems
| Problem | Difficulty | Link |
|---|---|---|
| Subsets | Medium | LeetCode 78 |
| Permutations | Medium | LeetCode 46 |
| Combination Sum | Medium | LeetCode 39 |
| N-Queens | Hard | LeetCode 51 |
| Word Search | Medium | LeetCode 79 |
Practice Roadmap
Interview Questions
1. Walk me through the backtracking template. What are the three core steps and why does each one matter?
1. Walk me through the backtracking template. What are the three core steps and why does each one matter?
- The three steps are choose, explore, and un-choose (backtrack). Every backtracking solution is a depth-first traversal of a decision tree built from these three operations.
- Choose means committing to a candidate decision — appending an element to the current path, marking a cell as visited, or placing a queen on the board. This mutates shared state.
- Explore means recursing deeper with the updated state. You pass a narrower set of remaining choices (e.g.,
start = i + 1for combinations, the full range with ausedarray for permutations). - Un-choose means undoing the mutation so the next sibling branch in the decision tree starts from a clean slate. This is the step that makes backtracking work — without it, choices from one branch leak into another and you get wrong results with zero error messages, which makes it one of the hardest bugs to catch.
- The reason we mutate and undo (instead of copying the path each time) is performance. Copying an array at every recursive call turns O(n) space into O(n * 2^n) space and adds a significant constant factor. The mutate-then-undo pattern keeps space at O(n) for the recursion stack.
- Example: In the subsets problem for
[1, 2, 3], when you are at the node[1, 2], you explore by adding3to get[1, 2, 3]. After that recursive call returns, you pop3so the path is back to[1, 2]. Then you pop2so the path is[1], and the loop moves toi=2which adds3to get[1, 3]. If you forgot to pop, you would be building on stale state.
path[:] (or new ArrayList<>(path)) is required when saving results, because the path keeps mutating after you save it.Follow-ups:- “What happens if you forget the un-choose step but the code still runs? How would you debug that?” — A missing
pop()orused[i] = Falsedoes not throw an error. The code produces fewer results or wrong results silently. You would debug by printing the path at each recursive call and noticing that elements from previous branches are still present. - “When would you intentionally not undo a choice?” — When you are creating a new copy of the path at each level instead of mutating (the immutable approach). This trades space for simplicity. Some functional-style solutions do this, and it is fine for small inputs, but it blows up memory for large decision trees.
2. How do you handle duplicate elements in backtracking to avoid producing duplicate results?
2. How do you handle duplicate elements in backtracking to avoid producing duplicate results?
- The strategy is sort the input first, then skip duplicate values at the same recursion level. The check is:
if i > start and nums[i] == nums[i-1]: continue. - Why this works: sorting groups identical values together. At any given recursion level, the loop iterates over choices for one “slot” in the result. If you have already tried value
2in this slot, trying another2in the same slot produces the exact same subtree — so you skip it. - The condition
i > start(noti > 0) is critical. It ensures you only skip duplicates at the same level. Usingi > 0would incorrectly prevent picking the same value in deeper levels, which is valid. For example, with input[1, 2, 2], the subset[2, 2]is legitimate — the two 2s come from different recursion depths. - For permutations with duplicates (Permutations II), the approach is slightly different. You sort, use a
usedarray, and add the condition:if i > 0 and nums[i] == nums[i-1] and not used[i-1]: continue. This ensures that among identical elements, they are always picked in left-to-right order, preventing duplicate permutations. - Example: Subsets II with
[1, 2, 2]. Sorted:[1, 2, 2]. At recursion levelstart=1, we picknums[1]=2and explore. When the loop reachesnums[2]=2, we seenums[2] == nums[1]andi > start, so we skip. Without this, we would get[1, 2]twice — once using the first 2 and once using the second.
- “Why is the condition
i > startand noti > 0? What breaks if you usei > 0?” — Usingi > 0prevents picking duplicate values even at different depths. You would miss valid results like[2, 2]from input[1, 2, 2]because the second 2 at depth 2 would be skipped. - “For Permutations II, the skip condition checks
not used[i-1]. Some people writeused[i-1]instead. Both produce correct results — why, and which is more efficient?” — Both enforce an ordering among identical elements.not used[i-1]means “only pick the i-th duplicate if the (i-1)-th is already in use” (pick duplicates left to right).used[i-1]means “skip if the previous duplicate is already used” (pick duplicates right to left). Thenot used[i-1]version prunes higher in the tree because it cuts off branches earlier, making it faster in practice.
3. Explain the time complexity of generating all subsets vs. all permutations. Why are they different?
3. Explain the time complexity of generating all subsets vs. all permutations. Why are they different?
- Subsets: O(n * 2^n). The decision tree is a binary tree of depth n — at each level you decide “include element i or not.” This produces exactly 2^n leaf nodes (subsets). Each subset is copied into the result in O(n) time on average, giving O(n * 2^n) total.
- Permutations: O(n * n!). The decision tree has a branching factor that decreases at each level: n choices at level 0, n-1 at level 1, …, 1 at level n-1. The number of leaves is n * (n-1) * … * 1 = n!. Each permutation is copied in O(n), giving O(n * n!).
- The difference comes from the structure of choices. Subsets are about include/exclude (2 choices per element, independent), while permutations are about ordering (each element goes to exactly one position, choices depend on what is already used).
- Space for both is O(n) for the recursion stack and current path (not counting the output). If you count the output, subsets use O(n * 2^n) and permutations use O(n * n!).
- Practical implication: n! grows much faster than 2^n. For n=10, 2^10 = 1,024 but 10! = 3,628,800. For n=20, subsets (about 1 million) are tractable, but permutations (about 2.4 * 10^18) are not. This is why interviewers ask “do you need all permutations or can we prune?” — it is not a theoretical question, it is a practical one about whether your code will finish.
- “If I only need to count subsets, do I need backtracking?” — No. The count is always 2^n (or C(n,k) for combinations of size k). You only need backtracking when you must enumerate the actual subsets. This is a common DP vs. backtracking decision point.
- “At what input size does generating all permutations become infeasible? How about subsets?” — Permutations: n around 10-12 is the practical limit (10! is about 3.6 million, 12! is about 479 million). Subsets: n around 20-25 is feasible (2^20 is about 1 million, 2^25 is about 33 million). Beyond these, you need to either prune aggressively or rethink the approach.
4. In the N-Queens problem, how does the pruning work, and how would you optimize is_safe from O(n) to O(1)?
4. In the N-Queens problem, how does the pruning work, and how would you optimize is_safe from O(n) to O(1)?
- The naive
is_safechecks three things for a candidate position(row, col): no queen in the same column, no queen on the upper-left diagonal, and no queen on the upper-right diagonal. We do not check the row because we place exactly one queen per row by construction. - Each of those three checks walks backward through previously placed rows, making each call O(n) in the worst case. Over the full search, this adds up.
- The O(1) optimization uses three sets (or boolean arrays):
colsfor occupied columns,diag1for occupied left-diagonals, anddiag2for occupied right-diagonals. The key insight is that all cells on the same left diagonal share the same value ofrow - col, and all cells on the same right diagonal sharerow + col. Sois_safe(row, col)becomes:col not in cols and (row - col) not in diag1 and (row + col) not in diag2— three set lookups, O(1). - When you choose (place a queen), you add to all three sets. When you un-choose, you remove from all three sets. The sets track exactly which constraints are currently active.
- Why interviewers love this: It demonstrates that you think about the data structure behind the constraint check, not just the algorithm. The same principle (precomputing conflict sets) applies to Sudoku solvers and many real constraint-satisfaction problems.
is_safe and not mentioning the set-based optimization. Or claiming “we need to check all 8 directions” — we only check upward because queens below the current row have not been placed yet.Follow-ups:- “Why do
row - colandrow + coluniquely identify diagonals? Canrow - colbe negative, and does that cause issues?” —row - colis constant along any top-left-to-bottom-right diagonal, androw + colis constant along any top-right-to-bottom-left diagonal. Yes,row - colcan be negative (e.g., row=0, col=3 gives -3). This is fine for sets or hash maps. If using arrays, you offset byn-1so indices are non-negative:diag1[row - col + n - 1]. - “If I asked you to solve this for just the count of solutions (not the actual board configurations), would you change your approach?” — For counting only, you drop the board construction and just increment a counter at the base case. This saves O(n^2) per solution for board copying. For very large n (say n=15+), there are also mathematical results and bitwise tricks (using bitmasks for cols/diag1/diag2 instead of sets) that push performance further. The bitmask approach uses integer bits to represent occupied columns and diagonals, making the check and update a single bitwise AND/OR.
5. Subsets, Combinations, and Permutations all use backtracking -- what are the structural differences in their decision trees?
5. Subsets, Combinations, and Permutations all use backtracking -- what are the structural differences in their decision trees?
- Subsets: The decision tree is a binary tree of depth n. At each level, you decide “include element i or skip it.” Every node (not just leaves) is a valid result. The
startindex moves forward by 1 to avoid revisiting earlier elements. You collect results at every recursive call. - Combinations (choose k): Same tree structure as subsets, but you only collect results at nodes where
path.length == k. You add a pruning condition: if the remaining elements are fewer than the remaining slots (n - i + 1 < k - path.length), stop early. This pruning is the key difference from subsets. - Permutations: The tree has a different shape entirely. At level 0, you have n choices. At level 1, you have n-1 (everything except what is used). At level 2, n-2, and so on. You collect only at leaves (when
path.length == n). Instead of astartindex, you use ausedboolean array to skip already-chosen elements, and you loop from 0 every time. - The mental model: subsets = include/exclude, combinations = include/exclude with a size constraint, permutations = arrange all in some order.
- Quick structural summary:
| Aspect | Subsets | Combinations | Permutations |
|---|---|---|---|
| Loop starts at | start | start | 0 |
| Next call passes | i + 1 | i + 1 | same range, check used |
| Collect results at | every node | nodes where len == k | leaves only (len == n) |
| Branching factor | decreasing (n, n-1, …) | decreasing + pruned | decreasing (n, n-1, …) |
start index. Also: not knowing that combinations benefit from the “remaining elements” pruning.Follow-ups:- “If I change the Combination Sum problem to allow reuse of elements, what changes structurally in the tree?” — You recurse with
i(same index) instead ofi + 1. This means the branching factor does not decrease — you can pick the same element again. The tree can be deeper (bounded bytarget / min(candidates)), but pruning withif remaining < 0: returnkeeps it tractable. - “Could you convert permutations to use a
startindex approach instead of ausedarray?” — Not directly, because permutations need to consider elements before the current index (order matters). Thestartindex approach only looks forward, which is exactly what prevents duplicate subsets. For permutations, you could alternatively use swapping: swap element at indexstartwith each element fromstartton-1, recurse withstart+1, then swap back. This avoids theusedarray but is trickier to handle duplicates with.
6. You are solving Word Search on a grid. Your backtracking solution works but is too slow. How do you optimize it?
6. You are solving Word Search on a grid. Your backtracking solution works but is too slow. How do you optimize it?
- Baseline approach: For each cell in the grid, start a DFS that matches the word character by character. Mark cells as visited (e.g., replace with
'#'), explore all 4 directions, then restore. Time: O(m * n * 4^L) where L is word length. - Optimization 1 — Early termination: Before starting any DFS, check if the grid even contains all the characters in the word (with correct frequencies). If the word needs three
'a's but the grid only has two, returnFalseimmediately. This is O(m * n) and catches many impossible cases. - Optimization 2 — Reverse the word: Count the frequency of the first character vs. the last character. If the last character appears fewer times in the grid, reverse the word and search for the reversed version. This reduces the number of DFS starting points, sometimes dramatically.
- Optimization 3 — Prune by remaining length: If the remaining characters in the word exceed the number of unvisited cells reachable from the current position, prune. This is harder to compute exactly but a rough bound helps.
- Optimization 4 — For Word Search II (multiple words): Build a Trie from all words and search once. At each cell, you walk the Trie simultaneously with the grid DFS. If no Trie child exists for the current character, prune that entire branch. Remove words from the Trie once found to avoid redundant searches.
- What does NOT work: Memoization. The state would need to encode (position, set of visited cells, index in word), which is exponential. Backtracking’s strength here is that it avoids storing all states.
- “Why can you not use memoization/DP for word search like you can for other grid problems?” — Because the state includes which cells are visited, not just the current position. Two different paths can arrive at the same cell with different visited sets, so you cannot cache by position alone. The visited set makes the state space exponential.
- “If I gave you 10,000 words to search in the same grid, would you run your single-word solution 10,000 times?” — No. You build a Trie from all 10,000 words and do one DFS traversal. At each cell, you walk the Trie. If the current prefix is not in the Trie, you prune immediately. This is Word Search II, and it reduces the combined complexity massively compared to running 10,000 independent searches.
7. Backtracking vs. Dynamic Programming -- when do you pick one over the other?
7. Backtracking vs. Dynamic Programming -- when do you pick one over the other?
- Use backtracking when you need to enumerate all valid solutions (all subsets, all permutations, all valid configurations). Backtracking generates each solution explicitly.
- Use DP when you need a single optimal value (minimum cost, maximum profit, total count) and the problem has overlapping subproblems. DP collapses repeated work via memoization.
- The overlap zone: Some problems like “count the number of ways to reach target sum” can be solved by either. Backtracking generates all ways and counts them (O(2^n) typically). DP caches intermediate results and counts without generating (often O(n * target)). For counting-only problems, DP almost always wins.
- When backtracking + memoization = DP: If you add a cache to your backtracking solution and the state space is polynomial, you have effectively built a top-down DP. This is a valid approach and sometimes easier to think about. But if the state space is exponential (e.g., involves a bitmask of visited nodes), the “DP” is still exponential — caching does not magically fix it.
- Decision rule of thumb: Ask yourself: “Do I need the actual solutions, or just a count/optimum?” If actual solutions, backtracking. If count/optimum with overlapping subproblems, DP. If count/optimum without overlapping subproblems, backtracking (or greedy).
- Example: “Partition a set into two subsets with equal sum” — DP (you need yes/no, not the actual partition). “Generate all partitions of a set into two subsets with equal sum” — backtracking.
- “Can you give me a problem where backtracking is strictly better than DP?” — N-Queens. The state space (which columns and diagonals are occupied) can be represented as a bitmask, but there is no overlapping subproblems structure to exploit. Each configuration of placed queens is unique. DP’s caching would add overhead without saving any work. Backtracking with pruning is the natural fit.
- “What about a problem where you initially think backtracking but DP is far superior?” — “How many distinct ways can you climb n stairs taking 1 or 2 steps at a time?” You could backtrack and enumerate all paths (O(2^n)), but it has classic overlapping subproblems: the number of ways from step k is the same regardless of how you got to step k. DP solves it in O(n).
8. In Combination Sum, what changes when elements can be reused vs. when they cannot? Walk me through both.
8. In Combination Sum, what changes when elements can be reused vs. when they cannot? Walk me through both.
- Reusable elements (Combination Sum I): Recurse with index
i(same index). This allows the same element to be picked again in the next recursive call. The tree can go deep — bounded bytarget / min(candidates). - Non-reusable elements (Combination Sum II): Recurse with index
i + 1. Each element can only be used once. The tree depth is bounded by the number of elements. - That single change —
ivs.i + 1in the recursive call — is the entire structural difference. - Additional complexity for Combination Sum II: The input may contain duplicate values (e.g.,
[1, 1, 2, 5, 6, 7, 10]). You need to sort the array and skip duplicates at the same recursion level (if i > start and candidates[i] == candidates[i-1]: continue) to avoid producing duplicate combinations. - Pruning is critical for both: Sort the candidates first. In the loop, if
candidates[i] > remaining,break(notcontinue). Because the array is sorted, all subsequent candidates are even larger, so the entire rest of the loop is useless. This single optimization can cut runtime by 10x+ on large inputs. - Example: Candidates
[2, 3, 6, 7], target = 7. With reuse:[2, 2, 3]and[7]are valid (we used2twice). Without reuse and candidates[2, 3, 6, 7]: only[7]works because we cannot reuse2.
continue instead of break when a sorted candidate exceeds the remaining target — this means you do not understand why sorting enables the optimization.Follow-ups:- “Why
breakand notcontinuewhen the candidate exceeds the remaining target?” — Because the array is sorted. Ifcandidates[i]exceeds the remaining target, thencandidates[i+1],candidates[i+2], etc. all exceed it too.continuewould pointlessly check every remaining candidate.breakskips them all in one step. - “What if I told you the candidates are not sorted and you cannot sort them — can you still prune effectively?” — Without sorting, you cannot
breakearly because a large candidate might be followed by a smaller one that fits. You can onlycontinueon individual candidates that are too large. This is strictly less efficient. In an interview, you should always sort first unless the problem explicitly forbids it or the order of candidates matters for the output.
9. You wrote a backtracking solution but it produces duplicate results and you cannot figure out why. Walk me through your debugging process.
9. You wrote a backtracking solution but it produces duplicate results and you cannot figure out why. Walk me through your debugging process.
- Step 1 — Print the decision tree. Add a print statement at the start of each recursive call showing the current
path,startindex, and the choice being made. Indent by recursion depth. This visualizes the actual tree your code is exploring. - Step 2 — Check for the “same level duplicate” bug. If the input has duplicate values, verify that you sorted the input AND added the skip condition
if i > start and nums[i] == nums[i-1]: continue. The most common cause of duplicate results is a missing skip condition. - Step 3 — Verify the
startindex is advancing correctly. If you passstartinstead ofi + 1to the recursive call, you allow reuse of elements. This is correct for Combination Sum I but wrong for Subsets and Combinations. Check which variant you are solving. - Step 4 — Check if you are collecting results at the right nodes. Subsets collect at every node. Combinations and permutations collect only at specific depths. Collecting at the wrong point produces extra or missing results.
- Step 5 — Reduce to a tiny input. Use input
[1, 2, 2]or[1, 1]and manually trace the expected output vs. actual output. Compare with the printed decision tree from Step 1. - Step 6 — Verify the un-choose step resets ALL state. If your choose step modifies multiple things (e.g., appending to path AND marking a cell visited AND adding to a set), the un-choose step must undo ALL of them. A partial undo is a subtle bug.
- In my experience, 90% of backtracking duplicate bugs come from exactly two causes: missing sort + skip for duplicate inputs, or passing the wrong index to the recursive call.
- “Your backtracking solution returns an empty result. What are the most common causes?” — Three main causes: (1) you never reach the base case because your recursion never makes progress (infinite loop or wrong loop bounds), (2) you are saving
pathdirectly instead ofpath[:]so all results point to the same empty list after recursion completes, (3) youris_validfunction is too restrictive and prunes every branch. - “How would you write a test to catch backtracking bugs before they reach production (or the interviewer)?” — Generate expected results for small inputs by hand, then assert on both the count and the content. For subsets, assert
len(result) == 2^n. For permutations, assertlen(result) == n!. Also assert that every result is unique and satisfies the constraints. These invariant checks catch most bugs immediately.
10. Design a Sudoku solver using backtracking. What pruning strategies make it practical?
10. Design a Sudoku solver using backtracking. What pruning strategies make it practical?
- Basic approach: Find the next empty cell, try digits 1-9, check if the digit is valid (not already in the same row, column, or 3x3 box), place it, recurse to the next empty cell. If no digit works, backtrack (reset the cell to empty and return
False). - Why naive is slow: A 9x9 Sudoku has up to 81 empty cells. Without pruning, you try 9 options per cell, giving a worst case near 9^81 — astronomically large. Practical Sudoku solving requires aggressive pruning.
- Pruning strategy 1 — Precompute constraints. Maintain three data structures:
rows[9],cols[9], andboxes[9], each a set of digits already placed. Checking validity becomes three O(1) set lookups instead of scanning the row/col/box each time. The box index for cell(r, c)is(r // 3) * 3 + c // 3. - Pruning strategy 2 — Choose the most constrained cell first (MRV heuristic). Instead of filling cells left to right, pick the empty cell with the fewest valid options. If a cell has only one valid digit, fill it immediately. If a cell has zero valid options, backtrack immediately without trying anything. This is the “Minimum Remaining Values” heuristic and it dramatically reduces the tree size.
- Pruning strategy 3 — Constraint propagation. After placing a digit, propagate: if any peer cell (same row/col/box) now has only one valid option, fill it too (and propagate recursively). This is what makes the solver feel “instant” on most puzzles. It is the technique behind Peter Norvig’s famous Sudoku solver.
- Pruning strategy 4 — Naked pairs/triples. If two cells in the same row can only contain digits
{3, 7}, then no other cell in that row can contain 3 or 7. This is an advanced constraint propagation technique. - Example: On a typical newspaper Sudoku (about 25 empty cells), the MRV heuristic + basic constraint propagation solves it in under 100 recursive calls. Without these, the same puzzle might need millions of calls.
(row, col).Follow-ups:- “How would you represent the board and constraints to minimize memory and maximize speed?” — Use a 9x9 array for the board. For constraints, use bitmasks: each row, column, and box gets a 9-bit integer where bit
dis set if digitd+1is present. Checking validity is a bitwise AND, and updating is a bitwise OR. This is faster than sets due to cache locality and avoids hash overhead. - “What is the worst-case input for a Sudoku solver, and how does your pruning handle it?” — The worst case is a nearly empty board (few given digits) with many valid solutions. MRV helps less because many cells have similar option counts. Constraint propagation helps less because there are fewer forced moves. In the absolute worst case, the solver degenerates to near brute-force. However, for any valid Sudoku (unique solution), the combination of MRV + propagation keeps the search tree manageable — typically under 10,000 nodes.
Interview Deep-Dive
When would you choose backtracking over other paradigms like BFS, DP, or greedy? What signals tell you backtracking is the right tool?
When would you choose backtracking over other paradigms like BFS, DP, or greedy? What signals tell you backtracking is the right tool?
- Backtracking is the right choice when the problem asks you to enumerate all valid configurations or find any valid configuration that satisfies a set of constraints. The key signals are: “generate all,” “find all,” “list every,” or any constraint-satisfaction framing (Sudoku, N-Queens, crossword filling).
- Backtracking over BFS: BFS explores level by level and is ideal for shortest-path problems. Backtracking explores depth-first and is ideal when you need complete solutions, not shortest ones. If the question says “minimum steps,” think BFS. If it says “all valid arrangements,” think backtracking.
- Backtracking over DP: DP computes a single optimal value or count by collapsing overlapping subproblems. Backtracking enumerates actual solutions. If you only need the count of subsets summing to K, DP is O(n * K). If you need to list every such subset, backtracking is unavoidable because the output itself can be exponential.
- Backtracking over greedy: Greedy makes irrevocable local choices. Backtracking tries a choice, explores, and undoes it if it fails. If a greedy counterexample exists (local optimum does not lead to global optimum), backtracking or DP is needed. But if you only need one valid solution and greedy finds it, greedy is cheaper.
- The decision heuristic: Ask “do I need the actual solutions or just a count/optimum?” If actual solutions, backtracking. If count/optimum with overlapping subproblems, DP. If count/optimum with greedy-choice property, greedy.
- Complexity awareness: Backtracking is inherently exponential (O(2^n) for subsets, O(n!) for permutations). If n exceeds ~20-25 for subsets or ~10-12 for permutations, the solution will not finish in time without aggressive pruning.
- First, profile where time is spent. In backtracking, the bottleneck is almost always the branching factor at upper levels of the tree, because exponentials amplify early branching.
- Optimization 1 — Better pruning. Add constraint checks earlier. For example, in Combination Sum, sort candidates and
break(notcontinue) when a candidate exceeds the remaining target, eliminating the entire right subtree. - Optimization 2 — Variable ordering (MRV). Choose the most constrained variable first. In N-Queens, place queens in the row with the fewest valid columns first. This prunes more branches higher in the tree where the payoff is largest.
- Optimization 3 — Precompute validity structures. Replace O(n) validity checks with O(1) set/bitmask lookups. For N-Queens, use
cols,diag1,diag2sets instead of scanning placed queens. - Optimization 4 — Symmetry breaking. If the problem has symmetry (e.g., the first queen in N-Queens can be placed in only half the columns due to mirror symmetry), exploit it to halve the search space.
- Optimization 5 — Early termination. If you only need one solution (not all), return immediately upon finding the first valid one instead of continuing to explore.
You need to generate all unique combinations that sum to a target from a list with duplicates. What is the exact complexity, and what is the single most impactful optimization?
You need to generate all unique combinations that sum to a target from a list with duplicates. What is the exact complexity, and what is the single most impactful optimization?
- This is Combination Sum II (LeetCode 40). The input has duplicates, each element can be used at most once, and we need unique combinations.
- Approach: Sort the array. Backtrack with
start = i + 1(no reuse). Skip duplicates at the same recursion level:if i > start and candidates[i] == candidates[i-1]: continue. - Exact complexity: The worst case is O(2^n) combinations (every subset could be valid if all elements are distinct and all sums happen to equal the target). Each combination takes O(k) to copy where k is the combination length. So worst-case output-sensitive complexity is O(k * 2^n). In practice, the target constraint prunes most branches.
- The single most impactful optimization: Sort the candidates and use
breakinstead ofcontinuewhencandidates[i] > remaining. Because the array is sorted, once one candidate exceeds the remaining budget, all subsequent candidates do too. This turns a linear scan of remaining candidates into an instant termination. On inputs with many candidates larger than the target, this can reduce the search space by 90% or more. - Space complexity: O(n) for the recursion stack and current path (not counting output). The sorting is O(n log n) but dominated by the exponential search.
- Edge cases that break naive implementations: (1) All elements are identical — the skip condition prevents generating the same combination multiple times. (2) Target is 0 — the empty combination is valid; make sure your base case handles this. (3) All elements exceed the target — return empty immediately with the
breakoptimization.
- The recursion depth is no longer bounded by n (the number of candidates). It is bounded by
target / min(candidates). The branching factor at each level is the number of candidates that do not exceed the remaining target. - Worst-case complexity becomes O(n^(target/min_candidate)), which can be much larger than O(2^n) for small candidates and large targets. For example, candidates = [1], target = 100 produces a single combination [1,1,…,1] of length 100, but the tree explores every prefix along the way.
- The key structural difference: with reuse, the recursive call passes
i(same index), so the tree can go arbitrarily deep. Without reuse, it passesi + 1, bounding depth to n.
Explain how backtracking on a grid (like Word Search) differs from backtracking on an array (like Permutations). What unique edge cases does grid backtracking introduce?
Explain how backtracking on a grid (like Word Search) differs from backtracking on an array (like Permutations). What unique edge cases does grid backtracking introduce?
- Structural difference: Array backtracking has a 1D decision space — at each level you pick from a set of candidates. Grid backtracking has a 2D decision space — at each step you move in one of 4 (or 8) directions, and the “candidates” are spatial neighbors rather than array elements.
- Visited tracking differs fundamentally. In permutations, you use a
usedboolean array indexed by element position. In grid problems, you mark the cell itself (either mutating the grid or using a 2D visited matrix). The visited state is path-dependent — you must unmark cells when you backtrack so other paths can use them. - Grid-specific edge cases:
- Stack overflow on large grids. A 500x500 all-land grid can have DFS depth of 250,000, blowing Python’s default 1000-frame recursion limit. Either increase the limit or convert to iterative DFS with an explicit stack.
- Starting point selection. Unlike array problems where you always start from index 0, grid backtracking may require trying every cell as a starting point (Word Search scans all cells for the first character). This adds an O(m*n) multiplier.
- Boundary checks. Every recursive call must validate
0 <= r < rows and 0 <= c < colsbefore accessing the grid. Forgetting this causes index-out-of-bounds errors that do not occur in array backtracking. - Diagonal vs. cardinal movement. Some problems use 4-directional movement, others use 8. The branching factor jumps from 4 to 8, doubling the work at each level.
- Grid mutation side effects. If you mark visited by setting
grid[r][c] = '#', you must restore the original character on backtrack. Forgetting the restoration is the single most common grid backtracking bug.
- Complexity: For Word Search, the worst case is O(m * n * 4^L) where L is word length. Each cell is a potential start (m*n), and from each start, the DFS branches into 4 directions up to L levels deep. In practice, pruning (character mismatch, visited cell) cuts this dramatically.
- Single-word search for K words is O(K * m * n * 4^L). Each word triggers an independent grid traversal.
- The Trie approach does one grid traversal. At each cell, it walks the Trie simultaneously with the DFS. If the current prefix has no children in the Trie, the entire DFS branch is pruned. The Trie effectively shares prefix computation across all words.
- For K = 10,000 words with average length 5 on a 300x300 grid, single-word search does 10,000 * 90,000 * 4^5 = ~900 billion operations. The Trie approach does roughly 90,000 * 4^5 = ~90 million operations (shared across all words), with aggressive Trie pruning reducing it further.
- Additional optimization: remove a word from the Trie once found, and prune leaf nodes that no longer lead to any word. This prevents redundant searches for already-found words.
What is the worst-case input for a backtracking algorithm, and how do you identify whether your pruning is effective enough for a given problem's constraints?
What is the worst-case input for a backtracking algorithm, and how do you identify whether your pruning is effective enough for a given problem's constraints?
- Worst-case inputs are those where pruning eliminates nothing and the algorithm explores the full decision tree. For subsets, this is any input — 2^n subsets always exist. For constraint-satisfaction, the worst case is an input where every partial solution looks promising but ultimately fails at the last step, forcing exhaustive exploration.
- Classic worst-case example: Backtracking on an unsolvable Sudoku board with very few given digits. The solver tries every combination, backtracks at the end, and ultimately reports “no solution.” Every path is explored to maximum depth before failing.
- How to assess pruning effectiveness:
- Count nodes visited. Add a counter that increments at each recursive call. Compare against the theoretical maximum (2^n for subsets, n! for permutations). If your counter is within 10x of the output size, pruning is excellent. If it is 100x or more, pruning is weak.
- Estimate branching factor. If the average branching factor at each level is b and depth is d, the tree has roughly b^d nodes. If your pruning reduces the effective branching factor from b to b’, the speedup is (b/b’)^d — exponential in depth. Even reducing branching from 9 to 7 in Sudoku gives (9/7)^81 which is astronomically large.
- Check constraint tightness. Tight constraints (many cells filled in Sudoku, small target in Combination Sum) prune more. Loose constraints (few given digits, large target) prune less.
- Profile the rejection rate. Track what percentage of candidates at each level are pruned. If 90% are pruned at level 1, that eliminates 90% of the tree. If only 10% are pruned, the tree is barely reduced.
- Rule of thumb for interview feasibility: For competitive programming, backtracking solutions with effective pruning can handle n up to about 20-25 (subsets), 10-12 (permutations), or 81 cells (Sudoku with MRV + constraint propagation). If n exceeds these, you need a fundamentally different approach (DP, greedy, math).
- For n = 15 with subsets, 2^15 = 32,768 — trivially fast. For permutations, 15! = 1.3 trillion — way too slow without pruning.
- Estimation method: Calculate the theoretical tree size. If it is under ~10^8 operations for Python or ~10^9 for C++/Java, it will likely pass within 2 seconds. For n = 15 permutations without pruning, 15! exceeds 10^12, so no. But with heavy pruning (like constraint-satisfaction where most branches are cut), the effective tree might be 10^6 — then it passes easily.
- Quick calibration: Python does roughly 10^7 simple operations per second. C++ does roughly 10^8-10^9. If your estimated tree size divided by the pruning factor is under these thresholds, proceed with backtracking. Otherwise, look for DP or a mathematical shortcut.
LeetCode Interview Q&A (Backtracking)
These four are the LeetCode-style backtracking questions you should be able to walk through cold. Each has a strong-answer framework, a worked example with full code, three senior follow-ups, common wrong answers, and pointers to related problems.LC 47: Generate All Permutations -- Handle Duplicates
LC 47: Generate All Permutations -- Handle Duplicates
- State that LC 46 (distinct elements) uses a
used[]array. The complication for LC 47 is that the input may contain duplicates — naive code produces duplicate permutations. - The fix has two parts: (a) sort the input so identical elements are adjacent; (b) at each level, skip an element if the previous identical element has not been used yet (
if i greater than 0 and nums[i] equals nums[i-1] and not used[i-1]: continue). - The intuition: among identical elements, always pick them in their original left-to-right order. This breaks the symmetry that would otherwise produce duplicates.
- State complexity: O(n * n!) worst case, but with heavy pruning when there are many duplicates.
nums = [1, 1, 2] (already sorted):- Without the dedup check, we would pick “first 1” then “second 1” then 2, AND “second 1” then “first 1” then 2 — two identical permutations.
- With the check: when at depth 0 we try
i=1(the second 1), we seenums[1]==nums[0]andused[0]==False, so we skip. Only one branch starting with 1 is explored.
not used[i-1] and not used[i-1]? Both versions seem to dedup.”
Both produce correct results, but they prune at different points. not used[i-1] means “skip the i-th duplicate if the i-1-th has not been picked yet” — this enforces left-to-right order among duplicates and prunes higher in the tree. used[i-1] enforces right-to-left order, prunes lower, so it explores more nodes before cutting. The not used[i-1] version is faster in practice, though both are O(n * n!) asymptotically.nums = [1]*20 — how does the runtime change?”
With heavy duplication, the deduplication is enormously effective. For [1]*20, naive backtracking would explore 20! roughly equals 2.4 * 10^18 nodes (intractable). With dedup, only 1 permutation exists, and the recursion explores roughly n nodes — linear time. The skip condition is what makes this practical.- Using a Python
setto dedup the final results — correct but exponentially slower because it generates all duplicates first. - Forgetting to sort first — the skip condition only works on adjacent duplicates.
- Writing
if i > 0 and nums[i] == nums[i-1]: continuewithout checkingused[i-1]— this incorrectly skips valid permutations like[1, 1, 2]itself.
- LC 46 Permutations — the distinct-elements baseline.
- LC 90 Subsets II — same dedup principle but with
startindex instead ofused[].
LC 51: N-Queens -- Explain Pruning
LC 51: N-Queens -- Explain Pruning
- State the constraint: place exactly one queen per row. This eliminates the need to check rows.
- For columns and the two diagonal types, maintain three sets so that
is_safe(row, col)is O(1) instead of O(n). - Diagonal identification: cells on the same
\diagonal sharerow - col. Cells on the same/diagonal sharerow + col. - Discuss the bitmask optimization for further speedup (LC 52).
(0, 1), row 1 cannot use column 0 (occupied diagonal 0+1-1-0=0… let me recompute: diag1 = row - col; so (0,1) puts 0-1=-1 in diag1 and 0+1=1 in diag2. Row 1 col 0: diag1 check 1-0=1 — not in diag1. diag2 check 1+0=1 — IN diag2 (from (0,1)). Pruned. So (1, 0) is correctly skipped because (0,1) and (1,0) are on the same / diagonal.(row, col):cols |= (1 << col)diag1 |= (1 << (row - col + n - 1))(offset to keep non-negative)diag2 |= (1 << (row + col))
is_safe check is (cols | diag1 | diag2) bitwise-AND (1 << col) — one CPU instruction. This is roughly 5-10x faster than set operations for large n. LC 52 (just count, no boards) hits this hard.- Checking all 8 directions including downward — unnecessary because rows below have not been filled yet.
- Using O(n)
is_safechecks (scanning rows) instead of O(1) set lookups — works for small n but TLE on n above roughly 10. - Forgetting that diagonals come in two types (
\and/) and only tracking one.
- LC 52 N-Queens II — counting variant where bitmask optimization shines.
- LC 37 Sudoku Solver — same constraint-set principle applied to a 9x9 grid.
LC 79: Word Search -- Backtracking on a Grid
LC 79: Word Search -- Backtracking on a Grid
- For every cell, start a DFS. At each step, check the current grid character matches
word[idx], mark the cell as visited (with a sentinel like'#'), and try all 4 directions. - CRITICAL: restore the cell to its original character on backtrack. Without this, sibling DFS paths cannot reuse the cell.
- Complexity: O(m * n * 4^L) where L is word length. The
4^Lterm is loose; in practice, the character match prunes aggressively. - Discuss optimizations: early character frequency check, reverse the word if the last character is rarer.
board = [["A","B","C","E"],["S","F","C","S"],["A","D","E","E"]] and word = "ABCCED":- Start at
(0, 0)='A', matchesword[0]. Mark as#, try directions. - Move to
(0, 1)='B', matchesword[1]. Mark. - Move to
(0, 2)='C', matchesword[2]. Mark. - Move to
(1, 2)='C', matchesword[3]. Mark. - Move to
(2, 2)='E', matchesword[4]. Mark. - Move to
(2, 1)='D', matchesword[5]. Length matches. Return True.
(r, c, idx, visited_set). The visited_set makes the state exponential — two paths arriving at (r, c, idx) may have different visited cells, so they are NOT equivalent. You cannot cache by (r, c, idx) alone. This is why backtracking (which avoids storing all states) is the right tool.O(K * m * n * 4^L) for K words into roughly O(m * n * 4^L) shared across all words.'A's with word = "A" * (m*n) + "B". The DFS explores every path of length m*n+1, never finds the B, and backtracks fully. This is the classic adversarial input. In practice, real test cases with diverse characters prune early.- Forgetting to restore the cell after recursion — subsequent paths cannot revisit it, so valid words are reported as not found.
- Using a separate
visitedset instead of in-place marking — correct but uses O(m*n) extra space when in-place uses O(L) recursion only. - Starting only at cells matching
word[0](an optimization) but forgetting to handle the empty word edge case — LeetCode says word length is at least 1, but always check the constraint statement.
- LC 212 Word Search II — the Trie generalization.
- LC 980 Unique Paths III — grid backtracking with a “visit every cell exactly once” constraint.
LC 39 vs LC 40: Combination Sum -- The One-Index Difference
LC 39 vs LC 40: Combination Sum -- The One-Index Difference
- Both problems ask for combinations summing to a target. The difference is whether each candidate can be used multiple times (LC 39) or only once (LC 40).
- The structural difference is one character: recurse with
i(reuse allowed) vsi + 1(no reuse). - LC 40 may have duplicates in the input — requires sort plus skip-equal-at-depth.
- Both benefit from sorting plus
break-pruning when the candidate exceeds the remaining target.
candidates = [2, 3, 6, 7], target = 7:[2, 2, 3]: pick 2, recurse withi=0(so 2 is reusable), get to remaining=5 then 3 then 1 then 0.[7]: pick 7 directly, remaining=0.- Note: we never get
[3, 2, 2]because we always go forward by index, preventing same-multiset-different-order duplicates.
candidates = [10, 1, 2, 7, 6, 1, 5], target = 8:- After sort:
[1, 1, 2, 5, 6, 7, 10]. - Valid combinations:
[1, 1, 6],[1, 2, 5],[1, 7],[2, 6]. - Without
if i > start and candidates[i] == candidates[i-1]: continue, you would also get duplicate[1, 7](one with each of the two 1s).
i > start and not i > 0 for the duplicate check?”
i > 0 would prevent picking [1, 1, 6] because the second 1 (at depth 1, with start=1) would be skipped due to nums[1]==nums[0]. We DO want to pick both 1s in different positions of the same combination — that is valid. i > start only prevents picking the same value twice AT THE SAME DEPTH, which is what causes duplicate combinations.break and not continue when candidates[i] > remaining?”
Because the array is sorted. If candidates[i] already exceeds remaining, then candidates[i+1], candidates[i+2], etc. are all larger and cannot fit either. continue would pointlessly check each. break skips them all in one step. On large inputs with widely-varying candidate magnitudes, this can be a 10x speedup.target / min(candidates) and the branching factor is at most n. Worst case is roughly O(n^(target/min)). For candidates [1] and target=100, this is 1 combination but the tree explores every prefix [1], [1,1], …, [1]*100, which is 100 nodes. So a tighter bound is O(target * S) where S is the number of valid combinations. Most interviewers accept “exponential, bounded by target/min branching factor n” as the answer.- Recursing with
i + 1in LC 39 — breaks reuse. You miss combinations like[2, 2, 3]. - Recursing with
iin LC 40 — allows reuse. You generate combinations that should not exist. - Using
continueinstead ofbreakfor the prune check — correctness preserved but huge perf hit. - Forgetting to sort before doing
break-prune — the prune is incorrect on unsorted input.
- LC 216 Combination Sum III — fixed pool (1-9), fixed count k.
- LC 377 Combination Sum IV — counting-only variant; backtracking is too slow, this is actually a DP problem.