Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Backtracking Pattern

What is Backtracking?

Backtracking is a systematic way to explore all possible solutions by building candidates incrementally and abandoning a path (“backtracking”) as soon as it violates constraints. Think of it like navigating a maze. At every fork, you pick a direction and keep walking. If you hit a dead end, you retrace your steps to the last fork and try the next direction. You do not restart from the entrance each time — you only undo the most recent decision. That “undo and try something else” loop is the heart of backtracking. More formally, backtracking explores a decision tree. Each node represents a partial solution, each edge represents a choice, and leaves are either valid solutions or dead ends. The algorithm performs a depth-first walk of this tree, pruning entire subtrees the moment a constraint is violated, which is what makes it dramatically faster than brute force in practice. Complexity insight: Without pruning, backtracking visits every leaf of the decision tree (often O(2^n) or O(n!)). Good pruning can cut this down by orders of magnitude, but worst-case exponential behavior is inherent to the approach. Always discuss pruning strategies when explaining backtracking in an interview.
Quick Recognition: If you need to generate all possibilities, find all valid combinations, or solve constraint satisfaction problems, think Backtracking!
LeetCode Pattern Recognition Cheatsheet — if you see these phrases in the problem statement, backtracking is almost always the right tool:
  • “all permutations” — LC 46, LC 47. Use used[] boolean array; loop from index 0 each time.
  • “all subsets / power set” — LC 78, LC 90. Use start index; collect at every recursive call.
  • “all combinations summing to N” — LC 39 (with reuse), LC 40 (without reuse), LC 216, LC 377. Sort first, prune when remaining is negative, break when current candidate exceeds remaining.
  • “N-queens” — LC 51, LC 52. Place one queen per row; track occupied columns and diagonals.
  • “sudoku solver” — LC 37. Fill cells; use bitmasks for row/col/box constraints; choose most-constrained cell first (MRV heuristic).
  • “word search on grid” — LC 79, LC 212. Mark visited cells with a sentinel like '#', restore after recursion. For multiple words, use a Trie (LC 212).
  • “generate parens” or “all valid X” — LC 22, LC 17, LC 93 (Restore IP). Build incrementally, prune by validity invariants (open count, close count).
  • “partition string into palindromes” — LC 131. Try every prefix, if it is a palindrome, recurse on the rest.
If the problem asks for the count or the single best answer (not all answers), prefer DP. Backtracking is for enumeration.

Pattern Recognition Checklist

Use Backtracking When

  • Need all combinations/permutations
  • Problem is about constraint satisfaction
  • Generate all valid solutions
  • Need to explore decision tree
  • Pruning can reduce search space

Don't Use When

  • Only need one solution (might use simpler approach)
  • Optimal solution needed (use DP instead)
  • Counting only (DP often better)
  • No way to prune invalid paths

When to Use

Combinations/Permutations

Generate all arrangements or selections

Subsets

Generate all possible subsets

Constraint Satisfaction

Sudoku, N-Queens, crosswords

Path Finding

Find all paths, word search

Core Template

Every backtracking solution follows three steps: choose, explore, un-choose. Memorize this skeleton and you can adapt it to any variant.
def backtrack(path, choices):
    # BASE CASE: we have built a complete candidate
    if is_solution(path):
        result.append(path[:])  # Save a COPY -- path will keep mutating
        return
    
    for choice in choices:
        if is_valid(choice, path):  # PRUNE: skip choices that violate constraints
            # 1. CHOOSE -- commit to this option
            path.append(choice)
            
            # 2. EXPLORE -- recurse with the updated partial solution
            backtrack(path, updated_choices)
            
            # 3. UN-CHOOSE -- undo so the next iteration starts clean
            path.pop()

Pattern Variations

1. Subsets (All Combinations)

For subsets, every node in the decision tree is a valid answer (not just leaves). At each position we decide “include this element or not” and we only look forward to avoid duplicates like picking and . Time: O(n * 2^n) — 2^n subsets, each copied in O(n). Space: O(n) recursion depth.
def subsets(nums):
    """Generate all subsets of nums"""
    result = []
    
    def backtrack(start, current):
        # Every partial path is a valid subset -- collect it immediately
        result.append(current[:])
        
        for i in range(start, len(nums)):
            current.append(nums[i])       # CHOOSE: include nums[i]
            backtrack(i + 1, current)      # EXPLORE: only consider elements after i
            current.pop()                  # UN-CHOOSE: remove nums[i] before next iteration
    
    backtrack(0, [])
    return result

# Walkthrough with [1,2,3]:
#   start=0, current=[]       -> save []
#     i=0: current=[1]        -> save [1]
#       i=1: current=[1,2]    -> save [1,2]
#         i=2: current=[1,2,3]-> save [1,2,3]
#       i=2: current=[1,3]    -> save [1,3]
#     i=1: current=[2]        -> save [2]
#       i=2: current=[2,3]    -> save [2,3]
#     i=2: current=[3]        -> save [3]
# Result: [[], [1], [1,2], [1,2,3], [1,3], [2], [2,3], [3]]

2. Permutations

Unlike subsets, permutations use every element exactly once and order matters. We track which elements are already in the current arrangement with a used boolean array. Only leaf nodes (full-length arrangements) are collected. Time: O(n * n!) — n! permutations, each copied in O(n). Space: O(n) for the used array and recursion stack.
def permutations(nums):
    """Generate all permutations of nums"""
    result = []
    used = [False] * len(nums)
    
    def backtrack(current):
        # BASE CASE: arrangement is complete when it includes all elements
        if len(current) == len(nums):
            result.append(current[:])
            return
        
        for i in range(len(nums)):
            if used[i]:          # Skip elements already in the current permutation
                continue
            
            used[i] = True       # CHOOSE: mark element as used
            current.append(nums[i])
            
            backtrack(current)   # EXPLORE: build rest of the permutation
            
            current.pop()        # UN-CHOOSE: release element for other positions
            used[i] = False
    
    backtrack([])
    return result

3. Combinations (Choose K)

Combinations are subsets of a fixed size k. The key optimization is pruning: if there are not enough remaining numbers to fill the combination, stop early instead of recursing pointlessly. For example, if you need 3 more numbers but only 2 remain, prune immediately. Time: O(k * C(n,k)) — C(n,k) combinations, each copied in O(k). Space: O(k) recursion depth.
def combinations(n, k):
    """Choose k numbers from 1 to n"""
    result = []
    
    def backtrack(start, current):
        if len(current) == k:       # We have selected k numbers -- save and stop
            result.append(current[:])
            return
        
        # PRUNE: if remaining numbers (n - i + 1) < needed (k - len(current)), stop
        remaining_needed = k - len(current)
        for i in range(start, n - remaining_needed + 2):
            current.append(i)           # CHOOSE number i
            backtrack(i + 1, current)   # EXPLORE: only pick larger numbers next
            current.pop()               # UN-CHOOSE number i
    
    backtrack(1, [])
    return result

4. N-Queens

The classic constraint satisfaction problem. We place one queen per row and use is_safe to prune placements that conflict with previously placed queens. We only need to check upward (rows already filled) because we place queens top to bottom. Time: O(n!) approximate with pruning. Space: O(n^2) for the board. Interview tip: You can speed up is_safe from O(n) to O(1) by maintaining sets for columns, left-diagonals (row - col), and right-diagonals (row + col). Mention this optimization even if you code the simpler version.
def solve_n_queens(n):
    """Place n queens on n x n board so no two attack each other"""
    result = []
    board = [['.'] * n for _ in range(n)]
    
    def is_safe(row, col):
        # Check column -- has any queen been placed in this column above?
        for i in range(row):
            if board[i][col] == 'Q':
                return False
        
        # Check upper-left diagonal -- walk diagonally up-left
        i, j = row - 1, col - 1
        while i >= 0 and j >= 0:
            if board[i][j] == 'Q':
                return False
            i -= 1
            j -= 1
        
        # Check upper-right diagonal -- walk diagonally up-right
        i, j = row - 1, col + 1
        while i >= 0 and j < n:
            if board[i][j] == 'Q':
                return False
            i -= 1
            j += 1
        
        return True  # No conflicts -- this position is safe
    
    def backtrack(row):
        if row == n:  # All queens placed successfully
            result.append([''.join(r) for r in board])
            return
        
        for col in range(n):           # Try each column in this row
            if is_safe(row, col):      # PRUNE: skip columns that conflict
                board[row][col] = 'Q'  # CHOOSE: place queen
                backtrack(row + 1)     # EXPLORE: move to next row
                board[row][col] = '.'  # UN-CHOOSE: remove queen
    
    backtrack(0)
    return result

Classic Problems

Pattern: Include or exclude each elementKey: Every path is a valid subset — collect at every recursive call, not just leaves.Complexity: O(n * 2^n) time, O(n) space (excluding output).Edge case: If the input contains duplicates, sort first and skip nums[i] == nums[i-1] at the same recursion level to avoid duplicate subsets.
Pattern: Use boolean array to track used elementsKey: All positions are valid choices. Unlike subsets, we loop from index 0 each time instead of a start index.Complexity: O(n * n!) time, O(n) space.Common mistake: Forgetting to reset used[i] = False after backtracking — this silently produces fewer results without erroring out, making it hard to debug.
Pattern: Subsets with target sum, elements reusableKey: Start from the same index (not i+1) to allow reuse of the same element. When elements can only be used once, use i+1 and sort + skip duplicates.Complexity: Hard to express tightly — bounded by the number of valid combinations. Pruning with if remaining < 0: return is essential.Interview tip: Sort the candidates first. Then when the current candidate exceeds the remaining target, break out of the loop because all subsequent candidates are even larger.

Worked LeetCode Problems

These six problems cover roughly 80% of the backtracking questions you will see in interviews. Master the templates here and the rest are variations.

LC 46: Permutations

Problem: Given an array of distinct integers, return all possible permutations. Why it matters: The cleanest “use a used array” template. Every interviewer asks some version of this.
def permute(nums):
    result, used = [], [False] * len(nums)
    def backtrack(path):
        if len(path) == len(nums):
            result.append(path[:])               # Save a COPY
            return
        for i in range(len(nums)):
            if used[i]: continue                 # Skip used elements
            used[i] = True; path.append(nums[i]) # CHOOSE
            backtrack(path)                       # EXPLORE
            path.pop(); used[i] = False           # UN-CHOOSE
    backtrack([])
    return result
# Time: O(n * n!). Space: O(n) recursion + O(n!) output.

LC 78: Subsets (Power Set)

Problem: Return all subsets (the power set) of an array of distinct integers. Key insight: Every recursive call’s path IS a valid subset — collect at the top of every call, not just at leaves.
def subsets(nums):
    result = []
    def backtrack(start, path):
        result.append(path[:])               # Every node is a result
        for i in range(start, len(nums)):
            path.append(nums[i])              # CHOOSE
            backtrack(i + 1, path)            # EXPLORE -- only forward
            path.pop()                        # UN-CHOOSE
    backtrack(0, [])
    return result
# Time: O(n * 2^n). Space: O(n) recursion + O(n * 2^n) output.

LC 39: Combination Sum

Problem: Find all unique combinations of candidates that sum to target. Each candidate may be used unlimited times. Key insight: Recurse with i (NOT i+1) to allow element reuse. Sort first, then break when a candidate exceeds the remaining target — this single optimization is huge.
def combinationSum(candidates, target):
    candidates.sort()                            # Sort to enable break-pruning
    result = []
    def backtrack(start, path, remaining):
        if remaining == 0:
            result.append(path[:])
            return
        for i in range(start, len(candidates)):
            if candidates[i] > remaining:        # PRUNE: rest are even bigger
                break                            # NOT continue -- the array is sorted
            path.append(candidates[i])
            backtrack(i, path, remaining - candidates[i])  # i, not i+1 (reuse allowed)
            path.pop()
    backtrack(0, [], target)
    return result

LC 51: N-Queens

Problem: Place N queens on an NxN board so no two attack each other. Return all distinct configurations. Key insight: Place exactly one queen per row. Track occupied columns and diagonals in O(1)-lookup sets. Diagonals are identified by row - col (left-to-right) and row + col (right-to-left).
def solveNQueens(n):
    result = []
    cols, diag1, diag2 = set(), set(), set()    # Occupied columns and diagonals
    board = [['.'] * n for _ in range(n)]
    
    def backtrack(row):
        if row == n:
            result.append([''.join(r) for r in board])
            return
        for col in range(n):
            if col in cols or (row - col) in diag1 or (row + col) in diag2:
                continue                          # PRUNE: position attacked
            board[row][col] = 'Q'
            cols.add(col); diag1.add(row - col); diag2.add(row + col)
            backtrack(row + 1)
            board[row][col] = '.'                # UN-CHOOSE: revert all four mutations
            cols.remove(col); diag1.remove(row - col); diag2.remove(row + col)
    backtrack(0)
    return result
# Time: O(n!) approximately, with heavy pruning. Space: O(n^2) for the board.

LC 79: Word Search (Grid Backtracking)

Problem: Given an m x n grid of characters, return true if word exists by walking adjacent cells (no cell reused per word). Key insight: Mark visited cells with a sentinel character ('#'), recurse 4 directions, restore after the recursion.
def exist(board, word):
    rows, cols = len(board), len(board[0])
    
    def dfs(r, c, idx):
        if idx == len(word): return True
        if r < 0 or r >= rows or c < 0 or c >= cols: return False
        if board[r][c] != word[idx]: return False
        # CHOOSE: mark visited
        temp, board[r][c] = board[r][c], '#'
        # EXPLORE: 4 directions
        found = (dfs(r+1, c, idx+1) or dfs(r-1, c, idx+1)
                 or dfs(r, c+1, idx+1) or dfs(r, c-1, idx+1))
        # UN-CHOOSE: restore
        board[r][c] = temp
        return found
    
    for r in range(rows):
        for c in range(cols):
            if dfs(r, c, 0):
                return True
    return False
# Time: O(m * n * 4^L) where L is word length. Space: O(L) recursion.

LC 22: Generate Parentheses

Problem: Given n, return all combinations of n pairs of well-formed parentheses. Key insight: Track how many ( and ) have been placed. Two pruning rules: cannot add ( if open == n; cannot add ) if close >= open.
def generateParenthesis(n):
    result = []
    def backtrack(path, open_n, close_n):
        if len(path) == 2 * n:
            result.append(''.join(path))
            return
        if open_n < n:                            # Can still add an opener
            path.append('('); backtrack(path, open_n + 1, close_n); path.pop()
        if close_n < open_n:                      # Closer only valid if open exceeds close
            path.append(')'); backtrack(path, open_n, close_n + 1); path.pop()
    backtrack([], 0, 0)
    return result
# Time: O(4^n / sqrt(n)) -- the nth Catalan number. Space: O(n) recursion.

Common Mistakes

Avoid These Pitfalls:
  1. Not copying path: Use path[:] or new ArrayList(path) when saving. If you append path directly, every entry in your result list points to the same mutating list — and they all end up empty after recursion completes.
  2. Forgetting to backtrack: Always undo the choice after exploring. A missing path.pop() or used[i] = False is the number-one silent bug in backtracking — the code runs but produces wrong or incomplete results.
  3. Duplicate results: When the input has repeated values (e.g., [1,2,2]), sort first and skip nums[i] == nums[i-1] at the same recursion level. Without this, you get duplicate subsets or combinations.
  4. Infinite loops: Ensure progress by moving the start index forward. In combinations and subsets, always recurse with i + 1 (or i only if element reuse is explicitly allowed). Recursing with the same start without constraints causes infinite recursion.
  5. Pruning too late or not at all: Place constraint checks before the recursive call (inside is_valid), not after. Late pruning means you waste time building invalid partial solutions. For Combination Sum, sort candidates and break when a candidate exceeds the remaining target — this single optimization can cut runtime by 10x or more.

Caveats and Traps

LeetCode Backtracking Traps That Cost Real Submissions:
  1. Forgetting to un-choose corrupts state for sibling branches. A missing path.pop() or used[i] = False does NOT throw an error — the code runs and silently returns wrong results. After every backtrack(...) call, the very next line MUST undo every mutation made before it.
  2. Modifying the result list reference (append a copy, not the live list). result.append(path) is a bug; result.append(path[:]) or result.append(list(path)) is correct. Without the copy, all entries point to the same list and after recursion completes, they all reflect the final empty state.
  3. No pruning equals TLE on most LeetCode backtracking problems. Combination Sum without sorting plus break will time out on large inputs. N-Queens without the diagonal sets times out at n equals 12-13. Always think about pruning BEFORE writing code.
  4. Duplicates in input require sort plus skip-equal-at-same-depth pattern. For LC 47 (Permutations II) or LC 90 (Subsets II), the input may have repeats. Without nums.sort() and if i greater than start and nums[i] equals nums[i-1]: continue, you produce duplicate results. Naively dedup’ing with a set wastes exponential work.
  5. Wrong recursive index (i vs i+1) breaks reuse semantics. Combination Sum I (reuse allowed) recurses with i. Combination Sum II (no reuse) recurses with i + 1. Subsets recurses with i + 1. Permutations does NOT use a start index — it loops from 0 with a used array. Mixing these up is the most common backtracking bug.
Solutions and Patterns That Avoid These Traps:
  1. The “always pop after recurse” rule: Whenever you write backtrack(...), the very next line must undo every mutation you made BEFORE that call. Treat them as paired statements. If you find yourself with a mutation but no matching undo, you have a bug.
  2. The “save a copy” idiom:
    • Python: result.append(path[:]) or result.append(list(path)) or result.append(tuple(path)).
    • Java: result.add(new ArrayList<>(path)).
    • JavaScript: result.push([...path]).
    • C++: result.push_back(path) (vector copy by value).
  3. Sort plus skip duplicates pattern (LC 40, LC 47, LC 90):
    nums.sort()
    for i in range(start, len(nums)):
        if i > start and nums[i] == nums[i-1]:
            continue   # Skip duplicate at same depth
        # ... CHOOSE, EXPLORE, UN-CHOOSE ...
    
  4. Break-pruning on sorted candidates (LC 39, LC 40):
    for i in range(start, len(candidates)):
        if candidates[i] > remaining:
            break   # Sorted; rest are even bigger -- skip them all
    
    Use break, NOT continue — this is the entire point of sorting first.
  5. Decision matrix for index/reuse:
    • Subsets / Combinations (no reuse): for i in range(start, n) plus recurse with i + 1.
    • Combination Sum (reuse allowed): same loop, recurse with i.
    • Permutations: for i in range(n) plus used[] array, recurse without start.

Curated LeetCode Practice List

Grouped by difficulty, with annotations for what each problem teaches.

Easy (Warm-up)

LC#ProblemWhat It Teaches
17Letter Combinations of a Phone NumberBacktracking on a fixed-depth tree. Cleanest first problem.
784Letter Case Permutation”Include or modify” choice at each character.
401Binary WatchBacktracking with constraint validation at the end.

Medium (LeetCode bread-and-butter)

LC#ProblemWhat It Teaches
78SubsetsCanonical “every node is a result” template.
90Subsets IISubsets with duplicates — sort plus skip-equal-at-depth.
46PermutationsThe used[] array template.
47Permutations IIPermutations with duplicates — sort plus not used[i-1] skip.
39Combination SumReuse allowed; sort plus break-prune.
40Combination Sum IIEach used once; duplicates in input.
22Generate ParenthesesConstraint-based pruning (open/close counts).
79Word SearchGrid backtracking with visited-marker.
131Palindrome PartitioningTry every prefix; recurse on the rest if valid.
93Restore IP AddressesConstraint-bounded depth (exactly 4 octets).
698Partition to K Equal Sum SubsetsBacktracking + bitmask state.

Hard (Stretch problems for senior interviews)

LC#ProblemWhat It Teaches
51N-QueensConstraint sets for O(1) validity.
52N-Queens IISame but counting only — bitmask optimization shines.
37Sudoku SolverMultiple interacting constraints; MRV heuristic.
212Word Search IITrie + grid backtracking — the Trie trick.
301Remove Invalid ParenthesesGenerate-and-validate with minimum-removal constraint.
282Expression Add OperatorsTrack running value plus last operand for multiplication precedence.

The Backtracking Template

Every backtracking solution follows this structure:
function backtrack(path, choices):
    if is_solution(path):
        save(path)
        return
    
    for choice in choices:
        if is_valid(choice):
            make_choice(path, choice)
            backtrack(path, remaining_choices)
            undo_choice(path, choice)  # BACKTRACK!

Problem Type Quick Reference

Problem TypeInclude/ExcludeStart IndexReuse Elements
SubsetsEvery node is resulti + 1No
CombinationsLeaf nodes onlyi + 1No
Combination SumLeaf nodes onlyi (same)Yes
PermutationsLeaf nodes only0 (use visited)No
Unique ResultsSkip duplicates at same leveli + 1No

Interview Problems by Company

ProblemCompanyKey Concept
SubsetsAll FAANGBasic include/exclude
PermutationsAll FAANGUsed array tracking
Combination SumAmazon, MetaReusable elements
Letter CombinationsMetaPhone keypad mapping
Word SearchAmazon, MetaGrid backtracking

Interview Tips

Script for interviews:
  1. “This requires generating all possibilities, so I’ll use backtracking.”
  2. “I’ll build solutions incrementally, making one choice at a time.”
  3. “At each step, I’ll check if the current path is valid.”
  4. “After exploring a choice, I’ll undo it (backtrack) and try the next.”
  5. “I’ll prune invalid paths early to optimize.”
Interviewer SaysYou Should Think
”Generate all combinations”Backtracking
”Find all permutations”Backtracking with used array
”All valid configurations”Backtracking with constraints
”Can you prune?”Early termination conditions
”Handle duplicates”Sort + skip same at same level
When input has duplicates but you need unique results:
  1. Sort the input first
  2. Skip duplicates at same level: if i > start and nums[i] == nums[i-1]: continue
  3. This ensures same value isn’t used twice at same position in recursion tree

Practice Problems

ProblemDifficultyLink
SubsetsMediumLeetCode 78
PermutationsMediumLeetCode 46
Combination SumMediumLeetCode 39
N-QueensHardLeetCode 51
Word SearchMediumLeetCode 79

Practice Roadmap

1

Day 1: Subsets

  • Solve: Subsets, Subsets II
  • Focus: Include/exclude pattern
2

Day 2: Combinations

  • Solve: Combinations, Combination Sum
  • Focus: Target constraints
3

Day 3: Permutations

  • Solve: Permutations, Permutations II
  • Focus: Used array tracking
4

Day 4: Grid Problems

  • Solve: Word Search, N-Queens
  • Focus: 2D backtracking
Interview Tip: Draw the decision tree to visualize choices at each step. This helps identify the base case and pruning conditions.

Interview Questions

What interviewers are really testing: Whether you have internalized the pattern deeply enough to reproduce it from first principles under pressure, or if you just memorized a LeetCode solution. They also want to see if you understand why the undo step exists — candidates who skip that reveal they do not understand mutation vs. immutability in recursive state.Strong Answer:
  • The three steps are choose, explore, and un-choose (backtrack). Every backtracking solution is a depth-first traversal of a decision tree built from these three operations.
  • Choose means committing to a candidate decision — appending an element to the current path, marking a cell as visited, or placing a queen on the board. This mutates shared state.
  • Explore means recursing deeper with the updated state. You pass a narrower set of remaining choices (e.g., start = i + 1 for combinations, the full range with a used array for permutations).
  • Un-choose means undoing the mutation so the next sibling branch in the decision tree starts from a clean slate. This is the step that makes backtracking work — without it, choices from one branch leak into another and you get wrong results with zero error messages, which makes it one of the hardest bugs to catch.
  • The reason we mutate and undo (instead of copying the path each time) is performance. Copying an array at every recursive call turns O(n) space into O(n * 2^n) space and adds a significant constant factor. The mutate-then-undo pattern keeps space at O(n) for the recursion stack.
  • Example: In the subsets problem for [1, 2, 3], when you are at the node [1, 2], you explore by adding 3 to get [1, 2, 3]. After that recursive call returns, you pop 3 so the path is back to [1, 2]. Then you pop 2 so the path is [1], and the loop moves to i=2 which adds 3 to get [1, 3]. If you forgot to pop, you would be building on stale state.
Red flag answer: “You just add the element, recurse, and remove it.” This shows rote memorization without understanding why each step matters. Also a red flag: not mentioning that path[:] (or new ArrayList<>(path)) is required when saving results, because the path keeps mutating after you save it.Follow-ups:
  1. “What happens if you forget the un-choose step but the code still runs? How would you debug that?” — A missing pop() or used[i] = False does not throw an error. The code produces fewer results or wrong results silently. You would debug by printing the path at each recursive call and noticing that elements from previous branches are still present.
  2. “When would you intentionally not undo a choice?” — When you are creating a new copy of the path at each level instead of mutating (the immutable approach). This trades space for simplicity. Some functional-style solutions do this, and it is fine for small inputs, but it blows up memory for large decision trees.
What interviewers are really testing: This is the #1 follow-up on any subset/combination/permutation problem. It separates candidates who have solved five problems from those who have solved fifty. The interviewer wants to see if you understand why sorting + skipping works, not just that it does.Strong Answer:
  • The strategy is sort the input first, then skip duplicate values at the same recursion level. The check is: if i > start and nums[i] == nums[i-1]: continue.
  • Why this works: sorting groups identical values together. At any given recursion level, the loop iterates over choices for one “slot” in the result. If you have already tried value 2 in this slot, trying another 2 in the same slot produces the exact same subtree — so you skip it.
  • The condition i > start (not i > 0) is critical. It ensures you only skip duplicates at the same level. Using i > 0 would incorrectly prevent picking the same value in deeper levels, which is valid. For example, with input [1, 2, 2], the subset [2, 2] is legitimate — the two 2s come from different recursion depths.
  • For permutations with duplicates (Permutations II), the approach is slightly different. You sort, use a used array, and add the condition: if i > 0 and nums[i] == nums[i-1] and not used[i-1]: continue. This ensures that among identical elements, they are always picked in left-to-right order, preventing duplicate permutations.
  • Example: Subsets II with [1, 2, 2]. Sorted: [1, 2, 2]. At recursion level start=1, we pick nums[1]=2 and explore. When the loop reaches nums[2]=2, we see nums[2] == nums[1] and i > start, so we skip. Without this, we would get [1, 2] twice — once using the first 2 and once using the second.
Red flag answer: “I use a set to deduplicate the results at the end.” This works correctness-wise but is a performance disaster. You are still generating all duplicate subtrees (exponential wasted work) and then deduplicating O(2^n) results. The sort-and-skip approach avoids generating duplicates entirely, which can be orders of magnitude faster.Follow-ups:
  1. “Why is the condition i > start and not i > 0? What breaks if you use i > 0?” — Using i > 0 prevents picking duplicate values even at different depths. You would miss valid results like [2, 2] from input [1, 2, 2] because the second 2 at depth 2 would be skipped.
  2. “For Permutations II, the skip condition checks not used[i-1]. Some people write used[i-1] instead. Both produce correct results — why, and which is more efficient?” — Both enforce an ordering among identical elements. not used[i-1] means “only pick the i-th duplicate if the (i-1)-th is already in use” (pick duplicates left to right). used[i-1] means “skip if the previous duplicate is already used” (pick duplicates right to left). The not used[i-1] version prunes higher in the tree because it cuts off branches earlier, making it faster in practice.
What interviewers are really testing: Complexity analysis for backtracking is where most candidates crumble. The interviewer wants to see if you can reason about the shape of the decision tree (branching factor, depth) and connect it to output size, not just recite “O(2^n)” from memory.Strong Answer:
  • Subsets: O(n * 2^n). The decision tree is a binary tree of depth n — at each level you decide “include element i or not.” This produces exactly 2^n leaf nodes (subsets). Each subset is copied into the result in O(n) time on average, giving O(n * 2^n) total.
  • Permutations: O(n * n!). The decision tree has a branching factor that decreases at each level: n choices at level 0, n-1 at level 1, …, 1 at level n-1. The number of leaves is n * (n-1) * … * 1 = n!. Each permutation is copied in O(n), giving O(n * n!).
  • The difference comes from the structure of choices. Subsets are about include/exclude (2 choices per element, independent), while permutations are about ordering (each element goes to exactly one position, choices depend on what is already used).
  • Space for both is O(n) for the recursion stack and current path (not counting the output). If you count the output, subsets use O(n * 2^n) and permutations use O(n * n!).
  • Practical implication: n! grows much faster than 2^n. For n=10, 2^10 = 1,024 but 10! = 3,628,800. For n=20, subsets (about 1 million) are tractable, but permutations (about 2.4 * 10^18) are not. This is why interviewers ask “do you need all permutations or can we prune?” — it is not a theoretical question, it is a practical one about whether your code will finish.
Red flag answer: Saying “both are exponential” without distinguishing 2^n from n!. Also a red flag: not being able to explain where the complexity comes from (the decision tree structure) and just quoting a number.Follow-ups:
  1. “If I only need to count subsets, do I need backtracking?” — No. The count is always 2^n (or C(n,k) for combinations of size k). You only need backtracking when you must enumerate the actual subsets. This is a common DP vs. backtracking decision point.
  2. “At what input size does generating all permutations become infeasible? How about subsets?” — Permutations: n around 10-12 is the practical limit (10! is about 3.6 million, 12! is about 479 million). Subsets: n around 20-25 is feasible (2^20 is about 1 million, 2^25 is about 33 million). Beyond these, you need to either prune aggressively or rethink the approach.
What interviewers are really testing: Two things at once — your understanding of constraint propagation in backtracking, and whether you think about optimization beyond the first working solution. The O(1) optimization shows you understand amortized data structure tricks, not just brute force.Strong Answer:
  • The naive is_safe checks three things for a candidate position (row, col): no queen in the same column, no queen on the upper-left diagonal, and no queen on the upper-right diagonal. We do not check the row because we place exactly one queen per row by construction.
  • Each of those three checks walks backward through previously placed rows, making each call O(n) in the worst case. Over the full search, this adds up.
  • The O(1) optimization uses three sets (or boolean arrays): cols for occupied columns, diag1 for occupied left-diagonals, and diag2 for occupied right-diagonals. The key insight is that all cells on the same left diagonal share the same value of row - col, and all cells on the same right diagonal share row + col. So is_safe(row, col) becomes: col not in cols and (row - col) not in diag1 and (row + col) not in diag2 — three set lookups, O(1).
  • When you choose (place a queen), you add to all three sets. When you un-choose, you remove from all three sets. The sets track exactly which constraints are currently active.
  • Why interviewers love this: It demonstrates that you think about the data structure behind the constraint check, not just the algorithm. The same principle (precomputing conflict sets) applies to Sudoku solvers and many real constraint-satisfaction problems.
Red flag answer: Only describing the brute-force is_safe and not mentioning the set-based optimization. Or claiming “we need to check all 8 directions” — we only check upward because queens below the current row have not been placed yet.Follow-ups:
  1. “Why do row - col and row + col uniquely identify diagonals? Can row - col be negative, and does that cause issues?” — row - col is constant along any top-left-to-bottom-right diagonal, and row + col is constant along any top-right-to-bottom-left diagonal. Yes, row - col can be negative (e.g., row=0, col=3 gives -3). This is fine for sets or hash maps. If using arrays, you offset by n-1 so indices are non-negative: diag1[row - col + n - 1].
  2. “If I asked you to solve this for just the count of solutions (not the actual board configurations), would you change your approach?” — For counting only, you drop the board construction and just increment a counter at the base case. This saves O(n^2) per solution for board copying. For very large n (say n=15+), there are also mathematical results and bitwise tricks (using bitmasks for cols/diag1/diag2 instead of sets) that push performance further. The bitmask approach uses integer bits to represent occupied columns and diagonals, making the check and update a single bitwise AND/OR.
What interviewers are really testing: Whether you see backtracking as one pattern with variations or as three separate things you memorized independently. Strong candidates can explain the differences by pointing at the decision tree shape, base case, and how choices narrow.Strong Answer:
  • Subsets: The decision tree is a binary tree of depth n. At each level, you decide “include element i or skip it.” Every node (not just leaves) is a valid result. The start index moves forward by 1 to avoid revisiting earlier elements. You collect results at every recursive call.
  • Combinations (choose k): Same tree structure as subsets, but you only collect results at nodes where path.length == k. You add a pruning condition: if the remaining elements are fewer than the remaining slots (n - i + 1 < k - path.length), stop early. This pruning is the key difference from subsets.
  • Permutations: The tree has a different shape entirely. At level 0, you have n choices. At level 1, you have n-1 (everything except what is used). At level 2, n-2, and so on. You collect only at leaves (when path.length == n). Instead of a start index, you use a used boolean array to skip already-chosen elements, and you loop from 0 every time.
  • The mental model: subsets = include/exclude, combinations = include/exclude with a size constraint, permutations = arrange all in some order.
  • Quick structural summary:
AspectSubsetsCombinationsPermutations
Loop starts atstartstart0
Next call passesi + 1i + 1same range, check used
Collect results atevery nodenodes where len == kleaves only (len == n)
Branching factordecreasing (n, n-1, …)decreasing + pruneddecreasing (n, n-1, …)
Red flag answer: Treating all three as “basically the same thing” or being unable to explain why permutations loop from 0 while subsets/combinations use a start index. Also: not knowing that combinations benefit from the “remaining elements” pruning.Follow-ups:
  1. “If I change the Combination Sum problem to allow reuse of elements, what changes structurally in the tree?” — You recurse with i (same index) instead of i + 1. This means the branching factor does not decrease — you can pick the same element again. The tree can be deeper (bounded by target / min(candidates)), but pruning with if remaining < 0: return keeps it tractable.
  2. “Could you convert permutations to use a start index approach instead of a used array?” — Not directly, because permutations need to consider elements before the current index (order matters). The start index approach only looks forward, which is exactly what prevents duplicate subsets. For permutations, you could alternatively use swapping: swap element at index start with each element from start to n-1, recurse with start+1, then swap back. This avoids the used array but is trickier to handle duplicates with.
What interviewers are really testing: Real-world optimization intuition. Everyone can write the basic DFS + backtracking for word search. The interviewer wants to see if you can identify where time is wasted and apply targeted pruning, not just say “add memoization” (which does not work here because state space is too large).Strong Answer:
  • Baseline approach: For each cell in the grid, start a DFS that matches the word character by character. Mark cells as visited (e.g., replace with '#'), explore all 4 directions, then restore. Time: O(m * n * 4^L) where L is word length.
  • Optimization 1 — Early termination: Before starting any DFS, check if the grid even contains all the characters in the word (with correct frequencies). If the word needs three 'a's but the grid only has two, return False immediately. This is O(m * n) and catches many impossible cases.
  • Optimization 2 — Reverse the word: Count the frequency of the first character vs. the last character. If the last character appears fewer times in the grid, reverse the word and search for the reversed version. This reduces the number of DFS starting points, sometimes dramatically.
  • Optimization 3 — Prune by remaining length: If the remaining characters in the word exceed the number of unvisited cells reachable from the current position, prune. This is harder to compute exactly but a rough bound helps.
  • Optimization 4 — For Word Search II (multiple words): Build a Trie from all words and search once. At each cell, you walk the Trie simultaneously with the grid DFS. If no Trie child exists for the current character, prune that entire branch. Remove words from the Trie once found to avoid redundant searches.
  • What does NOT work: Memoization. The state would need to encode (position, set of visited cells, index in word), which is exponential. Backtracking’s strength here is that it avoids storing all states.
Red flag answer: “Just add memoization” or “use BFS instead.” BFS does not help because we need to track the exact path (visited cells), and memoization’s state space is too large. Also a red flag: not knowing the Trie optimization for Word Search II.Follow-ups:
  1. “Why can you not use memoization/DP for word search like you can for other grid problems?” — Because the state includes which cells are visited, not just the current position. Two different paths can arrive at the same cell with different visited sets, so you cannot cache by position alone. The visited set makes the state space exponential.
  2. “If I gave you 10,000 words to search in the same grid, would you run your single-word solution 10,000 times?” — No. You build a Trie from all 10,000 words and do one DFS traversal. At each cell, you walk the Trie. If the current prefix is not in the Trie, you prune immediately. This is Word Search II, and it reduces the combined complexity massively compared to running 10,000 independent searches.
What interviewers are really testing: Judgment. Many problems can technically be solved by both, but choosing the wrong one leads to TLE or unnecessary complexity. The interviewer wants to see your decision framework, not a single rule.Strong Answer:
  • Use backtracking when you need to enumerate all valid solutions (all subsets, all permutations, all valid configurations). Backtracking generates each solution explicitly.
  • Use DP when you need a single optimal value (minimum cost, maximum profit, total count) and the problem has overlapping subproblems. DP collapses repeated work via memoization.
  • The overlap zone: Some problems like “count the number of ways to reach target sum” can be solved by either. Backtracking generates all ways and counts them (O(2^n) typically). DP caches intermediate results and counts without generating (often O(n * target)). For counting-only problems, DP almost always wins.
  • When backtracking + memoization = DP: If you add a cache to your backtracking solution and the state space is polynomial, you have effectively built a top-down DP. This is a valid approach and sometimes easier to think about. But if the state space is exponential (e.g., involves a bitmask of visited nodes), the “DP” is still exponential — caching does not magically fix it.
  • Decision rule of thumb: Ask yourself: “Do I need the actual solutions, or just a count/optimum?” If actual solutions, backtracking. If count/optimum with overlapping subproblems, DP. If count/optimum without overlapping subproblems, backtracking (or greedy).
  • Example: “Partition a set into two subsets with equal sum” — DP (you need yes/no, not the actual partition). “Generate all partitions of a set into two subsets with equal sum” — backtracking.
Red flag answer: “Backtracking is brute force and DP is the optimized version.” This is wrong. They solve fundamentally different types of questions. Backtracking is not just “DP without caching.” Also a red flag: not recognizing that memoized backtracking and top-down DP are the same thing.Follow-ups:
  1. “Can you give me a problem where backtracking is strictly better than DP?” — N-Queens. The state space (which columns and diagonals are occupied) can be represented as a bitmask, but there is no overlapping subproblems structure to exploit. Each configuration of placed queens is unique. DP’s caching would add overhead without saving any work. Backtracking with pruning is the natural fit.
  2. “What about a problem where you initially think backtracking but DP is far superior?” — “How many distinct ways can you climb n stairs taking 1 or 2 steps at a time?” You could backtrack and enumerate all paths (O(2^n)), but it has classic overlapping subproblems: the number of ways from step k is the same regardless of how you got to step k. DP solves it in O(n).
What interviewers are really testing: Whether you understand the single line of code that changes between these two variants and, more importantly, why it changes the structure of the decision tree. This is a favorite Amazon and Meta question.Strong Answer:
  • Reusable elements (Combination Sum I): Recurse with index i (same index). This allows the same element to be picked again in the next recursive call. The tree can go deep — bounded by target / min(candidates).
  • Non-reusable elements (Combination Sum II): Recurse with index i + 1. Each element can only be used once. The tree depth is bounded by the number of elements.
  • That single change — i vs. i + 1 in the recursive call — is the entire structural difference.
  • Additional complexity for Combination Sum II: The input may contain duplicate values (e.g., [1, 1, 2, 5, 6, 7, 10]). You need to sort the array and skip duplicates at the same recursion level (if i > start and candidates[i] == candidates[i-1]: continue) to avoid producing duplicate combinations.
  • Pruning is critical for both: Sort the candidates first. In the loop, if candidates[i] > remaining, break (not continue). Because the array is sorted, all subsequent candidates are even larger, so the entire rest of the loop is useless. This single optimization can cut runtime by 10x+ on large inputs.
  • Example: Candidates [2, 3, 6, 7], target = 7. With reuse: [2, 2, 3] and [7] are valid (we used 2 twice). Without reuse and candidates [2, 3, 6, 7]: only [7] works because we cannot reuse 2.
Red flag answer: Not knowing which index to pass in the recursive call, or confusing the two variants. Also a red flag: using continue instead of break when a sorted candidate exceeds the remaining target — this means you do not understand why sorting enables the optimization.Follow-ups:
  1. “Why break and not continue when the candidate exceeds the remaining target?” — Because the array is sorted. If candidates[i] exceeds the remaining target, then candidates[i+1], candidates[i+2], etc. all exceed it too. continue would pointlessly check every remaining candidate. break skips them all in one step.
  2. “What if I told you the candidates are not sorted and you cannot sort them — can you still prune effectively?” — Without sorting, you cannot break early because a large candidate might be followed by a smaller one that fits. You can only continue on individual candidates that are too large. This is strictly less efficient. In an interview, you should always sort first unless the problem explicitly forbids it or the order of candidates matters for the output.
What interviewers are really testing: Debugging skill and systematic thinking under pressure. Backtracking bugs are notoriously hard to find because the code runs without errors but produces wrong output. The interviewer wants to see a methodical approach, not random guessing.Strong Answer:
  • Step 1 — Print the decision tree. Add a print statement at the start of each recursive call showing the current path, start index, and the choice being made. Indent by recursion depth. This visualizes the actual tree your code is exploring.
  • Step 2 — Check for the “same level duplicate” bug. If the input has duplicate values, verify that you sorted the input AND added the skip condition if i > start and nums[i] == nums[i-1]: continue. The most common cause of duplicate results is a missing skip condition.
  • Step 3 — Verify the start index is advancing correctly. If you pass start instead of i + 1 to the recursive call, you allow reuse of elements. This is correct for Combination Sum I but wrong for Subsets and Combinations. Check which variant you are solving.
  • Step 4 — Check if you are collecting results at the right nodes. Subsets collect at every node. Combinations and permutations collect only at specific depths. Collecting at the wrong point produces extra or missing results.
  • Step 5 — Reduce to a tiny input. Use input [1, 2, 2] or [1, 1] and manually trace the expected output vs. actual output. Compare with the printed decision tree from Step 1.
  • Step 6 — Verify the un-choose step resets ALL state. If your choose step modifies multiple things (e.g., appending to path AND marking a cell visited AND adding to a set), the un-choose step must undo ALL of them. A partial undo is a subtle bug.
  • In my experience, 90% of backtracking duplicate bugs come from exactly two causes: missing sort + skip for duplicate inputs, or passing the wrong index to the recursive call.
Red flag answer: “I would add the results to a set to deduplicate.” This masks the bug instead of fixing it. The underlying issue is that your code is exploring redundant branches, wasting exponential time. Also a red flag: “I would rewrite it from scratch” — this suggests no systematic debugging ability.Follow-ups:
  1. “Your backtracking solution returns an empty result. What are the most common causes?” — Three main causes: (1) you never reach the base case because your recursion never makes progress (infinite loop or wrong loop bounds), (2) you are saving path directly instead of path[:] so all results point to the same empty list after recursion completes, (3) your is_valid function is too restrictive and prunes every branch.
  2. “How would you write a test to catch backtracking bugs before they reach production (or the interviewer)?” — Generate expected results for small inputs by hand, then assert on both the count and the content. For subsets, assert len(result) == 2^n. For permutations, assert len(result) == n!. Also assert that every result is unique and satisfies the constraints. These invariant checks catch most bugs immediately.
What interviewers are really testing: Whether you can apply backtracking to a complex constraint-satisfaction problem with multiple interacting constraints, and whether you understand that naive backtracking is unusable without intelligent pruning. This is a staff-level question that tests system-level thinking applied to algorithms.Strong Answer:
  • Basic approach: Find the next empty cell, try digits 1-9, check if the digit is valid (not already in the same row, column, or 3x3 box), place it, recurse to the next empty cell. If no digit works, backtrack (reset the cell to empty and return False).
  • Why naive is slow: A 9x9 Sudoku has up to 81 empty cells. Without pruning, you try 9 options per cell, giving a worst case near 9^81 — astronomically large. Practical Sudoku solving requires aggressive pruning.
  • Pruning strategy 1 — Precompute constraints. Maintain three data structures: rows[9], cols[9], and boxes[9], each a set of digits already placed. Checking validity becomes three O(1) set lookups instead of scanning the row/col/box each time. The box index for cell (r, c) is (r // 3) * 3 + c // 3.
  • Pruning strategy 2 — Choose the most constrained cell first (MRV heuristic). Instead of filling cells left to right, pick the empty cell with the fewest valid options. If a cell has only one valid digit, fill it immediately. If a cell has zero valid options, backtrack immediately without trying anything. This is the “Minimum Remaining Values” heuristic and it dramatically reduces the tree size.
  • Pruning strategy 3 — Constraint propagation. After placing a digit, propagate: if any peer cell (same row/col/box) now has only one valid option, fill it too (and propagate recursively). This is what makes the solver feel “instant” on most puzzles. It is the technique behind Peter Norvig’s famous Sudoku solver.
  • Pruning strategy 4 — Naked pairs/triples. If two cells in the same row can only contain digits {3, 7}, then no other cell in that row can contain 3 or 7. This is an advanced constraint propagation technique.
  • Example: On a typical newspaper Sudoku (about 25 empty cells), the MRV heuristic + basic constraint propagation solves it in under 100 recursive calls. Without these, the same puzzle might need millions of calls.
Red flag answer: Only describing the brute-force “try 1-9 in each cell” without mentioning any pruning. Or claiming it is “just like N-Queens” without acknowledging the additional complexity of row, column, AND box constraints interacting. Also: not knowing how to compute the 3x3 box index from (row, col).Follow-ups:
  1. “How would you represent the board and constraints to minimize memory and maximize speed?” — Use a 9x9 array for the board. For constraints, use bitmasks: each row, column, and box gets a 9-bit integer where bit d is set if digit d+1 is present. Checking validity is a bitwise AND, and updating is a bitwise OR. This is faster than sets due to cache locality and avoids hash overhead.
  2. “What is the worst-case input for a Sudoku solver, and how does your pruning handle it?” — The worst case is a nearly empty board (few given digits) with many valid solutions. MRV helps less because many cells have similar option counts. Constraint propagation helps less because there are fewer forced moves. In the absolute worst case, the solver degenerates to near brute-force. However, for any valid Sudoku (unique solution), the combination of MRV + propagation keeps the search tree manageable — typically under 10,000 nodes.

Interview Deep-Dive

Strong Answer:
  • Backtracking is the right choice when the problem asks you to enumerate all valid configurations or find any valid configuration that satisfies a set of constraints. The key signals are: “generate all,” “find all,” “list every,” or any constraint-satisfaction framing (Sudoku, N-Queens, crossword filling).
  • Backtracking over BFS: BFS explores level by level and is ideal for shortest-path problems. Backtracking explores depth-first and is ideal when you need complete solutions, not shortest ones. If the question says “minimum steps,” think BFS. If it says “all valid arrangements,” think backtracking.
  • Backtracking over DP: DP computes a single optimal value or count by collapsing overlapping subproblems. Backtracking enumerates actual solutions. If you only need the count of subsets summing to K, DP is O(n * K). If you need to list every such subset, backtracking is unavoidable because the output itself can be exponential.
  • Backtracking over greedy: Greedy makes irrevocable local choices. Backtracking tries a choice, explores, and undoes it if it fails. If a greedy counterexample exists (local optimum does not lead to global optimum), backtracking or DP is needed. But if you only need one valid solution and greedy finds it, greedy is cheaper.
  • The decision heuristic: Ask “do I need the actual solutions or just a count/optimum?” If actual solutions, backtracking. If count/optimum with overlapping subproblems, DP. If count/optimum with greedy-choice property, greedy.
  • Complexity awareness: Backtracking is inherently exponential (O(2^n) for subsets, O(n!) for permutations). If n exceeds ~20-25 for subsets or ~10-12 for permutations, the solution will not finish in time without aggressive pruning.
Follow-up: Your backtracking solution for a constraint-satisfaction problem runs correctly but takes 30 seconds on the largest test case. Walk me through how you would optimize it without changing the paradigm.
  • First, profile where time is spent. In backtracking, the bottleneck is almost always the branching factor at upper levels of the tree, because exponentials amplify early branching.
  • Optimization 1 — Better pruning. Add constraint checks earlier. For example, in Combination Sum, sort candidates and break (not continue) when a candidate exceeds the remaining target, eliminating the entire right subtree.
  • Optimization 2 — Variable ordering (MRV). Choose the most constrained variable first. In N-Queens, place queens in the row with the fewest valid columns first. This prunes more branches higher in the tree where the payoff is largest.
  • Optimization 3 — Precompute validity structures. Replace O(n) validity checks with O(1) set/bitmask lookups. For N-Queens, use cols, diag1, diag2 sets instead of scanning placed queens.
  • Optimization 4 — Symmetry breaking. If the problem has symmetry (e.g., the first queen in N-Queens can be placed in only half the columns due to mirror symmetry), exploit it to halve the search space.
  • Optimization 5 — Early termination. If you only need one solution (not all), return immediately upon finding the first valid one instead of continuing to explore.
Strong Answer:
  • This is Combination Sum II (LeetCode 40). The input has duplicates, each element can be used at most once, and we need unique combinations.
  • Approach: Sort the array. Backtrack with start = i + 1 (no reuse). Skip duplicates at the same recursion level: if i > start and candidates[i] == candidates[i-1]: continue.
  • Exact complexity: The worst case is O(2^n) combinations (every subset could be valid if all elements are distinct and all sums happen to equal the target). Each combination takes O(k) to copy where k is the combination length. So worst-case output-sensitive complexity is O(k * 2^n). In practice, the target constraint prunes most branches.
  • The single most impactful optimization: Sort the candidates and use break instead of continue when candidates[i] > remaining. Because the array is sorted, once one candidate exceeds the remaining budget, all subsequent candidates do too. This turns a linear scan of remaining candidates into an instant termination. On inputs with many candidates larger than the target, this can reduce the search space by 90% or more.
  • Space complexity: O(n) for the recursion stack and current path (not counting output). The sorting is O(n log n) but dominated by the exponential search.
  • Edge cases that break naive implementations: (1) All elements are identical — the skip condition prevents generating the same combination multiple times. (2) Target is 0 — the empty combination is valid; make sure your base case handles this. (3) All elements exceed the target — return empty immediately with the break optimization.
Follow-up: If the same problem allows unlimited reuse of each element (Combination Sum I), how does the time complexity change and why?
  • The recursion depth is no longer bounded by n (the number of candidates). It is bounded by target / min(candidates). The branching factor at each level is the number of candidates that do not exceed the remaining target.
  • Worst-case complexity becomes O(n^(target/min_candidate)), which can be much larger than O(2^n) for small candidates and large targets. For example, candidates = [1], target = 100 produces a single combination [1,1,…,1] of length 100, but the tree explores every prefix along the way.
  • The key structural difference: with reuse, the recursive call passes i (same index), so the tree can go arbitrarily deep. Without reuse, it passes i + 1, bounding depth to n.
Strong Answer:
  • Structural difference: Array backtracking has a 1D decision space — at each level you pick from a set of candidates. Grid backtracking has a 2D decision space — at each step you move in one of 4 (or 8) directions, and the “candidates” are spatial neighbors rather than array elements.
  • Visited tracking differs fundamentally. In permutations, you use a used boolean array indexed by element position. In grid problems, you mark the cell itself (either mutating the grid or using a 2D visited matrix). The visited state is path-dependent — you must unmark cells when you backtrack so other paths can use them.
  • Grid-specific edge cases:
    1. Stack overflow on large grids. A 500x500 all-land grid can have DFS depth of 250,000, blowing Python’s default 1000-frame recursion limit. Either increase the limit or convert to iterative DFS with an explicit stack.
    2. Starting point selection. Unlike array problems where you always start from index 0, grid backtracking may require trying every cell as a starting point (Word Search scans all cells for the first character). This adds an O(m*n) multiplier.
    3. Boundary checks. Every recursive call must validate 0 <= r < rows and 0 <= c < cols before accessing the grid. Forgetting this causes index-out-of-bounds errors that do not occur in array backtracking.
    4. Diagonal vs. cardinal movement. Some problems use 4-directional movement, others use 8. The branching factor jumps from 4 to 8, doubling the work at each level.
    5. Grid mutation side effects. If you mark visited by setting grid[r][c] = '#', you must restore the original character on backtrack. Forgetting the restoration is the single most common grid backtracking bug.
  • Complexity: For Word Search, the worst case is O(m * n * 4^L) where L is word length. Each cell is a potential start (m*n), and from each start, the DFS branches into 4 directions up to L levels deep. In practice, pruning (character mismatch, visited cell) cuts this dramatically.
Follow-up: In Word Search II (searching for multiple words simultaneously), why does building a Trie over the word list and searching once outperform running single-word search for each word?
  • Single-word search for K words is O(K * m * n * 4^L). Each word triggers an independent grid traversal.
  • The Trie approach does one grid traversal. At each cell, it walks the Trie simultaneously with the DFS. If the current prefix has no children in the Trie, the entire DFS branch is pruned. The Trie effectively shares prefix computation across all words.
  • For K = 10,000 words with average length 5 on a 300x300 grid, single-word search does 10,000 * 90,000 * 4^5 = ~900 billion operations. The Trie approach does roughly 90,000 * 4^5 = ~90 million operations (shared across all words), with aggressive Trie pruning reducing it further.
  • Additional optimization: remove a word from the Trie once found, and prune leaf nodes that no longer lead to any word. This prevents redundant searches for already-found words.
Strong Answer:
  • Worst-case inputs are those where pruning eliminates nothing and the algorithm explores the full decision tree. For subsets, this is any input — 2^n subsets always exist. For constraint-satisfaction, the worst case is an input where every partial solution looks promising but ultimately fails at the last step, forcing exhaustive exploration.
  • Classic worst-case example: Backtracking on an unsolvable Sudoku board with very few given digits. The solver tries every combination, backtracks at the end, and ultimately reports “no solution.” Every path is explored to maximum depth before failing.
  • How to assess pruning effectiveness:
    1. Count nodes visited. Add a counter that increments at each recursive call. Compare against the theoretical maximum (2^n for subsets, n! for permutations). If your counter is within 10x of the output size, pruning is excellent. If it is 100x or more, pruning is weak.
    2. Estimate branching factor. If the average branching factor at each level is b and depth is d, the tree has roughly b^d nodes. If your pruning reduces the effective branching factor from b to b’, the speedup is (b/b’)^d — exponential in depth. Even reducing branching from 9 to 7 in Sudoku gives (9/7)^81 which is astronomically large.
    3. Check constraint tightness. Tight constraints (many cells filled in Sudoku, small target in Combination Sum) prune more. Loose constraints (few given digits, large target) prune less.
    4. Profile the rejection rate. Track what percentage of candidates at each level are pruned. If 90% are pruned at level 1, that eliminates 90% of the tree. If only 10% are pruned, the tree is barely reduced.
  • Rule of thumb for interview feasibility: For competitive programming, backtracking solutions with effective pruning can handle n up to about 20-25 (subsets), 10-12 (permutations), or 81 cells (Sudoku with MRV + constraint propagation). If n exceeds these, you need a fundamentally different approach (DP, greedy, math).
Follow-up: You are told the constraints are n up to 15 and time limit is 2 seconds. Can you use backtracking? How do you estimate whether it will pass?
  • For n = 15 with subsets, 2^15 = 32,768 — trivially fast. For permutations, 15! = 1.3 trillion — way too slow without pruning.
  • Estimation method: Calculate the theoretical tree size. If it is under ~10^8 operations for Python or ~10^9 for C++/Java, it will likely pass within 2 seconds. For n = 15 permutations without pruning, 15! exceeds 10^12, so no. But with heavy pruning (like constraint-satisfaction where most branches are cut), the effective tree might be 10^6 — then it passes easily.
  • Quick calibration: Python does roughly 10^7 simple operations per second. C++ does roughly 10^8-10^9. If your estimated tree size divided by the pruning factor is under these thresholds, proceed with backtracking. Otherwise, look for DP or a mathematical shortcut.

LeetCode Interview Q&A (Backtracking)

These four are the LeetCode-style backtracking questions you should be able to walk through cold. Each has a strong-answer framework, a worked example with full code, three senior follow-ups, common wrong answers, and pointers to related problems.
Strong Answer Framework:
  1. State that LC 46 (distinct elements) uses a used[] array. The complication for LC 47 is that the input may contain duplicates — naive code produces duplicate permutations.
  2. The fix has two parts: (a) sort the input so identical elements are adjacent; (b) at each level, skip an element if the previous identical element has not been used yet (if i greater than 0 and nums[i] equals nums[i-1] and not used[i-1]: continue).
  3. The intuition: among identical elements, always pick them in their original left-to-right order. This breaks the symmetry that would otherwise produce duplicates.
  4. State complexity: O(n * n!) worst case, but with heavy pruning when there are many duplicates.
Real LeetCode Example — Full Walkthrough:
def permuteUnique(nums):
    nums.sort()                                      # Group duplicates together
    result, used = [], [False] * len(nums)
    
    def backtrack(path):
        if len(path) == len(nums):
            result.append(path[:])
            return
        for i in range(len(nums)):
            if used[i]:
                continue
            # Skip a duplicate if its identical predecessor has NOT been used.
            # This forces left-to-right pick order among duplicates.
            if i > 0 and nums[i] == nums[i-1] and not used[i-1]:
                continue
            used[i] = True; path.append(nums[i])
            backtrack(path)
            path.pop(); used[i] = False
    
    backtrack([])
    return result
# Time: O(n * n!) worst case. Space: O(n) recursion + O(n!) output.
For nums = [1, 1, 2] (already sorted):
  • Without the dedup check, we would pick “first 1” then “second 1” then 2, AND “second 1” then “first 1” then 2 — two identical permutations.
  • With the check: when at depth 0 we try i=1 (the second 1), we see nums[1]==nums[0] and used[0]==False, so we skip. Only one branch starting with 1 is explored.
Senior Follow-up 1 — “Why not used[i-1] and not used[i-1]? Both versions seem to dedup.” Both produce correct results, but they prune at different points. not used[i-1] means “skip the i-th duplicate if the i-1-th has not been picked yet” — this enforces left-to-right order among duplicates and prunes higher in the tree. used[i-1] enforces right-to-left order, prunes lower, so it explores more nodes before cutting. The not used[i-1] version is faster in practice, though both are O(n * n!) asymptotically.
Senior Follow-up 2 — “Could you solve this without sorting, using a per-level set of seen values?” Yes:
def permuteUnique(nums):
    result, used = [], [False] * len(nums)
    def bt(path):
        if len(path) == len(nums): result.append(path[:]); return
        seen = set()                              # Reset for each level
        for i in range(len(nums)):
            if used[i] or nums[i] in seen: continue
            seen.add(nums[i])
            used[i] = True; path.append(nums[i])
            bt(path)
            path.pop(); used[i] = False
    bt(nums and []); return result
This avoids the O(n log n) sort but uses O(k) extra space per level (where k is unique values per level). Sort plus skip is more cache-friendly and is what most interviewers expect.
Senior Follow-up 3 — “What if duplicates are very frequent? For example, nums = [1]*20 — how does the runtime change?” With heavy duplication, the deduplication is enormously effective. For [1]*20, naive backtracking would explore 20! roughly equals 2.4 * 10^18 nodes (intractable). With dedup, only 1 permutation exists, and the recursion explores roughly n nodes — linear time. The skip condition is what makes this practical.
Common Wrong Answers:
  • Using a Python set to dedup the final results — correct but exponentially slower because it generates all duplicates first.
  • Forgetting to sort first — the skip condition only works on adjacent duplicates.
  • Writing if i > 0 and nums[i] == nums[i-1]: continue without checking used[i-1] — this incorrectly skips valid permutations like [1, 1, 2] itself.
Further Reading:
  • LC 46 Permutations — the distinct-elements baseline.
  • LC 90 Subsets II — same dedup principle but with start index instead of used[].
Strong Answer Framework:
  1. State the constraint: place exactly one queen per row. This eliminates the need to check rows.
  2. For columns and the two diagonal types, maintain three sets so that is_safe(row, col) is O(1) instead of O(n).
  3. Diagonal identification: cells on the same \ diagonal share row - col. Cells on the same / diagonal share row + col.
  4. Discuss the bitmask optimization for further speedup (LC 52).
Real LeetCode Example — Full Walkthrough:
def solveNQueens(n):
    result = []
    cols, diag1, diag2 = set(), set(), set()
    board = [['.'] * n for _ in range(n)]
    
    def backtrack(row):
        if row == n:
            result.append([''.join(r) for r in board])
            return
        for col in range(n):
            # PRUNE: skip if any of the three constraints is violated
            if col in cols: continue
            if (row - col) in diag1: continue
            if (row + col) in diag2: continue
            # CHOOSE: place queen and update all three constraint sets
            board[row][col] = 'Q'
            cols.add(col); diag1.add(row - col); diag2.add(row + col)
            # EXPLORE
            backtrack(row + 1)
            # UN-CHOOSE: revert all four mutations
            board[row][col] = '.'
            cols.remove(col); diag1.remove(row - col); diag2.remove(row + col)
    
    backtrack(0)
    return result
For n=4, the search tree has 4 choices at row 0, but most are quickly pruned. After placing Q at (0, 1), row 1 cannot use column 0 (occupied diagonal 0+1-1-0=0… let me recompute: diag1 = row - col; so (0,1) puts 0-1=-1 in diag1 and 0+1=1 in diag2. Row 1 col 0: diag1 check 1-0=1 — not in diag1. diag2 check 1+0=1 — IN diag2 (from (0,1)). Pruned. So (1, 0) is correctly skipped because (0,1) and (1,0) are on the same / diagonal.
Senior Follow-up 1 — “Walk me through how to prune even more aggressively using bitmasks.” Replace the three sets with three integers where bit k means “column k / diagonal k is occupied.” For a queen at (row, col):
  • cols |= (1 << col)
  • diag1 |= (1 << (row - col + n - 1)) (offset to keep non-negative)
  • diag2 |= (1 << (row + col))
The is_safe check is (cols | diag1 | diag2) bitwise-AND (1 << col) — one CPU instruction. This is roughly 5-10x faster than set operations for large n. LC 52 (just count, no boards) hits this hard.
Senior Follow-up 2 — “What if n is huge — like n equals 30? Can backtracking handle it?” n=30 has roughly 22 billion solutions. Counting alone is feasible with bitmask backtracking (a few seconds). Returning all boards is not — the output itself exceeds RAM. For very large n, mathematical lower bounds and parallelization come into play. For LeetCode purposes, n up to 9 returns all boards comfortably.
Senior Follow-up 3 — “Is N-Queens an NP-hard problem? Why or why not?” Solving N-Queens (finding any solution) is in P — there are constructive solutions known for all n greater than 3. But COUNTING solutions (LC 52 generalized) is what makes the search interesting; no closed-form formula is known and the count grows roughly factorially. This is a useful nuance to mention — it shows you distinguish “find any” from “count all.”
Common Wrong Answers:
  • Checking all 8 directions including downward — unnecessary because rows below have not been filled yet.
  • Using O(n) is_safe checks (scanning rows) instead of O(1) set lookups — works for small n but TLE on n above roughly 10.
  • Forgetting that diagonals come in two types (\ and /) and only tracking one.
Further Reading:
  • LC 52 N-Queens II — counting variant where bitmask optimization shines.
  • LC 37 Sudoku Solver — same constraint-set principle applied to a 9x9 grid.
Strong Answer Framework:
  1. For every cell, start a DFS. At each step, check the current grid character matches word[idx], mark the cell as visited (with a sentinel like '#'), and try all 4 directions.
  2. CRITICAL: restore the cell to its original character on backtrack. Without this, sibling DFS paths cannot reuse the cell.
  3. Complexity: O(m * n * 4^L) where L is word length. The 4^L term is loose; in practice, the character match prunes aggressively.
  4. Discuss optimizations: early character frequency check, reverse the word if the last character is rarer.
Real LeetCode Example — Full Walkthrough:
def exist(board, word):
    rows, cols = len(board), len(board[0])
    
    def dfs(r, c, idx):
        if idx == len(word):                         # Matched whole word
            return True
        if (r < 0 or r >= rows or c < 0 or c >= cols  # Out of bounds
                or board[r][c] != word[idx]):         # Or wrong character
            return False
        # CHOOSE: mark visited
        temp = board[r][c]
        board[r][c] = '#'
        # EXPLORE: 4 directions
        found = (dfs(r+1, c, idx+1)
              or dfs(r-1, c, idx+1)
              or dfs(r, c+1, idx+1)
              or dfs(r, c-1, idx+1))
        # UN-CHOOSE: restore the original character
        board[r][c] = temp
        return found
    
    for r in range(rows):
        for c in range(cols):
            if dfs(r, c, 0):
                return True
    return False
For board = [["A","B","C","E"],["S","F","C","S"],["A","D","E","E"]] and word = "ABCCED":
  • Start at (0, 0)='A', matches word[0]. Mark as #, try directions.
  • Move to (0, 1)='B', matches word[1]. Mark.
  • Move to (0, 2)='C', matches word[2]. Mark.
  • Move to (1, 2)='C', matches word[3]. Mark.
  • Move to (2, 2)='E', matches word[4]. Mark.
  • Move to (2, 1)='D', matches word[5]. Length matches. Return True.
Senior Follow-up 1 — “Why is memoization useless here, when it works for so many other grid problems?” The state for this problem is (r, c, idx, visited_set). The visited_set makes the state exponential — two paths arriving at (r, c, idx) may have different visited cells, so they are NOT equivalent. You cannot cache by (r, c, idx) alone. This is why backtracking (which avoids storing all states) is the right tool.
Senior Follow-up 2 — “How would you optimize for finding many words at once (LC 212)?” Build a Trie from all the words. At each cell, walk the Trie in lockstep with the DFS. If the current cell’s character is not a child in the current Trie node, prune the branch immediately. Found words are recorded when you hit a Trie node marked as a word-end. Optionally, remove found words from the Trie to skip redundant work. This turns single-word time O(K * m * n * 4^L) for K words into roughly O(m * n * 4^L) shared across all words.
Senior Follow-up 3 — “What is the worst-case input that hits the full O(m * n * 4^L) bound?” A grid filled entirely with the same character followed by a word that almost-but-not-quite fits. For example, all 'A's with word = "A" * (m*n) + "B". The DFS explores every path of length m*n+1, never finds the B, and backtracks fully. This is the classic adversarial input. In practice, real test cases with diverse characters prune early.
Common Wrong Answers:
  • Forgetting to restore the cell after recursion — subsequent paths cannot revisit it, so valid words are reported as not found.
  • Using a separate visited set instead of in-place marking — correct but uses O(m*n) extra space when in-place uses O(L) recursion only.
  • Starting only at cells matching word[0] (an optimization) but forgetting to handle the empty word edge case — LeetCode says word length is at least 1, but always check the constraint statement.
Further Reading:
  • LC 212 Word Search II — the Trie generalization.
  • LC 980 Unique Paths III — grid backtracking with a “visit every cell exactly once” constraint.
Strong Answer Framework:
  1. Both problems ask for combinations summing to a target. The difference is whether each candidate can be used multiple times (LC 39) or only once (LC 40).
  2. The structural difference is one character: recurse with i (reuse allowed) vs i + 1 (no reuse).
  3. LC 40 may have duplicates in the input — requires sort plus skip-equal-at-depth.
  4. Both benefit from sorting plus break-pruning when the candidate exceeds the remaining target.
Real LeetCode Example — LC 39 (Reuse) and LC 40 (No Reuse):
# LC 39: Combination Sum (reuse allowed)
def combinationSum(candidates, target):
    candidates.sort()
    result = []
    def bt(start, path, remaining):
        if remaining == 0:
            result.append(path[:]); return
        for i in range(start, len(candidates)):
            if candidates[i] > remaining: break       # Sorted -- rest are bigger
            path.append(candidates[i])
            bt(i, path, remaining - candidates[i])     # i: same element reusable
            path.pop()
    bt(0, [], target)
    return result

# LC 40: Combination Sum II (no reuse, may have duplicates)
def combinationSum2(candidates, target):
    candidates.sort()
    result = []
    def bt(start, path, remaining):
        if remaining == 0:
            result.append(path[:]); return
        for i in range(start, len(candidates)):
            if candidates[i] > remaining: break
            # Skip duplicates at the same recursion depth
            if i > start and candidates[i] == candidates[i-1]:
                continue
            path.append(candidates[i])
            bt(i + 1, path, remaining - candidates[i]) # i+1: each used once
            path.pop()
    bt(0, [], target)
    return result
For LC 39 with candidates = [2, 3, 6, 7], target = 7:
  • [2, 2, 3]: pick 2, recurse with i=0 (so 2 is reusable), get to remaining=5 then 3 then 1 then 0.
  • [7]: pick 7 directly, remaining=0.
  • Note: we never get [3, 2, 2] because we always go forward by index, preventing same-multiset-different-order duplicates.
For LC 40 with candidates = [10, 1, 2, 7, 6, 1, 5], target = 8:
  • After sort: [1, 1, 2, 5, 6, 7, 10].
  • Valid combinations: [1, 1, 6], [1, 2, 5], [1, 7], [2, 6].
  • Without if i > start and candidates[i] == candidates[i-1]: continue, you would also get duplicate [1, 7] (one with each of the two 1s).
Senior Follow-up 1 — “Why i > start and not i > 0 for the duplicate check?” i > 0 would prevent picking [1, 1, 6] because the second 1 (at depth 1, with start=1) would be skipped due to nums[1]==nums[0]. We DO want to pick both 1s in different positions of the same combination — that is valid. i > start only prevents picking the same value twice AT THE SAME DEPTH, which is what causes duplicate combinations.
Senior Follow-up 2 — “Why break and not continue when candidates[i] > remaining?” Because the array is sorted. If candidates[i] already exceeds remaining, then candidates[i+1], candidates[i+2], etc. are all larger and cannot fit either. continue would pointlessly check each. break skips them all in one step. On large inputs with widely-varying candidate magnitudes, this can be a 10x speedup.
Senior Follow-up 3 — “What is the time complexity of LC 39, and why is it tricky to express?” The depth is bounded by target / min(candidates) and the branching factor is at most n. Worst case is roughly O(n^(target/min)). For candidates [1] and target=100, this is 1 combination but the tree explores every prefix [1], [1,1], …, [1]*100, which is 100 nodes. So a tighter bound is O(target * S) where S is the number of valid combinations. Most interviewers accept “exponential, bounded by target/min branching factor n” as the answer.
Common Wrong Answers:
  • Recursing with i + 1 in LC 39 — breaks reuse. You miss combinations like [2, 2, 3].
  • Recursing with i in LC 40 — allows reuse. You generate combinations that should not exist.
  • Using continue instead of break for the prune check — correctness preserved but huge perf hit.
  • Forgetting to sort before doing break-prune — the prune is incorrect on unsorted input.
Further Reading: