Mastering Dynamic Programming: A Deep Dive into 12 Essential Patterns

Dynamic Programming is one of those topics that separates programmers who can solve hard problems from those who cannot. It is not a single algorithm — it is a problem-solving paradigm: a way of thinking about computation that transforms exponentially slow recursive solutions into polynomial-time ones by systematically eliminating redundant work.

Yet DP has a reputation for being intimidating. Many learners encounter it, struggle with it, and eventually memorize a handful of solutions without truly understanding the underlying structure. This guide takes a different approach. Rather than cataloguing isolated problems, it organizes the entire landscape of DP into 12 recurring patterns. Each pattern represents a family of problems that share the same state representation, recurrence structure, and fill order. Once you internalize these 12 patterns, you will find that most DP problems you encounter in interviews or competitive programming are variations of something you have already seen.

This post is structured to serve multiple audiences simultaneously. If you are a learner, you will find careful explanations of why each pattern works, not just how to implement it. If you are a senior developer refreshing your knowledge, the pattern tables and skeleton code will let you quickly locate the structure you need. If you are preparing for a technical interview, the "Interview Tip" sections and LeetCode problem lists will help you practice efficiently.

The Foundations: What Makes a Problem a DP Problem?

Before exploring the patterns, it is worth being precise about what DP actually is and when it applies.

The Two Necessary Conditions

A problem is amenable to DP if and only if it exhibits both of the following properties.

Overlapping Subproblems means that a naive recursive solution would compute the same sub-answer multiple times. The canonical example is the Fibonacci sequence: computing fib(5) requires fib(4) and fib(3), but computing fib(4) also requires fib(3) — so fib(3) is computed twice. In a deep recursion, this redundancy grows exponentially.

Optimal Substructure means that the optimal solution to the overall problem can be constructed from the optimal solutions to its subproblems. This is the property that allows you to trust the recurrence relation: if you know the best answer for every subproblem, you can combine them to get the best answer for the full problem.

If a problem has overlapping subproblems but no optimal substructure (e.g., finding the longest path in a general graph), DP will not work correctly. If it has optimal substructure but no overlapping subproblems (e.g., merge sort), DP is not needed — divide and conquer suffices.

The Four Building Blocks

Every DP solution is built from four components. Understanding these clearly is the key to designing DP solutions from scratch rather than memorizing them.

Component	Definition	Example
State	What one subproblem represents	`dp[i]` = best value for prefix ending at index `i`
Recurrence	How to derive `dp[i]` from smaller states	`dp[i] = max(dp[i-1] + arr[i], arr[i])`
Base Case	The smallest subproblems, solved directly	`dp[0] = arr[0]`
Fill Order	The order in which states are computed	Left to right, or by increasing interval length

Top-Down vs. Bottom-Up

DP can be implemented in two equivalent ways. Top-down (memoization) starts from the original problem and recursively breaks it down, caching results as they are computed. Bottom-up (tabulation) starts from the base cases and iteratively fills the DP table in the correct order.

Top-down is often easier to write because the recursion mirrors the natural problem structure. Bottom-up is often more efficient in practice because it avoids function call overhead and is more cache-friendly. For the patterns below, both approaches are valid; the skeletons shown use the bottom-up style unless otherwise noted.

Pattern 1: 1D Linear DP

The Core Idea

This is the most fundamental DP pattern. The state is a single index i, and the recurrence computes dp[i] from one or more earlier values dp[j] where j < i. The table is a one-dimensional array, filled from left to right (or right to left, depending on the problem).

The key insight is that the "best" way to reach position i depends only on the best ways to reach earlier positions. This is optimal substructure in its simplest form.

When to Recognize This Pattern

You are looking at a 1D linear DP problem when the problem involves a sequence (array or string), you need to find an optimal value (max, min, count) over the sequence, and the answer at position i depends on answers at positions before i. Specifically:

Fibonacci-style: dp[i] depends on dp[i-1] and dp[i-2].
Rod cutting / max product: dp[i] depends on dp[j] for all j < i, representing all possible "cut points."
Array hopper: dp[i] asks whether position i is reachable, or what the minimum cost to reach it is.

Worked Example: Fibonacci

The Fibonacci problem is the "Hello, World" of DP. The naive recursive solution has O(2^n) time complexity because it recomputes the same values exponentially many times. The DP solution recognizes that there are only n distinct subproblems and solves each exactly once.

def fib(n):
    if n <= 1:
        return n
    a, b = 0, 1
    for _ in range(2, n + 1):
        a, b = b, a + b
    return b

Notice that this solution uses only O(1) space because dp[i] depends only on the two immediately preceding values. This is a common optimization: when the recurrence only looks back a fixed number of steps, you can replace the full array with a few variables.

Worked Example: Max Product of Cutting Rope

This problem asks for the maximum product you can obtain by cutting a rope of length n into at least two pieces. The state dp[i] represents the maximum product obtainable from a rope of length i with at least one cut.

def max_product_cutting(n):
    # dp[i] = max product from cutting rope of length i (at least one cut)
    dp = [0] * (n + 1)
    for i in range(2, n + 1):
        for cut in range(1, i):
            # Either stop cutting after this cut (cut * (i - cut))
            # or continue cutting the right piece (cut * dp[i - cut])
            dp[i] = max(dp[i], cut * (i - cut), cut * dp[i - cut])
    return dp[n]

The inner loop tries every possible first cut. The expression cut * (i - cut) represents making exactly one cut, while cut * dp[i - cut] represents making at least one more cut in the right piece. This is the "try all options and take the best" structure that characterizes many 1D DP problems.

Worked Example: Array Hopper (Minimum Jumps)

Given an array where each element represents the maximum jump length from that position, find the minimum number of jumps to reach the last index.

def min_jumps(nums):
    n = len(nums)
    dp = [float('inf')] * n
    dp[n - 1] = 0  # Base case: already at the end
    for i in range(n - 2, -1, -1):
        for j in range(i + 1, min(i + nums[i] + 1, n)):
            dp[i] = min(dp[i], 1 + dp[j])
    return dp[0] if dp[0] != float('inf') else -1

Here the table is filled right to left. dp[i] = minimum jumps from position i to the end. The inner loop considers all positions reachable from i in one jump.

Interview Tip: For 1D DP, always ask: "What does dp[i] represent?" and "What are all the ways to arrive at state i?" The recurrence is the answer to the second question. If you can answer both clearly, the code almost writes itself.

Complexity Summary

Problem	Time	Space (optimized)
Fibonacci	O(n)	O(1)
Max product cutting	O(n²)	O(n)
Array hopper (min jumps)	O(n²)	O(n)

Classic LeetCode Problems

Fibonacci Number (LC 509)
Integer Break (LC 343)
Jump Game (LC 55) — reachability variant
Jump Game II (LC 45) — minimum jumps variant

Pattern 2: Subarray DP

The Core Idea

Subarray DP focuses on contiguous portions of an array. The state dp[i] represents the best value of some property for the subarray ending at index i. This "ending at" formulation is crucial: it ensures that you can extend the subarray by one element at each step.

When to Recognize This Pattern

The hallmark of subarray DP is the phrase "contiguous subarray" in the problem statement. You need to find the longest, largest, or most optimal contiguous subarray satisfying some condition.

Worked Example: Longest Ascending Subarray

dp[i] = length of the longest ascending subarray ending at index i. The recurrence is simple: if nums[i] > nums[i-1], the ascending run continues; otherwise, it resets to 1.

def longest_ascending_subarray(nums):
    if not nums:
        return 0
    dp = 1   # dp[i], initialized for i=0
    best = 1
    for i in range(1, len(nums)):
        if nums[i] > nums[i - 1]:
            dp += 1
        else:
            dp = 1
        best = max(best, dp)
    return best

Since dp[i] depends only on dp[i-1], the array is replaced with a single variable.

Worked Example: Kadane's Algorithm (Maximum Subarray Sum)

Kadane's algorithm is one of the most elegant algorithms in computer science. The state cur represents the maximum subarray sum ending at the current position.

def max_subarray_sum(nums):
    best = cur = nums[0]
    for x in nums[1:]:
        # Either extend the previous subarray or start a new one at x
        cur = max(x, cur + x)
        best = max(best, cur)
    return best

The key decision at each step is: "Is it better to extend the existing subarray, or to start fresh at the current element?" If cur + x < x, it means the previous subarray sum is negative and it is better to discard it.

Interview Tip: Kadane's algorithm is a template that appears in many disguises. The 2D submatrix sum problem (Pattern 6) reduces to Kadane's. Whenever you see "maximum sum of a contiguous region," think Kadane's.

Classic LeetCode Problems

Longest Continuous Increasing Subsequence (LC 674)
Maximum Subarray (LC 53)
Max Consecutive Ones (LC 485)

Pattern 3: Word Break / Dictionary DP

The Core Idea

This pattern addresses segmentation problems: given a string and a set of valid "pieces," can the string be fully decomposed into those pieces? The state dp[i] is a boolean: can the prefix s[0..i-1] be segmented using the dictionary?

When to Recognize This Pattern

Look for problems that ask whether a string can be "broken," "split," or "segmented" into valid components from a given set. The key structural observation is that dp[i] is true if there exists some split point j < i such that dp[j] is true and s[j:i] is in the dictionary.

Skeleton

def word_break(s, word_dict):
    word_set = set(word_dict)
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True  # Empty string is always valid
    for i in range(1, n + 1):
        for j in range(i):
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break
    return dp[n]

The base case dp[0] = True represents the empty prefix, which is trivially valid. The break statement is an optimization: once we find one valid split, we do not need to check others.

Complexity and Optimization

The naive implementation above is O(n³) because of the string slicing s[j:i] inside the loop. Using a hash set for the dictionary makes the lookup O(1) on average, but the slicing itself is O(n). For large inputs, you can precompute all valid word lengths and only check those, reducing the inner loop iterations.

Interview Tip: This pattern generalizes to "can we tile/cover/partition a sequence using pieces from a given set?" The DP structure is always the same: dp[i] is reachable if some earlier dp[j] is reachable and the segment [j, i) is a valid piece.

Classic LeetCode Problems

Word Break (LC 139)
Word Break II (LC 140) — enumerate all valid segmentations

Pattern 4: 2D DP — Edit Distance and Two-Sequence Problems

The Core Idea

When a problem involves two sequences (two strings, two arrays), the natural state is a 2D table where dp[i][j] represents the answer for the first i elements of sequence A and the first j elements of sequence B. The recurrence considers what happens at the "boundary" of both sequences: do the current elements match? If not, what operation (insert, delete, replace) is cheapest?

When to Recognize This Pattern

Two-sequence DP problems typically involve:

Computing the "distance" or "similarity" between two strings.
Finding the longest common subsequence or substring.
Aligning two sequences optimally.

Worked Example: Edit Distance (Levenshtein Distance)

The edit distance between two strings is the minimum number of single-character operations (insert, delete, substitute) needed to transform one string into the other.

def min_distance(word1, word2):
    m, n = len(word1), len(word2)
    # dp[i][j] = edit distance between word1[:i] and word2[:j]
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Base cases: transforming to/from empty string
    for i in range(m + 1):
        dp[i][0] = i  # Delete all i characters
    for j in range(n + 1):
        dp[0][j] = j  # Insert j characters
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                dp[i][j] = dp[i-1][j-1]  # Characters match: no operation needed
            else:
                dp[i][j] = 1 + min(
                    dp[i-1][j],    # Delete from word1
                    dp[i][j-1],    # Insert into word1
                    dp[i-1][j-1]   # Replace in word1
                )
    return dp[m][n]

The base cases encode the cost of transforming a string to or from the empty string. The recurrence at each cell considers all three operations and takes the minimum.

Understanding the Recurrence Visually

It helps to think of the DP table as a grid where you move from the top-left (both strings empty) to the bottom-right (both strings fully processed). Each cell dp[i][j] can be reached from:

dp[i-1][j] by deleting a character from word1 (move down).
dp[i][j-1] by inserting a character into word1 (move right).
dp[i-1][j-1] by either matching (if characters are equal) or substituting (if they differ).

Interview Tip: Edit distance is a foundational problem. Many variants exist: what if substitution costs 2? What if you want the actual sequence of operations, not just the count? The structure of the DP table remains the same; only the recurrence changes.

Classic LeetCode Problems

Edit Distance (LC 72)
Longest Common Subsequence (LC 1143)
Minimum ASCII Delete Sum for Two Strings (LC 712)
Interleaving String (LC 97)

Pattern 5: 2D DP — Shapes in a Grid (Squares and Crosses)

The Core Idea

This pattern applies to 2D binary matrices where you need to find the largest shape (square, cross, X, rectangle) made entirely of 1s. The key insight is that the "size" of a shape at a given cell can be computed from the sizes at neighboring cells, enabling a DP recurrence.

Worked Example: Largest Square of 1s

dp[i][j] = the side length of the largest square of 1s whose bottom-right corner is at cell (i, j).

The recurrence is elegant: a square of side k with its bottom-right at (i, j) requires squares of side at least k-1 at (i-1, j), (i, j-1), and (i-1, j-1). Therefore:

dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1])   if matrix[i][j] == 1
dp[i][j] = 0                                                  if matrix[i][j] == 0

def max_square(matrix):
    if not matrix or not matrix[0]:
        return 0
    rows, cols = len(matrix), len(matrix[0])
    # Use a (rows+1) x (cols+1) table with a 1-cell border of zeros
    dp = [[0] * (cols + 1) for _ in range(rows + 1)]
    best = 0
    for i in range(rows):
        for j in range(cols):
            if matrix[i][j] == 1:
                dp[i+1][j+1] = 1 + min(dp[i][j], dp[i][j+1], dp[i+1][j])
                best = max(best, dp[i+1][j+1])
    return best * best  # Return area; use best for side length

The 1-cell border of zeros in the DP table eliminates the need for boundary checks in the recurrence.

Cross and X of 1s

For the longest cross of 1s (a "+" shape), precompute four direction arrays for each cell: up[i][j], down[i][j], left[i][j], right[i][j], each representing the number of consecutive 1s in that direction including the cell itself. The arm length of the cross centered at (i, j) is min(up, down, left, right), and the total number of 1s in the cross is 4 * arm - 3.

For the largest X of 1s (a "×" shape), the same idea applies but with the four diagonal directions.

Interview Tip: The "bottom-right corner" framing is a powerful technique. It transforms a 2D shape problem into a 2D DP problem where each cell stores a compact summary of the shape that ends there. Always ask: "What is the right 'anchor point' for the shape I am looking for?"

Classic LeetCode Problems

Maximal Square (LC 221)
Maximal Rectangle (LC 85)
Count Square Submatrices with All Ones (LC 1277)

Pattern 6: 2D DP — Submatrix Sum

The Core Idea

This pattern combines two techniques: 2D prefix sums for O(1) rectangle sum queries, and Kadane's algorithm applied to a compressed 1D representation. Together, they enable finding the maximum-sum submatrix in O(n²m) time (where the matrix is n×m).

2D Prefix Sums

A 2D prefix sum table pre[r][c] stores the sum of all elements in the rectangle from (0,0) to (r-1, c-1). The sum of any rectangle from (r1, c1) to (r2, c2) can then be computed in O(1):

sum = pre[r2+1][c2+1] - pre[r1][c2+1] - pre[r2+1][c1] + pre[r1][c1]

This is the inclusion-exclusion principle applied to rectangles.

The Full Algorithm

def max_submatrix_sum(matrix):
    if not matrix or not matrix[0]:
        return 0
    rows, cols = len(matrix), len(matrix[0])
    
    # Build 2D prefix sum table
    pre = [[0] * (cols + 1) for _ in range(rows + 1)]
    for r in range(rows):
        for c in range(cols):
            pre[r+1][c+1] = matrix[r][c] + pre[r][c+1] + pre[r+1][c] - pre[r][c]

    best = matrix[0][0]
    for r1 in range(rows):
        for r2 in range(r1, rows):
            # Compress columns into a 1D array for the row band [r1, r2]
            cur = 0
            for c in range(cols):
                col_sum = pre[r2+1][c+1] - pre[r1][c+1] - pre[r2+1][c] + pre[r1][c]
                # Apply Kadane's algorithm on the 1D column sums
                cur = max(col_sum, cur + col_sum)
                best = max(best, cur)
    return best

The outer two loops fix the top and bottom row boundaries of the submatrix. For each such pair of rows, the column sums are computed in O(1) using the prefix table, giving a 1D array. Kadane's algorithm then finds the maximum subarray sum of this 1D array, which corresponds to the best column range for the fixed row range.

Interview Tip: The time complexity is O(n²m) where n is the number of rows and m the number of columns. If n > m, transpose the matrix first to ensure the outer loops iterate over the smaller dimension.

Classic LeetCode Problems

Max Sum of Rectangle No Larger Than K (LC 363)
Maximal Rectangle (LC 85)

Pattern 7: Knapsack DP

The Core Idea

The knapsack family of problems is one of the most important in combinatorial optimization. The general setup: you have a set of items, each with a weight and a value, and a capacity constraint. You want to select items to maximize total value without exceeding the capacity.

The power of knapsack DP lies in its generality. Many real-world problems — resource allocation, portfolio optimization, scheduling — can be modeled as knapsack variants.

The Four Knapsack Variants

Variant	Constraint	Key Recurrence Difference
0-1 Knapsack	Each item used at most once	`dp[i-1][j - w[i]]` (previous row)
Complete Knapsack	Each item used unlimited times	`dp[i][j - w[i]]` (same row)
Multi Knapsack	Item `i` has count `s[i]`	Extra loop over copies, or binary splitting
Group Knapsack	Pick at most one from each group	Outer loop over groups, inner over capacity

0-1 Knapsack: Full Derivation

State: dp[i][j] = maximum value using the first i items with capacity j.

Recurrence:

If we skip item i: dp[i][j] = dp[i-1][j]
If we take item i (only if w[i] <= j): dp[i][j] = v[i] + dp[i-1][j - w[i]]
dp[i][j] = max(dp[i-1][j], v[i] + dp[i-1][j - w[i]])

Space Optimization to 1D: Since dp[i] only depends on dp[i-1], we can use a single array. The trick is to iterate j from high to low (from capacity down to w[i]). This ensures that when we update dp[j], the value dp[j - w[i]] still reflects the previous row (i.e., item i has not yet been used).

def knapsack_01(weights, values, capacity):
    dp = [0] * (capacity + 1)
    for w, v in zip(weights, values):
        for j in range(capacity, w - 1, -1):  # Iterate high to low
            dp[j] = max(dp[j], v + dp[j - w])
    return dp[capacity]

Complete Knapsack: The Key Difference

In the complete knapsack, each item can be used unlimited times. The only change is that we iterate j from low to high. This means when we update dp[j], the value dp[j - w[i]] already reflects the current row — meaning item i may have been used already, which is exactly what we want.

def knapsack_complete(weights, values, capacity):
    dp = [0] * (capacity + 1)
    for w, v in zip(weights, values):
        for j in range(w, capacity + 1):  # Iterate low to high
            dp[j] = max(dp[j], v + dp[j - w])
    return dp[capacity]

The "According to Last Step" Technique

A powerful technique in knapsack DP (and DP in general) is to define the state by the last decision made. For example, in the Longest Increasing Subsequence (LIS) problem, dp[i] = length of the longest increasing subsequence ending at index i. This "ending at i" formulation is the "last step" — the last element chosen is nums[i].

This technique is useful whenever the problem has a natural "last action" that determines what came before it.

Interview Tip: The 0-1 vs. complete knapsack distinction comes down to one word: "iterate high to low" vs. "iterate low to high." In an interview, if you can explain why the direction matters (to prevent or allow reuse), you will stand out.

Pattern 8: Interval DP

The Core Idea

Interval DP is used when the subproblems are defined over contiguous intervals [i, j] of a sequence. The key structural property is that the optimal solution for an interval can be computed by splitting it at some point k and combining the optimal solutions for the two sub-intervals.

The Fill Order

The critical insight about interval DP is the fill order: you must compute all intervals of length 1 before length 2, all length 2 before length 3, and so on. This ensures that when you compute dp[i][j], all sub-intervals dp[i][k] and dp[k+1][j] are already available.

for length in range(1, n + 1):          # Iterate by interval length
    for i in range(n - length + 1):     # Iterate over all start positions
        j = i + length - 1              # Compute end position
        for k in range(i, j):           # Try all split points
            dp[i][j] = min/max(dp[i][j], dp[i][k] + dp[k+1][j] + cost(i, k, j))

Worked Example: Merge Stones

Given n piles of stones, merge adjacent piles into one. The cost of merging is the total number of stones in the merged piles. Find the minimum total cost to merge all piles into one.

def merge_stones(piles):
    n = len(piles)
    prefix = [0] * (n + 1)
    for i in range(n):
        prefix[i+1] = prefix[i] + piles[i]
    
    dp = [[0] * n for _ in range(n)]
    # Base case: dp[i][i] = 0 (single pile, no cost)
    
    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            dp[i][j] = float('inf')
            for k in range(i, j):
                cost = prefix[j+1] - prefix[i]  # Cost of merging [i, j]
                dp[i][j] = min(dp[i][j], dp[i][k] + dp[k+1][j] + cost)
    
    return dp[0][n-1]

The prefix sum array gives O(1) range sum queries, making the total time complexity O(n³).

Initialization Pitfalls

Interval DP has a common initialization trap. For minimization problems, initialize dp[i][j] to a large value (infinity) for all i < j, and to 0 (or the base case value) for i == j. Do not leave all entries as 0, because 0 is a valid answer for empty intervals but not for non-trivial ones.

Interview Tip: Interval DP problems are often disguised. "Burst Balloons" (LC 312), "Strange Printer" (LC 664), and "Minimum Cost to Cut a Stick" (LC 1547) all use interval DP. The tell-tale sign is that the problem involves making decisions about a range and the cost depends on what remains in the range.

Pattern 9: Tree DP

The Core Idea

Tree DP defines the DP state on the nodes of a tree and computes answers in a bottom-up fashion using a post-order DFS traversal. The state for a node is computed from the states of its children. This is natural because trees have no cycles, so the dependency order is well-defined.

The General Shape

def tree_dp(root):
    def dfs(node):
        if not node:
            return base_value
        left = dfs(node.left)
        right = dfs(node.right)
        # Combine children results at the current node
        return combine(node.val, left, right)
    return dfs(root)

The dfs function returns a value (or a tuple of values) representing the DP state for the subtree rooted at node.

Worked Example: Binary Tree Maximum Path Sum (LC 124)

A path in a binary tree can go through any node and must be a connected sequence of nodes. Find the maximum sum of any path.

The key insight is to separate two quantities:

The maximum "gain" from a node going down one branch (used when the path passes through the node's parent).
The maximum path sum passing through the current node (used to update the global answer).

def max_path_sum(root):
    ans = float('-inf')
    def dfs(node):
        nonlocal ans
        if not node: return 0
        # If a branch has negative gain, we ignore it (take max with 0)
        left  = max(dfs(node.left), 0)
        right = max(dfs(node.right), 0)
        # Update global answer: path goes left → node → right
        ans = max(ans, node.val + left + right)
        # Return the max gain going down one branch
        return node.val + max(left, right)
    dfs(root)
    return ans

Worked Example: House Robber III (LC 337)

In this problem, you cannot rob two directly connected nodes. The state is a pair: (rob_this, skip_this).

def rob(root):
    def dfs(node):
        if not node: return (0, 0)
        l_rob, l_skip = dfs(node.left)
        r_rob, r_skip = dfs(node.right)
        # If we rob this node, we cannot rob its children
        rob_this  = node.val + l_skip + r_skip
        # If we skip this node, we take the best of each child
        skip_this = max(l_rob, l_skip) + max(r_rob, r_skip)
        return (rob_this, skip_this)
    return max(dfs(root))

Returning a tuple is a powerful technique in tree DP. It allows the parent node to choose the best option based on the full state of the subtree, rather than a single aggregated value.

Worked Example: Diameter of Binary Tree (LC 543)

The diameter is the length of the longest path between any two nodes. The path may or may not pass through the root.

def diameter(root):
    ans = 0
    def dfs(node):
        nonlocal ans
        if not node: return 0
        left = dfs(node.left)
        right = dfs(node.right)
        # The diameter through this node is left_depth + right_depth
        ans = max(ans, left + right)
        # Return the max depth below this node
        return 1 + max(left, right)
    dfs(root)
    return ans

Interview Tip: In tree DP, the function return value and the global answer are often different things. The return value is what the parent needs; the global answer is updated as a side effect. This separation is the key to solving problems like Maximum Path Sum and Diameter.

Classic LeetCode Problems

Binary Tree Maximum Path Sum (LC 124)
House Robber III (LC 337)
Diameter of Binary Tree (LC 543)
Longest Univalue Path (LC 687)
Count Ways to Build Rooms in Ant Colony (LC 1916)

Pattern 10: Bitmask DP (State Compression DP)

The Core Idea

When the state of a problem involves a subset of a small set of elements (typically n ≤ 20), you can represent each subset as an integer from 0 to 2^n - 1. Each bit in the integer indicates whether the corresponding element is included in the subset. DP over all 2^n subsets is called bitmask DP or state compression DP.

Bitmask Operations

Operation	Code	Meaning
Check if element `i` is in `mask`	`(mask >> i) & 1`	Is bit `i` set?
Add element `i` to `mask`	`mask \| (1 << i)`	Set bit `i`
Remove element `i` from `mask`	`mask & ~(1 << i)`	Clear bit `i`
Enumerate all subsets of `mask`	`sub = mask; while sub > 0: sub = (sub-1) & mask`	Iterate over subsets

Worked Example: Traveling Salesman Problem (TSP)

The TSP asks for the shortest Hamiltonian cycle — a route that visits every city exactly once and returns to the start. The brute-force approach tries all n! permutations, which is infeasible for n > 12. Bitmask DP reduces this to O(2^n × n²).

State: dp[mask][i] = minimum cost to reach city i, having visited exactly the cities in mask.

def tsp(dist):
    n = len(dist)
    INF = float('inf')
    dp = [[INF] * n for _ in range(1 << n)]
    dp[1][0] = 0  # Start at city 0; mask = 0b1 (only city 0 visited)

    for mask in range(1 << n):
        for u in range(n):
            if dp[mask][u] == INF: continue
            if not (mask >> u) & 1: continue  # u must be in the visited set
            for v in range(n):
                if (mask >> v) & 1: continue  # v must not be visited yet
                new_mask = mask | (1 << v)
                dp[new_mask][v] = min(dp[new_mask][v], dp[mask][u] + dist[u][v])

    full = (1 << n) - 1
    return min(dp[full][i] + dist[i][0] for i in range(1, n))

Worked Example: Partition to K Equal Sum Subsets (LC 698)

def can_partition_k_subsets(nums, k):
    total = sum(nums)
    if total % k: return False
    target = total // k
    n = len(nums)
    dp = [None] * (1 << n)
    dp[0] = 0  # dp[mask] = current sum in the active bucket
    for mask in range(1 << n):
        if dp[mask] is None: continue
        for i in range(n):
            if (mask >> i) & 1: continue  # nums[i] already used
            if dp[mask] % target + nums[i] <= target:
                dp[mask | (1 << i)] = (dp[mask] + nums[i]) % target
    return dp[(1 << n) - 1] == 0

Interview Tip: Bitmask DP is the go-to technique when the problem involves "visiting all elements" or "assigning elements to groups" with a small n. The complexity O(2^n × n) or O(2^n × n²) is acceptable for n ≤ 20.

Classic LeetCode Problems

Shortest Path Visiting All Nodes (LC 847)
Partition to K Equal Sum Subsets (LC 698)
Maximum AND Sum of Array (LC 2172)

Pattern 11: Digit DP

The Core Idea

Digit DP is a specialized technique for counting integers in a range [lo, hi] that satisfy some property defined at the digit level — for example, "sum of digits equals s," "no two consecutive digits are the same," or "the number is divisible by k."

The key idea is to build the number digit by digit from the most significant to the least significant, while tracking a tight constraint: are we still bounded by the upper limit, or have we already chosen a smaller digit at some position?

The Tight Flag

The tight flag is the heart of digit DP. At each position:

If tight = True, the current digit can range from 0 to digits[pos] (the corresponding digit of the upper limit).
If tight = False, the current digit can range from 0 to 9 (no upper bound constraint).

Once you choose a digit strictly less than the limit digit, tight becomes False for all subsequent positions.

General Skeleton

from functools import lru_cache

def count_in_range(n):
    """Count integers in [0, n] satisfying some property."""
    digits = list(map(int, str(n)))
    length = len(digits)

    @lru_cache(maxsize=None)
    def dfs(pos, tight, *state):
        if pos == length:
            return 1 if is_valid(state) else 0
        limit = digits[pos] if tight else 9
        result = 0
        for d in range(0, limit + 1):
            new_tight = tight and (d == limit)
            new_state = transition(state, d)
            result += dfs(pos + 1, new_tight, *new_state)
        return result

    return dfs(0, True, *initial_state)

# For a range [lo, hi]: count_in_range(hi) - count_in_range(lo - 1)

Worked Example: Count Numbers with No Consecutive Same Digits

def no_consecutive_same(n):
    digits = list(map(int, str(n)))
    k = len(digits)

    @lru_cache(maxsize=None)
    def dfs(pos, tight, prev):
        if pos == k: return 1
        limit = digits[pos] if tight else 9
        result = 0
        for d in range(0, limit + 1):
            if d != prev:  # No consecutive same digit
                result += dfs(pos + 1, tight and d == limit, d)
        return result

    return dfs(0, True, -1)  # -1 as sentinel for "no previous digit"

Key Invariants

Three invariants make digit DP correct:

Always memoize on (pos, tight, ...state). The tight flag is part of the memoization key because the same (pos, state) pair has different valid digit ranges depending on whether tight is true or false.
tight is True on exactly one path — the path that exactly matches the prefix of the upper limit. All other paths have tight = False.
Leading zeros require a separate started flag if they affect the validity of the number (e.g., "007" should not be counted as a 3-digit number).

Interview Tip: Digit DP problems are rare in standard interviews but common in competitive programming. The template is always the same; the only thing that changes is the state and the is_valid / transition functions. Practice by identifying what "state" you need to track at each digit position.

Classic LeetCode Problems

Count Numbers with Unique Digits (LC 357)
Numbers At Most N Given Digit Set (LC 902)
Non-negative Integers Without Consecutive Ones (LC 600)
Digit Count in Range (LC 1067)

Pattern 12: Probability and Expected Value DP

The Core Idea

This pattern computes the probability of an event or the expected value of a quantity in a random process. The state represents the current situation (position, round, remaining resources), and the transitions are weighted by the probability of each outcome.

The fundamental equation is:

dp[state] = Σ (probability of transition i) × dp[next_state_i]

This can be computed either forward (spreading probability from the initial state) or backward (pulling expected value from the terminal states).

Forward vs. Backward DP

Approach	Direction	When to Use
Forward DP	Initial → Terminal	When you know the starting state and want to find the probability of reaching various end states
Backward DP	Terminal → Initial	When you know the terminal states and want to find the expected value from the start

Worked Example: Knight Probability (LC 688)

A knight starts at (row, col) on an n×n chessboard and makes k random moves. What is the probability that the knight remains on the board after all k moves?

def knight_probability(n, k, row, col):
    moves = [(-2,-1),(-2,1),(-1,-2),(-1,2),(1,-2),(1,2),(2,-1),(2,1)]
    # dp[r][c] = probability of being at (r,c) after current step
    dp = [[0.0]*n for _ in range(n)]
    dp[row][col] = 1.0
    for _ in range(k):
        new_dp = [[0.0]*n for _ in range(n)]
        for r in range(n):
            for c in range(n):
                if dp[r][c] == 0: continue
                for dr, dc in moves:
                    nr, nc = r+dr, c+dc
                    if 0 <= nr < n and 0 <= nc < n:
                        new_dp[nr][nc] += dp[r][c] / 8.0
        dp = new_dp
    return sum(dp[r][c] for r in range(n) for c in range(n))

This is a forward DP: we start with probability 1 at the initial position and spread it forward through each move.

Worked Example: New 21 Game (LC 837)

After exactly k draws (each adding a random integer from [1, max_pts]), what is the probability that the total score is at most n?

def new21_game(n, k, max_pts):
    if k == 0 or n >= k + max_pts: return 1.0
    dp = [0.0] * (n + 1)
    dp[0] = 1.0
    window_sum = 1.0  # Sliding window sum of the last max_pts dp values
    for i in range(1, n + 1):
        dp[i] = window_sum / max_pts
        if i < k:
            window_sum += dp[i]
        if i >= max_pts:
            window_sum -= dp[i - max_pts]
    return sum(dp[k:n+1])

The sliding window optimization reduces the time complexity from O(n × max_pts) to O(n). This is a crucial optimization technique: when each state depends on the sum of a fixed-size window of previous states, maintain a running sum instead of recomputing it.

Interview Tip: Probability DP problems often have a "sliding window" optimization hiding in them. If dp[i] depends on dp[i-1] + dp[i-2] + ... + dp[i-W] for a fixed window size W, maintain a running sum and update it in O(1) per step.

Classic LeetCode Problems

Knight Probability in Chessboard (LC 688)
New 21 Game (LC 837)
Soup Servings (LC 808)
Dice Roll Simulation (LC 1223)

Putting It All Together: A Pattern Recognition Guide

When you encounter a DP problem, use the following decision process to identify the right pattern.

Signal in the Problem	Likely Pattern
Single sequence, value at `i` depends on earlier values	1D Linear DP
"Contiguous subarray" with max/min/length	Subarray DP (Kadane's)
"Can string be split into dictionary words?"	Word Break DP
Two strings or sequences, alignment/distance	2D Two-Sequence DP
2D binary grid, find largest shape of 1s	2D Shapes DP
2D grid, find maximum sum rectangle	Submatrix Sum DP
Items with weights/values, capacity constraint	Knapsack DP
Optimal value over a range `[i, j]`, split at `k`	Interval DP
Tree structure, combine child results	Tree DP
Small `n`, "visit all" or "assign to groups"	Bitmask DP
Count integers in range with digit property	Digit DP
Random process, find probability or expected value	Probability/EV DP

Common Mistakes and How to Avoid Them

Off-by-one errors in base cases. Always be explicit about what dp[0] means. Does it represent an empty prefix, a single element, or something else? Draw out the first few entries of the table manually before coding.

Wrong fill order in interval DP. If you fill by i from 0 to n and j from 0 to n, you will try to use dp[i][k] before it is computed. Always fill by interval length.

0-1 vs. complete knapsack direction. The direction of the inner loop (high-to-low vs. low-to-high) determines whether items can be reused. Mixing them up produces incorrect results that are hard to debug.

Forgetting the tight flag in digit DP memoization. Without tight in the cache key, the function will return the wrong count for states that are reachable both with and without the tight constraint.

Not considering negative gains in tree DP. In problems like Maximum Path Sum, always clamp child contributions to 0 with max(dfs(child), 0). A negative subtree should be ignored, not included.

Conclusion

Dynamic Programming is not a single algorithm but a way of thinking. The 12 patterns in this guide represent the major structural families that appear repeatedly across hundreds of problems. By learning to recognize these patterns — by their state representation, their recurrence structure, and their fill order — you transform DP from an intimidating black box into a systematic toolkit.

The path to mastery is straightforward: understand the pattern, implement the skeleton from memory, and then solve 3–5 problems in each family. After that, you will find that most "hard" DP problems are simply combinations or variations of what you already know.

The table below summarizes all 12 patterns for quick reference.

#	Pattern	State Shape	Key Technique	Typical Complexity
1	1D Linear	`dp[i]`	Recurrence on previous indices	O(n) or O(n²)
2	Subarray	`dp[i]` (ending at `i`)	Kadane's algorithm	O(n)
3	Word Break	`dp[i]` (prefix of length `i`)	Dictionary lookup	O(n²)
4	Two-Sequence	`dp[i][j]`	Insert/delete/replace	O(mn)
5	Shapes in Grid	`dp[i][j]` (anchor at `(i,j)`)	Direction runs	O(mn)
6	Submatrix Sum	2D prefix + 1D Kadane	Row compression	O(n²m)
7	Knapsack	`dp[i][j]` (items, capacity)	Loop direction	O(nW)
8	Interval	`dp[i][j]` (interval `[i,j]`)	Fill by length	O(n³)
9	Tree	`dfs(node)` (subtree)	Post-order DFS	O(n)
10	Bitmask	`dp[mask][i]`	Subset enumeration	O(2^n × n)
11	Digit	`dfs(pos, tight, state)`	Tight flag + memoization	O(n × states)
12	Probability/EV	`dp[state]`	Forward/backward + sliding window	Varies

Happy coding, and may your recurrences always converge.