Those who cannot remember the past are condemned to repeat it. - Dynamic Programming

Dynamic programming is a paradigm that is meant to solve hard problems in a way that is much better than brute force.

The first thing that needs to be said about dynamic programming is that it is not a subtype of divide and conquer. We do not need to divide the problem into fractionally smaller subproblems. Rather, we use properties of the correct solution to rewrite it as a combination of smaller cases.

Thus, divide and conquer is a subtype of dynamic programming.

Here are a few more distinctions between the two paradigms:

Dynamic programming keeps it's options open, considering multiple ways of defining smaller sub problems and closing the best of them.
Because each recursive call of a dynamic programming tries out multiple choice of smaller sub problems, there is usually a significant overlap between sub problems (overlapping sub-problems). This is why caching sub problems is a useful optimisation.
In divide and conquer on the other hand, there is often little overlap. For example, in Merge Sort, there is no overlap. Thus, there is little performance to be gained through memoization.
Most of the canonical applications of divide and conquer replace a straightforward polynomial time algorithm with faster versions. Dynamic programming algorithms, on the other hand, are used for problems where the straightforward solution requires an exponential amount of time.
When you find that the merge step in divide and conquer is excessively complicated, use dynamic programming.

Principles

The following are key characteristics of the dynamic programming paradigm which lay out how any problem may be identified and solved.

Identify how sub problems may be constructed.
Show how to quickly solve larger sub problems given the solutions to smaller ones.
Based on this, rewrite the algorithm to quickly and correctly infer the solution from the set of solutions to all sub problems. Prove that this algorithm is correct.

Step 1 is the most important feature of dynamic programming: identifying a way in which smaller subproblems (in an optimisation problem, these are optimal substructures) can be put together to get the solution to the larger problem.

Step 2 leads to a recursive solution. Here, a recursion tree is created which represents the relationships between the sub problems.

This is a top-down approach to the problem where the solution is built up starting from the biggest instance.

Step 3 is where we convert this recursive solution to an iterative solution. This is equivalent to the proof by induction to prove the correctness of Step 2.

We do this by storing the solutions to the list of smaller subproblems in an array. Then, in each iterative step, we compute the next element in the array in similar fashion to the inductive step.

This yields an elegant direct algorithm for the problem.

The difficult part of dynamic programming is often step 1. We need to find the sub problems that can be put together to get a correct solution.

Once we’re done, the last step is to do complexity analysis. There is a simple way to determine the running time:

\[f(n) \times g(n) +h(n)\]

where:

\(f(n)\) is the number of subproblems (number of states)
\(g(n)\) is the time taken to solve each subproblem given the previous solutions
\(h(n)\) is postprocessing to extract the final answer

Often, dynamic programming algorithms have a reconstruction step where the required answer is extracted from the set of all solutions to the sub-problems that we have constructed.

Steps

The following steps are to be followed to solve any DP problem:

Analyse the structure of an optimal solution to determine how it may be expressed in terms of sub problems
- Define the terms of the recurrence
- Get the base case (only what can be solved trivially)
Write down the recursive definition of the optimal solution (doesn’t need to be code; writing down a mathematical expression is best in most cases) (perhaps the most important)
- This process is all about coming up with all ways to rewrite the problem, so be creative and consider all naive ideas you have
- A hint is to think about all possibilities in which a larger problem might be constructed (start naive and come up with better formulations)
- Figure out if any additional information is needed, if yes, pass it on through the recurrence
- See if you can get rid of redundant terms in the recurrence (terms that are taken care for implicitly)
- Figure out how to make it faster (yes this is part of the planning process too!) Going Faster
Implement a Dynamic Programming algorithm (with a reconstruction step if necessary)

Remember than DP can only be used if the sub-problems form a Directed Acyclic Graph (DAG). This is because there cannot be circular dependencies. We need to have a fixed direction in which problems are reduced so that we can solve the sub-problems in order of the topological sort.

One useful technique that can be used to arrive at the recurrence for DP problems is to fill out a table as follows:

Problem Being Solved	State the problem
Valid Choices, c	The decision that needs to be made in the recurrence
Sub-problems(c)	The immediate sub-problems we need to solve to get the solution to the problem
Contribution(c)	How we can transform the solution of the sub-problem to solve the problem
Combination over all c	The technique to combine the solutions to all subproblems

Implementing the DP Algorithm

The first thing we need to identify is the state space.

The state space is the set of all possible states (or configurations) that a system can occupy during the process of solving a problem. It represents the total collection of subproblems that need to be considered to find the optimal solution to the overall problem.

Once we have the state space, we can cast it as an array where we can directly index into the solution of any sub-problem.

Create the array at the start of the problem, then add to the array as you process the various sub-problems. Don’t try to create the array along the way.

Then, use the recurrence to fill in the array and build up to the largest solution.

However, be careful to ensure that the order in which you will solve the sub-problems is going to be correct.

Reconstruction Step

Dynamic programming typically stores optimal values (like minimum cost, maximum profit, longest subsequence length). Reconstruction means retracing the decisions that produced the optimal value, so we can recover the actual solution (sequence, path, set of items, etc.).

Here are some Common Strategies for Reconstruction:

Parent Pointers
- While filling the DP table, store not just the optimal value but also the choice that led to it (e.g., which index, which transition).
- Example: In longest increasing subsequence (LIS), store the predecessor index for each element.
- Reconstruction: Start from the best endpoint, follow parent pointers backward, then reverse the sequence.
Decision Tracking
- Instead of storing full parent pointers, store a boolean flag or decision marker (e.g., "take item" vs. "skip item").
- Example: In knapsack, for each state (i, w) store whether item i was included.
- Reconstruction: Walk backward from (n, W) and collect items where the flag says "take."
Traceback via Table Comparison
- If parent pointers weren’t stored, you can still reconstruct by comparing DP values:
  - At state (i, j), check if the value equals the transition from (i-1, j) (skip) or (i-1, j-w) + value of item (take).
- Example: In edit distance, compare whether the current cell came from substitution, insertion, or deletion.
Recursive Reconstruction
- Write a recursive function that:
  - Starts at the final state.
  - At each step, decides which transition was used.
  - Recursively reconstructs the solution.
- Often used in problems like matrix chain multiplication (to reconstruct parenthesization).
Path Reconstruction in Graph DP
- For shortest paths (Bellman-Ford, Floyd-Warshall), store next-hop matrices or predecessor arrays.
- Reconstruction: Start at source, repeatedly follow the stored "next" node until destination.
Hybrid Approaches
- Sometimes you combine strategies:
  - Store parent pointers for efficiency.
  - Use table comparison as a fallback.
  - Or store partial reconstruction info (like split points in interval DP).

DP vs Memoization

Dynamic programming is the use of a loop in lieu of recursion, explicitly solving all subproblems in order from smallest to largest. So what’s better: memoization or dynamic programming?

Aspect	Memoization (Top-Down)	Tabulation (Bottom-Up DP)
Approach	Recursive + cache results	Iterative + build table
Execution Order	Solves only needed subproblems	Solves all subproblems systematically
Ease of Implementation	Easier (modify recursive solution)	Harder (requires careful iteration design)
Overhead	Recursion stack overhead	No recursion overhead
Space Usage	May use more memory (stack + cache)	Usually more compact (single table) and sometimes not all states need to be stored
Performance	Slower due to recursion calls	Faster due to iterative nature
Best Use Case	When subproblems are sparse or irregular	When subproblems are dense and predictable

You’ll commonly see people refer to memoized solutions as top-down solutions and dynamic-programming solutions as bottom-up solutions. It’s called “top-down” because, to solve large subproblems, we recurse down to small subproblems. In “bottom-up” solutions, we start from the bottom—the smallest subproblems—and work our way to the top.

One famous example to explain the difference is computing the \(n^{th}\) Fibonacci Number. The naive approach (recursion) has \(O(2^n)\) time complexity.

The memoized solution uses a table to store all computed Fibonacci values. To analyse this, we need to think of the set of problems we need to solve instead of following the recursion.

In the DP approach, the table is the solution space which we iteratively build till we reach our required solution. The analysis is again the product of the number of sub-problems and the time to solve each sub-problem.

Note that when implementing a bottom up algorithm, you need to create the DP table carefully. You need to ensure you create the correct dimensional list. The lists need to be independent (this is easy to miss out in languages like python).

Example Problems

There are a wide range of classic problems that can be solved using dynamic programming.

Longest Increasing Subsequence - This classic problem needs us to think about different ways to formulate the state. We can either store the LIS of first \(n\) elements or the LIS that ends at \(n^{th}\) element. The second formulation turns out to be better. Using this, the update we need to perform is to find the largest solved LIS that ends at an element smaller than the current element.
The Knapsack Problem
Maximum-Weight Independent Set (MWIS) - Our goal here is to find the subset of mutually non-adjacent vertices with the largest total weight. We proceed by examining two cases: one with a vertex included and another with the same vertex excluded. This allows us to rewrite the problem in terms of a smaller graph without that vertex.
Minimise Needleman-Wunsch score of two strings - More commonly called the sequence alignment problem, the algorithm determines the similarity score between two strings.
Levenshtein Distance between Strings - This is the edit distance between two strings, measuring the number of insertions, deletions and substitutions to convert between two strings. Note that there are multiple more ways to define edit distance but this is the vanilla version of the problem.
The solve this, we enumerate all possible ways in which one sub-string can be be converted to another, leading to the following recurrence, leading to a \(O(mn)\) runtime:
```
edit_dist(A, B, n, m) = min(
	edit_dist(A, B, n - 1, m) + 1,
	edit_dist(A, B, n, m - 1) + 1,
	edit_dist(A, B, n - 1, m - 1) + d # d is 1 if the last characters of A and B are different
)
```
Longest Common Subsequence - The goal is to find the longest possible sequence that is a sub-sequence of both sequences.
We use a similar idea to the above to define the recurrence: some modifications exist but it is the same way to think about the decomposition. However, here, the goal is to get a string so we need to perform backtracking to get the solution (where diagonal moves in the state table correspond to added characters).
Optimal Binary Search Tree - Here, we try to determine the search tree that achieves the minimum average search time. This is similar to Huffman codes but we need to factor in the search tree property.
See Optimal Binary Search Tree for more details.
Kadane’s Algorithm - This solves the “Maximum Subarray” problem in \(O(n)\) time.
Pretty simple DP algorithm that stores the solution for the first \(n\) elements using which we can get the solution for the \(n+1\) th element. This gives the largest subarray ending at every element in the array.
Note that we don’t need to store all the sums explicitly but rather just the solution to the problem with the previous index.

Minimum Vertex Cover - The goal is to find the subset of vertices such that all edges connect to some vertex in this subset.

Here, we need to define two formulations of the recurrence: one that includes the vertex abnd one that doesn’t.

cover(T, r, true) =
min(cover(T, c1, true), cover(T, c1, false)) +
min(cover(T, c2, true), cover(T, c2, false)) + …
min(cover(T, ct, true), cover(T, ct, false)) + 1

cover(T, r, false) =
	cover(T, c1, true) + 
	cover(T, c2, true) + … 
	+ cover(T, ct, true)

For all of these problems, the step that involves identifying the subproblems is the most important.

One place where algorithms involving DP come in useful is the Single Source Shortest Path problem. Here are some DP algorithms we use here:

DP Anti-Patterns

DP is great but sometimes, it leads to sub-optimal solutions. We could do better by studying the specifics of the problem. Here are a few examples:

Split-Array Largest Sum (LeetCode 410): Given an integer array nums and an integer k, split nums into k non-empty subarrays such that the largest sum of any subarray is minimized.
This seems like a textbook case of DP however this problem turns out to have an easy optimality check and a clear feasible-infeasible boundary. This means we can binary search directly for the largest sum of any sub-array. The feasibility check is a greedy strategy that build subarrays as big as possible from the left.
This brings down complexity from \(O(n^2 \times k)\) to \(O(n \log ( \sum \text {nums}))\).
Minimum Jumps to Reach End via Prime Teleportation (LeetCode 3629): Most “minimum jumps” problems ask for the minimum number of moves to reach some target. That naturally suggests a DP state like:
\[dp[i] = \text{minimum jumps needed to reach index } i\]
Then you define transitions from earlier reachable states. But in teleportation jump problems:
- you can move forward
- backward
- sideways through prime links
- cycles exist
So states do not form a clean DAG. That means bottom-up DP becomes messy, because:
- state i may depend on j
- and j may depend back on i
This breaks the normal DP flow. You’d need repeated relaxation (Bellman-Ford style), memoized search with cycle handling, or graph algorithms. At that point, BFS is superior. So before choosing DP, ask yourself whether the problem has acyclic dependency order.