The goal of the study of algorithms is to stock your algorithmic toolbox with as many primitives as possible. Primitives are algorithms that we use all the time and often appear as a part of code we use to solve real-world problems.
Here is a great resource for learning about algorithms: https://cp-algorithms.com/
The fantastical trunk of algorithms
Developing algorithms
When we develop algorithms, we are often seeking to put together knowledge that we have in novel ways that is adapted to the problem at hand. You might be doing 1 of two things:
- Modifying algorithms that we know
- Adapting the inputs so that we can use our algorithms as black boxes
Here is a rough field guide for how to set about algorithm design:
- Figure out if the problem is a disguised version of a problem that you already know how to solve. If it is, you have the solution.
- If it is not, see if you can use a for free primitive to simplify the problem.
- If none of the above possible, identify a line in the sand by identifying the obvious (naive) solution, usually using exhaustive search.
- If the straightforward algorithm is inefficient, think about greedy algorithms that you can use. Most likely they will fail but the way in which they fail will tell you about possible ways to approach the problem.
- See if you can apply any of the paradigms that you have learned: divide and conquer and dynamic programming.
- Think about what data structure can help make the solution even better.
We have tools for reasoning about code correctness, which allows us to come up with reasonable proofs. Thinking along the lines of this tools allows us to come up with a piece of code that is intuitive to understand.
Invariants
In programming, an invariant is a condition or property of a program's state that is always true at specific, well-defined points during its execution. These are highly useful to start thinking about how to solve problems.
If you can identify such a property, you can often:
- Prove that some configuration is impossible.
- Prove that a process must terminate in a desired state.
- Design the algorithm to maintain that property.
Here are a few examples of invariants:
Invariants in famous algorithms
Basic sorting algorithms
- Insertion sort:
“The insertion sort algorithm maintains the invariant that after inserting the element at position i, the prefix consisting of the first i+1 elements of the array appears in sorted order.”
- Selection sort:
“The selection sort algorithm maintains the invariant that after ii passes of the algorithm on the array, the first ii elements of the array are the ii smallest elements of the array and appear in sorted order.”
- Merge sort (top‑down):
“The merge sort algorithm maintains the invariant that immediately before merging two subarrays, each subarray individually appears in sorted order.”
- Quick sort partition (Lomuto):
“The Lomuto partition procedure maintains the invariant that at any point during the scan, all elements strictly before index ii are less than or equal to the pivot, all elements between indices ii and j−1j−1 are greater than the pivot, and all elements from index jj onward are not yet classified.”
- Quick sort partition (Hoare):
“The Hoare partition procedure maintains the invariant that all elements strictly to the left of index ii are less than or equal to the pivot and all elements strictly to the right of index jj are greater than or equal to the pivot.”
Searching and selection
- Linear search (forward):
“The linear search algorithm maintains the invariant that after examining the first ii positions, if the key appears among those positions then its index has already been found.”
- Binary search:
“The binary search algorithm maintains the invariant that the target value, if it appears in the array, must lie within the current interval [low,high][low,high] of indices being considered.”
- Binary search (open/closed form):
“In the half‑open variant of binary search, the algorithm maintains the invariant that the target value, if present, lies within the interval [low,high)[low,high), and all indices strictly less than lowlow or greater than or equal to highhigh have been ruled out.”
- Selection (Quickselect):
“The quickselect algorithm maintains the invariant that after each partition step, the desired kk-th smallest element is contained entirely within the subarray that will be recursively selected.”
Graph algorithms
- Breadth‑first search (BFS):
“The BFS algorithm maintains the invariant that when a vertex is dequeued, all vertices at a smaller distance from the source have already been fully processed, and all still‑queued vertices are at distance either equal to or exactly one greater than that vertex.”
- Depth‑first search (DFS, recursive):
“The DFS algorithm maintains the invariant that when entering a recursive call on a vertex, every vertex on the recursion stack is currently being explored along a simple path from the start vertex.”
- Dijkstra’s shortest path:
“Dijkstra’s algorithm maintains the invariant that after a vertex is extracted from the priority queue and marked as finalized, the current distance value stored for that vertex equals the length of the shortest path from the source to that vertex.”
- Bellman–Ford:
“The Bellman–Ford algorithm maintains the invariant that after kk relaxation rounds, for every vertex, the stored distance is at most the length of any path from the source to that vertex using at most kk edges.”
- Floyd–Warshall:
“The Floyd–Warshall algorithm maintains the invariant that after considering the first kk intermediate vertices, the distance value dist[i][j]dist[i][j] is the length of the shortest path from vertex ii to vertex jj that uses only intermediate vertices from the set {1,…,k}{1,…,k}.”
- Topological sort (Kahn’s algorithm):
“Kahn’s topological sort algorithm maintains the invariant that the list built so far is a valid topological ordering of the vertices that have already been removed from the graph.”
Data structures
- Heap (array representation of a binary heap):
“The binary heap operations maintain the invariant that for every node in the heap, the key stored at that node is less than or equal to the keys stored at its children (for a min‑heap).”
- Heapify (build‑heap):
“The bottom‑up heap construction algorithm maintains the invariant that after heapifying the subtree rooted at index ii, the subarray representing that subtree satisfies the heap order property.”
- Union–find (disjoint set forest):
“The union–find data structure maintains the invariant that each set is represented by a unique root, and for any node, following parent pointers repeatedly eventually reaches that root.”
- Balanced search trees (e.g., red‑black tree):
“A red‑black tree maintains the invariant that every path from the root to a leaf contains the same number of black nodes, and no red node has a red child.”
Dynamic programming algorithms
- Dynamic programming for Fibonacci numbers:
“The bottom‑up Fibonacci algorithm maintains the invariant that after filling entry ii in the table, each entry from 00 through ii stores the correct Fibonacci value for its index.”
- Longest common subsequence (LCS):
“The LCS dynamic program maintains the invariant that the entry dp[i][j]dp[i][j] equals the length of the longest common subsequence of the prefixes consisting of the first ii characters of the first string and the first jj characters of the second string.”
- Knapsack (0/1 DP):
“The 0/1 knapsack dynamic program maintains the invariant that the entry dp[i][w]dp[i][w] equals the maximum total value achievable using only the first ii items with total weight at most ww.”
- Edit distance (Levenshtein):
“The edit distance dynamic program maintains the invariant that the entry dp[i][j]dp[i][j] equals the minimum edit distance between the prefix of length ii of the source string and the prefix of length jj of the target string.”
Greedy algorithms and flows
- Kruskal’s minimum spanning tree:
“Kruskal’s algorithm maintains the invariant that the set of edges chosen so far forms a forest that is a subset of some minimum spanning tree of the graph.”
- Prim’s minimum spanning tree:
“Prim’s algorithm maintains the invariant that the edges chosen so far always form a tree that is a prefix of some minimum spanning tree, and every chosen edge connects a vertex inside the tree to a vertex outside it with minimum possible weight.”
- Ford–Fulkerson / Edmonds–Karp:
“The max‑flow algorithm maintains the invariant that the current flow always satisfies capacity constraints and flow conservation at every vertex other than the source and sink.”
- Greedy interval scheduling:
“The greedy interval scheduling algorithm that repeatedly selects the interval with earliest finishing time maintains the invariant that the set of intervals chosen so far is compatible and can be extended to some optimal solution.”
Divide‑and‑conquer and others
- Karatsuba multiplication (or any divide‑and‑conquer multiplication):
“The divide‑and‑conquer multiplication algorithm maintains the invariant that at each recursion level, the subproblems are defined so that the sum of their appropriately scaled results equals the product of the original inputs.”
- Strassen’s matrix multiplication:
“Strassen’s algorithm maintains the invariant that the linear combinations forming its seven products are chosen so that recombining those products reproduces the exact product of the original matrices.”
- Euclid’s algorithm for gcd:
“Euclid’s algorithm maintains the invariant that at every step, the greatest common divisor of the pair of integers being processed is equal to the greatest common divisor of the original pair of integers.”
- Exponentiation by squaring:
“The fast exponentiation algorithm maintains the invariant that at each step, the product of the accumulated result and the current base raised to the remaining exponent equals the original base raised to the original exponent.”
Invariants aren’t just useful in understanding algorithms but also in a variety of problem-solving contexts:
- Proving impossibility / reachability
Many “operations on an array/graph” problems ask whether you can reach a target configuration by repeatedly applying allowed moves.
- Typical idea:
- Find a quantity (sum, parity, number of components, etc.) that is unchanged (invariant) or changes in a restricted way (monovariant).
- Show that the target configuration would require that quantity to have a different value.
- Example pattern:
- Operation: “Pick two elements \(a_i, a_j\) and add 1 to both.”
- Invariant: The difference \(a_i - a_j\) for any fixed pair is unchanged, or the array sum parity stays the same.
- Conclusion: If target has different parity or inconsistent differences, it is unreachable.
This is extremely common in Codeforces/AtCoder “game” or “operations” tasks, where the intended solution is a short invariant argument.
- Typical idea:
- Guiding constructive algorithms
In harder constructive tasks (“build a sequence/graph satisfying conditions”), invariants help you design an algorithm that never violates constraints.
- You maintain some invariant like:
- “The partial structure is always valid.”
- “The partial graph is always acyclic.”
- “The current sequence still can be extended to a full solution.”
- Example pattern:
- Building a permutation with constraints: maintain that used values form a set that still leaves enough remaining values to satisfy future requirements.
- Greedy graph building (e.g., building a tree): maintain that the current edges form a forest (no cycles), so you never accidentally create an invalid structure.
- You maintain some invariant like:
- Loop invariants in standard algorithms you implement a lot
Knowing the invariants of common algorithms keeps you from introducing off‑by‑one or logic bugs.
- Prefix/suffix processing:
- Invariant: after processing index ii, the answer for prefix [0,i][0,i] is correct and all needed state (prefix sums, DP values, etc.) is up to date.
- Sliding window / two pointers:
- Invariant: the current window [l,r)[l,r) always satisfies the condition (e.g., sum ≤ K or contains at most K distinct elements).
- You move pointers while preserving this, and use the invariant to count contributions.
- Binary search on answer:
- Invariant: predicate is false for all values in (−∞,L](−∞,L] and true for all in [R,+∞)[R,+∞), or vice versa.
- Each mid update must preserve this, otherwise the search becomes incorrect.
These invariants are rarely written out in contests, but understanding them mentally drastically reduces WA from subtle boundary mistakes.
- Prefix/suffix processing:
- Invariants and monovariants in “game theory” problems
Problems often give a game with two players and moves like “remove stones,” “flip bits,” or “change edges.”
- Invariants:
- Something that never changes: e.g., parity of a pile sum, XOR of pile sizes, color balance.
- Monovariants:
- A quantity that always strictly increases/decreases, guaranteeing termination or forbidding cycles.
- Usage:
- Prove that the game ends in finite steps (monovariant).
- Prove that only certain positions are reachable (invariant).
- Determine winning/losing positions by preserved quantities (e.g., Nim XOR).
- Invariants:
Analysing Algorithms
When we write programs until this point, correctness is the main consideration. However, sometimes that is not enough: programs that are correct but that take forever to run are not good programs.
This is why we start thinking about efficiency:
- Time efficiency: a measure of amount of time for an algorithm to execute
- Space efficiency: a measure of the amount of memory needed for an algorithm to execute
These two are at a tradeoff: you can store some results to improve time efficiency but you are sacrificing space efficiency.
However, measuring efficiency is a big problem in Computer Science because the same algorithms can have different efficiency on different systems and with different implementations. Thus, as a benchmark for a metric for measuring efficiency, we define the requirements:
- Running time should vary between algorithms
- Running time should not vary between implementations
- Running time should not vary between computers
- Running time should not vary between languages
- Running time should be predictable for small inputs
Time efficiency
When it comes to dealing with Time Efficiency, we have 3 approaches to measure efficiency:
- Measure with timer
- Count the operations
- Abstract notion of order of growth
We will go through each of the approaches.
Measure with timer
This can be done with a wrapper coded similar to the following:
>>> def wrapper(fn, inp):
... print("Evaluating runtime for ", fn.__name__)
... dt_prev = 1
... for i in inp:
... start = time.time()
... res = fn(i)
... dt = time.time() - start
... print(f"Input {i} took time {dt} which is {dt/dt_prev} times more than the previous step.")
... dt_prev = dtHowever, this approach fails in the red-highlighted points:
- Running time should vary between algorithms
- Running time should not vary between implementations
- Running time should not vary between computers
- Running time should not vary between languages
- Running time is should be predictable for small inputs
Count the operations
Here, we assume that the following steps take a constant time. We make this assumption because these operations are carried out so fast that they virtually take the same, very small amount of time \(dt\):
- Mathematical Operations
- Comparisons
- Assignments
- Accessing objects in memory
sums = 0
for i in range(x):
sums += iHere, we can count operations as follows:
sums = 0- 1 operation- The following operations are repeated
xtimes:i = 1- 1 operationsums += 1- 2 operations [sum + 1andsum = _are two distinct operations]
Thus, overall, the numbers of operations would be: 1 + x * (1 + 2) or 1 + 3x.
However, this approach fails in the red highlighted points:
- Running time should vary between algorithms
- Running time should not vary between implementations
- Running time should not vary between computers
- Running time should not vary between languages
- Running time is should be predictable for small inputs
The biggest failure of this method is there is no real definition of which operations to count.
Thus, the assumption we can make is that all of the following of these (according to the RAM model) run for the same amount of time:
- Basic Arithmetic Operations
- Making Copies of Integers
- Basic Comparisons
- Returning Single Integers
- Loading Single Integer from memory
- Saving a single integer into memory
This allows us to focus on the “bigger picture” of how the algorithm perform overall.
The most precise understanding would be obtained using the “Word RAM” model but this is an advanced algorithm analysis technique that is not required for this level of analysis.
Order of growth
This is a generalised way to figure out run-time where the size of the problem gets arbitrarily large, for the worst case scenario of the problem.
We ignore any additive or multiplicative constants (coefficients, smaller terms disappear). We only keep one term so there won’t be any sums. By focusing on what grows the fastest, we obtain a very general picture of the run-time, expressed in asymptotic notation.
Note: If there are multiple variables that affect runtime, it is alright to have a theta notation containing a sum of two functions.
To get the asymptotic notation for run-time, follow the steps:
- Count the numbers of operations carried out as per the counting method
- Only count steps that are a function of the input parameter we are concerned about
- When loops (functions) are in series: do addition; when loops are in parallel: do multiplication
- Get rid of all the coefficients and smaller terms
It is important to recognize the different complexity classes that you might encounter:
O(1)
An algorithm with a time complexity of O(1) is one where the number of steps required to complete an operation stays the same, regardless of the size of the input.
Examples:
- Retrieving a value from a hash table
- Accessing a python list element by its index
- Checking if a number is even or odd:
number % 2 == 0is a single operation. - Pushing or popping an element from a stack (implemented with an array): Assuming no resizing is needed.
- Returning the first element of a linked list:
list.head.
O(\(logN\))
An algorithm with O(logN) complexity means the execution time grows logarithmically with the input size. This typically occurs when an algorithm repeatedly halves the input size in each step. This makes sense because doubling the input size takes 1 more step.
Examples:
- Binary Search: In a sorted array, you repeatedly divide the search interval in half.
- Finding an element in a balanced binary search tree: Each comparison eliminates half of the remaining nodes.
- Euclidean algorithm for finding the greatest common divisor (GCD): The numbers involved decrease exponentially.
O(\(N\))
An algorithm with O(N) complexity means the execution time grows linearly with the input size. This often involves iterating through each element of the input once.
Examples:
- Traversing a linked list or array: Visiting each element once.
- Finding the maximum or minimum element in an unsorted array: You need to check every element.
- Linear search in an unsorted array: In the worst case, you might have to check every element.
- Summing all elements in an array: You need to add each element.
O(\(NlogN\)) - Log linear time
This complexity is seen in algorithms that perform a logarithmic operation multiple times, once for each element in the input. The key to understanding O(NlogN) complexity lies in recognizing the combination of linear and logarithmic components.
Examples:
- Merge Sort: Divides the array in half, sorts each half, and then merges them.
- Heap Sort: Builds a heap (O(N)) and then repeatedly extracts the maximum element (O(logN) for N elements).
- Quick Sort (average case): Similar to Merge Sort in its divide-and-conquer approach.
- Fast Fourier Transform (FFT): A highly efficient algorithm for computing the Discrete Fourier Transform.
O(\(N^2\))
An algorithm with \(O(N^2)\) complexity means the execution time grows proportionally to the square of the input size. This often occurs when you have nested loops that iterate through the input.
Examples:
- Bubble Sort: Compares adjacent elements and swaps them, potentially multiple passes.
- Selection Sort: Finds the minimum element and places it at the beginning, repeating for the rest.
- Insertion Sort: Inserts each element into its correct position in the sorted part of the array.
- Calculating all pairs of elements in a set: Nested loops for
(i, j)whereiandjare indices.
O(\(N^3\))
An algorithm with O(N^3) complexity means the execution time grows proportionally to the cube of the input size. This typically involves three nested loops iterating through the input.
Examples:
- Standard matrix multiplication of two N x N matrices: If implemented with three nested loops where the outermost two loops iterate over rows and columns of the result, and the innermost loop performs the dot product.
- Floyd-Warshall Algorithm: Finds the shortest paths between all pairs of vertices in a weighted graph.
- Brute-force solution for 3-SUM problem: Finding three numbers in an array that sum to a target value using three nested loops.
O(\(2^N\)) - Exponential Time
This exponential growth occurs because the algorithm generates an increasing number of subproblems or recursive calls with each input element.
Examples:
- Calculating Fibonacci numbers recursively without memoization:
fib(n) = fib(n-1) + fib(n-2)leads to redundant calculations. - Solving the Traveling Salesperson Problem (TSP) using dynamic programming (Held-Karp algorithm): Explores subsets of cities.
- Solving the Subset Sum problem using brute force: Checking every possible subset of numbers.
- Brute-force solution for the Knapsack problem: Checking all possible combinations of items.
- Solving the satisfiability problem (SAT) for a Boolean formula using brute force: Checking all 2N possible truth assignments for N variables.
O(\(N!\)) - Factorial Time
Factorial time complexity, represented as O(N!), indicates an even more rapid growth in algorithm execution time than exponential time complexity. Factorial complexity is typically encountered in scenarios requiring the exploration of all possible permutations or combinations.
Examples:
- Solving the Traveling Salesperson Problem (TSP) using a brute-force approach: Trying every possible permutation of cities.
- Generating all permutations of a given set of N elements: For example, listing all possible orderings of cards in a deck.
- Brute-force solution to the shortest common supersequence problem: Checking all permutations of the input strings.
- Some exact solutions for the Hamiltonian Cycle problem: Which involves finding a cycle that visits every vertex exactly once.
- Trying all possible arrangements for seating guests at a dinner party: To find an optimal arrangement based on certain criteria.
The complexity of a piece of code comes down to analysing the statements that make it up. Here are some common examples:
- Sequences of statements: add up time complexities
- Conditionals with n blocks have complexity \(\max(time(block_1), time(block_2), \dots time(block_n))\)
- Multiply the number of times the loop runs with the complexity of the block.
Space Efficiency
The reason we need to measure other metrics is because we are not always pursuing the fastest algorithm but the best one in the context of the problem. Sometimes this might be memory usage, parallelism, information communication. Here we look at the problem of memory use, which we can deal with by looking at the space efficiency of the algorithm.
Space complexity indicates the maximum amount of memory that the algorithm requires to solve a problem.
This measurement is important because an algorithm's memory usage can significantly impact its efficiency and practicality, particularly in environments where memory resources are limited.
These are principles to remember with space efficiency:
- A variable takes O(1) space
- A list of length n takes O(n) space
- A k-dimensional list, with each list length k takes \(O(k^k)\) space
- Unless recursion is used, space complexity is easy to deal with: just count the number of variables and lists in play
- We measure the maximum (not total) amount of space that is used at any given time during execution.
This is as opposed to the total space that is used when the algorithm runs. This is to factor in the role of garbage collection.
Here is a key clarification that needs to be made: Auxiliary Space vs. Total Space Complexity
- Auxiliary space is the temporary workspace an algorithm uses during execution. For bubble sort, this is O(1) because it uses a fixed number of variables (e.g.,
i,j,swapped). - Total space complexity includes both the input storage and auxiliary space. If we account for the input array (which requires O(n) space), the total becomes O(n).
In algorithm analysis, "space complexity" commonly refers to auxiliary space, not total space. This convention arises because input space is unavoidable and identical across algorithms.
Amortized Analysis
Amortized analysis evaluates the average performance of each operation in a sequence, guaranteeing the worst-case total cost over a long series of operations, even if a few individual operations are expensive.
It provides a tighter, more realistic bound than standard worst-case analysis by spreading the costs of infrequent, heavy operations across multiple cheap operations.
The principle is as follows:
- Count the cost of an operation
- Put in some more time (surplus) along with the computed cost
The surplus can be then stored to compensate an expensive computation, averaging out the complexity to a lower bound in the long run. This is true as long as we can prove that the we never draw an amount (for an expensive computation) that lands the net count to below 0.
The key is that this technique only works when we are concerned with total runtime of a set of operations. If we only look at one operation, then in the worst case, the runtime cannot be further bounded.
There are various ways to do amortized analysis:
- Accounting Method: Assigns different charges to operations and smooths out the operations
- Aggregate Method: Find total cost of \(n\) operations and then divide this cost by \(n\)
- Potential Method: Models expenses for operations as energy with a potential function \(\Phi(h)\)
Limits of efficient computation
Understanding what problems can be efficiently computed is important because it allows us to understand what tasks can be solved with an efficient algorithm so that time isn’t wasted hunting for a polynomial-time solution that probably doesn’t exist.
Understanding what can be computed and what cannot forces us to choose he right goal (exact vs approximate), the right formulation (decision vs optimization), and the right technique (reduction, DP, branch-and-bound, etc.).
There are 4 main classes of problems when we study algorithms from this perspective. These include:
- Problems in P: Can be solved in polynomial time by a deterministic Turing machine
- Problems in NP: Problems have solutions that can be verified in polynomial time by a deterministic Turing machine
- NP-hard: A problem is NP-hard if it is at least as hard as every problem in NP, in the sense that any NP problem can be efficiently transformed (reduced) into it; NP-hard problems don’t have to be “quickly checkable” themselves.
- NP-complete: A problem is NP-complete if it is both in NP and NP-hard—informally, it’s among the “hardest” problems whose solutions are still efficiently verifiable.
So now we know how to figure out if a problem is NP complete. Now what?
Many problems of practical significance are NP-complete, yet they are too important to abandon merely because we don’t know how to find an optimal solution in polynomial time.
Even if a problem is NP-complete, there may be hope. We have at least three ways to get around NP-completeness.
- If the actual inputs are small, an algorithm with exponential running time may be perfectly satisfactory.
- We may be able to isolate important special cases that we can solve in polynomial
time.
We can arrive at these cases by thinking of the verification methods that make the problem NP-complete in the first place. This can lead to the required narrowing of cases.
- We might come up with approaches to find near-optimal solutions in polynomial time (either in the worst case or the expected case).
In practice, nearoptimality is often good enough. We call an algorithm that returns near-optimal solutions an approximation algorithm.
Reductions
At its core, a reduction is a formal way of saying:
“If I can solve problem A, then I can solve problem B by transforming B into A.”
Reductions let us:
- Transfer algorithms
If you reduce B → A, then any algorithm for A becomes an algorithm for B.
- Transfer lower bounds
If B is known to be hard, and B → A, then A is at least as hard as B.
- Explain algorithmic structure
Many algorithms are best understood as “hidden” versions of simpler problems.
Let A and B be computational problems.
Rigorously, a reduction from B to A is a function \(f\) such that:
- For every instance \(x\) of problem B, \(f(x)\) is an instance of problem A.
- Solving A on input \(f(x)\) gives enough information to solve B on input x.
- The transformation f must be efficient (usually polynomial time, or linear time in algorithm design).
A reduction is denoted by:
Here is a concrete example which explain Dijkstra as a family of reductions. You want to solve Single‑Source Shortest Path (SSSP) on a graph with non‑negative weights. You reduce it to the following abstract problem:
Given a set of keys that only decrease over time, repeatedly extract the smallest key.
This is exactly the monotone priority queue problem.
The reduction is:
- Each vertex v has a key = current best distance estimate.
- Relaxing edges corresponds to decreasing keys.
- Choosing the next vertex to finalize corresponds to extract‑min.
Thus:
\(\mathrm{SSSP}\; \leq \; \mathrm{Monotone\ PQ}\)
And the complexity of Dijkstra is:
\(T_{\mathrm{Dijkstra}}=V\cdot T_{\mathrm{extract-min}}+E\cdot T_{\mathrm{decrease-key}}\)
This is the example of the reduction that has taken place.