The Dijkstra algorithm is the famous solution to the single source shortest path (SSSP) problem where given a directed graph \(G\) and a starting vertex \(s\), our goal is to find the length of the shortest path from \(s\) to \(v\) for all \(s\) and \(v\).

In the case where all edges have lengths that are positive integers, the problem can be reduced into breadth-first search where all edges of length \(n\) are replaced with \(n\) edges of length 1. However, this is an inefficient solution when edges become very long. However, we want an efficient (preferably linear time in \(|V|\) and \(|E|\) solution) to the single source shortest path problem.

Working

The key idea behind Dijkstra, is that at some stage, we can say for certain that some distance is indeed the shortest distance to a vertex.

This is because the shortest edge going out from vertex A is also the shortest path from vertex A to the vertex the shortest path leads to.

The key invariant is that at any step, all the nodes explored have an SSSP distance smaller than the distance of the current node. This means they don't need to be revisited, allowing us to do linear work.

To keep track of nearest vertices, we keep track of a Dijkstra Score. It is the shortest distance (or cost/time) calculated by Dijkstra's algorithm from the source node to all other reachable nodes in a weighted graph with non-negative edge weights.

The next edge we add is the edge with the smallest Dijkstra Score.

Then, we can iteratively apply is procedure for finding the true shortest path for every vertex to solve the SSSP problem.

Implementation

Below is the standard progression of Dijkstra implementations, from worst to best, culminating in the linear‑time version.

Only the first implementation uses an adjacency matrix representation.

Implementation	Data Structure	Edge Weight Type	Time Complexity
Matrix + linear scan	Adjacency Matrix	Any non‑negative	\(O(V^2)\)
Brute force Dijkstra	Adjacency List	Any non‑negative	\(O(VE)\)
List + Unsorted Priority Queue	Array	Any non‑negative	\(O(V^2+E)\)
Binary heap	Binary heap	Any non‑negative	\(O((V+E)\log V)\)
Fibonacci heap	Fib heap	Any non‑negative	\(O(E + V\log V)\)
Dial’s algorithm	Buckets	Small integers	\(O(V + E + C)\)
0–1 BFS	Deque	0/1 weights	\(O(V + E)\)
Radix heap	Radix buckets	Integers	\(O(E + V\log C)\)
Thorup’s algorithm	Hierarchical buckets	Positive integers	\(O(V + E)\)

Below we go through how each of these approaches work.

Binary Heap

Here, we are presented with two choice for what to store in the heap:

Edges (so that we can extract the shortest edge)
Vertices (so that we can identify the vertex with the smallest Dijkstra Score)

It turns out that the latter is much easier to implement because the set of vertices only becomes smaller with iterations of the algorithm. On the other hand, the edges change completely from step to step based on which edge we are starting from.

The invariant in the heap is that it is going to contain all the unvisited vertices and their updated Dijkstra scores.

With this decided, here is how we implement the algorithm:

Initialize a set \(X\) of the vertices for which the shortest distance is confirmed (initially empty), and an empty heap \(H\) (initially with all vertices with keys as the Dijkstra score)
Until \(H\) is empty: [the loop runs \(|V|\) times]
1. Remove the vertex with the smallest Dijkstra score from \(H\) and add it to \(X\) [Time complexity \(O(\log{|V|})\)]
2. Assign the shortest distance to the vertex moved as the Dijkstra score of the vertex [Constant time operation]
3. Update the heap so that the Dijkstra scores are updated based on the new set \(X\) [The time complexity changes each loop so instead we analyse the total number of times heap updates are done. This happens \(2|E|\) times because each outgoing edge from each vertex is only processed once.]

The intuition between Dijkstra is that it repeatedly picks the closest unvisited node.

Overall time complexity:

As the loop runs, the time complexity of 2a will be \(O(|V|\log{|V|})\).

The total time complexity of 2c will be \(O(|E|\log{|V|})\).

Thus, overall, we have \(O((V+E)\log V)\).

This can be expressed in pseudo-code as follows:

def dijkstra(G, s):
	dist_est = [an array of all ∞]; set dist_est[s] = 0
	heap = heapify(copy(dist_est))
	while heap is not empty:
		curr_node = heap.extract_min()
		for all neighbours (n, w) in G.get_neighbours(curr_node)
			if dist_est[curr_node] + w < dist_est[n]:
				heap.decrease_key(curr_node, dist_est[curr_node] + w)

This however requires the decrease_key operation which is not supported by most APIs or is difficult to implement. Thus, instead of using this operation, we do lazy deletion by maintaining a visited array to keep track of which nodes we don’t need to process again.

This however means that the the the heap can store up to \(|V|\) elements. This leads to a different complexity analysis leading to \(|V| \log (|V|)\). This however is the same complexity as the A

Another implementation detail for the algorithm is that for infinity, the usual convention is to use the maximum integer that can be stored on a 32-bit machine. This is \(2^{31}-1\), approximately given by \(10^{18}\).

Limitations

Note that Dijkstra is general does not work with negative edge weights. Thus, none of these implementations can handle non-negative weights.

This is because Dijkstra assumes that once a vertex is extracted as the current minimum‑distance vertex, its shortest path is final. Negative edges break this assumption.

To solve the single source shortest path problem with negative weights in play, turn to:

Floyd-Warshall Algorithm - Complexity \(O(V³)\)
Bellman–Ford Algorithm - Complexity \(O(VE)\)
SPFA (Shortest Path Faster Algorithm) - A practical improvement over Bellman–Ford: worst case still \(O(VE)\), but often much faster in practice.
Johnson’s Algorithm

0-1 BFS

If edge weights are restricted to 0 or 1, the problem structure simplifies dramatically.

The intuition to have is that at any vertex, we can at most increase the distance by 1 unit. Otherwise, the distance does not change. Thus, when we encounter a 0 weight edge, we know that we have to process the associated node first.

This is similar in spirit to Dijkstra where we explore the closest nodes first.

To do this, all we need to do is keep track of the 0 edges to be relaxed and the 1 edges. To do this we can do either of the following:

Use a deque to maintain both the 0 and 1 edges
Use 2 separate queues

We will always dequeue the 0 edges and relax them first so that we fully explore the shortest path first. This allows us to do linear time SSSP like BFS even if we are using Dijkstra’s principle. The main time saving comes from the use of a heap.

Example Problems

Here we look at generalized versions of the SSSP problems for which we can adapt Dijkstra:

Teleporting SSSP: If we are given the ability to teleport between a set of nodes, we are tasked to figure out the next set of shortest distances.
The basic solution to this would be to add 0 weight edges to the graph. However, given T teleportation nodes, this add \(T^2\) edges. This slows down Dijkstra.
A better solution would be to add a cycle of teleportation nodes where each teleportation nodes can only allow teleportation to the next one in the list. This does not change the problem because the edge weights are all 0. Thus, here we only add \(T\) additional edges.
A similar way to capture this idea is to add a central teleportation node where all the teleportation nodes lead to.
SSSP with Expressways: We can take k edges out of any path and make them 0 cost to traverse.
The naive approach would be to inspect the path taken to each node and removing the k longest edges traversed. For this, we keep a parent array where we store the parent of each node. At the end of the traversal, we can traverse back to the source using parent pointers.
This pre-processing does the following:
1. Select the \(k^{th}\) largest nodes in expected \(O(n)\) time (because there are at most \(n - 1\) edges in the path)
2. Do this for all nodes
This leads to the overall time complexity of \(O(n^2)\). This is the majority of the cost of the algorithm because Dijkstra is linear.
A better solution is to take the original graph and make \(k+1\) copies. We call these copies layers and we are allowed to traverse between layers with 0 cost. We assume we start at the top layer and work our way downwards to the bottom layer.
Storing and processing the \(k\) nodes blows up the time complexity by a factor of \(k\) so the runtime is \(O(k(m+n \log n))\). However, in the worst case this too can becomes \(O(n^2)\) if \(k=\Theta (n)\). Additionally, this approach is memory inefficienct because all \(k\) layers need to be stored.
💡
This technique is a common technique that can be applied to graph problems.
- Discounted travel tickets: Model situations where you can use up to free rides or discounts.
- Limited fuel boosts in navigation: Represent paths where you can consume up to boosts to skip fuel cost.
- Packet routing with retries: Capture networks where up to retransmissions are free.
- Scheduling with exceptions: Model tasks where up to delays can be ignored.
- Budget-constrained logistics: Capture routes where up to tolls can be waived.
Each allowance (discount, boost, retry, etc.) corresponds to moving down one layer, so the layered graph encodes the state “how many special moves have been used so far.”
SSSP from any vertex in one class to one vertex in another: We can introduce a super source and a super sink into the graph. The super source is connected to all vertices in the first class with edges of weight zero, while the super sink is connected to all vertices in the second class, also with edges of weight zero.
This construction effectively transforms the problem into finding the shortest path from the super source to the super sink.
Grid Disconnection: We have a 2D grid of nodes which are weighted. We want to remove the smallest weight subset of nodes which we can remove to disconnect the top left node with the top right node.
We can reformulate the problem as a shortest path problem by considering the shortest path from a top-right L shaped node and a bottom-left L shaped node. Then, we can run Dijkstra from these two nodes to find the shortest path between these nodes.
Running Dijkstra Twice: This is a simple, robust strategy to answer many “shortest path with one modification” queries: (note that this strategy can be used with other SSSP algorithms too)
- compute distances to every vertex from the source
- compute distances to every vertex from the target
  This is done by performing Dijkstra on the transpose graph (i.e. with the edge directions reversed).
- combine those distances to evaluate using a single modified edge or forced intermediate
  Combining them tests every possible place where a single local change could be applied without recomputing full shortest paths for each change.
This costs two Dijkstra runs plus an O(m) scan of edges and is practical for contest and real‑world graphs. Here are some examples:
1. “One discounted edge” — find shortest path from s to t if you may choose exactly one edge to pay half (or zero) of its weight. Evaluate distS[u] + floor(w/2) + distT[v] for every edge (u,v).
2. “Use one special portal/teleport edge” — you may add one extra edge (a portal) between two nodes with fixed cost; test all candidate portal placements by combining distS and distT.
3. “Shortest path that must pass through a hub set H” — compute distS and distT and for each hub h evaluate distS[h] + distT[h].
4. “Edge removal / roadblock” — to test effect of removing a single edge, you can compare original shortest path with best alternative computed by scanning edges and using precomputed distances. These are standard contest patterns and appear frequently in problem sets and tutorials.
Now, it turns out that there is another way in which Dijkstra can be run twice to achieve the same result. Here is how that works:
- compute distances to every vertex from the source (first Dijkstra run)
- connect a super node to all the vertices of interest with the weight of the edge being the distance of the vertex from the source
- run Dijkstra from the super node
Number of Edges on All Shortest Paths: To determine the number of edges that lie on at least one shortest path between a start node \(s\) and an end node \(t\) in a positively weighted graph, we first run Dijkstra’s algorithm from \(s\). Next, we run Dijkstra’s algorithm again, but this time from \(t\), to compute the shortest distance from \(t\) to all other nodes.
With these two sets of distances, we can evaluate each edge \((u,v)\) with weight \(w\). If the condition:
\[dist_s[u]+w+dist_t[v]=dist_s[t]\]
holds, then edge \((u,v)\) lies on some shortest path from \(s\) to \(t\).
Pizza Pirate Encounters: We need to deliver pizzas but with every edge we take, we are left with \(0<f_e<1\) fraction of the pizza we entered with. To maximise the amount of Pizza we have, we can use two approaches that work with Dijkstra:
1. We can modify the algorithm so that the cost we maintain is the amount of Pizza that is left at each node. Then, in the PQ, we use a max heap to retrieve the the maximum Pizza available. This works because we are operating in fractions.
2. We transform the weights to \(- \log (f_e)\) because we want to do minimisation and we want to minimise the product of the weights.
SSSP with one negative weighted edge: We have a weighted, directed graph but no negative cycles. It turns out we can still use Dijkstra (if we take care of the one negative weighted edge ourselves).
If we know the negative edge \(e\) is from \(x\) to \(y\) and we are doing SSSP from \(u\) to \(v\), we run SSP using Dijkstra from:
1. Node \(u\)
2. Node \(x\) or Node \(y\)
Now, we have two lists of distances. Then we manually check which paths benefit from taking edge e. The true shortest path from node v to any node t, is the min of: dist_from_u[v] and dist_from_u[x] - negative_edge_length + dist_from_y[v].

Single Source Longest Path

The Pizza Pirate Encounters problem does not mean we can use Dijkstra for the Single Source Longest Path problem. This is because we lose the greedy guarantee. A path that looks longest now might later be extended into an even longer one, so you can’t safely finalize it.

In fact, the longest path problem is NP-hard in general graphs (it’s equivalent to the Hamiltonian path problem) but can be solved in special graphs (DAGs).

However, we can solve the problem when the following are satisfied:

repeating nodes is allowed
there are no positive cycles

For example, if we want to solve the problem, here is how we can modify Bellman-Ford’s relax subroutine:

if D[u] + w > D[v] then
    D[v] ← D[u] + w 
end

What Dijkstra cannot solve

There are several problems that look like they should be solvable by Dijkstra’s greedy shortest-path algorithm but actually break its assumptions. Here are a few:

Shortest Path with Resource Constraints

Minimum cost path with at most weight (W) where weight and cost are independently quantified.

Greedy expansion ignores global constraints; a locally optimal path may exceed the resource budget.

Shortest Path with Multiple Objectives

Example: Find path minimizing both travel time and toll cost.

Greedy choice works for a single scalar metric, not for multi-dimensional trade-offs.