Dynamic Programming

Divide & Conquer works when a problem can be solved by solving smaller problems of the same type.

In Dynamic Programming (DP), we typically have to define and solve a more general problem.

Shortest Path

Problem: Given a graph $G = (V, E)$ with edge lengths $l(e)$, find the shortest path from vertex $s$ to $t$.

• If all edges were of unit-length a breadth first search could solve the problem in $O(|V|+|E|)$.
• Previous solution can be extended to weighted version by breaking edges into smaller unit edges, though the complexity of such a solution would be exponential on size of input! (An edge of length L can be represented in $O(\log (T))$ bits)

Observation: A shortest $s-t$ path $P$ is also a shortest $s-u$ path, for all vertices $u$ on $P$.

More General Problem: Find shortest $s-u$ paths, $\forall u$.

Algorithm (Dijkstra):

Idea: Maintaining tentative distances to each vertex, i.e., $T(u)$.

• $T(s) = 0$
• $T(u) = \infty, \forall u \ne s$
• $S = \{ \}$
• while $S \ne V$:
• Let $j$ be the vertex not in $S$ with smallest $T(j)$
• $S \leftarrow S \cup \{ j \}$
• Update $T(u) \leftarrow \min \{ T(u), T(j) + l(j,u) \}$

Theorem: Dijkstra’s algorithm correctly finds shortest path from $s$ to all vertices.

Either of the following lemmas immediately result the theorem.

Lemma: At every step, for every vertex $u \in S$$T(u)$ is the length of shortest $s-u$ path and members of $S$ are the $|S|$ nearest vertices to the $s$.

Proof: We prove this by induction on $|S|$. Let’s name the set after $k$ iterations $S_k$. The basis trivially holds for $|S_1| = 1$. Induction step: ($k$ to $k+1$) Let $u$ be the $k+1$‘th distant vertex from $s$. The previous (to the end) vertex on a shortest $s-u$ path (name it $v$) is nearer to the source and by induction hypothesis is in $S_k$ and $T(v)$ is distance of $v$ from the source. Considering the algorithm at a previous step, when $v$ was added to $S$, it has updated its neighbor $u$ so $T(u) \leq T(v) + l(v,u)$. On the other hand at any point of the algorithm and for all vertices $x$, $T(x)$ is at least equal to distance of $x$ from the source, because according to updates sequence there exists a path of that length from source to $x$. Hence, we must have $T(u) = T(v) + l(v,u)$, equal to distance of $u$ from the source. As for all other vertices $x \in V \setminus S_k \setminus \{u\}$ we have $T(x) \geq d(s,x) \geq d(s,u) \geq T(u)$, $u$ will be the vertex added into $S$ at this step for which $T(u)$ is distance from the source and it is the $k+1$‘th distant vertex from $s$ as required for $S_{k+1}$.

Alternatively we can prove the theorem using the following lemma.

Lemma: At every step, for each vertex $u \in S$, $T(u)$ is the shortest path distance from $s$ to it and there is a $s-u$ path of length $T(u)$ that falls wholly inside $S$.

Proof: Similarly, we have simple induction on $|S|$. Basis holds for $S_k = S_1 = \{s\}$. Induction step: Assuming the lemma for $k$, Let $u$ be the nearest vertex to source from $V \setminus S_k$. A shortest path from $s$ to $u$ starts in $S$ and ends outside of it. By the choice of $u$ all previous vertices in the path to it must be in $S$, else we have contradiction by choice of $u$. Name the previous (to $u$) vertex on that path $v$. Considering the algorithm at a previous step, when $v$ was added to $S$, it has updated its neighbor $u$ so $T(u) \leq T(v) + l(v,u)$. On the other hand at any point of the algorithm and for all vertices $x$, $T(x)$ is at least equal to distance of $x$ from the source, because according to updates sequence there exists a path of that length from source to $x$. Hence, we must have $T(u) = T(v) + l(v,u)$, equal to distance of $u$ from the source. As for all other vertices $x \in V \setminus S_k \setminus \{u\}$ we have $T(x) \geq d(s,x) \geq d(s,u) \geq T(u)$, $u$ will be the vertex added into $S$ at this step for which $T(u)$ is distance from the source and there exist a shortest path to it using only vertices in $S_{k+1}$, hence completing the induction hypothesis.

Note: Using data structures like heap, the runtime for above procedure would be $O( (|V|+|E|) \log (|V|))$.

Note: We found the length of shortest path to each vertex. To retrieve the shortest path itself, we could iteratively check all neighbors of destination for being a previous node in a shortest path, or we could store this information while updating its $T()$.

Longest Increasing Subsequence

Problem: Given a sequence of $n$ numbers $a_1, \ldots, a_n$, find (the length of) its longest increasing subsequence (LIS).

Example: LIS for $1, 4, -1, 0, 6, 2, 4, 5$ is $-1, 0, 2, 4, 5$ that is of length $5$.

Note that D&Q technique does not work for this problem.

Let’s solve a more general problem: Finding (length of) LIS ending at element $a(i)$ for all indices $i$, name it $s(i)$.

Recursion: $s(i) = \max \{ 1+s(j) : j < i, a(j) <= a(i) \}$.

Using previous observation we solved the LIS problem, computing all $s(i)$ in time $O( n^2 )$.

Note: This problem has an $O(n \log n)$ solution, using monotonicity of smallest tail of increasing subsequences of fixed lengths.

Matrix Chain Multiplication

Problem: Given $k$ matrices $M_1, \dots, M_k$, want to compute $M_1 \times \dots \times M_k$ most efficiently by choosing the order in which we perform multiplications.

For simplicity say for each multiplication we use the naive method, taking time $O(m n p)$ to multiply an $m \times n$ and an $n \times p$ matrix, and we only want to find the minimum cost of computing the final matrix.

Let the $i$‘th matrix be of size $m_i \times m_{i+1}$.

Example: If sizes of matrices be $(2 \times 6), (6 \times 10), (10 \times 6)$, multiplying the left pair first results in total cost of $2 \cdot 6 \cdot 10 + 2 \cdot 10 \cdot 6$ while the other scenario (right first) costs $6 \cdot 10 \cdot 6 + 2 \cdot 6 \cdot 6$.

More general problem: For every $i \leq j$, what is the (minimum) cost of multiplications from the $i$‘th matrix to the $j$‘th one, i.e., $M_i, \dots, M_j$, name this value $C[i,j]$.

Observation:

• $C[i, i+1] = m_i m_{i+1} m_{i+2}$
• $C[i, j] = \min_{i < k < j} C[i, k] + C[k+1, j] + m_i m_{k+1} m_{j+1}$

Analysis: Storing $C[,]$ in a two dimensional array, we need to compute values for elements above the diagonal, computing $O(n^2)$ values where each takes $O(n)$ time, summing up to $O(n^3)$ for our total computation.

Note: In DP we usually use smaller instances of the general problem to solve larger ones. While we compute the answer for an instance using others it is important that referred instances be previously solved and known. In last example an acceptable order of finding values $C[i,j]$ is in increasing order of $j-i$, i.e., diagonals of $2$-d array $C$ moving up, starting at the main diagonal.