Dynamic Programming: Tables

February 2, 2021

Assignment Comments, Sample Solutions

Pseudocode is not code

// Let price[1..N] be a constant array of prices.
// So price[i] is the cost of a churro of length i.

CutChurro(n):
    if n = 0
        return 0
    maxTotPrice <- -1
    for i <- 1 to n
        totPriceTry <- price[i] + CutChurro(n-i)
        if totPriceTry > maxTotPrice
            maxTotPrice <- totPriceTry 
    return maxTotPrice

The goal is to communicate your algorithm in a way that makes its correctness transparent.
It’s OK to have things a little less “buttoned-up” if it makes it clearer what the algorithm is doing.

Step 1: understand the recursive structure

// Let price[1..N] be a constant array of prices.
// So price[i] is the cost of a churro of length i.

CutChurro(n):
    if n = 0
        return 0
    maxTotPrice <- -1
    for i <- 1 to n
        totPriceTry <- price[i] + CutChurro(n-i)
        if totPriceTry > maxTotPrice
            maxTotPrice <- totPriceTry 
    return maxTotPrice

CutChurro(n) depends on the Recursion Fairy taking care of CutChurro(n-1), CutChurro(n-2), …, CutChurro(0).
Data structure: One-dimensional array CC[0..n]
Evaluation order: CC[0] is the first thing to fill in, then CC[1], CC[2], … etc, in that order.

Fast Churro Cutting

FastCutChurro(n):
    CC[0] <- 0
    for i <- 1 to n
        maxTotPrice <- 0
        for j <- 1 to i
            totPriceTry <- price[j] + CC[i-j]
            if maxTotPrice < totPriceTry
                maxTotPrice <- totPriceTry
        CC[i] <- maxTotPrice
    return CC[n]

The recurrence in this algorithm is transparently the same as the recursive call in the recursive algorithm.

Fast Churro Cutting: Space and Time

FastCutChurro(n):
    CC[0] <- 0
    for i <- 1 to n
        maxTotPrice <- 0
        for j <- 1 to i
            totPriceTry <- price[j] + CC[i-j]
            if maxTotPrice < totPriceTry
                maxTotPrice <- totPriceTry
        CC[i] <- maxTotPrice
    return CC[n]

Old (recursive) version: \(O(2^n)\) (exponential time)
Dynamic programming (iterative) time: \(O(n^2)\) (nested for-loops)
- Space: \(O(n)\), because of the one-dimensional array
Since \(n^2 \in o(2^n)\), this time is subexponential.
- Exponential functions dominate polynomials.

Fast Churro Cutting: Exponentially better time

FastCutChurro(n):
    CC[0] <- 0
    for i <- 1 to n
        maxTotPrice <- 0
        for j <- 1 to i
            totPriceTry <- price[j] + CC[i-j]
            if maxTotPrice < totPriceTry
                maxTotPrice <- totPriceTry
        CC[i] <- maxTotPrice
    return CC[n]

Actual timings show that going from \(O(2^n)\) to \(O(n^2)\) is very noticable, even for small examples:

> system.time(CutChurro(20))
   user  system elapsed 
  0.736   0.002   0.738 
> system.time(FastCutChurro(20))
   user  system elapsed 
      0       0       0

Coding Optimizations

Note: Better code is not necessarily better pseudocode.
These examples are cleaner, but the connection between the recursive algorithm is less transparent.

FastCutChurro(n):
    CC[0] = 0
    for (i in 1:n)
        CC[i] = -1
        for (j in 1:i)
            p = price[j] + CC[i-j]
            if p > CC[i]: CC[i] = p
    return CC[n]
    

FastCutChurro(n):
    Initialize CC[1..n] to price[1..n]
    for i in 2..n:
        for j in 1..floor(i/2):
            if CC[j] + CC[i-j] > CC[i]:
                CC[i] <- CC[j] + CC[i-j]
    return CC[n]

Dynamic Programming

Dynamic Programming Problems

Problem has several options, and recursive substructure.
- String splitting: find a word, then split what’s left.
- Churro cutting: cut off a piece, then price what’s left.
Many of the subproblems are the same: overlapping subproblems.
- The same substrings recur.
- Several churro pieces have same length.

Dynamic programming keeps track of the subproblem solutions, and iteratively builds an array/table of solutions.

Dynamic Programming Steps

Specify the problem as a recursive function, and solve it recursively (with an algorithm or formula).
Build the solutions from the bottom up:
- What are the subproblems?
- What data structure do I store the solutions in? (Usually an array.)
- Which subproblem does each subproblem depend on?
  - Determine an evaluation order.
- Space and time?
- Write the algorithm. Usually it will just be a couple nested loops, with the recursive calls replaced by array look-ups.

Edit Distance

Minimum edit distance between two strings

How many operations does it take to transform one string (e.g., DNA sequence) to another? The operations are:

Mutation: Change one letter to another value.
Deletion: Delete one letter.
Insertion: Insert a letter.

SATURDAY -mutate-> SUTURDAY -mutate-> SUNURDAY -delete-> SUNRDAY -delete-> SUNDAY
SATURDAY -delete-> STURDAY -delete-> SURDAY -mutate-> SUNDAY
KITTEN -mutate-> SITTEN -mutate-> SITTIN -insert-> SITTING

Alignment diagrams: |-bars represent 1 unit of cost.

SATURDAY    SATURDAY    KITTEN     ALGOR I THM
 ||||        || |       |   | |      || | | ||
SUN  DAY    S  UNDAY    SITTING    AL TRUISTIC

Recursive Substructure

SATURDAY    SATURDAY    KITTEN     ALGOR I THM
 ||||        || |       |   | |      || | | ||
SUN  DAY    S  UNDAY    SITTING    AL TRUISTIC

The rightmost element in an alignment diagram can be:

A match (cost 0) \(\substack{x \\ | \\ x}\)
A mutation (cost 1) \(\substack{x \\ | \\ y}\)
An insertion (cost 1) \(\substack{- \\ | \\ x}\)
A deletion (cost 1) \(\substack{x \\ | \\ -}\)

Try all four, then have the Recursion Fairy optimize the rest.

Table Groups

Table #1	Table #2	Table #3	Table #4	Table #5	Table #6	Table #7
Graham	Kristen	John	James	Bri	Trevor	Blake
Isaac	Josiah	Drake	Andrew	Jack	Talia	Logan
Levi	Kevin	Claire	Jordan	Grace	Ethan	Nathan

Use the Recursion Fairy

Suppose A[1..m] and B[1..n] are two strings. Suppose also that you have a function EditDistance(i,j) that returns the minimum edit distance from A[1..i] to B[1..j], as long as \(i<m\) or \(j<n\).

Suppose A[m] = B[n] (the last characters of the strings match). If the rightmost element of the alignment diagram is a match \(\substack{x \\ | \\ x}\), what is the minimum edit distance from A[1..m] to B[1..n]? (Use the EditDistance function.)
Suppose A[m] != B[n] (the last characters of the strings don’t match). If the rightmost element of the alignment diagram is a mutation \(\substack{x \\ | \\ y}\), what is the minimum edit distance?

Insertions and Deletions

If the rightmost element of the alignment diagram is an insertion \(\substack{- \\ | \\ x}\), what is the minimum edit distance?
If the rightmost element of the alignment diagram is a deletion \(\substack{x \\ | \\ -}\), what is the minimum edit distance?

Using Dynamic Programming

Data Structure and recurrence

Let ED[1..m, 1..n] be an array in which we will store the minimum edit distance from A[1..i] to B[1..j].

ED[i,j] will be the minimum of the following:

ED[i-1, j-1] (if A[i] = B[j])
- match
ED[i-1, j-1] + 1 (if A[i] != B[j])
- mutation
ED[i, j-1] + 1
- insertion
ED[i-1, j] + 1
- deletion

Evaluation Order

ED[i,j] is the minimum of the following:

ED[i-1, j-1] (if A[i] = B[j])
ED[i-1, j-1] + 1 (if A[i] != B[j])
ED[i, j-1] + 1
ED[i-1, j] + 1

So the evaluation order must obey the arrows:

\[ \begin{array}{cccccc} \cdots & \mathtt{ED[i-2, j-2]} & \rightarrow & \mathtt{ED[i-2, j-1]} & \rightarrow &\mathtt{ED[i-2, j]} \\ & \downarrow & \searrow & \downarrow & \searrow & \downarrow \\ \cdots & \mathtt{ED[i-1, j-2]} & \rightarrow & \mathtt{ED[i-1, j-1]} & \rightarrow &\mathtt{ED[i-1, j]} \\ & \downarrow & \searrow & \downarrow & \searrow & \downarrow \\ \cdots & \mathtt{ED[i, j-2]} & \rightarrow & \mathtt{ED[i, j-1]} & \rightarrow & \mathtt{ED[i, j]} \\ \end{array} \]

Row major or column major

Either row major (row by row) or column major (column by column) order will work.
For example, here is row major as a nested for-loop:

for i from 1 to m
    for j from 1 to n
         TODO: populate E[i,j]

Space and Time

Data structure: ED[1..m, 1..n], so space is \(O(mn)\).
Recursive structure: ED[i,j] is the minimum of the following:
- ED[i-1, j-1] (if A[i] = B[j])
- ED[i-1, j-1] + 1 (if A[i] != B[j])
- ED[i, j-1] + 1
- ED[i-1, j] + 1
  - This minimum can be taken in constant time (\(O(1)\)).
Evaluation order:

for i from 1 to m
    for j from 1 to n
         TODO: populate E[i,j]

The populate step takes constant time, so time is \(mn \cdot O(1) = O(mn)\)

Algorithm

Finally, we can write the algorithm. (Note we add a row and a column of 0’s as a base case.)

EditDistance(A[1 .. m], B[1 .. n]):
    Initialize ED[0, *] and ED[*, 0] to *, for *=0,1,2,... 
    for i <- 1 to m
        for j <- 1 to n
            insert <- ED[i, j − 1] + 1
            delete <- ED[i − 1, j] + 1
            if A[i] = B[j]
                mutate <- ED[i − 1, j − 1]
            else
                mutate <- ED[i − 1, j − 1] + 1
            ED[i, j] <- min {insert, delete, mutate}
    return ED[m, n]

Check out the memoization table.

Exam #1: February 10 (Wed.)

Are you ready?
- You should be able to do tonight’s assignment by yourself, without getting help from others.
  - It’s OK to get help from others, but before you do you should try to solve the problem by yourself.
- If you can’t do this assignment by yourself, that’s a sign that you are relying too much on others to do your work for you.
What if you’re not ready?
- Get a notebook and a pencil and redo all the assignments by yourself without looking at the solutions.
- Try some other Exercises in the book. The goal is to exercise the techniques you have learned, not necessarily produce a perfect solution.