Dynamic Programming: Tables

February 2, 2021

Assignment Comments, Sample Solutions

Pseudocode is not code

// Let price[1..N] be a constant array of prices.
// So price[i] is the cost of a churro of length i.

CutChurro(n):
    if n = 0
        return 0
    maxTotPrice <- -1
    for i <- 1 to n
        totPriceTry <- price[i] + CutChurro(n-i)
        if totPriceTry > maxTotPrice
            maxTotPrice <- totPriceTry 
    return maxTotPrice
  • The goal is to communicate your algorithm in a way that makes its correctness transparent.
  • It’s OK to have things a little less “buttoned-up” if it makes it clearer what the algorithm is doing.

Step 1: understand the recursive structure

// Let price[1..N] be a constant array of prices.
// So price[i] is the cost of a churro of length i.

CutChurro(n):
    if n = 0
        return 0
    maxTotPrice <- -1
    for i <- 1 to n
        totPriceTry <- price[i] + CutChurro(n-i)
        if totPriceTry > maxTotPrice
            maxTotPrice <- totPriceTry 
    return maxTotPrice
  • CutChurro(n) depends on the Recursion Fairy taking care of CutChurro(n-1), CutChurro(n-2), …, CutChurro(0).
  • Data structure: One-dimensional array CC[0..n]
  • Evaluation order: CC[0] is the first thing to fill in, then CC[1], CC[2], … etc, in that order.

Fast Churro Cutting

FastCutChurro(n):
    CC[0] <- 0
    for i <- 1 to n
        maxTotPrice <- 0
        for j <- 1 to i
            totPriceTry <- price[j] + CC[i-j]
            if maxTotPrice < totPriceTry
                maxTotPrice <- totPriceTry
        CC[i] <- maxTotPrice
    return CC[n]
  • The recurrence in this algorithm is transparently the same as the recursive call in the recursive algorithm.

Fast Churro Cutting: Space and Time

FastCutChurro(n):
    CC[0] <- 0
    for i <- 1 to n
        maxTotPrice <- 0
        for j <- 1 to i
            totPriceTry <- price[j] + CC[i-j]
            if maxTotPrice < totPriceTry
                maxTotPrice <- totPriceTry
        CC[i] <- maxTotPrice
    return CC[n]
  • Old (recursive) version: \(O(2^n)\) (exponential time)
  • Dynamic programming (iterative) time: \(O(n^2)\) (nested for-loops)
    • Space: \(O(n)\), because of the one-dimensional array
  • Since \(n^2 \in o(2^n)\), this time is subexponential.
    • Exponential functions dominate polynomials.

Fast Churro Cutting: Exponentially better time

FastCutChurro(n):
    CC[0] <- 0
    for i <- 1 to n
        maxTotPrice <- 0
        for j <- 1 to i
            totPriceTry <- price[j] + CC[i-j]
            if maxTotPrice < totPriceTry
                maxTotPrice <- totPriceTry
        CC[i] <- maxTotPrice
    return CC[n]
  • Actual timings show that going from \(O(2^n)\) to \(O(n^2)\) is very noticable, even for small examples:
> system.time(CutChurro(20))
   user  system elapsed 
  0.736   0.002   0.738 
> system.time(FastCutChurro(20))
   user  system elapsed 
      0       0       0 

Coding Optimizations

  • Note: Better code is not necessarily better pseudocode.
  • These examples are cleaner, but the connection between the recursive algorithm is less transparent.
FastCutChurro(n):
    CC[0] = 0
    for (i in 1:n)
        CC[i] = -1
        for (j in 1:i)
            p = price[j] + CC[i-j]
            if p > CC[i]: CC[i] = p
    return CC[n]
    

FastCutChurro(n):
    Initialize CC[1..n] to price[1..n]
    for i in 2..n:
        for j in 1..floor(i/2):
            if CC[j] + CC[i-j] > CC[i]:
                CC[i] <- CC[j] + CC[i-j]
    return CC[n]

Dynamic Programming

Dynamic Programming Problems

  • Problem has several options, and recursive substructure.
    • String splitting: find a word, then split what’s left.
    • Churro cutting: cut off a piece, then price what’s left.
  • Many of the subproblems are the same: overlapping subproblems.
    • The same substrings recur.
    • Several churro pieces have same length.

Dynamic programming keeps track of the subproblem solutions, and iteratively builds an array/table of solutions.

Dynamic Programming Steps

  1. Specify the problem as a recursive function, and solve it recursively (with an algorithm or formula).

  2. Build the solutions from the bottom up:

    • What are the subproblems?
    • What data structure do I store the solutions in? (Usually an array.)
    • Which subproblem does each subproblem depend on?
      • Determine an evaluation order.
    • Space and time?
    • Write the algorithm. Usually it will just be a couple nested loops, with the recursive calls replaced by array look-ups.

Edit Distance

Minimum edit distance between two strings

How many operations does it take to transform one string (e.g., DNA sequence) to another? The operations are:

  • Mutation: Change one letter to another value.
  • Deletion: Delete one letter.
  • Insertion: Insert a letter.
SATURDAY -mutate-> SUTURDAY -mutate-> SUNURDAY -delete-> SUNRDAY -delete-> SUNDAY
SATURDAY -delete-> STURDAY -delete-> SURDAY -mutate-> SUNDAY
KITTEN -mutate-> SITTEN -mutate-> SITTIN -insert-> SITTING

Alignment diagrams: |-bars represent 1 unit of cost.

SATURDAY    SATURDAY    KITTEN     ALGOR I THM
 ||||        || |       |   | |      || | | ||
SUN  DAY    S  UNDAY    SITTING    AL TRUISTIC

Recursive Substructure

SATURDAY    SATURDAY    KITTEN     ALGOR I THM
 ||||        || |       |   | |      || | | ||
SUN  DAY    S  UNDAY    SITTING    AL TRUISTIC

The rightmost element in an alignment diagram can be:

  • A match (cost 0) \(\substack{x \\ | \\ x}\)
  • A mutation (cost 1) \(\substack{x \\ | \\ y}\)
  • An insertion (cost 1) \(\substack{- \\ | \\ x}\)
  • A deletion (cost 1) \(\substack{x \\ | \\ -}\)

Try all four, then have the Recursion Fairy optimize the rest.

Table Groups

Table #1 Table #2 Table #3 Table #4 Table #5 Table #6 Table #7
Graham Kristen John James Bri Trevor Blake
Isaac Josiah Drake Andrew Jack Talia Logan
Levi Kevin Claire Jordan Grace Ethan Nathan

Use the Recursion Fairy

Suppose A[1..m] and B[1..n] are two strings. Suppose also that you have a function EditDistance(i,j) that returns the minimum edit distance from A[1..i] to B[1..j], as long as \(i<m\) or \(j<n\).

  1. Suppose A[m] = B[n] (the last characters of the strings match). If the rightmost element of the alignment diagram is a match \(\substack{x \\ | \\ x}\), what is the minimum edit distance from A[1..m] to B[1..n]? (Use the EditDistance function.)

  2. Suppose A[m] != B[n] (the last characters of the strings don’t match). If the rightmost element of the alignment diagram is a mutation \(\substack{x \\ | \\ y}\), what is the minimum edit distance?

Insertions and Deletions

  1. If the rightmost element of the alignment diagram is an insertion \(\substack{- \\ | \\ x}\), what is the minimum edit distance?

  2. If the rightmost element of the alignment diagram is a deletion \(\substack{x \\ | \\ -}\), what is the minimum edit distance?

Using Dynamic Programming

Data Structure and recurrence

Let ED[1..m, 1..n] be an array in which we will store the minimum edit distance from A[1..i] to B[1..j].

ED[i,j] will be the minimum of the following:

  • ED[i-1, j-1] (if A[i] = B[j])
    • match
  • ED[i-1, j-1] + 1 (if A[i] != B[j])
    • mutation
  • ED[i, j-1] + 1
    • insertion
  • ED[i-1, j] + 1
    • deletion

Evaluation Order

ED[i,j] is the minimum of the following:

  • ED[i-1, j-1] (if A[i] = B[j])
  • ED[i-1, j-1] + 1 (if A[i] != B[j])
  • ED[i, j-1] + 1
  • ED[i-1, j] + 1

So the evaluation order must obey the arrows:

\[ \begin{array}{cccccc} \cdots & \mathtt{ED[i-2, j-2]} & \rightarrow & \mathtt{ED[i-2, j-1]} & \rightarrow &\mathtt{ED[i-2, j]} \\ & \downarrow & \searrow & \downarrow & \searrow & \downarrow \\ \cdots & \mathtt{ED[i-1, j-2]} & \rightarrow & \mathtt{ED[i-1, j-1]} & \rightarrow &\mathtt{ED[i-1, j]} \\ & \downarrow & \searrow & \downarrow & \searrow & \downarrow \\ \cdots & \mathtt{ED[i, j-2]} & \rightarrow & \mathtt{ED[i, j-1]} & \rightarrow & \mathtt{ED[i, j]} \\ \end{array} \]

Row major or column major

\[ \begin{array}{cccccc} \cdots & \mathtt{ED[i-2, j-2]} & \rightarrow & \mathtt{ED[i-2, j-1]} & \rightarrow &\mathtt{ED[i-2, j]} \\ & \downarrow & \searrow & \downarrow & \searrow & \downarrow \\ \cdots & \mathtt{ED[i-1, j-2]} & \rightarrow & \mathtt{ED[i-1, j-1]} & \rightarrow &\mathtt{ED[i-1, j]} \\ & \downarrow & \searrow & \downarrow & \searrow & \downarrow \\ \cdots & \mathtt{ED[i, j-2]} & \rightarrow & \mathtt{ED[i, j-1]} & \rightarrow & \mathtt{ED[i, j]} \\ \end{array} \]

  • Either row major (row by row) or column major (column by column) order will work.
  • For example, here is row major as a nested for-loop:
for i from 1 to m
    for j from 1 to n
         TODO: populate E[i,j]

Space and Time

  • Data structure: ED[1..m, 1..n], so space is \(O(mn)\).
  • Recursive structure: ED[i,j] is the minimum of the following:
    • ED[i-1, j-1] (if A[i] = B[j])
    • ED[i-1, j-1] + 1 (if A[i] != B[j])
    • ED[i, j-1] + 1
    • ED[i-1, j] + 1
      • This minimum can be taken in constant time (\(O(1)\)).
  • Evaluation order:
for i from 1 to m
    for j from 1 to n
         TODO: populate E[i,j]
  • The populate step takes constant time, so time is \(mn \cdot O(1) = O(mn)\)

Algorithm

Finally, we can write the algorithm. (Note we add a row and a column of 0’s as a base case.)

EditDistance(A[1 .. m], B[1 .. n]):
    Initialize ED[0, *] and ED[*, 0] to *, for *=0,1,2,... 
    for i <- 1 to m
        for j <- 1 to n
            insert <- ED[i, j − 1] + 1
            delete <- ED[i − 1, j] + 1
            if A[i] = B[j]
                mutate <- ED[i − 1, j − 1]
            else
                mutate <- ED[i − 1, j − 1] + 1
            ED[i, j] <- min {insert, delete, mutate}
    return ED[m, n]

Check out the memoization table.

Exam #1: February 10 (Wed.)

  • Are you ready?
    • You should be able to do tonight’s assignment by yourself, without getting help from others.
      • It’s OK to get help from others, but before you do you should try to solve the problem by yourself.
    • If you can’t do this assignment by yourself, that’s a sign that you are relying too much on others to do your work for you.
  • What if you’re not ready?
    • Get a notebook and a pencil and redo all the assignments by yourself without looking at the solutions.
    • Try some other Exercises in the book. The goal is to exercise the techniques you have learned, not necessarily produce a perfect solution.