Design and Analysis of Algorithms

Lecture 0 <2019-08-26 Mon>

Syllabus review
Course Webpage
Proof-heavy course
Design – methods used to create new algorithms
Analysis – Mathematical assessment of an algorithm's correctness and efficiency
Correctness – Does it do what it's supposed to do on all possible inputs?
Efficiency – Measure the algorithm's running time, counting the number of basic operations

Lecture 1 <2019-08-26 Mon>

Asymptotic Notation – Starts on Slide 3

Big-$O$
Big-$\Omega$
Big-$\Theta$
Little-$o$
Little-$\omega$
Not Interchangeable
Asymptotic Upper Bound – Big-$O$, $O(g(n)) = \{f(n) : \exists c, n_0 > 0 \mid \forall n \geq n_0, 0 \leq f(n) \leq cg(n) \}$
Asymptotic Lower Bound, Big-$\Omega$, $\Omega(g(n)) = \{f(n) : \exists c, n_0 \mid \forall n \ge n_0, 0 \leq cg(n) \leq f(n) \}$
Asymptotic Tight Bound, Big-$\Theta$, $\Theta(g(n)) = \{f(n) : \exists c_1, c_2, n_0 > 0 \mid \forall n \geq n_0, 0 \leq c_1 g(n) \leq f(n) \leq c_2 g(n) \}$
Upper Bound, not tight, little-$o$
Lower bound, not tight, little-$\omega$

Lecture 1, Cont. <2019-08-28 Wed>

Upper And Lower Bounds

01.9
Analyze algorithms and problems in terms of time complexity (count of operations)
Sometimes in space complexity (memory usage)
upper and lower bounds are used
applies to general problems
01.10 – upper is most common form
Algorithm $A$ has an upper bound of $f(n)$ for input of size $n$ if there exists no input of size $n$ such that $A$ requires more than $f(n)$ time
Consider sort, Quicksort and Bubblesort are $\mathcal{O}(n^2)$, and Mergesort is $\mathcal{O}(n \log{n})$
- Quicksort is used more often in practice, when randomized tends to be closer to $n \log{n}$ time
Lower Bounds are like best-case results, typically not as interesting
01.11 – Uppper bound of a problem
A problem has an upper bound of $f(n)$ if there exists at least one algorithm that has an uppperbound $f(n)$
- there exists an algorithm with time/space complexity of at most $f(n)$ on all inputs of size $n$
Mergesort has worst-case $\mathcal{O}(n \log{n})$, sorting has an upper bound of $\mathcal{O}(n \log{n})$
01.12 – Lower bounds of a problem
A problem has a lower bound of $f(n)$ if, for any algorithm $A$, to solve it, there exists at least one input of size $n$ that forces $A$ to take at least $f(n)$ time/space.
- Called the pathological input, depends on $A$ itself.
- input of size $n$ (reverse) that forces Bubblesort to take $\Omega(n^2)$
- Since every sorting algorithm has an input of size $n$ forcing $\Omega(n \log{n})$ steps, the sorting problem has a time comlpexity lower bound of $\Omega(n \log{n})$. Therefore, Mergesort is asymptotically optimal.
01.13 Lower Bounds, 2
- To argue a lower bound, use adversarial argument: algorithm simulates an arbitrary algorithm $A$ to build pathological input
- Needs to be a general form, pathological input depends on specific algorithm $A$
- Can reduce one problem to another to establish lower bounds
  - Show that if we can compute convex hull in $o(n \log{n})$ time, then we can sort in time $o(n \log{n})$; cannot be true, so convex hull takes $\Omega(n \log{n})$

Efficiency

01.14 – Efficiency
- An algorithm is time- or space- efficient if worst case time/space complexity is $\mathcal{O}(n^c)$ for constant $c$ and input size $n$, polynomial in the input size
- Input size on in number of bits needed
  - graph of $n$ nodes takes $\mathcal{O}(n \log{n})$ bits to represent nodes, $\mathcal{O}(n^2 \log{n})$ to represent edges., algorithm in $\mathcal{O}(n^c)$ is efficient
  - Problem that includes as an input a number, $k$ (e.g., threshold) onnly needs $\mathcal{O}(\log{k})$ bits to represent, efficient must run in $\mathcal{O}(\log^c{k})$
  - If polynomial in $k$, called pseudopolynomial

Recurrence Relations

01.15 – Recurrence Relations
- Non-recursive relations are relatively easy to analyse.
- Recurrence relations are used for time complexity of recursive algorithms, bound relation asymptotically
- For example, Mergesort
  - $T(n) = 2 T(n/2) + \mathcal{O}(n)$
  - Bound on $T(n)$ needed
  - Use the Master theorem (See 235/156 notes)
  - Other approaches include 01.17–18

Graphs

01.19–24 – Graphs
- Set of vertices $V$, set of unordered pairs between vertices $E$ (edges)
- Directed graph has ordered pairs
- Weighted is directed augmented with weights for edges
- Adjacency list, each vertex has linked list listing edges, $\mathcal{O}(n + m)$
- Adjacency matrix, $n \times n$ – $\mathcal{O}(n^2)$

Lecture 1, Cont. <2019-08-30 Fri>

Graphs

01.24 – Adjacency matrices
$n \times n$ matrix $M$, $M(i,j) = w$ for weight, $\infty$ for non-edges, takes $\Theta(n^2)$ space

Algorithmic Techniques, Dynamic Programming

01.25
Technique for solving optimization problems, where a best solution is needed, evaluated by an objective function
Decomposes a problem into subproblems, optimally solves recursively, combines into final/optimal solution
Exponential number of subproblems, many overlap, re-use solutions
Number of distinct subproblems is polynomial
Works if they have optimal substructure property – optimal solution is made up of optimal solutions to subproblems
All-pairs shortest paths, knapsack
Can prove correctness if optimal substructure property exists
Must be careful, very carefully proven

Algorithmic Techniques, Greedy Algorithms

01.26
Must consider optimal substructure property
Each step, commits to locally optimal
Tend to be very quick, does not use all possible subproblems
MST, Dijkstra's

Divide And Conquer

01.27
Not limited to optimization, solve each sub-problem recursively, then combine
Think Mergesort

Proof By Contradiction

01.28
Assume opposite of premise to be proved, arrive at a contradiction of some other assumption
To prove there is no greatest even integer:
- Assume for a reductio that there exists a greatest even integer $N$, for all even integers, $n$, we have $N \geq n$. (1)
- But $M = N + 2$ is an even integer, since it's the sum of two even integers, and $M > N$.
- Therefore, (1) is false, so there is no greatest even integer. $\box$

Proof by Induction

01.29
Base case, and inductive step, involves non-negative integers
Useful for proving invariants in algorithms, properties that always hold at every step (especially at the final step)
See PHIL 411 notes

Proof by Construction

01.30
Proof of existence by directly constructing it.
Will be used extensively for NP-completeness

Proof by Contraposition

01.31
$P \to Q \equiv \lnot Q \to \not P$
Proof that if $x^2$ is even, then $x$ is even. Contraposition is the simplest
Useful for proving biconditionals as well
Useful for NP-completeness

Lecture 2: Medians and Order Statistics <2019-08-30 Fri>

Given an array $A$ of $n$ distinct numbers, $i$th order statistic of $A$ is its $i$th smallest element
$i = 1$ – minimum
$i = n$ – max
$i = \lfloor (n+1)/2 \rfloor$ – (lower) median
Given array $A$ of $n$ elements and a number $i \in \{1, \dots, n\}$ find the $i$th order statistic of $A$
Can we do better?
Minimum, linear, normal just scan through the array, keeping track of the smallest so far
- small is smallest from $A_1$ to $A_i$ – the loop invariant
- Loop invariant can be shown easily via induction
- Correctness by observing $i = n$ at the end, but before return
- $\Theta(n)$ – Cannot do better
- Must find a lower bound
- Must make at least $n - 1$ comparisons
  - no element can be eliminated until shown it's greater than at least one other element
  - All still eligible to be smallest are in a bucket, removed after shown to be $>$ another.
  - Each comparison removes at most one element, from the bucket, therefore at least $n - 1$ comparisons are needed to remove all but one from the bucket
  - Follows by pigeonhole principle
Minimum and Maximum (02.06)
- Given $A$ with $n$ elementts, find both min and max
- Obvious algorithm, non asymptotic time complexity
- Do better by checking for both
- Look at 02.07

Lecture 2: Medians and Order Statistics, Continued <2019-09-04 Wed>

The simultaneous Minimum and Maximum (starts 02.06)

find-min and find-max are optimal, as they are at $n - 1$ comparisons for $n$ elements.

Algorithm presented on 02.07 – be familiar with the algorithm (retype later)

  (defun min-and-max (list-of-nums)
    (let ((len (length list-of-nums))
          (max (max (first list-of-nums)
                    (second list-of-nums)))
          (min (min (first list-of-nums)
                    (second list-of-nums))))
      (do ((i 1 (1+ i)))
          ((= i (1- (floor len 2))))
        (setq max (max max
                       (nth list-of-nums (* 2 i))
                       (nth list-of-nums (- (* 2 i) 1)))
              min (min min
                       (nth list-of-nums (* 2 i))
                       (nth list-of-nums (- (* 2 i) 1)))))
      (when (oddp len)
        (setq max (max max (nth list-of-nums (1- len)))
              min (min min (nth list-of-nums (1- len)))))
      (values max min)))

For this algorithm, in $\mathcal{O}(\frac{3}{2}n)$

Selection of $i$th smallest value (02.10)

Select $i$th smallest value. Obvious solution (sort, return $i$th element), complexity is $\Theta(n \log n)$
Can do better, running in linear time
Uses divide-and-conquer, similar to binary search
Definition on 02.12
Uses partition like in quicksort (deterministic)
$q$ from algorithm is in its final sorted position
Select calls partition, choses pivot element $q$, reorders $A$ such that all elements $< A_q$ to the left of $A_q$ and all elements $> A_q$ to the right
If $A_q$ is the element, return, if it's greater learch left, less, search right.
Partition chooses a pivot element $A_x$, exchange $A_x$ and $A_r$, then begin to loop to exchange appropriately

Lecture 2: Medians and Order Statistics, Continued <2019-09-06 Fri>

Selection, Continued

Recall algorithm, from 02.12
Requires partition, defined later on
Partition chooses the pivot element
Partition helps to preserve the invariant
Choce of pivot element is crucial

Choosing a Pivot element

choise is critical to low time-complexit
best choice of pivot element to partition
Use the median – circular to do this
Median of Medians
$A$ of $n$ elements, partition $A$ into $m = \lfloor n/3 \rfloor$ groups of 5 elements, atnd at most one othe group of $n \mod 3$ elements
$A^{\prime} = [ x_1, x_2, \ldots, x_{\lceil n / 5 \rceil}]$, where $x_i$ is median of group $i$ from sorting $i$
$y = \mathrm{Select}(A^{\prime}, 1, \lceil n/5 \rceil, \lfloor(\lceil n/5 \rceil + 1)/2 \rfloor)$ is the pivot element, scan $A[p \cdots r]$ and return $y$ in as $i$ – this is choose pivot element
As a team, do 02.19

Time Complexity

See 02.20
Recurrence form (02.22):
- $T(n)$ total time for size $n$
- Pivot selection takes $\mathcal{O}(n)$ to split and $T(n/5)$ for finding median of medians, partitioning $n$ elements takes $\mathcal{O}(n)$ time
- Recursive select takes $T(3n/4)$ (in choose pivot element)
- ∴, $T(n) \leq T(n/5) + T(3n/4) + \mathcal{O}(n)$
- Or, rewritten as $T(\alpha n) + T(\beta n) + O(n)$ where $\alpha = 1/5$ and $\beta = 3/4$
- By theorem from Lecture 1, when $\alpha + \beta < 1$, $T(n) \in \mathcal{O}(n)$
And thus, Select has complexity $\mathcal{O}(n)$
Proof shown on 02.23
How can we lower-bound to 1/4 of the elements – see 02.21

Lecture 2: Medians and Order Statistics, Continued <2019-09-09 Mon>

Practice, Median of Medians, 02.19

$A = [4, 9, 12, 17, 6, 5, 21, 14, 8, 11, 13, 29, 3]$
select(A, 1, 13, 4)
- Partition(A, 1, 13)
  - ChoosePivotElement(A, 1, 13)
    - $[4, 9, 12, 17, 6], [5, 21, 14, 8, 11], [13, 29, 3]$
    - $[9, 11, 13]$
    - $11$ is median, $x = 10$
  - Proceed to swap
  - $A = [4, 9, 6, 3, 8, 11, 12, 17, 21, 14, 13, 29]$, $p = 6$, $k = 5$
Make sure to use lower median!

Lecture 3: Dynamic Programming <2019-09-09 Mon>

solving optimization problems
Decompose problem into subproblems, solving them recursively and combine the solutions into a final, optimal solution
exponential number of subproblems to solve, but with overlap, means solutions can be resued
Number of distinct subproblems is polynomial

The Rod cutting problem

Rod of length $n$, cut into integer length smaller rods to maximize profit.
Rod of length $i$ has price $p_i$
Cuts are free, profit base solely on prices charged for rods
Cuts only occur at integral boundaries, so there are $n - 1$ positions to cut, total number of possible solutions is $2^{n - 1}$
Cuts $i_1, \ldots, i_k$, ($i_1 + \cdots i_k = n$), and revenue $r_n = p_{i_1} + \cdots p_{i_k}$ is maximized
Optimal substructure problem

Lecture 3: Dynamic Programming, Continued <2019-09-11 Wed>

Rod cutting, an implementation

Recall the optimal substructure property, an optimal solution is made up of optimal solutions to subproblems
Easy to prove via contradiction
$r_n = \mathrm{max}(p_n, r_1 + r_{n - 1}, \ldots, r_{n - 1} + r_1)$
Helpful to simply store previously calculated values

Lecture 3: Dynamic Programming, Continued <2019-09-13 Fri>

Rod Cutting, continued

Requires considering all possibilities (with one initial cut) for a given $r_n$, which gives two subproblems.
Proof of correctness/proof of optimal substructure property by contradiction
- Assume for a reductio that the solution is not optimal for the subproblem…
Some dynamic programming problems can be consider slightly differently.
$r_n = \max_{1 \leq i \leq n} (p_i + r_{n-i})$ is an alternative form (03.05)
Rod cutting proof available in book, §15.1 (p. 362)
Cut-rod on (03.06)
Missing memoization

Time Complexity

$T(0) = 1$, See 03.07
Non-memoized complexity is huge
Top-down with memoization
Bottom-up, fill in table
Have the same running time
Memoized (recursive) implementation at 03.10
Bottom up at 03.11 – easier to analyse

Reconstructing a solution

add a second array $s$ to keep track of the optimal size of the first piece cut in each subproblem
Printing from this table is reasonably simple

Lecture 3: Dynamic Programming, Continued <2019-09-16 Mon>

Matrix Chain Multiplication

long chain of $n$ matrices we want to multiply
$p \times q$ by $q \times r$ requires $pqr$ steps
Brute force solution is $\Omega(4^n / n^{3/2})$
Steps:
1. Characterize structure of an optimal solution
2. Recursively define the value of an optimal solution
3. Compute the value of an optimal solution
4. Construct an optimal solution from computed information (03.18)
Step 1
- $A_{i \dots j}$ is the matrix from $A_i A_{i + 1} \cdots A_j$
- Must split product and compute i to k and k+1 to j, then multiply the two together
- $k$ is the optimal parenthesization
Assume cost is not optimal, then there exists some other solution from $i$ to $k$ such that $cost(s^{\prime}) < cost(s)$
- Then replace $s$ with $s^{\prime}$
Step 2:
- $m[i, j]$ is min number of mults for A_i to A_j
- $m[i, j]$
  1. $i = j$ = 0
  2. $i < j$, split at $k$, $m[i, j] = m[i, k] + m[k + 1, j] + p_{i - 1}p_{k}p_j$
  3. $k$ must be tried for all possible values, store in $s[i, j]$ to find $k$ for a given $i, j$ pair. (03.20)
Step 3:
- Is $\Theta(n^2)$
- Bottom-up is on chain length
  - Linear to solve each problem, a quadratic number of subproblems, ∴, $\Theta(n^3)$
  - Implementation on 03.22

Lecture 3: Dynamic Programming, Continued <2019-09-18 Wed>

Matrix Chain Multiplication, Continued

Optimal substructure defined with regards to characterization of subproblems

Optimal Substructure

Consider shortest path?
Does shortest path have optimal substructure?
Yes!
But not for Longest Simple Path. LSP is in fact NP complete, because subproblems are not independent

Longest Common Subsequence

Sequence $Z$ is a subsequence of anothe sequence $x$ if there is a strictly increasing sequence $i_1, \ldots, i_k$ of indices of $X$ such that for all $j = 1, \ldots, k$, $x_{i_j} = z_j$
As one reads through $Z$, one can find a match to each symbol of $Z$ in $X$ in order.
$Z$ is a common subsequence of $X$ and $Y$ if it is a subsequence of both
Goal of LCS is to find a maimum-length common subsequence of sequences $X$ and $Y$

Characterizing Structure of LCS Optimal Solution

Given $X$, the $i$th prefix of $X$ is $X_i$
If $X$ and $Y$ have LCS $Z$, then
1. $x_m = y_n \implies z_k = x_m = y_n$ and $Z_{k - 1}$ is an LCS of $X_{m - 1}$ and $Y_{n - 1}$
  - if $z_k \neq x_m$, can lengthen $Z$. Contradiction
  - If $Z_{k - 1}$ is not LCS of $X_{m - 1}$ and $Y_{n - 1}$, then a longer CS of $X_{m - 1}$ and $Y_{n - 1}$ could have $x_m$ appended to it to get CS of $X$ and $Y$ that is longer than $Z$. Contradiction
2. See 03.31!
Last symbol must be equal!

Recursive definition of Optimal Solution

$c[i, j]$ defined on 03.32
Find the LCS of the prefixes, if $x$ and $y$'s last chars match
If not, find LCS of $X$ and $Y_{n - 1}$, find LCS of $X_{m - 1}$ and $Y$ and use the longer.
Remember to use mathematical style

Define the Algorithm

03.33
$b$ is used to keep direction
Diagonal in an LCS is where it's printed
But LCS is prented backwards, use recursion to print in correct order

Lecture 3: Dynamic Programming, Continued <2019-09-20 Fri>

exactly 4
5 and 4 hinge on finding the correct subproblem

Lecture 3: Dynamic Programming, Continued <2019-09-23 Mon>

Optimal Binary Search Trees

Goal is to construct BSTs such that most frequently sought values are near the root, minimizing expected search time.
Given a sequence $K$ of $n$ distinct keys in sorted order
Key $k_i$ has a probability $p_i$ that it will be sought on a particular search
To handle searches for values not in $k$, have $n + 1$ dummy keys, $d_0, d_1, \ldots, d_n$ to serve as leaves. $d_i$ is reached with probability $q_i$
If $depth_{T}(k_i)$ is distance from root of $k_i$ in tree $T$, then the expected cost of $T$ is $1 + \sum_{i=1}^{n} p_i depth_{T}(k_i) + \sum_{i=0}^{n} q_i depth_{t}(d_i)$
An optimal BST is one with minimum expected search cost

Step 1: Structure of Optimal Solution

Observation: Since $K$ is sorted and dummy keys are interspersed in order, any subtree of a BST must contain contiguous keys from $i$ to $j$ and leaves from $d_{i - 1}$ to $d_j$
"Proof" in 03.44
Given keys $k_i, \ldots, k_j$, say that it's optimal BST roots at $k_r$ for some $i \leq r \leq j$
Since we dont know optimal $k_r$, all must be tried

Recursive definition

$e[i, j]$ is expected cost from $i$ to $j$
if $j = i - 1$, then there is only the dummy key $d_{i - 1}$ and $e[i, i - 1] = q_{i - 1}$
if $j \geq i$, then choose root $k_r$ from $k_i$ to $k_j$, optimally solving subproblems.
When combining optimal trees from subproblems and increase depth by one.
Define $w(i, j) = \sum_{\ell = i}^{j} p_{\ell} + \sum_{\ell = i - 1}^{j} q_{\ell}$ as the sum of the probabilities of the nodes in the subtree built on $k_i, \ldots, k_j$, meaning $e[i, j] = p_r + (e[i, r - 1] + w(i, r - 1)) + (e[r + 1, j] + w(r + 1, j))$
Becomes $w(i, j) = w(i, r - 1) + p_r + w(r + 1, j)$
Minimize, adding root array

Define the function

Pseudocode in 03.43
is in $\mathcal{O}(n^3)$

Lecture 4: Greedy Algorithms <2019-09-27 Fri>

A technique for solving optimization problems
Still examine subproblems, stil exploits optimal substructure
Greedy algorithms instead choose the best choice at the moment, rather than examine all subproblems
- ability to do this is called the greedy choice property, this allows us to limit what subproblems we look at
MST, single-source shortest path

Activity Selection

scheduling classes in a classroom
Courses are candidates to be scheduled in that room, not all can have it
Maximize utilization of the room in terms of number of classes scheduled
Generally:
- Given $S = \{a_1, a_2, \ldots, a_n\}$ of $n$ proposed activities that wish to use a resource that can serve only on activity at a time
- $a_i$ has a start time $s_i$ and a finish time $f_i$, such that $O \leq s_i < f_i < \infty$
- If $a_i$ is scheduled to use the resource, it occupies it during the interval $[s_i, f_i)$ then we can schedule both $_i$ and $a_j$ iff $s_i \geq f_j$ or $s_j \geq f_i$ (if holds, $a_i$ and $a_j$ are compatible)
- Find a largest subset $S^{\prime} \subseteq S$ such that all activities in $S^{\prime}$ are pairwise compatible
- Assume that activities are sorted by finish time.
Let $S_{ij}$ be a set of activities that start after $i$ and end before $j$, and $A_{ij}$ be the subset of $S_{ij}$ that is maximal
See optimal substructure (04.05)
Recursive definition in 04.07, but we can optimize
Greedy choice – choose the one with the earliest finish time of all still compatible with currently schedule activities
Now, consider $S_k = \{a_i \in S \mid s_i \geq f_k\}$ be the activities that start after $a_k$ finishes
- Then greedily choose $a_1$ first and the first thing in each
Proof on 04.09
Argument of Optimal Substructure and Greedy Choice properties

Lecture 4: Greedy Algorithms, Continued <2019-09-30 Mon>

Rod cutting, redux

Subproblem $n$ is rod of length $n$
Let $r_n$ be value of optimal solution to subproblem $n$
$r_n = p_i + r_{n - i}$
Assume than $r_{n - i}$ is not optimal. Falls in from there

Activity Selection, Continued

Optimal solutions are done with dynamic programming in $\mathcal{O}(n^3)$
But greedy solution is better

The Greedy Choice

Maximizes amount of something left over
State Greedy Choice Property
Argue property
argue optimal substructure
Greedy Choice Property means that there exists a greedy choice that leads to an optimal solution
- Use a proof by construction
See 04.09

Lecture 4: Greedy Algorithms, Continued <2019-10-02 Wed>

Proof of greedy choice

Slide 04.09
Let $A_k$ be an optimal solution to $S_k$, and let $a_j$ have earliest finish time of all in $A_k$
Make this choice, Activities in $A^{\prime}$ are mutually compatible
Therefore, there exists an optimal solution taking the greedy choice
$A_k$ is such that all elements in it are compatible with $a_k$
Thus, $A_0 \cap S_1$ is optimal for $S_1$, by contradiction:
- Assume for a reductio that $A_1^{\prime}$ exists such that $|A_1^{\prime}| > |A_1|$ and all activities in $A_1^{\prime}$ are mutually compatible
- $A_0^{\prime} = A_0 \setminus A_1 \cup A_1^{\prime}$
- $|A_0^{\prime}| > |A_0|$, contradiction!
Greedy Choice propertiy is such that the optimal solution $A_0$ contains $a_1$

Greedy Vs Dynamic Programming

Gredy algorithms leverage optimal substructure
When we can argue that greedy choice is part of a optimal solution, implying that we don't need to explore all subproblems
Consider the knapsack problem
0-1 is taken in entriety
Fractional is not
Fractional chooses $v_i / w_i$ and choose items in descending order
- knapsack can always be filled in the end
0-1 is dynamic, but NP-complete, is $\mathcal{O}(nW)$, and therefore pseudopolynomial

Lecture 4: Greedy Algorithms, Continued <2019-10-04 Fri>

Huffman Coding

encoding a file of symbols from some alphabet
Want to minimize size, based on frequency
Fixed-length uses $\lceil \log_2{n} \rceil$ bits per symbol
Variable-length uses fewer bits for more frequent symbols
Prefix codes can be represented as a binary tree, as can prefix-free codes
Corresponds to a full binary tree
Pseudocode on 04.18
Computation is linear, log for heap ops, therefore, total, $\mathcal{O}(n \log{n})$

Greedy Choice

04.20–22
Lemma: Let $C$ be an alpabet in which symbol $c \in C$ has frequency $c.freq$ and let $x, y \in C$ have lowest frequencies. Then there exists an optimal prefix code for $C$ in which codewords $x$ and $y$ have the same length and differ only in the last bit. I.e., an optimal solution exists that merges lowest frequencies first.
Proof:
- Let $T$ be a tree representing an arbitrary optimal prefx code, and let $A$ and $b$ be siblings of maximum depth in $T$
- Assume, without loss of generality, that $x.freq \leq y.freq$ and $a.freq \leq b.freq$
- Since $x$ and $y$ are the two least frequent nodes, we get $x.freq \leq a.freq$ and $y.freq \leq b.freq$
- Convert $T$ to $T^{\prime}$ by exchanging $a$ and $x$, then $T^{\prime}$ to $T^{\prime\prime}$ by exchanging $b$ and $y$
- Then, in $T^{\prime\prime}$, $x$ and $y$ are siblings of maximum depth
- Cost difference between $T$ and $T^{\prime}$ is $B(T) - B(T^{\prime})$
  - $= \sum_{c \in C} c.freq \cdot d_T(c) - \sum_{c \in C} c.freq \cdot d_{T^{\prime}}(c)$
  - $= x.freq \cdot d_T(x) + a.freq \cdot d_T(a) - x.freq \cdot d_{T^{\prime}}(x) - a.freq \cdot d_{T^{\prime}}(a)$
  - $(a.freq - x.freq)(d_T(a) d_T(x)) \geq 0$
- Since $a.freq \geq x.freq$ and $d_T(a) \geq d_T(x)$
- Similarly, $B(T^{\prime}) - B(T^{\prime\prime}) \geq 0$ , so $B(T^{\prime\prime}) \leq B(T)$, so $T^{\prime\prime}$ is optimal. QED

Optimal Substructure

04.23–25

Lemma

Let $C$ be an alphabet in which symbol $c \in C$ has frequency $c.freq$ and let $x, y \in C$ have lowest frequencies
Let $C^{\prime} = C \setminus \{x, y\} \cup {z}$ and $z.freq = x.freq + y.freq$
Let $T^{\prime}$ be any tree representing an optimal prefix code for $C^{\prime}$
Then, $T, which is $T^′$ with leaf $z$ replaced by internal node with children $x$ and $y$, represents an optimal prefix code for $C$

Proof

See 04.24–25
Look back at it.

Lecture 5: Elementary Graph Algorithms <2019-10-07 Mon>

Graphs are ADTs applicable to numerous problems
- can capture entities and the relationships between them
- degree of the relationship
- and more
Basics in graph theory, representation, algorithms for basic graph-theory problems

Breadth-First Search

Given a graph $G = (V, E)$, and a source node $s \in V$, BFS visits every vertex that is reachable from $s$
Uses a queue to search in a breadth-first manner
creates a structure called a BFS tree such that for each vertex in $v$, the distance from $s$ to $v$ is a shortest path in $G$
Initializes each node's color to WHITE
As nodes are visited, color to GREY when in the queue, then BLACK when finished.

Lecture 5: Elementary Graph Algorithms, Continued <2019-10-09 Wed>

Breadth-First Search, continued

BFS tree is shortest paths tree
Algorithm presented in 05.04
Array $d$ is distance from start node
Array $\pi$ is parent node.
$\mathcal{O}(|V|^2)$ for adjacency matrix
With adjacency list, is $\mathcal{O}(|V|)$, or actually, $\mathcal{O}(|E| + |V|)$
$d[v]$ is shortest distance from $s$ to $v$ for unweighted shortest paths
Can print shortest path by recursively following $\pi[v]$, $\pi[\pi[v]]$, etc.
If $d[v] == \infty$, then $v$ is not reachable from $s$

Depth First Search

Graph traversal, follows path as deep as possible before backtracking
DFSis implemented recursively
Tracks discovery time and finish time of each node
All nodes coloured black at end
Produces a "forest"
Presented on 05.09 and 05.10
Time complexity equivalent to BFS, $\Theta(|V| + |E|)$
Vertex $u$ is a proper descendent of vertex $v$ in the DF forest iff $d[v] < d[u] < f[u] < f[v]$
- Parenthesis structure If one prints "(u" when discovering $u$ and "u)" when finishing $u$, then the printed text will be a well-formed parenthesized expression
Types of tree edges
- Tree edges are in depth-first forest
- Back edges $(u, v)$ connect a vertex $u$ to its ancestor $v$ in the DF tree (includes self-loops)
- Forward edges are non-tree edges connecting a node to one of its DF tree descendants
- Cross edges goe between non-ancestral edges within a DF tree or between DF trees
A graph has a cycle iff DFS discovers a back edge (deadlock detection)
When dfs first explores an edge, look at $v$'s color
- white implies tree edge
- grey implies back edge
- black implies forward or cross edge

Lecture 5: Elementary Graph Algorithms, Continued <2019-10-11 Fri>

Topological Sorting

Begins with a directed acyclic graph, edge implies event must occur before another
Topological sort of a dag $G$ is a linear ordering of its vertices such that if $G$ contains an edge $(u, v)$, then $u$ appears before $v$ in the ordering
Call DFS on dag $G$
As each vertex is finished, insert into the front of a linked list, return the linked list of vertices
Thus, topological sort is a descending sort of vertices based on DFS finish times
What is time complexity – $\Theta(|V| + |E|)$
See 05.15

Strongly Connected Components

Given $G = \langle V, E \rangle$, a strongly connected component of $G$ is a maximal set of vertices $C \subseteq V$ such that for every pair of vertices $u, v \in C$, $u$ is reachable from $v$ and $v$ is reachable from $u$
Collapsing the edges within an SCC yields an acyclic component graph
Algorithm requisres transpose of $G$, $G^T$, which is $G$ with the edges reversed. $G^T$ and $G$ have the same SCCs
Call DFS on $G$
Call DFS on $G^T$, Looping through vertices in order of decreasing finishing times from first DFS call
Each DFS tree in second DFS run is an SCC in $G$
Is $\Theta(|V| + |E|)$, because $G^T$ can be computed in $\Theta(|V| + |E|)$ if in adjacency list, and let the first run of DFS use Topological Sorting to generate the list
The first component node of $G^T$ has not outgoing edges, since in $G$ it has only incoming edges, second has edges into the first, etc. (Lemma 22.14 in book)
And, DFS on $G^T$ visits components in topological order of $G$'s component graph

Lecture 6: Minimum-Weight Spanning Trees <2019-10-14 Mon>

Intro

Given a connected, undirected graph $G = (V, E)$, a spanning tree is an acyclic subset $T \subseteq E$, that connects all vertices in $V$
- $T$ is acyclic implies tree
- $T$ connects all vertices, spans $G$
If $G$ is weighted, then $T$'s weight is $w(T) = \sum_{(u, v) \in T} w(u, v)$
An MST is a spanning tree of minimum weight, not necessarily unique.

Kruskal's Algorithm

Greedy algorithm, makes locally best choice at each step
Starts by declaring each vertex to be its own tree (so all nodes together make a forest)
Iteratively identify the minimum-weight edge $(u, v)$ that connects two distinct trees and add it to the MST $T$, merging the two trees

Lecture 6: Minimum-Weight Spanning Trees, Continued <2019-10-16 Wed>

Kruskal's Algorithm

Relies on Disjoint Set data structure
Algorithm presented on 06.05
Algorithm is $\mathcal{O}(|E| \log{|E|})$ (06.12)

Disjoint Set Data Structure

Given universe $U$ of elements, DSDS maintains collection $\mathcal{S} = \{S_1, \ldots, S_k\}$
- Each element in $U$ is in exactly one set $S_j$
- No set $S_j$ is empty
Dynamically updated throughout program
Each set $S \in \mathcal{S}$ has a representative element $x \in S$
Chapter 21
$\alpha$ is the inverse Ackermann function
very near constant

Prim's Algorithm

Still greedy
Maintains a single tree, not a forest
Starts with an arbitrary root $r$
Repeatedly finds minimum weight edge incident to the tree
algorithm on 06.14
Uses min queue
Proof in 06.18
- Nodes already are those in $V \setminus Q$
- For all vertices $v \in Q$, if $\pi[v] \neq \mathsc{Nil}$, then $key[v] < \infty$
Proof is same, more or less
Prove invariant via cut that respects $A$ (no edges span the cut)
GRedy choise by looking for a safe edge $e$

Minimum Heap

Binary tree where key at each node is $\leq$ keys of children
Any subtree also a heap
$\theta(\log{n})$
Build is $\mathcal{O}(n)$
Filter new min in $\mathcal{O}(\log{n})$

Lecture 6: Minimum Spanning Trees, Continued <2019-10-18 Fri>

Proof of correctness of both algorithms

Both use greedy approach
Maintain invariant that at any time, set of edges $A$ selected so far is subset of some MST (optimal substructure)
Each iteration of the algorithms look for a safe edge $e$ such that $A \cup \{e\}$ is also a subset of an MST (greedy choice)
Prove invariant via use of cut $(S, V - S)$ that respects $A$ (no edges span cut)
Cut is partition of graph into two disjoint subsets
Theorem Let $A \subseteq E$ be included in some MST of $G$, $(S, V - S)$ be a cut respecting $A$ and $(u, v) \in E$ be a minimum-weight edge crossing the cut. Then $(u, v)$ is a safe edge for $A$. (06.20)
Proof.
- Let $T$ be an MST including $A$ and not including $(u, v)$
- Let $p$ be path from $u$ to $v$ in $T$, and let $(x, y)$ be an edge from $p$ crossing cut (not $A$).
- Since $T$ is a spanning tree, so is $T^{\prime} = (T \setminus \{(x, y)\}) \cup \{(u, v)\}$.
- Both $(u, v)$ and $(x, y)$ cross cut, so $w(u, v) \leq w(x, y)$
- So $w(T^{\prime}) = w(T) - w(x, y) + w(u, v) \leq w(T)$
- Thus, $T^{\prime}$ is an MST
- And $(u, v)$ is a safe edge for $A$ since $A \cup \{(u, v)\} \subseteq T^{\prime}$

Lecture 7: Single-Source Shortest Path <2019-10-18 Fri>

Given a weighted, directed graph $G = (V, E)$ with weight function $w: E \mapsto \mathbb{R}$
The weight of a path $p = \langle v_0, v_1, \ldots, v_k \rangle$ is the sum of the weights of its edges: $w(p) = \sum_{i = 1}^{k} w(v_{i - 1}, v_i)$
Shortest path weight from $u$ to $v$ is $\delta(u, v)$ is the min of all paths between the two, infinite otherwise
A shortest path from $u$ to $v$ is any path $p$ such that $w(p) = \delta(u, v)$
Network routing, driving directions

Types

Single-source: finding shortest pth from source $s$ to every other node.
Single-destitination: find shortest paths from every node to destination $t$ (solved by running single-source on the transpose)
Single-pair: find shortest path from $u$ to $v$ (solveable with SSSP, nothing asymptoticaly faster)
All-pairs: find shortest paths between every pair of nodes (can be solved with repeated applications of single-source, but can be faster/simpler)

Optimal substructure

Shortest paths problem has optimal substructure property. Proof on 07.04.

Negative-Weight Edges and Cycles

What happens if $G$ has edges with negative weights?
Dijkstra's algorithm cannot handle this!! Bellman-Ford can, under the right circumstances, namely no negative-weight cycles
Cycles include
- Negative-weight cycles
- Zero-weight cycles
- Positive-weight cycles

Lecture 7: Single-Source Shortest Paths, Continued <2019-10-23 Wed>

Relaxation

$d[v]$ for $v \in V$, is the upper bound on $\delta(s, v)$
Relaxation of an edge $(u, v)$ is the process of testing whether we can decrease $d[v]$ and yielding a tighter upper bound
Initialize single source, initializes sets $d[v]$ to $\infty$, and the parent to non-existent
Relax is such that if the current $d[v]$ is greater than $d[u] + w(u, v)$, update $d[v]$ to that, and set $v$'s parent to $u$

Lecture 7: Single-Source Shortest Paths, Continued <2019-10-25 Fri>

Relaxation

Provides core to shortest path
Basic method works simply, varied based on implementation

Bellman-Ford

Works with negative-weight edges and detects negative-weight cycles
Makes $|V| - 1$ passes over all edges, relaxing each edge during each path
No cycles implies all shortest paths have $\leq |V| - 1$ edges, so that the number of relaxations is sufficient
Can be modified to tag negative-weight cycles
Algorithm shown on 07.13
Pass is through every single edge, $|V| - 1$ times
7-10 detects negative weight cycles
Time Complexity:
- Initialize Single Source takes $\mathcal{O}(|V|)$
- Relax itself is $\mathcal{O}(1)$
- Negative Weight Cycle is $\mathcal{O}(|E|)$
- Dual loop is $\mathcal{O}(|V| |E|)$
- Thus, total time complexity is $\mathcal{O}(|V| |E|)$
Correctness
- Assume no negative weight cycles
- Since no cycles appear in SPs, every SP has at most $|V| - 1$ edges
- Then define sets $S_0, \ldots, S_{|V| - 1}$
  - $S_k = \{ v \in V | \exists p = (s \to v) \mathrm{s.t.} \delta(s, v) = w(p) \land |p| \leq k\}$
  - Invariant: After the $i$th iteration of outer relaxation loop, for all $v \in S_i$, we have $d[v] = \delta(s, v)$
    - This is the path-relaxation property (Lemma 24.15)
    - Proven via induction on $i$:
      - obvious for $i = 0$
      - If holds for $v \in S_{i - 1}$, then relaxation and optimal substructure implies holding for $v \in S_i$
  - Thus, after $|V| - 1$ iterations, $d[v] = \delta(s, v)$ for all $v \in V = S_{|V| - 1}$
- Detection of Negative-weight cycles (07.18)
  - Let $c = \langle v_0, v_1, \ldots, v_k = v_0 \rangle$ be a neg-weight cycle reachable from $s$: $\sum_{i = 1}^k w(v_{i - 1}, v_i) < 0$
  - If the algorithm incorrectly returns true, then for all nodes in the cycle $i = 0, 1, \ldots, k$, $d[v_i] \leq d[v_{i - 1}] + w(v_{i - 1}, v_i)$
  - By summing we get thing
  - Since $v_0 = v_k$, the two are the same, then contradiction follows (07.18)

Lecture 7: Single-Source Shortest Paths, Continued <2019-10-28 Mon>

Billboard problem and avocado problem are very similar

Shortest paths in a DAG

topo sort vertices, initialize single source, relax from left to right
Correctness follows from path-relaxation of Bellman-Ford, but relaxing in topo order implies edges relaxed in order
Total time complexity is $\mathcal{O}(|V| + |E|)$

Dijkstra's Algorithm

Greedy
Faster than Bellman-Ford
Requires all edge weights to be non-negative
Maintains set $S$ of vertices whose final shortest path weights from $s$ have been determined
Repeatedly selects $u \in V \setminus S$ with minimum SP estimate, adds $u$ to $S$ and relaxes all edges leaving $u$
Uses min-priority queue to repeatedly make greedy choice
Pseudocode on 07.25
Use array to implement min queue

Correctness

Invariant: At the start of each iteration of the while loop, $d[v] = \delta(s, v)$ for all $v \in S$
Proof: Let $u$ be first node added to $S$ where $d[u] \neq \delta(s, u)$
Let $p = s \to x \to y \to u$ be SP to $u$ and $y$ first node on $p$ in $V \setminus S$
Since $y$'s predecessor $x \in S$, $d[y] = \delta(s, y)$ due to relaxacion of $(x, y)$

Lecture 7: Single-Source Shortest Paths, Continued <2019-10-30 Wed>

Dijkstra's and Time Complexity

Initialize single source – $\Theta(|V|)$
Create Q – array, $\mathcal{O}(|V|)$; heap, $\mathcal{O}(|V|)$
Extract-Min – $\mathcal{O}(|V|)$ calls
- Time Complexity – $\mathcal{O}(|V|)$ if array; $\mathcal{O}(\log{|V|})$ if heap
Relax – $\mathcal{O}(|E|)$
- Time Complexity – $\mathcal{O}(1)$ ; $\mathcal{O}(\log{|V|})$
Total Time Complexity
- $\mathcal{O}(|V|^2)$ if array
- $\mathcal{O}(|E| \log{|V|})$ if heap
If $E$ is $o(|V|^2 / \log{|V|})$,, prefer heap

Correctness, continued

Since $y$ precedes $u$ in $p$, and edge weights are non negative $d[y] = \delta(s, y) \leq \delta(s, u) \leq d[u]$
Since $u$ was chosen before $y$ in line 3, $d[u] \leq d[y]$, so $d[y] = \delta(s, y) = \delta(s, u) = d[u]$, a contradiction
Since all vertices eventually end up in $S$, we see that the algorithm is correct. "QED"

Lecture 8: All Pairs Shortest Paths <2019-11-04 Mon>

Similar to SSSP, but for all pairs
Given weighted directed graph, find $\delta(u, v)$ for all $(u, v) \in V \times V$
Run SSP $|V|$ times
- If no negative weights, use Dijkstra's, gives $\mathcal{O}(|V|^3)$
- If negative weights, use Bellman-Ford, for $\mathcal{O}(|V|^4)$
Can do better, and handle neg-weight edges
- Matrix-multiplication algorithm, $\Theta(|V|^3 \log |V|)$
- Floyd-Warshall, $\Theta(|V|^3)$

Adjacency Matrix Representation

Assume vertices are numbered, 1 to $n$
Input is $n \times n$ matrix: \[ w_{ij} = \begin{cases} 0 & i = j \\ w(i, j) & (i, j) \in E \\ \infty & (i, j) \not\in E \end{cases} \]
Assume, for now, that neg weight cycles are absent
Distance matrices $L$ and $D$ are produced by algorithms, predecessor matrix $\Pi$ where $\pi_{ij}$ is the predecessor of $j$ on a shortest path from $i$ to $j$ or NIL if $i = j$ or no path exists (well-defined due to optimal substructure)

Printing from $\Pi$

defined recursively, printing after recursive call
See 08.04

Matrix Multiplication

Maintain a series of matrices $L^{(m)} = \left( \ell_{ij}^{(m)} \right)$, where $\ell_{ij}^{(m)}$ is the minimum weight of any path from $i$ to $j$ that uses at most $m$ edges
- When $m = 0$, $\ell_{ij}^{(0)} = 0$ if $i = j$, $\infty$ otherwise
Exploits opitmal substructure to get a recursive definition of $\ell$
Uses $\min$, see 08.06

Lecture 8: All Pairs Shortest Paths, continued <2019-11-08 Fri>

Matrix Multiplication, continued

Allows better paths to come up
Recursive solution due optimal substructure property (05.06) \[ \ell_{ij}^{(m)} = \min_{1 \leq k \leq n} \left(\ell_{ik}^{(m - 1)} + w_{ik}\right) \]
Ignoring negative weight cycles means $\delta(i, j) = \ell_{ij}^{(n - 1)}$ (because shortest path uses $n - 1$ edges)
Bottom up computation $L^{(m)}$ using $L^{(1)} = W, L^{(2)}, \ldots, L^{(n - 1)}$ by iterative computation
Defined on 05.08–09
Faster version on 05.13

Lecture 8: All Pairs Shortest Paths, continued <2019-11-11 Mon>

Floyd-Warshall

Shaves : log factor
Start assuming no neg weight cycles, can detect neg cycles as before
Considers decomposition by intermediate vertex,
- If a simple path $p = \langle v_1, v_2, v_3, \ldots, v_{\ell - 1}, v_{\ell} \rangle$, then the set of intermediate vertices is $\{v_2, v_3, \ldots, v_{\ell - 1}\}$
Let $V = \{1, \ldots, n\}$, and fix $i, j \in V$
For some $1 \leq k \leq n$, consider set of vertices $V_k = \{1, \ldots, k\}$
Consider all paths from $i$ to $j$ whose intermediate vertices come from $V_k$ and let $p$ be a minimum-weight path from them
Is $k \in p$?
- If not, then all intermediate vertices of $p$ are in $V_{k - 1}$ and an SP from $i$ to $j$ based on $V_{k - 1}$ is also an SP from $i$ to $j$ based on $V_k$
- If so, then we can decompose $p$ into $i \overset{p_1}{\leadsto} k \overset{p_2}{\leadsto} j$
Note, $V_0$ is no intermediate vertices, i.e., only the direct edges between pairs of vertices
Now $D^{(k)}_{ij}$ is the weight of an SP from $i$ to $j$ using only $V_k$ as intermediates
- $D^{(0)} = W$
- $D^{(n)}$ is the shortest paths
$k \neq p \implies D^{(k)}_{ij} = D^{(k - 1)}_{ij}$
$k = p \implies D^{(k - 1)}_{ik} + D^{(k - 1)}_{kj} = D^{(k)}_{ij}$
Shortest path from $i$ to $j$ based on $V_k$ is either going to be the same as that based on $V_{k - 1}$ or will be going to go through $k$
$D^{(k)} = \left(d_{ij}^{(k)}\right)$, where $d_{ij}^{(k)}$ is the weight of the shortest path at $k$ (08.17), is in $\Theta(n^3)$
Algorithm on (08.18)
Transitive Closure, whether paths exist between pairs of vertices for $G = (V, E)$, transitive closure of $G^{*} = (V, E^{*})$
- $E^{*} = \{(i, j) : \exists i \overset{p}{\leadsto} j \in G\}$
- Can be solved with Floyd-Warshall but with bitwise ops instead of min/plus
Memory can potentially be saved, operation can be in place

Lecture 9: Lower Bounds <2019-11-11 Mon>

Upper Bound

Of an algorithm, has $f(n)$ for $n$ input; such that if there exiusts no input of size $n$ such that $A$ requires more than $f(n)$ time
Of a problem, has an upper bound of $f(n)$ if there exists at least one algorithm that has an upper bound of $f(n)$
Generally, tightest is preferred

Lower Bound

A problem has a lower bound $f(n)$ if for any algorithm $A$ to solve the problem there exists at least one input of size $n$ that forces $A$ to take at least $f(n)$ time or space
Pathological input depends on the sepecific algorithm $A$, consider Bubble sort, takes reverse order, giving $\Omega(n^2)$
Since every sorting algorithm has an input of size $n$ forcing $\Omega(n \log{n})$ steps, the sorting problem has time complexit lower bound of $\Omega(n \log{n})$
Adversarial argument, simulate arbitrary algorithm $A$ to build pathological input
Reduce one problem to another to establish lower bounds

Lecture 9: Lower Bounds, continued <2019-11-13 Wed>

Decision trees

full binary tree that represent comparisons bteween elementts performed by a particular sorting algorithm operating on a certain sized input
Tree represents an algorithm's behavior on all possible inputs of size $n$
Each node represents one comparison made by the algorithm
- Each node is labeld as $i:j$ which represents $A[i] \leq A[j]$
- If, in particular input, holds, then flo moves left, otherwise to right
- Each leaf represents a possible output of the algorithm, a permutation of the input
- All permutations must be in the tree in order for algorithm to work property

Proof of lower bound (using an adversarial approach)

Length of path from root to output leaf is number of comparisons made by algorithm
Worst-case number of comparisons is the length of the longest path
Adversary chooses a deepest leaf to create worst-case input
Number of leaves is $n!$
A binary tree of height $h$ has at most $2^h$ leaves
Thus, $2^h \geq n# \geq \sqrt{2\pi n}\left(\frac{n}{e}\right)^n$
Take base-2 logsx
$h \geq \log \sqrt{2\pi} + (1/2) \log n + n \log n - n \log e = \Omega(n \log n)$
Every comparison-based sorting algorithm has some input that forces it to make $\Omega(n \log n)$ comparisons. QED
And thus, mergesort and heapsort are asymptotically optimal.

Reductions, using Convex Hull

Use sorting lower bound to get lower bound on convex hull
Given a set $Q = \{p_1, p_2, \ldots, p_n\} of $n$ points, each from $\mathbb{R}^2$, output $CH(Q)$, which is the smallest convex polygon $P$ such that each point from $Q$ is on $P$'s boundary or in its interior
Output is an ordered list of points that make up the boundary edges such that from one to the next, the hull will be drawn
Reduce problem of sorting to convex hull
Given any instance of the sorting problem, $A = \{x_1, \ldots, x_n\}$, we will transform it to an instance of convex hull.
Transformation is $Q = \{(x_1, x_1^2), \ldots, (x_n, x_n^2)\}$
Feed into convex hull, convex hull transformed to sorted array

Lecture 9: Lower Bounds, continued <2019-11-15 Fri>

Reductions, using Convex Hull, continued

Known Lower Bound of Sorting, reducing to Convex Hull
Since Sorting is in $\Omega(n \log{n})$, by reduction, we will find that Convex Hull is $\Omega(n \log{n})$
Takes any instance of problem A (the problem reduced from) to some instance of problem B (the problem reduced to)
Because of mapping, transformation will include all points (as will all lie on a parabola)
By definition, only one chunk
Conversion is still linear time
Proves a lower bound of convex hull, but not necessarily the lower bound (it is the lower bound of Convex Hull)
Can easily be done via contradiction. (09.12)

Lecture 10: NP Completeness <2019-11-15 Fri>

So far, focused on problems with efficient algorithms (i.e., those that run in polynomial time)
- Called efficient even if $c$ is large, since it is likely that another, more efficient algorithm exists
- Polynomial in terms of size of input, $n$.
Some problems have a fastest known algorithm in superpolynomial time (i.e., exponential, sub-exponential ($2^{n^{1/3}}$)
Some problems cannot be solved in any amount of time
Focus on lower bounds, but used to argue that some problems don't have any efficient solution

Lecture 10: NP Completeness, continued <2019-11-18 Mon>

Focus on complexity classes P and NP
P is deterministic polynomial time, the set of problems that can be solved with a deterministic TM (deterministic algorithm in polynomial time)
NP is non-deterministic polynomial time, i.e., set of problems that can be solved by a non-deterministic TM in polynomial time (can explore many possible paths of computation at once)
NP is the set of problems whose solutions, if given, can be verified in polynomial time
Hamiltonian cycle – $G = (V, E)$ contain a hamiltonian cycle (a simple cycle visiting every vertex exactly once?)
- NP, since given a specific $G$, plus yes/no answer, plus certificate, we can verify a yes in polynomial time using the certificate
- What would be an appropriate certificate?
- Not known if $HAM-CYCLE \in P$
Euler – does a directed graph $G = (V, E)$ contain an Euler tour – a cycle that visits every edge in $E$ exactly once and can visit vertices multiple times
- Check if each vertex's in-degree is equivalent to its out-degree
- Certificate is just the tour, or even the graph it self
Any problem in $P$ is also in $NP$, since if we can efficiently solve the problem, we get poly-time verification for free, i.e., $P \subseteq NP$
Not known if $P \subset NP$ unknown if there exists a problem in $NP$ that's not $P$
A subset of the problems in $NP$ is the set of NP complete problems
- Every problem in NPC is at least as hard as all others in NP
- These problems are believed to be intractable (no efficient algorithm), but not yet proven to be so
- If any NPC problem is in $P$, then $P = NP$ and life is glorious and scary

Proving NP completeness

Prove that a problem is NPC, we can just take a different approach
To prove $B \in NPC$:
1. Prove that $B \in NP$ by identifying certificate that can be used to verify a YES answer in polynomial time
  - Use obvious choice
  - Need to argue that verification requires polynomial time
  - Certificate is not instance unless $B \in P$
2. Show that $B$ is as hard as any other NP problem by showing that if we can efficiently solve $B$ then we can efficiently solve all problems in NP
Step one is easy
Second looks hard, but just use a reduction (probably to 3SAT)

Reducing to NP

Takes instance of $A$ and transforms to $B$ such that if $B$ is solveable, then $A$ is solvable
TC of reduction for $A$ is reduction to $B$ plus time to solve $B$
Start with $A$ and $B$
- Trasform any instance $\alpha$ of $A$ to some instance $\beta$ of $B$ such that
  1. Transformation must take polynomial time
  2. The answer for $\alpha$ is yes iff the answer for $\beta$ is yes.
If such a reduction exists, then $B$ is at least as hard as $A$, since if an efficient algo iexists, any $A$ can be solved similarly
$A \leq_{p} B$ (read $A$ is reducible to $B$ modulo polynomial-time conversion)

Only decision problems

Only focus on decision problems, simplifies things
Only output is yes or no
Not asked for shortest path or cycle or similar
Don't ask for shortest path from $i$ to $j$, just ask if there exists a path from $i$ to $j$ with weight at most $k$
Optimization problem implies decision problem
They are no harder than the original optimization problem, show if decision is hard, so is optimization
Decision are especially convenient if you think about TM and languages

Lecture 10: NP Completeness, continued <2019-11-20 Wed>

Proving NP completeness

Same form as reduction for convex hull, but no need to trasform solution for $B$ to a solution for $A$
As with the convex hull, the reduction's time complexity must be strictly less than the lower bound we are proving for $B$'s algorithm.
- The problem conversion must happen in $\mathcal{O}(n^c)$, where $c \in \mathbb{Z}$
Must show that output of $B$ is yes iff $A$ would be yes
But if we want to prove that a problem $B$ is NPC, do we have to reduce it to every problem in NP?
- No! reductions are transitive!

Circuit Sat

First NPC problem
An instance is a boolean combinational circuit, no feedback, no memory
Is there a satisfying assignment such that the assignment of inputs that makes it output 1?
To prove, show circuit sat is NP, run circuit
- Certificate is satisfying assignment
- Any problem in NP reduces to Circuit Sat
Skipping second, leverages the existence of an algorithm that verifies certificates for some NP problem

Proving the problem

10.18
Follow every step
1. Prove that B is in NP
  1. Describe certificate that can verify yes (often simple and obvious, but not merely the instance)
  2. Describe how the certificate is verified
  3. Argue tht verification takes polynomial time
2. Prove that the problem $B$ is NP-hard
  1. Take any other NP-complete problem $A$ and reduce it to $B$
    - Reduction must transform any instance of $A$ to some instance of $B$
  2. Prove that the reduction takes polynomial time (an algorithm, analyze like any other)
  3. Prove that the reduction is valid
    - Answer is "yes" for $A$ if and only if the answer is "yes" for $B$
    - Must argue both directions
    - Constructive proofs work well, e.g., "Assume the instance of VERTEX-COVER (problem $A$) has a vertex cover of size $\leq k$. We will now construct from that a Hamiltonian cycle in problem $B$." (and vice versa)

Lecture 10: NP Completeness, continued <2019-11-22 Fri>

Satisfiability (SAT) through Circuit Satisfiability

Given a poolean formula $\phi$ consisting of $n$ variables, $x_1, \ldots, x_n$ , $m$ connectives (of the usual type) and parenthesis
SAT is $B$
- Certificate is satisfying assignment
- Certificate verified by evaluation of satisfied assignment
- Verification clearly takes polynomial time ($\mathcal{O}(n + m)$)
SAT is in fact in NP
Will show CIRCUIT-SAT $\leq_{P}$ SAT reducing CIRCUIT-SAT to SAT, and thus show that SAT is NP hard
Map any instance $C$ of circuit sat to some formula $\phi$ in polynomial time such that sat iff circuit sat
- Define a variable for each wire in $C$
- Then define a clause of $\phi$ for each gate that defines the function for that gate
- And include the output wire as a separate clause
- Given a satisfying assignment for $\phi$, extraction of $x_{1}_{}$, $x_{2}$, $x_{3}$ as satisfying assignment to $C$
- $\phi$'s size is polynomial in size of $C$
- If circuit has a satisfying assignment, final output of circuit is $1$ and value on each internal wire matches the output of the gate that feeds it, thus $\phi$ evaluates to 1
- If $\phi$ has a satisfying assignment, then each of $\phi$'s clauses is equivalent to each internal wire.
- Since satisfying assignment for $C$ impplies satisfying assignment for $\phi$ and vice-versa, we get $C$ has a satisfying assignment iff $\phi$ does.

3-CNF Satisfiability

Given a boolean formula than is in 1-conjunctive normal form, which is a conjunction of clauses, each a disjunction of 3 literals
Assignment is certificate
Certificate verified by evaluation of satisfied assignment
Reduce SAT $\leq_{P}$ 3CNFSAT
Again, need to map any instance of $SAT$ to some instance of 3CNFSAT
- Parenthesize $\phi$ and build parse tree, that can be a circuit
- Assign variables to wires in the circuit, $\phi^{\prime}^{}$
- Truth table for $\phi_{i}^{\prime}$ to get dnf of negation, convert to cnf (by DeMorgan's) which is double prime
- Add auxilary vars
- Take conjunction of all of double prime
  - If has three literals, add as clause
  - If has two literals, add $p$ and $\lnot p$ to two copies and add the copies
  - Otherwise, add auxillary variables $p$ and $q$ in the correct manner
- Can show the biconditional relatively easily

Lecture 10: NP Completeness, continued <2019-11-25 Mon>

Clique

Given an undirected graph $G = (V, E)$ and a value $k$, does $G$ contain a clique (complete subgraph) of size $k$?
Certified by vertices, $V^{\prime}$ of clique
- Verified that $V^{\prime}$ forms a clique
- $|V^{\prime}| \geq k$
Reduce 3CNF to Clique
- $\langle \phi \rangle$ becomes $\langle G, k \rangle$
- $\phi = C_{1} \land \cdots \land C_{k}$
- Creates triple $v^{r}_{1}$, $v^{r}_{2}$, $v^{r}_{3}$ added to $V$ for every clause
- Create an edge between any pair of vertices, $(v^{k}_{j}, v^{l}_{m})$
  - When $k \neq l$
  - They are not the negation of each other
- Obviously done in polynomial time
If has satisfying assignment, at least one literal in each clause is true
$V^{\prime}$ is clique of size $k'$
If G has size $k$ clique, can assign 1 to corresponding literal of each vertex
Each vertex in its own triple, so each clause has a literal set to $1$
Will not set both literal and negation to 1
Gets satisfying assignment
See 10.37.

Vertex Cover

A vertex covers all edges incident to it
A vertex cover of a graph is a set of vertices that cvers all edges in the graph
An undirected graph $G$ and value $k$
Does $G$ contain a vertex cover of size $k$
Set of vertices is certificate, verify size and all edges are covered.
Reduce Clique to Vertex Cover
Given $\langle G = (V, E), k \rangle$ of Clique, vertex cover is $\langle \bar{G}, |V| - k \rangle$ where $\bar{G} = (V, \bar{E})$ is $G$'s complement.
Clear that conversion is polynomial
Proof of biconditional trivial by construction given complement (see 10.41)

Lecture 10: NP Completeness, continued <2019-12-02 Mon>

Review method of reduction
Review Vertex Cover reduction

Lecture 10: NP Completeness, continued <2019-12-04 Wed>

Subset Sum

Given a finite set $S$ of positive integers and a positive integer target $t$, is there a subset $S^{\prime} \subseteq S$ whose elements sum exactly to $t$?
In NP
- Certificate is $S^{\prime}$, verification in polytime
Subset sum is NP Hard, from 3-CNF-SAT
- 3-CNF-SAT $\leq_{P}$ SUBSET-SUM
- No clause contains variable and its negation
- Each variable appears in at least one clause
- Reduction
  - Let $\phi$ have $k$ clauses $C_{1}, \ldots C_{k}$ over $n$ variables $x_{1}, \ldots x_{n}$
  - Reduction creates two numbers in $S$ for each variable $x_{i}$ and two numbers for each clause $C_{j}$
  - Each number has $n + k$ digits, most significant $n$ tied to variables, least significant $k$ tied to clauses
    - $t$ has a one in each digit tied to a variable, and a 4 in each digit tied to a clause.
    - For each $x_{i}$, $S$ contains $v_{i}$ and $v_{i}^{\prime}$ in the $i$th most significant position, and $0$ for other variables, put a 1 in $C_{j}$'s digit for $v_{i}$ if it apears unnegated, a 1 in $C_{j}$'s digit for $v_{i}^{\prime}$ if it appears negated
    - For each $C_{j}$, $S contains integers $s_j$ and $s_j_{}^′$ where $s_j$ has a 1 in the $C_j$ digit and $s_j^′$ has a 2 in the $C_j$ digit
  - Greatest sum of any digit is 6, therefore no carry,
  - Variable digits provide mutual exclusion of variables (excluded middle)
  - slack from $s_{k}$ and excluded middle of $x_{n}$ provide proof

Lecture 10: NP Completeness, continued <2019-12-06 Fri>

Subset Sum, Continued

4 is used because there can be 1, 2 or 3 literals that are true, the additional 1 makes the problem easier and allows the slack to work
See 10.46, 10.47 for if and only if

Lecture 11: Maximum Flow <2019-12-09 Mon>

Introduction

Can use digraph as a flow network to model things (data communications networks, water etc. through pipes, assembly lines, etc.)
a flow network is a digraph with two special vertices, a source ($s$) that produces flow, and a sink ($t$) that takes the flow
Diredge is a conduit with a capacity
Vertices and conduit junctions
Except $s$ and $t$, flow must be conserved, the flow into a vertex must match flow out
Max flow problem: given a flow network, determine max amount of flow that can get from $s$ to $t$, also: bipartite matching

Flow networks

every edge $(u, v) \in E$ has non negative capacity, $c(u, v) \geq 0$
If $(u, v) \in E$, then $(v,u) \not\in E$
If $(u, v) \in E$, then $c(u, v) = 0$
No self-loops
Assume that every vertex in $V$ likes on some path from the source $s \in V$ to the sink vertex $t \in V$

Flows

A flow in $G$ is a function $f: V \times V \mapsto \mathcal{R}$ that satisfies
1. The capacity constraint: for all $u, v \in V$, $0 \leq f(u, v) \leq c(u, v)$, i.e., flow is non-negative and does not exceed capacity
2. Flow conservation: for all $u \in V \setminus \{s, t\}$: \[\sum_{v \in V }f(v, u) = \sum_{v \in V }f(u, v)\]
Value of flow $f$ is net flow out of $s$, net flow into $t$ \[|f| = \sum_{v \in V} f(s, v) - \sum_{v \in V} f(v, s)\]
Maximum Flow Problem: Given the graph and capacities, find a flow of maximum value

Example

flow/capacity on edges

Multiple Sources/Sinks

Supersource $s$ and supersink $t$, edge from $s$ to every source, from every sink to $t$, add edges with infinite capacity

Ford-Fulkerson Method

method, not algorithm
many ways to implement
Uses residual network – a network wich is $G$ with capacities updated based on the amount of flow $f$ already going through it
augmenting path – a simple path from $s$ to $t$ in a residual nework, if such a path exists, then more flow can be pushed through
cut – partition of $V$ into $S$ and $t$ where $s \in S$ and $t \in T$ can measure flow and capacity crossing cut
Repeatedly finds augmenting path in residual, adds in flow along path, updates residual network

Residual Networks

given capacity $c$ and flow $f$, residuale network consists of edges with capacities showing how one can change flow in $G$x
Residual capacity of an edge is: \[c_{f}(u, v) = \begin{cases}c(u, v) - f(u, v) & (u, v) \in E\\f(v, u) & (v, u) \in E\\0 & \end{cases}\]
$E_{f} = \{(u, v) \in V \times V : c_{f}(u, v) > 0\}$

Lecture 11: Maximum Flow, Continued <2019-12-11 Wed>

Flow Augmentation

if $f$ is a flow in $G$ and $f^{\prime}$ is a flow in $G_{f}$, the augmentation is: \[(f \uparrow f^{\prime})(u, v) = \begin{cases} f(u, v) + f^{\prime}(u, v) - f^{\prime}(v, u) & (u, v) \in E \\0 & \end{cases}\]
$f \uparrow f^{\prime}$ is a flow in $G$ with value $|f \uparrow f^{\prime}| = |f| + |f^{\prime}|$
Amount of flow that can be put on a path $p$ is $p$'s residual capacity, $c_{f}(p) = \min\{c_{f}(u, v) \forall(u, v)\in p\}$
Then, $f_{p}(u, v)$ is based on the value of $c_{f}(p)$
Full Ford-Fulkerson algorithm presented on 11.16

Analysis

Assume oll of $G$'s capacities are integers
If we chose augmenting path arbitrarily, $|f|$ has an upper bound of $|f*|$ steps

Edmonds-Karp

Uses Ford-Fulkerson but BFS to find paths, time complexity is $\mathcal{O}(|V||E|^{2})$

Maximum Bipartite Matching

In an undirected graph, a matching is a subset of edges $M \subseteq E$ such that for all $v \in V$, at most one edge from $M$ is incident on $v$
If an edge from $M$ is incident on $v$, $v$ is matched, otherwise, unmatched
Find a matching of maximum cardinality
Special case: $G$ is bipartite, i.e., meaning $V$ partitioned into two disjoint sets $L$ and $R$ and all edges cross tbetween the two
Matching machines to tasks, arranging marriages between interested parties
Can cast bipartite matching as max flow, given bipartite graph $G$, flow network is:
- $V^{\prime} = V \cup \{s, t\}$
- $E^{\prime} = \{(u, v), (u, v) \in E\} \cup \{(s, u) : u \in L\} \cup \{(v, t): v \in R\}$
- $c(u, v) = 1, (u, v) \in E^{\prime}$
Value of flow is number of edges

Lecture 12: Final Exam Review

Monday, Dec 16 13:00-15:00, AvH 106
Topics: MSTs, Single Source Shortest Paths, All Pairs Shortest Paths, Lower Bounds, NP Completeness, Max Flow
Two reference sheets (double sided, mecanicaly printed, standard letter)
Four Questions, 25 pts each
Guaranteed NP-completeness, Max Flow (Edmons Karp)