Design and Analysis of Algorithms
Lecture 0
-
Syllabus review
-
Proof-heavy course
-
Design – methods used to create new algorithms
-
Analysis – Mathematical assessment of an algorithm's correctness and efficiency
-
Correctness – Does it do what it's supposed to do on all possible inputs?
-
Efficiency – Measure the algorithm's running time, counting the number of basic operations
Lecture 1
Asymptotic Notation – Starts on Slide 3
-
Big-$O$
-
Big-$\Omega$
-
Big-$\Theta$
-
Little-$o$
-
Little-$\omega$
-
Not Interchangeable
-
Asymptotic Upper Bound – Big-$O$, $O(g(n)) = \{f(n) : \exists c, n_0 > 0 \mid \forall n \geq n_0, 0 \leq f(n) \leq cg(n) \}$
-
Asymptotic Lower Bound, Big-$\Omega$, $\Omega(g(n)) = \{f(n) : \exists c, n_0 \mid \forall n \ge n_0, 0 \leq cg(n) \leq f(n) \}$
-
Asymptotic Tight Bound, Big-$\Theta$, $\Theta(g(n)) = \{f(n) : \exists c_1, c_2, n_0 > 0 \mid \forall n \geq n_0, 0 \leq c_1 g(n) \leq f(n) \leq c_2 g(n) \}$
-
Upper Bound, not tight, little-$o$
-
Lower bound, not tight, little-$\omega$
Lecture 1, Cont.
Upper And Lower Bounds
-
01.9
-
Analyze algorithms and problems in terms of time complexity (count of operations)
-
Sometimes in space complexity (memory usage)
-
upper and lower bounds are used
-
applies to general problems
-
01.10 – upper is most common form
-
Algorithm $A$ has an upper bound of $f(n)$ for input of size $n$ if there exists no input of size $n$ such that $A$ requires more than $f(n)$ time
-
Consider sort, Quicksort and Bubblesort are $\mathcal{O}(n^2)$, and Mergesort is $\mathcal{O}(n \log{n})$
-
Quicksort is used more often in practice, when randomized tends to be closer to $n \log{n}$ time
-
-
Lower Bounds are like best-case results, typically not as interesting
-
01.11 – Uppper bound of a problem
-
A problem has an upper bound of $f(n)$ if there exists at least one algorithm that has an uppperbound $f(n)$
-
there exists an algorithm with time/space complexity of at most $f(n)$ on all inputs of size $n$
-
-
Mergesort has worst-case $\mathcal{O}(n \log{n})$, sorting has an upper bound of $\mathcal{O}(n \log{n})$
-
01.12 – Lower bounds of a problem
-
A problem has a lower bound of $f(n)$ if, for any algorithm $A$, to solve it, there exists at least one input of size $n$ that forces $A$ to take at least $f(n)$ time/space.
-
Called the pathological input, depends on $A$ itself.
-
input of size $n$ (reverse) that forces Bubblesort to take $\Omega(n^2)$
-
Since every sorting algorithm has an input of size $n$ forcing $\Omega(n \log{n})$ steps, the sorting problem has a time comlpexity lower bound of $\Omega(n \log{n})$. Therefore, Mergesort is asymptotically optimal.
-
-
01.13 Lower Bounds, 2
-
To argue a lower bound, use adversarial argument: algorithm simulates an arbitrary algorithm $A$ to build pathological input
-
Needs to be a general form, pathological input depends on specific algorithm $A$
-
Can reduce one problem to another to establish lower bounds
-
Show that if we can compute convex hull in $o(n \log{n})$ time, then we can sort in time $o(n \log{n})$; cannot be true, so convex hull takes $\Omega(n \log{n})$
-
-
Efficiency
-
01.14 – Efficiency
-
An algorithm is time- or space- efficient if worst case time/space complexity is $\mathcal{O}(n^c)$ for constant $c$ and input size $n$, polynomial in the input size
-
Input size on in number of bits needed
-
graph of $n$ nodes takes $\mathcal{O}(n \log{n})$ bits to represent nodes, $\mathcal{O}(n^2 \log{n})$ to represent edges., algorithm in $\mathcal{O}(n^c)$ is efficient
-
Problem that includes as an input a number, $k$ (e.g., threshold) onnly needs $\mathcal{O}(\log{k})$ bits to represent, efficient must run in $\mathcal{O}(\log^c{k})$
-
If polynomial in $k$, called pseudopolynomial
-
-
Recurrence Relations
-
01.15 – Recurrence Relations
-
Non-recursive relations are relatively easy to analyse.
-
Recurrence relations are used for time complexity of recursive algorithms, bound relation asymptotically
-
For example, Mergesort
-
$T(n) = 2 T(n/2) + \mathcal{O}(n)$
-
Bound on $T(n)$ needed
-
Use the Master theorem (See 235/156 notes)
-
Other approaches include 01.17–18
-
-
Graphs
-
01.19–24 – Graphs
-
Set of vertices $V$, set of unordered pairs between vertices $E$ (edges)
-
Directed graph has ordered pairs
-
Weighted is directed augmented with weights for edges
-
Adjacency list, each vertex has linked list listing edges, $\mathcal{O}(n + m)$
-
Adjacency matrix, $n \times n$ – $\mathcal{O}(n^2)$
-
Lecture 1, Cont.
Graphs
-
01.24 – Adjacency matrices
-
$n \times n$ matrix $M$, $M(i,j) = w$ for weight, $\infty$ for non-edges, takes $\Theta(n^2)$ space
Algorithmic Techniques, Dynamic Programming
-
01.25
-
Technique for solving optimization problems, where a best solution is needed, evaluated by an objective function
-
Decomposes a problem into subproblems, optimally solves recursively, combines into final/optimal solution
-
Exponential number of subproblems, many overlap, re-use solutions
-
Number of distinct subproblems is polynomial
-
Works if they have optimal substructure property – optimal solution is made up of optimal solutions to subproblems
-
All-pairs shortest paths, knapsack
-
Can prove correctness if optimal substructure property exists
-
Must be careful, very carefully proven
Algorithmic Techniques, Greedy Algorithms
-
01.26
-
Must consider optimal substructure property
-
Each step, commits to locally optimal
-
Tend to be very quick, does not use all possible subproblems
-
MST, Dijkstra's
Divide And Conquer
-
01.27
-
Not limited to optimization, solve each sub-problem recursively, then combine
-
Think Mergesort
Proof By Contradiction
-
01.28
-
Assume opposite of premise to be proved, arrive at a contradiction of some other assumption
-
To prove there is no greatest even integer:
-
Assume for a reductio that there exists a greatest even integer $N$, for all even integers, $n$, we have $N \geq n$. (1)
-
But $M = N + 2$ is an even integer, since it's the sum of two even integers, and $M > N$.
-
Therefore, (1) is false, so there is no greatest even integer. $\box$
-
Proof by Induction
-
01.29
-
Base case, and inductive step, involves non-negative integers
-
Useful for proving invariants in algorithms, properties that always hold at every step (especially at the final step)
-
See PHIL 411 notes
Proof by Construction
-
01.30
-
Proof of existence by directly constructing it.
-
Will be used extensively for NP-completeness
Proof by Contraposition
-
01.31
-
$P \to Q \equiv \lnot Q \to \not P$
-
Proof that if $x^2$ is even, then $x$ is even. Contraposition is the simplest
-
Useful for proving biconditionals as well
-
Useful for NP-completeness
Lecture 2: Medians and Order Statistics
-
Given an array $A$ of $n$ distinct numbers, $i$th order statistic of $A$ is its $i$th smallest element
-
$i = 1$ – minimum
-
$i = n$ – max
-
$i = \lfloor (n+1)/2 \rfloor$ – (lower) median
-
Given array $A$ of $n$ elements and a number $i \in \{1, \dots, n\}$ find the $i$th order statistic of $A$
-
Can we do better?
-
Minimum, linear, normal just scan through the array, keeping track of the smallest so far
-
small is smallest from $A_1$ to $A_i$ – the loop invariant
-
Loop invariant can be shown easily via induction
-
Correctness by observing $i = n$ at the end, but before return
-
$\Theta(n)$ – Cannot do better
-
Must find a lower bound
-
Must make at least $n - 1$ comparisons
-
no element can be eliminated until shown it's greater than at least one other element
-
All still eligible to be smallest are in a bucket, removed after shown to be $>$ another.
-
Each comparison removes at most one element, from the bucket, therefore at least $n - 1$ comparisons are needed to remove all but one from the bucket
-
Follows by pigeonhole principle
-
-
-
Minimum and Maximum (02.06)
-
Given $A$ with $n$ elementts, find both min and max
-
Obvious algorithm, non asymptotic time complexity
-
Do better by checking for both
-
Look at 02.07
-
Lecture 2: Medians and Order Statistics, Continued
The simultaneous Minimum and Maximum (starts 02.06)
-
find-min and find-max are optimal, as they are at $n - 1$ comparisons for $n$ elements.
-
Algorithm presented on 02.07 – be familiar with the algorithm (retype later)
(defun min-and-max (list-of-nums) (let ((len (length list-of-nums)) (max (max (first list-of-nums) (second list-of-nums))) (min (min (first list-of-nums) (second list-of-nums)))) (do ((i 1 (1+ i))) ((= i (1- (floor len 2)))) (setq max (max max (nth list-of-nums (* 2 i)) (nth list-of-nums (- (* 2 i) 1))) min (min min (nth list-of-nums (* 2 i)) (nth list-of-nums (- (* 2 i) 1))))) (when (oddp len) (setq max (max max (nth list-of-nums (1- len))) min (min min (nth list-of-nums (1- len))))) (values max min)))
-
For this algorithm, in $\mathcal{O}(\frac{3}{2}n)$
Selection of $i$th smallest value (02.10)
-
Select $i$th smallest value. Obvious solution (sort, return $i$th element), complexity is $\Theta(n \log n)$
-
Can do better, running in linear time
-
Uses divide-and-conquer, similar to binary search
-
Definition on 02.12
-
Uses partition like in quicksort (deterministic)
-
$q$ from algorithm is in its final sorted position
-
Select calls partition, choses pivot element $q$, reorders $A$ such that all elements $< A_q$ to the left of $A_q$ and all elements $> A_q$ to the right
-
If $A_q$ is the element, return, if it's greater learch left, less, search right.
-
Partition chooses a pivot element $A_x$, exchange $A_x$ and $A_r$, then begin to loop to exchange appropriately
Lecture 2: Medians and Order Statistics, Continued
Selection, Continued
-
Recall algorithm, from 02.12
-
Requires partition, defined later on
-
Partition chooses the pivot element
-
Partition helps to preserve the invariant
-
Choce of pivot element is crucial
Choosing a Pivot element
-
choise is critical to low time-complexit
-
best choice of pivot element to partition
-
Use the median – circular to do this
-
Median of Medians
-
$A$ of $n$ elements, partition $A$ into $m = \lfloor n/3 \rfloor$ groups of 5 elements, atnd at most one othe group of $n \mod 3$ elements
-
$A^{\prime} = [ x_1, x_2, \ldots, x_{\lceil n / 5 \rceil}]$, where $x_i$ is median of group $i$ from sorting $i$
-
$y = \mathrm{Select}(A^{\prime}, 1, \lceil n/5 \rceil, \lfloor(\lceil n/5 \rceil + 1)/2 \rfloor)$ is the pivot element, scan $A[p \cdots r]$ and return $y$ in as $i$ – this is choose pivot element
-
As a team, do 02.19
Time Complexity
-
See 02.20
-
Recurrence form (02.22):
-
$T(n)$ total time for size $n$
-
Pivot selection takes $\mathcal{O}(n)$ to split and $T(n/5)$ for finding median of medians, partitioning $n$ elements takes $\mathcal{O}(n)$ time
-
Recursive select takes $T(3n/4)$ (in choose pivot element)
-
∴, $T(n) \leq T(n/5) + T(3n/4) + \mathcal{O}(n)$
-
Or, rewritten as $T(\alpha n) + T(\beta n) + O(n)$ where $\alpha = 1/5$ and $\beta = 3/4$
-
By theorem from Lecture 1, when $\alpha + \beta < 1$, $T(n) \in \mathcal{O}(n)$
-
-
And thus, Select has complexity $\mathcal{O}(n)$
-
Proof shown on 02.23
-
How can we lower-bound to 1/4 of the elements – see 02.21
Lecture 2: Medians and Order Statistics, Continued
Practice, Median of Medians, 02.19
-
$A = [4, 9, 12, 17, 6, 5, 21, 14, 8, 11, 13, 29, 3]$
-
select(A, 1, 13, 4)
-
Partition(A, 1, 13)
-
ChoosePivotElement(A, 1, 13)
-
$[4, 9, 12, 17, 6], [5, 21, 14, 8, 11], [13, 29, 3]$
-
$[9, 11, 13]$
-
$11$ is median, $x = 10$
-
-
Proceed to swap
-
$A = [4, 9, 6, 3, 8, 11, 12, 17, 21, 14, 13, 29]$, $p = 6$, $k = 5$
-
-
-
Make sure to use lower median!
Lecture 3: Dynamic Programming
-
solving optimization problems
-
Decompose problem into subproblems, solving them recursively and combine the solutions into a final, optimal solution
-
exponential number of subproblems to solve, but with overlap, means solutions can be resued
-
Number of distinct subproblems is polynomial
The Rod cutting problem
-
Rod of length $n$, cut into integer length smaller rods to maximize profit.
-
Rod of length $i$ has price $p_i$
-
Cuts are free, profit base solely on prices charged for rods
-
Cuts only occur at integral boundaries, so there are $n - 1$ positions to cut, total number of possible solutions is $2^{n - 1}$
-
Cuts $i_1, \ldots, i_k$, ($i_1 + \cdots i_k = n$), and revenue $r_n = p_{i_1} + \cdots p_{i_k}$ is maximized
-
Optimal substructure problem
Lecture 3: Dynamic Programming, Continued
Rod cutting, an implementation
-
Recall the optimal substructure property, an optimal solution is made up of optimal solutions to subproblems
-
Easy to prove via contradiction
-
$r_n = \mathrm{max}(p_n, r_1 + r_{n - 1}, \ldots, r_{n - 1} + r_1)$
-
Helpful to simply store previously calculated values
Lecture 3: Dynamic Programming, Continued
Rod Cutting, continued
-
Requires considering all possibilities (with one initial cut) for a given $r_n$, which gives two subproblems.
-
Proof of correctness/proof of optimal substructure property by contradiction
-
Assume for a reductio that the solution is not optimal for the subproblem…
-
-
Some dynamic programming problems can be consider slightly differently.
-
$r_n = \max_{1 \leq i \leq n} (p_i + r_{n-i})$ is an alternative form (03.05)
-
Rod cutting proof available in book, §15.1 (p. 362)
-
Cut-rod on (03.06)
-
Missing memoization
Time Complexity
-
$T(0) = 1$, See 03.07
-
Non-memoized complexity is huge
-
Top-down with memoization
-
Bottom-up, fill in table
-
Have the same running time
-
Memoized (recursive) implementation at 03.10
-
Bottom up at 03.11 – easier to analyse
Reconstructing a solution
-
add a second array $s$ to keep track of the optimal size of the first piece cut in each subproblem
-
Printing from this table is reasonably simple
Lecture 3: Dynamic Programming, Continued
Matrix Chain Multiplication
-
long chain of $n$ matrices we want to multiply
-
$p \times q$ by $q \times r$ requires $pqr$ steps
-
Brute force solution is $\Omega(4^n / n^{3/2})$
-
Steps:
-
Characterize structure of an optimal solution
-
Recursively define the value of an optimal solution
-
Compute the value of an optimal solution
-
Construct an optimal solution from computed information (03.18)
-
-
Step 1
-
$A_{i \dots j}$ is the matrix from $A_i A_{i + 1} \cdots A_j$
-
Must split product and compute i to k and k+1 to j, then multiply the two together
-
$k$ is the optimal parenthesization
-
-
Assume cost is not optimal, then there exists some other solution from $i$ to $k$ such that $cost(s^{\prime}) < cost(s)$
-
Then replace $s$ with $s^{\prime}$
-
-
Step 2:
-
$m[i, j]$ is min number of mults for A_i to A_j
-
$m[i, j]$
-
$i = j$ = 0
-
$i < j$, split at $k$, $m[i, j] = m[i, k] + m[k + 1, j] + p_{i - 1}p_{k}p_j$
-
$k$ must be tried for all possible values, store in $s[i, j]$ to find $k$ for a given $i, j$ pair. (03.20)
-
-
-
Step 3:
-
Is $\Theta(n^2)$
-
Bottom-up is on chain length
-
Linear to solve each problem, a quadratic number of subproblems, ∴, $\Theta(n^3)$
-
Implementation on 03.22
-
-
Lecture 3: Dynamic Programming, Continued
Matrix Chain Multiplication, Continued
-
Optimal substructure defined with regards to characterization of subproblems
Optimal Substructure
-
Consider shortest path?
-
Does shortest path have optimal substructure?
-
Yes!
-
But not for Longest Simple Path. LSP is in fact NP complete, because subproblems are not independent
Longest Common Subsequence
-
Sequence $Z$ is a subsequence of anothe sequence $x$ if there is a strictly increasing sequence $i_1, \ldots, i_k$ of indices of $X$ such that for all $j = 1, \ldots, k$, $x_{i_j} = z_j$
-
As one reads through $Z$, one can find a match to each symbol of $Z$ in $X$ in order.
-
$Z$ is a common subsequence of $X$ and $Y$ if it is a subsequence of both
-
Goal of LCS is to find a maimum-length common subsequence of sequences $X$ and $Y$
Characterizing Structure of LCS Optimal Solution
-
Given $X$, the $i$th prefix of $X$ is $X_i$
-
If $X$ and $Y$ have LCS $Z$, then
-
$x_m = y_n \implies z_k = x_m = y_n$ and $Z_{k - 1}$ is an LCS of $X_{m - 1}$ and $Y_{n - 1}$
-
if $z_k \neq x_m$, can lengthen $Z$. Contradiction
-
If $Z_{k - 1}$ is not LCS of $X_{m - 1}$ and $Y_{n - 1}$, then a longer CS of $X_{m - 1}$ and $Y_{n - 1}$ could have $x_m$ appended to it to get CS of $X$ and $Y$ that is longer than $Z$. Contradiction
-
-
See 03.31!
-
-
Last symbol must be equal!
Recursive definition of Optimal Solution
-
$c[i, j]$ defined on 03.32
-
Find the LCS of the prefixes, if $x$ and $y$'s last chars match
-
If not, find LCS of $X$ and $Y_{n - 1}$, find LCS of $X_{m - 1}$ and $Y$ and use the longer.
-
Remember to use mathematical style
Define the Algorithm
-
03.33
-
$b$ is used to keep direction
-
Diagonal in an LCS is where it's printed
-
But LCS is prented backwards, use recursion to print in correct order
Lecture 3: Dynamic Programming, Continued
-
exactly 4
-
5 and 4 hinge on finding the correct subproblem
Lecture 3: Dynamic Programming, Continued
Optimal Binary Search Trees
-
Goal is to construct BSTs such that most frequently sought values are near the root, minimizing expected search time.
-
Given a sequence $K$ of $n$ distinct keys in sorted order
-
Key $k_i$ has a probability $p_i$ that it will be sought on a particular search
-
To handle searches for values not in $k$, have $n + 1$ dummy keys, $d_0, d_1, \ldots, d_n$ to serve as leaves. $d_i$ is reached with probability $q_i$
-
If $depth_{T}(k_i)$ is distance from root of $k_i$ in tree $T$, then the expected cost of $T$ is $1 + \sum_{i=1}^{n} p_i depth_{T}(k_i) + \sum_{i=0}^{n} q_i depth_{t}(d_i)$
-
An optimal BST is one with minimum expected search cost
Step 1: Structure of Optimal Solution
-
Observation: Since $K$ is sorted and dummy keys are interspersed in order, any subtree of a BST must contain contiguous keys from $i$ to $j$ and leaves from $d_{i - 1}$ to $d_j$
-
"Proof" in 03.44
-
Given keys $k_i, \ldots, k_j$, say that it's optimal BST roots at $k_r$ for some $i \leq r \leq j$
-
Since we dont know optimal $k_r$, all must be tried
Recursive definition
-
$e[i, j]$ is expected cost from $i$ to $j$
-
if $j = i - 1$, then there is only the dummy key $d_{i - 1}$ and $e[i, i - 1] = q_{i - 1}$
-
if $j \geq i$, then choose root $k_r$ from $k_i$ to $k_j$, optimally solving subproblems.
-
When combining optimal trees from subproblems and increase depth by one.
-
Define $w(i, j) = \sum_{\ell = i}^{j} p_{\ell} + \sum_{\ell = i - 1}^{j} q_{\ell}$ as the sum of the probabilities of the nodes in the subtree built on $k_i, \ldots, k_j$, meaning $e[i, j] = p_r + (e[i, r - 1] + w(i, r - 1)) + (e[r + 1, j] + w(r + 1, j))$
-
Becomes $w(i, j) = w(i, r - 1) + p_r + w(r + 1, j)$
-
Minimize, adding root array
Define the function
-
Pseudocode in 03.43
-
is in $\mathcal{O}(n^3)$
Lecture 4: Greedy Algorithms
-
A technique for solving optimization problems
-
Still examine subproblems, stil exploits optimal substructure
-
Greedy algorithms instead choose the best choice at the moment, rather than examine all subproblems
-
ability to do this is called the greedy choice property, this allows us to limit what subproblems we look at
-
-
MST, single-source shortest path
Activity Selection
-
scheduling classes in a classroom
-
Courses are candidates to be scheduled in that room, not all can have it
-
Maximize utilization of the room in terms of number of classes scheduled
-
Generally:
-
Given $S = \{a_1, a_2, \ldots, a_n\}$ of $n$ proposed activities that wish to use a resource that can serve only on activity at a time
-
$a_i$ has a start time $s_i$ and a finish time $f_i$, such that $O \leq s_i < f_i < \infty$
-
If $a_i$ is scheduled to use the resource, it occupies it during the interval $[s_i, f_i)$ then we can schedule both $_i$ and $a_j$ iff $s_i \geq f_j$ or $s_j \geq f_i$ (if holds, $a_i$ and $a_j$ are compatible)
-
Find a largest subset $S^{\prime} \subseteq S$ such that all activities in $S^{\prime}$ are pairwise compatible
-
Assume that activities are sorted by finish time.
-
-
Let $S_{ij}$ be a set of activities that start after $i$ and end before $j$, and $A_{ij}$ be the subset of $S_{ij}$ that is maximal
-
See optimal substructure (04.05)
-
Recursive definition in 04.07, but we can optimize
-
Greedy choice – choose the one with the earliest finish time of all still compatible with currently schedule activities
-
Now, consider $S_k = \{a_i \in S \mid s_i \geq f_k\}$ be the activities that start after $a_k$ finishes
-
Then greedily choose $a_1$ first and the first thing in each
-
-
Proof on 04.09
-
Argument of Optimal Substructure and Greedy Choice properties
Lecture 4: Greedy Algorithms, Continued
Rod cutting, redux
-
Subproblem $n$ is rod of length $n$
-
Let $r_n$ be value of optimal solution to subproblem $n$
-
$r_n = p_i + r_{n - i}$
-
Assume than $r_{n - i}$ is not optimal. Falls in from there
Activity Selection, Continued
-
Optimal solutions are done with dynamic programming in $\mathcal{O}(n^3)$
-
But greedy solution is better
The Greedy Choice
-
Maximizes amount of something left over
-
State Greedy Choice Property
-
Argue property
-
argue optimal substructure
-
Greedy Choice Property means that there exists a greedy choice that leads to an optimal solution
-
Use a proof by construction
-
-
See 04.09
Lecture 4: Greedy Algorithms, Continued
Proof of greedy choice
-
Slide 04.09
-
Let $A_k$ be an optimal solution to $S_k$, and let $a_j$ have earliest finish time of all in $A_k$
-
Make this choice, Activities in $A^{\prime}$ are mutually compatible
-
Therefore, there exists an optimal solution taking the greedy choice
-
$A_k$ is such that all elements in it are compatible with $a_k$
-
Thus, $A_0 \cap S_1$ is optimal for $S_1$, by contradiction:
-
Assume for a reductio that $A_1^{\prime}$ exists such that $|A_1^{\prime}| > |A_1|$ and all activities in $A_1^{\prime}$ are mutually compatible
-
$A_0^{\prime} = A_0 \setminus A_1 \cup A_1^{\prime}$
-
$|A_0^{\prime}| > |A_0|$, contradiction!
-
-
Greedy Choice propertiy is such that the optimal solution $A_0$ contains $a_1$
Greedy Vs Dynamic Programming
-
Gredy algorithms leverage optimal substructure
-
When we can argue that greedy choice is part of a optimal solution, implying that we don't need to explore all subproblems
-
Consider the knapsack problem
-
0-1 is taken in entriety
-
Fractional is not
-
Fractional chooses $v_i / w_i$ and choose items in descending order
-
knapsack can always be filled in the end
-
-
0-1 is dynamic, but NP-complete, is $\mathcal{O}(nW)$, and therefore pseudopolynomial
Lecture 4: Greedy Algorithms, Continued
Huffman Coding
-
encoding a file of symbols from some alphabet
-
Want to minimize size, based on frequency
-
Fixed-length uses $\lceil \log_2{n} \rceil$ bits per symbol
-
Variable-length uses fewer bits for more frequent symbols
-
Prefix codes can be represented as a binary tree, as can prefix-free codes
-
Corresponds to a full binary tree
-
Pseudocode on 04.18
-
Computation is linear, log for heap ops, therefore, total, $\mathcal{O}(n \log{n})$
Greedy Choice
-
04.20–22
-
Lemma: Let $C$ be an alpabet in which symbol $c \in C$ has frequency $c.freq$ and let $x, y \in C$ have lowest frequencies. Then there exists an optimal prefix code for $C$ in which codewords $x$ and $y$ have the same length and differ only in the last bit. I.e., an optimal solution exists that merges lowest frequencies first.
-
Proof:
-
Let $T$ be a tree representing an arbitrary optimal prefx code, and let $A$ and $b$ be siblings of maximum depth in $T$
-
Assume, without loss of generality, that $x.freq \leq y.freq$ and $a.freq \leq b.freq$
-
Since $x$ and $y$ are the two least frequent nodes, we get $x.freq \leq a.freq$ and $y.freq \leq b.freq$
-
Convert $T$ to $T^{\prime}$ by exchanging $a$ and $x$, then $T^{\prime}$ to $T^{\prime\prime}$ by exchanging $b$ and $y$
-
Then, in $T^{\prime\prime}$, $x$ and $y$ are siblings of maximum depth
-
Cost difference between $T$ and $T^{\prime}$ is $B(T) - B(T^{\prime})$
-
$= \sum_{c \in C} c.freq \cdot d_T(c) - \sum_{c \in C} c.freq \cdot d_{T^{\prime}}(c)$
-
$= x.freq \cdot d_T(x) + a.freq \cdot d_T(a) - x.freq \cdot d_{T^{\prime}}(x) - a.freq \cdot d_{T^{\prime}}(a)$
-
$(a.freq - x.freq)(d_T(a) d_T(x)) \geq 0$
-
-
Since $a.freq \geq x.freq$ and $d_T(a) \geq d_T(x)$
-
Similarly, $B(T^{\prime}) - B(T^{\prime\prime}) \geq 0$ , so $B(T^{\prime\prime}) \leq B(T)$, so $T^{\prime\prime}$ is optimal. QED
-
Optimal Substructure
-
04.23–25
Lemma
-
Let $C$ be an alphabet in which symbol $c \in C$ has frequency $c.freq$ and let $x, y \in C$ have lowest frequencies
-
Let $C^{\prime} = C \setminus \{x, y\} \cup {z}$ and $z.freq = x.freq + y.freq$
-
Let $T^{\prime}$ be any tree representing an optimal prefix code for $C^{\prime}$
-
Then, $T, which is $T′$ with leaf $z$ replaced by internal node with children $x$ and $y$, represents an optimal prefix code for $C$
Proof
-
See 04.24–25
-
Look back at it.
Lecture 5: Elementary Graph Algorithms
-
Graphs are ADTs applicable to numerous problems
-
can capture entities and the relationships between them
-
degree of the relationship
-
and more
-
-
Basics in graph theory, representation, algorithms for basic graph-theory problems
Breadth-First Search
-
Given a graph $G = (V, E)$, and a source node $s \in V$, BFS visits every vertex that is reachable from $s$
-
Uses a queue to search in a breadth-first manner
-
creates a structure called a BFS tree such that for each vertex in $v$, the distance from $s$ to $v$ is a shortest path in $G$
-
Initializes each node's color to WHITE
-
As nodes are visited, color to GREY when in the queue, then BLACK when finished.
Lecture 5: Elementary Graph Algorithms, Continued
Breadth-First Search, continued
-
BFS tree is shortest paths tree
-
Algorithm presented in 05.04
-
Array $d$ is distance from start node
-
Array $\pi$ is parent node.
-
$\mathcal{O}(|V|^2)$ for adjacency matrix
-
With adjacency list, is $\mathcal{O}(|V|)$, or actually, $\mathcal{O}(|E| + |V|)$
-
$d[v]$ is shortest distance from $s$ to $v$ for unweighted shortest paths
-
Can print shortest path by recursively following $\pi[v]$, $\pi[\pi[v]]$, etc.
-
If $d[v] == \infty$, then $v$ is not reachable from $s$
Depth First Search
-
Graph traversal, follows path as deep as possible before backtracking
-
DFSis implemented recursively
-
Tracks discovery time and finish time of each node
-
All nodes coloured black at end
-
Produces a "forest"
-
Presented on 05.09 and 05.10
-
Time complexity equivalent to BFS, $\Theta(|V| + |E|)$
-
Vertex $u$ is a proper descendent of vertex $v$ in the DF forest iff $d[v] < d[u] < f[u] < f[v]$
-
Parenthesis structure If one prints "(u" when discovering $u$ and "u)" when finishing $u$, then the printed text will be a well-formed parenthesized expression
-
-
Types of tree edges
-
Tree edges are in depth-first forest
-
Back edges $(u, v)$ connect a vertex $u$ to its ancestor $v$ in the DF tree (includes self-loops)
-
Forward edges are non-tree edges connecting a node to one of its DF tree descendants
-
Cross edges goe between non-ancestral edges within a DF tree or between DF trees
-
-
A graph has a cycle iff DFS discovers a back edge (deadlock detection)
-
When dfs first explores an edge, look at $v$'s color
-
white implies tree edge
-
grey implies back edge
-
black implies forward or cross edge
-
Lecture 5: Elementary Graph Algorithms, Continued
Topological Sorting
-
Begins with a directed acyclic graph, edge implies event must occur before another
-
Topological sort of a dag $G$ is a linear ordering of its vertices such that if $G$ contains an edge $(u, v)$, then $u$ appears before $v$ in the ordering
-
Call DFS on dag $G$
-
As each vertex is finished, insert into the front of a linked list, return the linked list of vertices
-
Thus, topological sort is a descending sort of vertices based on DFS finish times
-
What is time complexity – $\Theta(|V| + |E|)$
-
See 05.15
Strongly Connected Components
-
Given $G = \langle V, E \rangle$, a strongly connected component of $G$ is a maximal set of vertices $C \subseteq V$ such that for every pair of vertices $u, v \in C$, $u$ is reachable from $v$ and $v$ is reachable from $u$
-
Collapsing the edges within an SCC yields an acyclic component graph
-
Algorithm requisres transpose of $G$, $G^T$, which is $G$ with the edges reversed. $G^T$ and $G$ have the same SCCs
-
Call DFS on $G$
-
Call DFS on $G^T$, Looping through vertices in order of decreasing finishing times from first DFS call
-
Each DFS tree in second DFS run is an SCC in $G$
-
Is $\Theta(|V| + |E|)$, because $G^T$ can be computed in $\Theta(|V| + |E|)$ if in adjacency list, and let the first run of DFS use Topological Sorting to generate the list
-
The first component node of $G^T$ has not outgoing edges, since in $G$ it has only incoming edges, second has edges into the first, etc. (Lemma 22.14 in book)
-
And, DFS on $G^T$ visits components in topological order of $G$'s component graph
Lecture 6: Minimum-Weight Spanning Trees
Intro
-
Given a connected, undirected graph $G = (V, E)$, a spanning tree is an acyclic subset $T \subseteq E$, that connects all vertices in $V$
-
$T$ is acyclic implies tree
-
$T$ connects all vertices, spans $G$
-
-
If $G$ is weighted, then $T$'s weight is $w(T) = \sum_{(u, v) \in T} w(u, v)$
-
An MST is a spanning tree of minimum weight, not necessarily unique.
Kruskal's Algorithm
-
Greedy algorithm, makes locally best choice at each step
-
Starts by declaring each vertex to be its own tree (so all nodes together make a forest)
-
Iteratively identify the minimum-weight edge $(u, v)$ that connects two distinct trees and add it to the MST $T$, merging the two trees
Lecture 6: Minimum-Weight Spanning Trees, Continued
Kruskal's Algorithm
-
Relies on Disjoint Set data structure
-
Algorithm presented on 06.05
-
Algorithm is $\mathcal{O}(|E| \log{|E|})$ (06.12)
Disjoint Set Data Structure
-
Given universe $U$ of elements, DSDS maintains collection $\mathcal{S} = \{S_1, \ldots, S_k\}$
-
Each element in $U$ is in exactly one set $S_j$
-
No set $S_j$ is empty
-
-
Dynamically updated throughout program
-
Each set $S \in \mathcal{S}$ has a representative element $x \in S$
-
Chapter 21
-
$\alpha$ is the inverse Ackermann function
-
very near constant
Prim's Algorithm
-
Still greedy
-
Maintains a single tree, not a forest
-
Starts with an arbitrary root $r$
-
Repeatedly finds minimum weight edge incident to the tree
-
algorithm on 06.14
-
Uses min queue
-
Proof in 06.18
-
Nodes already are those in $V \setminus Q$
-
For all vertices $v \in Q$, if $\pi[v] \neq \mathsc{Nil}$, then $key[v] < \infty$
-
-
Proof is same, more or less
-
Prove invariant via cut that respects $A$ (no edges span the cut)
-
GRedy choise by looking for a safe edge $e$
Minimum Heap
-
Binary tree where key at each node is $\leq$ keys of children
-
Any subtree also a heap
-
$\theta(\log{n})$
-
Build is $\mathcal{O}(n)$
-
Filter new min in $\mathcal{O}(\log{n})$
Lecture 6: Minimum Spanning Trees, Continued
Proof of correctness of both algorithms
-
Both use greedy approach
-
Maintain invariant that at any time, set of edges $A$ selected so far is subset of some MST (optimal substructure)
-
Each iteration of the algorithms look for a safe edge $e$ such that $A \cup \{e\}$ is also a subset of an MST (greedy choice)
-
Prove invariant via use of cut $(S, V - S)$ that respects $A$ (no edges span cut)
-
Cut is partition of graph into two disjoint subsets
-
Theorem Let $A \subseteq E$ be included in some MST of $G$, $(S, V - S)$ be a cut respecting $A$ and $(u, v) \in E$ be a minimum-weight edge crossing the cut. Then $(u, v)$ is a safe edge for $A$. (06.20)
-
Proof.
-
Let $T$ be an MST including $A$ and not including $(u, v)$
-
Let $p$ be path from $u$ to $v$ in $T$, and let $(x, y)$ be an edge from $p$ crossing cut (not $A$).
-
Since $T$ is a spanning tree, so is $T^{\prime} = (T \setminus \{(x, y)\}) \cup \{(u, v)\}$.
-
Both $(u, v)$ and $(x, y)$ cross cut, so $w(u, v) \leq w(x, y)$
-
So $w(T^{\prime}) = w(T) - w(x, y) + w(u, v) \leq w(T)$
-
Thus, $T^{\prime}$ is an MST
-
And $(u, v)$ is a safe edge for $A$ since $A \cup \{(u, v)\} \subseteq T^{\prime}$
-
Lecture 7: Single-Source Shortest Path
-
Given a weighted, directed graph $G = (V, E)$ with weight function $w: E \mapsto \mathbb{R}$
-
The weight of a path $p = \langle v_0, v_1, \ldots, v_k \rangle$ is the sum of the weights of its edges: $w(p) = \sum_{i = 1}^{k} w(v_{i - 1}, v_i)$
-
Shortest path weight from $u$ to $v$ is $\delta(u, v)$ is the min of all paths between the two, infinite otherwise
-
A shortest path from $u$ to $v$ is any path $p$ such that $w(p) = \delta(u, v)$
-
Network routing, driving directions
Types
- Single-source
-
finding shortest pth from source $s$ to every other node.
- Single-destitination
-
find shortest paths from every node to destination $t$ (solved by running single-source on the transpose)
- Single-pair
-
find shortest path from $u$ to $v$ (solveable with SSSP, nothing asymptoticaly faster)
- All-pairs
-
find shortest paths between every pair of nodes (can be solved with repeated applications of single-source, but can be faster/simpler)
Optimal substructure
-
Shortest paths problem has optimal substructure property. Proof on 07.04.
Negative-Weight Edges and Cycles
-
What happens if $G$ has edges with negative weights?
-
Dijkstra's algorithm cannot handle this!! Bellman-Ford can, under the right circumstances, namely no negative-weight cycles
-
Cycles include
-
Negative-weight cycles
-
Zero-weight cycles
-
Positive-weight cycles
-
Lecture 7: Single-Source Shortest Paths, Continued
Relaxation
-
$d[v]$ for $v \in V$, is the upper bound on $\delta(s, v)$
-
Relaxation of an edge $(u, v)$ is the process of testing whether we can decrease $d[v]$ and yielding a tighter upper bound
-
Initialize single source, initializes sets $d[v]$ to $\infty$, and the parent to non-existent
-
Relax is such that if the current $d[v]$ is greater than $d[u] + w(u, v)$, update $d[v]$ to that, and set $v$'s parent to $u$
Lecture 7: Single-Source Shortest Paths, Continued
Relaxation
-
Provides core to shortest path
-
Basic method works simply, varied based on implementation
Bellman-Ford
-
Works with negative-weight edges and detects negative-weight cycles
-
Makes $|V| - 1$ passes over all edges, relaxing each edge during each path
-
No cycles implies all shortest paths have $\leq |V| - 1$ edges, so that the number of relaxations is sufficient
-
Can be modified to tag negative-weight cycles
-
Algorithm shown on 07.13
-
Pass is through every single edge, $|V| - 1$ times
-
7-10 detects negative weight cycles
-
Time Complexity:
-
Initialize Single Source takes $\mathcal{O}(|V|)$
-
Relax itself is $\mathcal{O}(1)$
-
Negative Weight Cycle is $\mathcal{O}(|E|)$
-
Dual loop is $\mathcal{O}(|V| |E|)$
-
Thus, total time complexity is $\mathcal{O}(|V| |E|)$
-
-
Correctness
-
Assume no negative weight cycles
-
Since no cycles appear in SPs, every SP has at most $|V| - 1$ edges
-
Then define sets $S_0, \ldots, S_{|V| - 1}$
-
$S_k = \{ v \in V | \exists p = (s \to v) \mathrm{s.t.} \delta(s, v) = w(p) \land |p| \leq k\}$
-
Invariant: After the $i$th iteration of outer relaxation loop, for all $v \in S_i$, we have $d[v] = \delta(s, v)$
-
This is the path-relaxation property (Lemma 24.15)
-
Proven via induction on $i$:
-
obvious for $i = 0$
-
If holds for $v \in S_{i - 1}$, then relaxation and optimal substructure implies holding for $v \in S_i$
-
-
-
Thus, after $|V| - 1$ iterations, $d[v] = \delta(s, v)$ for all $v \in V = S_{|V| - 1}$
-
-
Detection of Negative-weight cycles (07.18)
-
Let $c = \langle v_0, v_1, \ldots, v_k = v_0 \rangle$ be a neg-weight cycle reachable from $s$: $\sum_{i = 1}^k w(v_{i - 1}, v_i) < 0$
-
If the algorithm incorrectly returns true, then for all nodes in the cycle $i = 0, 1, \ldots, k$, $d[v_i] \leq d[v_{i - 1}] + w(v_{i - 1}, v_i)$
-
By summing we get thing
-
Since $v_0 = v_k$, the two are the same, then contradiction follows (07.18)
-
-
Lecture 7: Single-Source Shortest Paths, Continued
-
Billboard problem and avocado problem are very similar
Shortest paths in a DAG
-
topo sort vertices, initialize single source, relax from left to right
-
Correctness follows from path-relaxation of Bellman-Ford, but relaxing in topo order implies edges relaxed in order
-
Total time complexity is $\mathcal{O}(|V| + |E|)$
Dijkstra's Algorithm
-
Greedy
-
Faster than Bellman-Ford
-
Requires all edge weights to be non-negative
-
Maintains set $S$ of vertices whose final shortest path weights from $s$ have been determined
-
Repeatedly selects $u \in V \setminus S$ with minimum SP estimate, adds $u$ to $S$ and relaxes all edges leaving $u$
-
Uses min-priority queue to repeatedly make greedy choice
-
Pseudocode on 07.25
-
Use array to implement min queue
Correctness
-
Invariant: At the start of each iteration of the while loop, $d[v] = \delta(s, v)$ for all $v \in S$
-
Proof: Let $u$ be first node added to $S$ where $d[u] \neq \delta(s, u)$
-
Let $p = s \to x \to y \to u$ be SP to $u$ and $y$ first node on $p$ in $V \setminus S$
-
Since $y$'s predecessor $x \in S$, $d[y] = \delta(s, y)$ due to relaxacion of $(x, y)$
Lecture 7: Single-Source Shortest Paths, Continued
Dijkstra's and Time Complexity
-
Initialize single source – $\Theta(|V|)$
-
Create Q – array, $\mathcal{O}(|V|)$; heap, $\mathcal{O}(|V|)$
-
Extract-Min – $\mathcal{O}(|V|)$ calls
-
Time Complexity – $\mathcal{O}(|V|)$ if array; $\mathcal{O}(\log{|V|})$ if heap
-
-
Relax – $\mathcal{O}(|E|)$
-
Time Complexity – $\mathcal{O}(1)$ ; $\mathcal{O}(\log{|V|})$
-
-
Total Time Complexity
-
$\mathcal{O}(|V|^2)$ if array
-
$\mathcal{O}(|E| \log{|V|})$ if heap
-
-
If $E$ is $o(|V|^2 / \log{|V|})$,, prefer heap
Correctness, continued
-
Since $y$ precedes $u$ in $p$, and edge weights are non negative $d[y] = \delta(s, y) \leq \delta(s, u) \leq d[u]$
-
Since $u$ was chosen before $y$ in line 3, $d[u] \leq d[y]$, so $d[y] = \delta(s, y) = \delta(s, u) = d[u]$, a contradiction
-
Since all vertices eventually end up in $S$, we see that the algorithm is correct. "QED"
Lecture 8: All Pairs Shortest Paths
-
Similar to SSSP, but for all pairs
-
Given weighted directed graph, find $\delta(u, v)$ for all $(u, v) \in V \times V$
-
Run SSP $|V|$ times
-
If no negative weights, use Dijkstra's, gives $\mathcal{O}(|V|^3)$
-
If negative weights, use Bellman-Ford, for $\mathcal{O}(|V|^4)$
-
-
Can do better, and handle neg-weight edges
-
Matrix-multiplication algorithm, $\Theta(|V|^3 \log |V|)$
-
Floyd-Warshall, $\Theta(|V|^3)$
-
Adjacency Matrix Representation
-
Assume vertices are numbered, 1 to $n$
-
Input is $n \times n$ matrix: \[ w_{ij} = \begin{cases} 0 & i = j \\ w(i, j) & (i, j) \in E \\ \infty & (i, j) \not\in E \end{cases} \]
-
Assume, for now, that neg weight cycles are absent
-
Distance matrices $L$ and $D$ are produced by algorithms, predecessor matrix $\Pi$ where $\pi_{ij}$ is the predecessor of $j$ on a shortest path from $i$ to $j$ or NIL if $i = j$ or no path exists (well-defined due to optimal substructure)
Printing from $\Pi$
-
defined recursively, printing after recursive call
-
See 08.04
Matrix Multiplication
-
Maintain a series of matrices $L^{(m)} = \left( \ell_{ij}^{(m)} \right)$, where $\ell_{ij}^{(m)}$ is the minimum weight of any path from $i$ to $j$ that uses at most $m$ edges
-
When $m = 0$, $\ell_{ij}^{(0)} = 0$ if $i = j$, $\infty$ otherwise
-
-
Exploits opitmal substructure to get a recursive definition of $\ell$
-
Uses $\min$, see 08.06
Lecture 8: All Pairs Shortest Paths, continued
Matrix Multiplication, continued
-
Allows better paths to come up
-
Recursive solution due optimal substructure property (05.06) \[ \ell_{ij}^{(m)} = \min_{1 \leq k \leq n} \left(\ell_{ik}^{(m - 1)} + w_{ik}\right) \]
-
Ignoring negative weight cycles means $\delta(i, j) = \ell_{ij}^{(n - 1)}$ (because shortest path uses $n - 1$ edges)
-
Bottom up computation $L^{(m)}$ using $L^{(1)} = W, L^{(2)}, \ldots, L^{(n - 1)}$ by iterative computation
-
Defined on 05.08–09
-
Faster version on 05.13
Lecture 8: All Pairs Shortest Paths, continued
Floyd-Warshall
-
Shaves : log factor
-
Start assuming no neg weight cycles, can detect neg cycles as before
-
Considers decomposition by intermediate vertex,
-
If a simple path $p = \langle v_1, v_2, v_3, \ldots, v_{\ell - 1}, v_{\ell} \rangle$, then the set of intermediate vertices is $\{v_2, v_3, \ldots, v_{\ell - 1}\}$
-
-
Let $V = \{1, \ldots, n\}$, and fix $i, j \in V$
-
For some $1 \leq k \leq n$, consider set of vertices $V_k = \{1, \ldots, k\}$
-
Consider all paths from $i$ to $j$ whose intermediate vertices come from $V_k$ and let $p$ be a minimum-weight path from them
-
Is $k \in p$?
-
If not, then all intermediate vertices of $p$ are in $V_{k - 1}$ and an SP from $i$ to $j$ based on $V_{k - 1}$ is also an SP from $i$ to $j$ based on $V_k$
-
If so, then we can decompose $p$ into $i \overset{p_1}{\leadsto} k \overset{p_2}{\leadsto} j$
-
-
Note, $V_0$ is no intermediate vertices, i.e., only the direct edges between pairs of vertices
-
Now $D^{(k)}_{ij}$ is the weight of an SP from $i$ to $j$ using only $V_k$ as intermediates
-
$D^{(0)} = W$
-
$D^{(n)}$ is the shortest paths
-
-
$k \neq p \implies D^{(k)}_{ij} = D^{(k - 1)}_{ij}$
-
$k = p \implies D^{(k - 1)}_{ik} + D^{(k - 1)}_{kj} = D^{(k)}_{ij}$
-
Shortest path from $i$ to $j$ based on $V_k$ is either going to be the same as that based on $V_{k - 1}$ or will be going to go through $k$
-
$D^{(k)} = \left(d_{ij}^{(k)}\right)$, where $d_{ij}^{(k)}$ is the weight of the shortest path at $k$ (08.17), is in $\Theta(n^3)$
-
Algorithm on (08.18)
-
Transitive Closure, whether paths exist between pairs of vertices for $G = (V, E)$, transitive closure of $G^{*} = (V, E^{*})$
-
$E^{*} = \{(i, j) : \exists i \overset{p}{\leadsto} j \in G\}$
-
Can be solved with Floyd-Warshall but with bitwise ops instead of min/plus
-
-
Memory can potentially be saved, operation can be in place
Lecture 9: Lower Bounds
Upper Bound
-
Of an algorithm, has $f(n)$ for $n$ input; such that if there exiusts no input of size $n$ such that $A$ requires more than $f(n)$ time
-
Of a problem, has an upper bound of $f(n)$ if there exists at least one algorithm that has an upper bound of $f(n)$
-
Generally, tightest is preferred
Lower Bound
-
A problem has a lower bound $f(n)$ if for any algorithm $A$ to solve the problem there exists at least one input of size $n$ that forces $A$ to take at least $f(n)$ time or space
-
Pathological input depends on the sepecific algorithm $A$, consider Bubble sort, takes reverse order, giving $\Omega(n^2)$
-
Since every sorting algorithm has an input of size $n$ forcing $\Omega(n \log{n})$ steps, the sorting problem has time complexit lower bound of $\Omega(n \log{n})$
-
Adversarial argument, simulate arbitrary algorithm $A$ to build pathological input
-
Reduce one problem to another to establish lower bounds
Lecture 9: Lower Bounds, continued
Decision trees
-
full binary tree that represent comparisons bteween elementts performed by a particular sorting algorithm operating on a certain sized input
-
Tree represents an algorithm's behavior on all possible inputs of size $n$
-
Each node represents one comparison made by the algorithm
-
Each node is labeld as $i:j$ which represents $A[i] \leq A[j]$
-
If, in particular input, holds, then flo moves left, otherwise to right
-
Each leaf represents a possible output of the algorithm, a permutation of the input
-
All permutations must be in the tree in order for algorithm to work property
-
Proof of lower bound (using an adversarial approach)
-
Length of path from root to output leaf is number of comparisons made by algorithm
-
Worst-case number of comparisons is the length of the longest path
-
Adversary chooses a deepest leaf to create worst-case input
-
Number of leaves is $n!$
-
A binary tree of height $h$ has at most $2^h$ leaves
-
Thus, $2^h \geq n# \geq \sqrt{2\pi n}\left(\frac{n}{e}\right)^n$
-
Take base-2 logsx
-
$h \geq \log \sqrt{2\pi} + (1/2) \log n + n \log n - n \log e = \Omega(n \log n)$
-
Every comparison-based sorting algorithm has some input that forces it to make $\Omega(n \log n)$ comparisons. QED
-
And thus, mergesort and heapsort are asymptotically optimal.
Reductions, using Convex Hull
-
Use sorting lower bound to get lower bound on convex hull
-
Given a set $Q = \{p_1, p_2, \ldots, p_n\} of $n$ points, each from $\mathbb{R}^2$, output $CH(Q)$, which is the smallest convex polygon $P$ such that each point from $Q$ is on $P$'s boundary or in its interior
-
Output is an ordered list of points that make up the boundary edges such that from one to the next, the hull will be drawn
-
Reduce problem of sorting to convex hull
-
Given any instance of the sorting problem, $A = \{x_1, \ldots, x_n\}$, we will transform it to an instance of convex hull.
-
Transformation is $Q = \{(x_1, x_1^2), \ldots, (x_n, x_n^2)\}$
-
Feed into convex hull, convex hull transformed to sorted array
Lecture 9: Lower Bounds, continued
Reductions, using Convex Hull, continued
-
Known Lower Bound of Sorting, reducing to Convex Hull
-
Since Sorting is in $\Omega(n \log{n})$, by reduction, we will find that Convex Hull is $\Omega(n \log{n})$
-
Takes any instance of problem A (the problem reduced from) to some instance of problem B (the problem reduced to)
-
Because of mapping, transformation will include all points (as will all lie on a parabola)
-
By definition, only one chunk
-
Conversion is still linear time
-
Proves a lower bound of convex hull, but not necessarily the lower bound (it is the lower bound of Convex Hull)
-
Can easily be done via contradiction. (09.12)
Lecture 10: NP Completeness
-
So far, focused on problems with efficient algorithms (i.e., those that run in polynomial time)
-
Called efficient even if $c$ is large, since it is likely that another, more efficient algorithm exists
-
Polynomial in terms of size of input, $n$.
-
-
Some problems have a fastest known algorithm in superpolynomial time (i.e., exponential, sub-exponential ($2^{n^{1/3}}$)
-
Some problems cannot be solved in any amount of time
-
Focus on lower bounds, but used to argue that some problems don't have any efficient solution
Lecture 10: NP Completeness, continued
-
Focus on complexity classes P and NP
-
P is deterministic polynomial time, the set of problems that can be solved with a deterministic TM (deterministic algorithm in polynomial time)
-
NP is non-deterministic polynomial time, i.e., set of problems that can be solved by a non-deterministic TM in polynomial time (can explore many possible paths of computation at once)
-
NP is the set of problems whose solutions, if given, can be verified in polynomial time
-
Hamiltonian cycle – $G = (V, E)$ contain a hamiltonian cycle (a simple cycle visiting every vertex exactly once?)
-
NP, since given a specific $G$, plus yes/no answer, plus certificate, we can verify a yes in polynomial time using the certificate
-
What would be an appropriate certificate?
-
Not known if $HAM-CYCLE \in P$
-
-
Euler – does a directed graph $G = (V, E)$ contain an Euler tour – a cycle that visits every edge in $E$ exactly once and can visit vertices multiple times
-
Check if each vertex's in-degree is equivalent to its out-degree
-
Certificate is just the tour, or even the graph it self
-
-
Any problem in $P$ is also in $NP$, since if we can efficiently solve the problem, we get poly-time verification for free, i.e., $P \subseteq NP$
-
Not known if $P \subset NP$ unknown if there exists a problem in $NP$ that's not $P$
-
A subset of the problems in $NP$ is the set of NP complete problems
-
Every problem in NPC is at least as hard as all others in NP
-
These problems are believed to be intractable (no efficient algorithm), but not yet proven to be so
-
If any NPC problem is in $P$, then $P = NP$ and life is glorious and scary
-
Proving NP completeness
-
Prove that a problem is NPC, we can just take a different approach
-
To prove $B \in NPC$:
-
Prove that $B \in NP$ by identifying certificate that can be used to verify a YES answer in polynomial time
-
Use obvious choice
-
Need to argue that verification requires polynomial time
-
Certificate is not instance unless $B \in P$
-
-
Show that $B$ is as hard as any other NP problem by showing that if we can efficiently solve $B$ then we can efficiently solve all problems in NP
-
-
Step one is easy
-
Second looks hard, but just use a reduction (probably to 3SAT)
Reducing to NP
-
Takes instance of $A$ and transforms to $B$ such that if $B$ is solveable, then $A$ is solvable
-
TC of reduction for $A$ is reduction to $B$ plus time to solve $B$
-
Start with $A$ and $B$
-
Trasform any instance $\alpha$ of $A$ to some instance $\beta$ of $B$ such that
-
Transformation must take polynomial time
-
The answer for $\alpha$ is yes iff the answer for $\beta$ is yes.
-
-
-
If such a reduction exists, then $B$ is at least as hard as $A$, since if an efficient algo iexists, any $A$ can be solved similarly
-
$A \leq_{p} B$ (read $A$ is reducible to $B$ modulo polynomial-time conversion)
Only decision problems
-
Only focus on decision problems, simplifies things
-
Only output is yes or no
-
Not asked for shortest path or cycle or similar
-
Don't ask for shortest path from $i$ to $j$, just ask if there exists a path from $i$ to $j$ with weight at most $k$
-
Optimization problem implies decision problem
-
They are no harder than the original optimization problem, show if decision is hard, so is optimization
-
Decision are especially convenient if you think about TM and languages
Lecture 10: NP Completeness, continued
Proving NP completeness
-
Same form as reduction for convex hull, but no need to trasform solution for $B$ to a solution for $A$
-
As with the convex hull, the reduction's time complexity must be strictly less than the lower bound we are proving for $B$'s algorithm.
-
The problem conversion must happen in $\mathcal{O}(n^c)$, where $c \in \mathbb{Z}$
-
-
Must show that output of $B$ is yes iff $A$ would be yes
-
But if we want to prove that a problem $B$ is NPC, do we have to reduce it to every problem in NP?
-
No! reductions are transitive!
-
Circuit Sat
-
First NPC problem
-
An instance is a boolean combinational circuit, no feedback, no memory
-
Is there a satisfying assignment such that the assignment of inputs that makes it output 1?
-
To prove, show circuit sat is NP, run circuit
-
Certificate is satisfying assignment
-
Any problem in NP reduces to Circuit Sat
-
-
Skipping second, leverages the existence of an algorithm that verifies certificates for some NP problem
Other NPC Problems
-
Circuit Sat
-
SAT
-
3CNF-Sat
-
Clique
-
Vertex Cover
-
Ham Cycle
-
TSP
-
-
Subset Sum
-
Proving the problem
-
10.18
-
Follow every step
-
Prove that B is in NP
-
Describe certificate that can verify yes (often simple and obvious, but not merely the instance)
-
Describe how the certificate is verified
-
Argue tht verification takes polynomial time
-
-
Prove that the problem $B$ is NP-hard
-
Take any other NP-complete problem $A$ and reduce it to $B$
-
Reduction must transform any instance of $A$ to some instance of $B$
-
-
Prove that the reduction takes polynomial time (an algorithm, analyze like any other)
-
Prove that the reduction is valid
-
Answer is "yes" for $A$ if and only if the answer is "yes" for $B$
-
Must argue both directions
-
Constructive proofs work well, e.g., "Assume the instance of VERTEX-COVER (problem $A$) has a vertex cover of size $\leq k$. We will now construct from that a Hamiltonian cycle in problem $B$." (and vice versa)
-
-
-
Lecture 10: NP Completeness, continued
Satisfiability (SAT) through Circuit Satisfiability
-
Given a poolean formula $\phi$ consisting of $n$ variables, $x_1, \ldots, x_n$ , $m$ connectives (of the usual type) and parenthesis
-
SAT is $B$
-
Certificate is satisfying assignment
-
Certificate verified by evaluation of satisfied assignment
-
Verification clearly takes polynomial time ($\mathcal{O}(n + m)$)
-
-
SAT is in fact in NP
-
Will show CIRCUIT-SAT $\leq_{P}$ SAT reducing CIRCUIT-SAT to SAT, and thus show that SAT is NP hard
-
Map any instance $C$ of circuit sat to some formula $\phi$ in polynomial time such that sat iff circuit sat
-
Define a variable for each wire in $C$
-
Then define a clause of $\phi$ for each gate that defines the function for that gate
-
And include the output wire as a separate clause
-
Given a satisfying assignment for $\phi$, extraction of $x_{1}_{}$, $x_{2}$, $x_{3}$ as satisfying assignment to $C$
-
$\phi$'s size is polynomial in size of $C$
-
If circuit has a satisfying assignment, final output of circuit is $1$ and value on each internal wire matches the output of the gate that feeds it, thus $\phi$ evaluates to 1
-
If $\phi$ has a satisfying assignment, then each of $\phi$'s clauses is equivalent to each internal wire.
-
Since satisfying assignment for $C$ impplies satisfying assignment for $\phi$ and vice-versa, we get $C$ has a satisfying assignment iff $\phi$ does.
-
3-CNF Satisfiability
-
Given a boolean formula than is in 1-conjunctive normal form, which is a conjunction of clauses, each a disjunction of 3 literals
-
Assignment is certificate
-
Certificate verified by evaluation of satisfied assignment
-
Reduce SAT $\leq_{P}$ 3CNFSAT
-
Again, need to map any instance of $SAT$ to some instance of 3CNFSAT
-
Parenthesize $\phi$ and build parse tree, that can be a circuit
-
Assign variables to wires in the circuit, $\phi^{\prime}^{}$
-
Truth table for $\phi_{i}^{\prime}$ to get dnf of negation, convert to cnf (by DeMorgan's) which is double prime
-
Add auxilary vars
-
Take conjunction of all of double prime
-
If has three literals, add as clause
-
If has two literals, add $p$ and $\lnot p$ to two copies and add the copies
-
Otherwise, add auxillary variables $p$ and $q$ in the correct manner
-
-
Can show the biconditional relatively easily
-
Lecture 10: NP Completeness, continued
Clique
-
Given an undirected graph $G = (V, E)$ and a value $k$, does $G$ contain a clique (complete subgraph) of size $k$?
-
Certified by vertices, $V^{\prime}$ of clique
-
Verified that $V^{\prime}$ forms a clique
-
$|V^{\prime}| \geq k$
-
-
Reduce 3CNF to Clique
-
$\langle \phi \rangle$ becomes $\langle G, k \rangle$
-
$\phi = C_{1} \land \cdots \land C_{k}$
-
Creates triple $v^{r}_{1}$, $v^{r}_{2}$, $v^{r}_{3}$ added to $V$ for every clause
-
Create an edge between any pair of vertices, $(v^{k}_{j}, v^{l}_{m})$
-
When $k \neq l$
-
They are not the negation of each other
-
-
Obviously done in polynomial time
-
-
If has satisfying assignment, at least one literal in each clause is true
-
$V^{\prime}$ is clique of size $k'$
-
If G has size $k$ clique, can assign 1 to corresponding literal of each vertex
-
Each vertex in its own triple, so each clause has a literal set to $1$
-
Will not set both literal and negation to 1
-
Gets satisfying assignment
-
See 10.37.
Vertex Cover
-
A vertex covers all edges incident to it
-
A vertex cover of a graph is a set of vertices that cvers all edges in the graph
-
An undirected graph $G$ and value $k$
-
Does $G$ contain a vertex cover of size $k$
-
Set of vertices is certificate, verify size and all edges are covered.
-
Reduce Clique to Vertex Cover
-
Given $\langle G = (V, E), k \rangle$ of Clique, vertex cover is $\langle \bar{G}, |V| - k \rangle$ where $\bar{G} = (V, \bar{E})$ is $G$'s complement.
-
Clear that conversion is polynomial
-
Proof of biconditional trivial by construction given complement (see 10.41)
Lecture 10: NP Completeness, continued
-
Review method of reduction
-
Review Vertex Cover reduction
Lecture 10: NP Completeness, continued
Subset Sum
-
Given a finite set $S$ of positive integers and a positive integer target $t$, is there a subset $S^{\prime} \subseteq S$ whose elements sum exactly to $t$?
-
In NP
-
Certificate is $S^{\prime}$, verification in polytime
-
-
Subset sum is NP Hard, from 3-CNF-SAT
-
3-CNF-SAT $\leq_{P}$ SUBSET-SUM
-
No clause contains variable and its negation
-
Each variable appears in at least one clause
-
Reduction
-
Let $\phi$ have $k$ clauses $C_{1}, \ldots C_{k}$ over $n$ variables $x_{1}, \ldots x_{n}$
-
Reduction creates two numbers in $S$ for each variable $x_{i}$ and two numbers for each clause $C_{j}$
-
Each number has $n + k$ digits, most significant $n$ tied to variables, least significant $k$ tied to clauses
-
$t$ has a one in each digit tied to a variable, and a 4 in each digit tied to a clause.
-
For each $x_{i}$, $S$ contains $v_{i}$ and $v_{i}^{\prime}$ in the $i$th most significant position, and $0$ for other variables, put a 1 in $C_{j}$'s digit for $v_{i}$ if it apears unnegated, a 1 in $C_{j}$'s digit for $v_{i}^{\prime}$ if it appears negated
-
For each $C_{j}$, $S contains integers $sj$ and $sj_{}′$ where $sj$ has a 1 in the $Cj$ digit and $sj′$ has a 2 in the $Cj$ digit
-
-
Greatest sum of any digit is 6, therefore no carry,
-
Variable digits provide mutual exclusion of variables (excluded middle)
-
slack from $s_{k}$ and excluded middle of $x_{n}$ provide proof
-
-
Lecture 10: NP Completeness, continued
Subset Sum, Continued
-
4 is used because there can be 1, 2 or 3 literals that are true, the additional 1 makes the problem easier and allows the slack to work
-
See 10.46, 10.47 for if and only if
Lecture 11: Maximum Flow
Introduction
-
Can use digraph as a flow network to model things (data communications networks, water etc. through pipes, assembly lines, etc.)
-
a flow network is a digraph with two special vertices, a source ($s$) that produces flow, and a sink ($t$) that takes the flow
-
Diredge is a conduit with a capacity
-
Vertices and conduit junctions
-
Except $s$ and $t$, flow must be conserved, the flow into a vertex must match flow out
-
Max flow problem: given a flow network, determine max amount of flow that can get from $s$ to $t$, also: bipartite matching
Flow networks
-
every edge $(u, v) \in E$ has non negative capacity, $c(u, v) \geq 0$
-
If $(u, v) \in E$, then $(v,u) \not\in E$
-
If $(u, v) \in E$, then $c(u, v) = 0$
-
No self-loops
-
Assume that every vertex in $V$ likes on some path from the source $s \in V$ to the sink vertex $t \in V$
Flows
-
A flow in $G$ is a function $f: V \times V \mapsto \mathcal{R}$ that satisfies
-
The capacity constraint: for all $u, v \in V$, $0 \leq f(u, v) \leq c(u, v)$, i.e., flow is non-negative and does not exceed capacity
-
Flow conservation: for all $u \in V \setminus \{s, t\}$: \[\sum_{v \in V }f(v, u) = \sum_{v \in V }f(u, v)\]
-
-
Value of flow $f$ is net flow out of $s$, net flow into $t$ \[|f| = \sum_{v \in V} f(s, v) - \sum_{v \in V} f(v, s)\]
-
Maximum Flow Problem: Given the graph and capacities, find a flow of maximum value
Example
-
flow/capacity on edges
Multiple Sources/Sinks
-
Supersource $s$ and supersink $t$, edge from $s$ to every source, from every sink to $t$, add edges with infinite capacity
Ford-Fulkerson Method
-
method, not algorithm
-
many ways to implement
-
Uses residual network – a network wich is $G$ with capacities updated based on the amount of flow $f$ already going through it
-
augmenting path – a simple path from $s$ to $t$ in a residual nework, if such a path exists, then more flow can be pushed through
-
cut – partition of $V$ into $S$ and $t$ where $s \in S$ and $t \in T$ can measure flow and capacity crossing cut
-
Repeatedly finds augmenting path in residual, adds in flow along path, updates residual network
Residual Networks
-
given capacity $c$ and flow $f$, residuale network consists of edges with capacities showing how one can change flow in $G$x
-
Residual capacity of an edge is: \[c_{f}(u, v) = \begin{cases}c(u, v) - f(u, v) & (u, v) \in E\\f(v, u) & (v, u) \in E\\0 & \end{cases}\]
-
$E_{f} = \{(u, v) \in V \times V : c_{f}(u, v) > 0\}$
Lecture 11: Maximum Flow, Continued
Flow Augmentation
-
if $f$ is a flow in $G$ and $f^{\prime}$ is a flow in $G_{f}$, the augmentation is: \[(f \uparrow f^{\prime})(u, v) = \begin{cases} f(u, v) + f^{\prime}(u, v) - f^{\prime}(v, u) & (u, v) \in E \\0 & \end{cases}\]
-
$f \uparrow f^{\prime}$ is a flow in $G$ with value $|f \uparrow f^{\prime}| = |f| + |f^{\prime}|$
-
Amount of flow that can be put on a path $p$ is $p$'s residual capacity, $c_{f}(p) = \min\{c_{f}(u, v) \forall(u, v)\in p\}$
-
Then, $f_{p}(u, v)$ is based on the value of $c_{f}(p)$
-
Full Ford-Fulkerson algorithm presented on 11.16
Analysis
-
Assume oll of $G$'s capacities are integers
-
If we chose augmenting path arbitrarily, $|f|$ has an upper bound of $|f*|$ steps
Edmonds-Karp
-
Uses Ford-Fulkerson but BFS to find paths, time complexity is $\mathcal{O}(|V||E|^{2})$
Maximum Bipartite Matching
-
In an undirected graph, a matching is a subset of edges $M \subseteq E$ such that for all $v \in V$, at most one edge from $M$ is incident on $v$
-
If an edge from $M$ is incident on $v$, $v$ is matched, otherwise, unmatched
-
Find a matching of maximum cardinality
-
Special case: $G$ is bipartite, i.e., meaning $V$ partitioned into two disjoint sets $L$ and $R$ and all edges cross tbetween the two
-
Matching machines to tasks, arranging marriages between interested parties
-
Can cast bipartite matching as max flow, given bipartite graph $G$, flow network is:
-
$V^{\prime} = V \cup \{s, t\}$
-
$E^{\prime} = \{(u, v), (u, v) \in E\} \cup \{(s, u) : u \in L\} \cup \{(v, t): v \in R\}$
-
$c(u, v) = 1, (u, v) \in E^{\prime}$
-
-
Value of flow is number of edges
Lecture 12: Final Exam Review
-
Monday, Dec 16 13:00-15:00, AvH 106
-
Topics: MSTs, Single Source Shortest Paths, All Pairs Shortest Paths, Lower Bounds, NP Completeness, Max Flow
-
Two reference sheets (double sided, mecanicaly printed, standard letter)
-
Four Questions, 25 pts each
-
Guaranteed NP-completeness, Max Flow (Edmons Karp)