Notes

Notes, etc. of Samuel Flint

Fundamentals of Constraint Processing

Lecture 2

Resources

  • check website – has many, many links

  • conferences

  • journals

    • include logic programming and CSP

    • Linear Programming, Integer Programming, Mixed Integer Programing, Dynamic Programming

  • journals are also pretty big – and tend to be easier to get into

  • many good ways to get jobs with this

Constraint Satisfaction 101

  • examples

    • generating optimal decisions

      • have unary, bynary, n-ary constraints

      • often have many choices, but restrictions can make it easier to make the right decision

  • modeling constraints remains an art

  • sets of variables with sets of choices, and aset of constraints restricting combinations of values

  • applications

    • radio resource management

    • databases

    • temporal and spatial reasoning

    • allocation

    • design/config

    • graphics, visualization, interfaces

    • verification and engineering

    • HCI and decision support

    • molecular biology

    • analytical chemistry

    • robotics, machine vision, computational linguistics

    • transportation

    • qualitative and diagnostic reasoning

  • constraint processing is modeling, plus inference and search

    • modeling – variables, domains, constraints, query

    • inference – enforcing consistency (filtering) and propogation (AC, PC)

    • search – systematic (chronological backtracking, with or without intelligence, etc) or non-systemactic (hill climbing, taboo, etc)

  • Real Example

    • Netherlands have dense network

    • wanted more robust schedule that's more effective, increasing profit

    • used OR and AI

  • Singapore

    • large container hub

    • has custom, special loading built using CP

  • CP is:

    • solving decision problems

    • with arbitrary constraints

    • providing concise and high-level feedback about alternatives and conflicts

  • Very Practical

Defining a problem

  • has a given and a question

  • given is a set of objects, the relations, etc

  • question ex: find $x$ such that condition $y$ is satisfied

    • these can be simple or advanced

  • a problem must always, always, always have a question!

  • Given $\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})$

    • $\mathcal{V} = \{V_1, V_2, \ldots, V_n\}$ Variables

    • $\mathcal{D} = \{D_{V_1}, D_{V_2}, \ldots, D_{V_n}\}$ Domains for Variables

    • Constrants $\mathcal{C} = \{C_1, C_2, \ldots, C_l\}$ where $C_{V_i,V_j,\ldots,V_k} = \{(x,y,\ldots,z)\} \subseteq D_{V_1} \times D_{V_2} \times \ldots \times D_{V_n}$

  • Query whether we can find a value for each variable such that all constraints are satisfied, queries can include weights (making it a Constrained Optimization Problem)

  • Various levels of consistency

    • AC

    • GAC

    • SAC

    • SGAC

  • Sudoku can be done with consistency, but whether or not search is required for all

  • find solution – decisions

  • find sumber of, all solutions – counting

  • find set of constraints that can be removed so that a solution exists – optimization

  • specials

    • binary CSPs have only two variables

    • restricted to $\{0, 1\}$, boolean CSP

    • finite, discrete domains, enumeration can work

    • continuous domains require sophisticated algebraic techniques are needed – consistency techniques are then used on domain bounds

Lecture 3

  • always a given $\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})$

  • Constraints are relations

    • relations are a subset of the cartesian product of $n \geq 2$ sets

    • relations can be represented as a table, which are refered to as allowed tuples, or supports, which is a table constraint, being an enumerated

    • extension in a set with all members enumerated

    • conflicts are the set of tuples that are no good

    • Constraints can also be in intention – they're defined in terms of a logical function, equivalent to extension, but can be easier to understand

    • A check function should always be a simple implementation – to build with them

  • Code must always be well structured!

Lecture 4

  • $P = (V, D, C)$

    • $V$ variables

    • $D$ domains

    • $C$ constraints

  • Constraints are defined as $C_1 = \langle\mathrm{scope}(C_1), \mathrm{rel}(C_1)\rangle$

    • Scope is the variables it discusses – the set of variables over which the constraint is defined

    • relation is an abstract relation – may be in extension (listed as either supports or conflicts) or intension (roughly as a function or boolean expression)

    • Also have arity, the cardinality of the scope

    • universal constraints allow any member of the relevant cartesian product, but may be induced so as to be something else – equivalent to know constraint, and is binary alone

  • VVP – Variable Value Pair

  • make sure to implement check function, and to do so in a very independent manner

  • keep track of a number of constraint checks, always incrementing for every check

  • will be writing abscom to parse

  • Graph representation

    • Macro representation

    • Variables are represented as nodes (vertices)

    • domains as node labels

    • constraints as arcs (edges) between nodes

    • hypergraphs allow for special constraint nodes, or use a bubble to connect them instead – the hyperedges

  • relations in intension are defined by set-bulider notation

    • used when not practical/possible to list all tuples

    • define types/templates of common constraints

    • linear constraitns

    • all-diff constraints

    • at most

    • TSP-constraints

    • cycle-constraints

  • constraints implemented

    • in extension

      • set of tuples (list/table)

      • binary matrix (bit-matrix)

      • multi-valued decision diagrams

    • in intension

      • constrained decision diagrams

      • predicate functions

Lecture 5

  • examples of modeling

    • can have temporal csp

    • graph or map coloring – requires 4

    • resource allocation – airline trip planning, list coloring

    • product configuration – allowed components and what they require/forbid

    • logic puzzles

  • Constraint types

    • algebraic constraints

    • constraitns of bounded difference

    • coloring

    • mutual exclusion

    • difference constraints

    • arbitrary constraints, must be made explicit

  • Databases

    • Join in DB is a CSP

    • View materialization is a CSP

  • Interactive systems

    • data-flow

    • spreadsheets

    • graphical layout systems and animation

    • graphical user interfaces

  • Molecular biologiy

    • threading and similar

Formal characterization

  • macrostructures

    • networks/constraint graphs, $G = (V, E)$, may be hypergraphs

  • Micro-structure – includes every possible Variable Value Pair

    • VVPs are linked iff are a support and are consistent

    • cliques of size $n$ are possible solutions

    • however, finding cliques is $NP$ complete

  • Co-microstructure – includes every possible VVP

    • VVPs are linked iff they are a no-good

    • cliques of size $n$ prevent a given solution

  • Complexity

    • decision problem

    • NP-complete by reduction from 3SAT

    • can be verified in polynomial time, but solution is not generated in polynomial time

Proof of CSP into SAT

  • CSP has Variables, Domains, Constraints

  • Every VVP is a Boolean variable in SAT

  • Requires a clause for each variable, a disjunction for each VVP, to be added to the conjunction of clauses, along with a disjunction of not a or not b of each diffenrent VVP as pairs

  • Constraints are then mapped to SAT

Lecture 6

Solving a CSP

  • by search

    • constructive, systematic, exhaustive – involves enumerating all possible VVPs

    • iterative repair, local search – pick an assignment and change unsatisfied constraints until all constraints are satisfied

    • Before start

      • consider the model/forumaltion – controls the size of the search space (defining a model is an art)

      • pre-process

        • i.e. constraint filtering/propagation, consistency checking – reduce the size of search space

  • Constraint Checking

    • arc consistency

      • removes individual possible elements from the domains of either element

      • Revise Function (B, C):

        • $\forall x \in D_B$

          • if support($x$, $D_c$)

            • break

          • else

            • $D_B \gets D_B \setminus \{x\}$

      • Support function: for every element, if supports, return true, if none exist, default to false

      • will not always find a solution, but will reduce problem size

    • k-consistency

      • consistency on $k\leq n$ variables

  • Systematic search

    • for every value of V_i, check for consistency with previous assignments (starting from the bottom), cuts off portions of the tree if false (using backchecking to prune a search tree)

    • back-checking expands a partial solution iff it is thus-far complete

    • chronological backtracking – DFS with only the current path, undoes the most recent variable assignment and tries a new one – can be used for only one solution

      • is complete – will always find a solution if it exists

      • is sound / correct – a solution it gives will always be correct – will not give a false positive

    • intelligent backtracking

      • jump back based on the reason for failure – generally the deepest

    • Ordering heuristics

      • deciding what to expand first

      • often most constraint and most promising values first

      • strategies for variable/value ordering could be static/dynamic

      • effects branching factor – best to branch as little as possible

      • generally variable ordering is dynamic, but value is static (lexicographic)

    • Keep-in mind forward-checking – removing values from domains that aren't consistent – a/k/a domain annhilation – and is more efficient

    • Real Full Lookahead is used to further restrict domains and search spaces more than plain forward-checking

    • balance between reasoning and search

Lecture 7

Arc Consistency

  • revise should only modify one variable's domain

  • requires support, takes VVP and domain of other

    • checks for VVP in X with every element of Y, returning true if found, false if none found

    • check always increments if there's a constraint

    • revise should check if there's a constraint to begin with

  • review reading!

  • VVPs can be represented as tuples, may use angle brackets

  • AC-1:

    • assume CSP has $n$ variables, having domain size $d$, and $e$ constraints

    • $Q$ is a set of directed edges to revise, from x to y, from y to x, size $2e$

    • for all edges in $Q$

      • set acp to fales

      • repeat until revisedp is false

        • set revisedp to false

        • for everything in $Q$

          • if it's revised, set revisedp to true

          • But, if a variable's domain is empty, break and report – this CSP has no solution

      • This CSP is then arc-consistent

    • Make sure to sort list of edges!

  • Unary constraints should be done first!

Lecture 8

Property Algorithm Complexity
AC AC-1 $O(na^3e) = O(n^3a^3)$
  • solutions found by

    • detecting components

    • NC

    • normalize constraints

    • NC, AC, other forms of AC

    • Search

    • Verification

    • Solution, Node Vists, Constraint Checks, CPU Time

  • when implementing AC1, terminate if domain wipe-out occurs

  • worst case complexity is seldom reached

  • remember, AC may discover a solution to a CSP, or may find an inconsistent problem (by domain annhilation)

  • AC3

    • for $i \gets 1$ to $n$, do $NC(i)$

    • $Q \gets \{(i, j) | (i, j) \in arcs, i \neq j\}$

    • while $Q$ is non-empty:

      • select and delete any arc $(k, m) \in Q$

      • if $\mathsc{Revise}(k, m)$ then $Q \gets Q \cup \{(i, k) | (i, k) \in arcs(G), i \neq k, i \neq m\}$

    • Check for domain wipe-out!

Lecture 9

  • Remember to deal with Node Consistency – and don't include it in the check count

  • see lecture 8 notes

  • AC3 is not only more powerful

  • $2e + a[2e - n]$

  • AC3 is $O(a^2 (2e + a(2e - n))) = O(a^3e) = O(a^3n^2)$

  • AC4 is even more efficient, but requires special bookkeeping, is $O(a^2n^2)$

  • Variant of AC-3 called AC-2001, but requires cubic space, has yet another variant, AC-3.1$^{m}$, requires different bookkeeping

  • AC4

    • generate m, s, and counter

      • m and s have as many rows as there are VVPs in the problem, m is set to true, s to empty list

      • m is for alive

      • s is for a list of all VVPs that support a given VVP

      • given a vvp, provide a number of supports provided by a given variable

    • prepare data structures

      • for every constraint all tuples:

        • when allowed, update s

          • increment counter

    • Then process all constraints

      • if counter is 0, remove from domain, and update $m$

      • since $x$ is removed, then for everything such that it was a support, decrement the counter

  • AC3 tends to be a bit better than AC4

Lecture 10

  • soundness – can trust

  • complete – will always find a solution if exists

  • more efficient AC variants are useful for use during search

  • but when enforcing AC during search, you must be able to keep track of what was removed

  • remember properties are not the same as the algorithms! there may be many algorithms to implement each property!

  • most consistency methods are local initially

    • node/arc remove inconsistent values

    • path removes inconsistent tuples

      • used when you have positive tables of supports

    • they get closer to the solution

    • reduce the size of the problem

    • are cheap (polynomial time)

    • choice of consistency properties helps to reduce time to solution

Intelligent Backtracking Algorithms

  • Read Prosser, CI 93. Hybrid Algorithms for the Constraint Satisfaction Problem

  • Consider Reading 5/6 of Dechter

  • Issues

    • Variable/Value ordering

    • Variable Instantiation

    • Currentt Path

    • Current Variable

    • Past Variables

    • Future Variables

    • Shallow/Deep Levels/Nodes

    • Search space/Search Tree

    • Back-checking

    • Backtracking

  • As you go, un-assigned variables are the future variables/future sub-problem

  • those that have been instantiated are the past variables

  • the current path is current instantiations

  • always check a new instantiation against past variables (back-checking)

  • algorithms

    • Vanilla: BT

    • Improved Backward: BJ (back-jumping), CBJ (conflict-directed BJ)

    • Improving Forward: BM (back-marking), FC (forward-checking)

  • D-Way branching is conceptually easier to understand than 2-way

  • algorithms from Prosser are iterative back-tracking and pluggable/modularized

  • Kondrak and Van Beck looked at the efficiency of algorithms

Lecture 11

  • Variables $V_i$, $i \in [1, n]$

  • Domains $D_i = \{V_{i1}, v_{i2}, \ldots, v_{iMi}\}$

  • Constraint between $V_i$ and $V_j$ : $C_{i,j}$

  • Constraint graph $G$

  • Arcs of $G$: $\mathrm{Arc}(G)$

  • Instantiation order is static or dynamic

  • lang primitives are from lisp

  • Data structures

    • $v$ is a 1 by $n$ array to store assignments, $v_0$ is the root, backtracking to this indicates insolvability, $v_i$ is the assignment to the $i$th variable

    • $domain$ is the original domain

    • $current-domain$ is the current domain, must be updated on back-tracking

    • check is as before

  • Generic form

    • takes n, status

    • set consistent to true

    • status to unknown

    • $i \gets 1$

    • while the status is unknown

      • if consistent

        • $i \gets \mathrm{label}(i, consistent)$

        • else $i \gets \mathrm{unlabel}(i, consistent)$

      • if $i > n$

        • set status to solution

        • else if $i = 0$, set status to impossible

  • label and unlabel functions are provided by the backtrack algorithm

  • label is forward move, unlabel is backward move

  • For BT

    • label

      • when $v_i$ is assigned a value from the $current-domain_i$, we perform back-checking against past variables

      • if back-checking succeeds

      • return $i + 1$

      • else, remove assigned value from current-domain[i], assign next value, etc

      • if no value exists, set consistent to false

      • function

        • takes i, consistent

        • set consistent to false

        • for v[i] in each element of the current domain while not consistent

          • set consistent to true

          • for h from 1 to i - 1 and while consistent

            • set consistent to check(i, h)

          • if not consistent

            • then remove i from current-domain

        • if consistent, return i+1, otherwise i

    • unlabel

      • current level is set to $i - 1$

      • for future variables $j$, reset the current-domain

      • if the domain of $h$ is not empty, set consistent to true

      • Function

        • takes i, consistent

        • set h to i - 1

        • set current-domain[i] to domain[i]

        • remove v[h] from current-domain[h]

        • set consistent to current-domain[h] not-equal empty-set

        • return h

    • label does the actual instantiation

Lecture 12: Phase Transitions

  • Cheeseman 1991

  • Order parameter

  • Probability solution exists for random problems is almost 1

  • there's a critical value whereby, after it, probability is almost 0

  • around the critical value, the probability is around 0.5

  • cost of solving drops sharply as it gets further to the right of the critical value, but high as it goes to it

  • around .5 probability is high cost of solving, no matter the algorithm, this is referred to as a phase transition

  • conjecture regarding the characterization of NP complete problems

  • applies to detecting/implementing arc-consistency

  • random graphs are almost always easy to to color – conjecture from famous paper

    • graph coloring is a reduction operator

    • friends are vertices with the same neighborhood

    • this is the degree

  • For CSPs, it's either density or tightness

  • currently effects the way of expirement conduct – try and deal with the hardest problems

  • but be careful not to focus exclusively on the redior around the critical value

  • Run on random CSPs – given statistical analysis

  • Vary params $\langle n, a, t, p \rangle$

    • $n$ – number of variables

    • $a$ – domain size$

    • $t$ – tightness $\frac{|\mathrm{forbidden tuples}|}{|\mathrm{all tuples}|}$

    • $p$ – proportion of constraints $p = \frac{e}{e_{max} - e_{min}}$, also $\frac{2e}{n(n-1)}$

    • $t$ and $p$ will be between 0 and 1

    • Can have issue with uniformity, difficulty (phase transition), solvability

    • empirical studies can be done simply by varying one of the parameters

    • do it (Model B):

      1. generate $n$ nodes

      2. generate a list of $\frac{n(n-1)}{2}$ tuples of all combinations of 2 nodes.

      3. choose e elements from the above list as constraints to go between the $n$ nodes

      4. if the graph is not connected, throw away, go back to step 4, otherwise proceed

      5. generate a list of $a^2$ tuples of all combinations of 2 values

      6. For each constraint, choose randomly a number of tuples from the list to gparantee tightness $t$ for the constraint.

    • Model B instances may be flawed, but it's rare

Lecture 13: Back-tracking: Continued

  • keep track of path – an array of the instantiations thus far and to be created

  • remember what unlevel does – actually performs backtracking

  • many different ways to order variables

  • when doing value-ordering, use lexicographic ordering!

  • eventually, variable ordering heuristics must be broken with lexicographic ordering

Back-Jumping

  • uses $max-check$, an array, and when check(i, h) succeeds, $max-check[i] \gets max(max_check[i], h)$

  • if the current domain of $h$ is empty, chronological back-tracking is performed from h

  • you go back to $max-check[i]$ to fix stuff when you have an empty variable, unless you've already back-jumped in the current path

  • BJ just modifies BT-label and BT-unlabel

    • BJ-label just updates $max-check$

    • bj-unlabel

      • backtrack to h = max-check[i]

      • resets max-check[j] = 0 for j in $h+1$ to $i$

Conflict-directed BJ

  • requires a new data structure, the conflict-set array

    • tracking at what levels conflicts occur

  • knows how to jump-back again, to the deepest level of conflict

  • can jump-back more than once!

  • useless if good variable ordering

  • conflict sets are initialized to $\{0\}$

  • At any point, conf-set is a subset of past variables that are in conflict with $i$

  • when check(i, h) fails add h to conf-set[i]

  • when domain is empty

    • jumps to deepest past variable in conf-set

      • update $conf-set[h] = conf-set[h] \cup conf-set[i] \setminus \{h\}$

  • To calc all solutions – don't use CBJ

Lecture 14: Back-tracking: Continued

CBJ Continued

  • max-check – deepest level checked against – if instantiated, will be $i - 1$

  • conflict set is used for holding what's conflicted

  • Finding all solutions with CBJ

    • use the conflict sets

    • or use CBF from Kondrak

    • keep track of number of solutions

  • make sure to check generated solutions with the solution checker!!

  • modify algo to after solution, back track to every level at least once!

    • in actuality, must be back tracked to exactly one

  • look at Chis Thiel's response

    • when a solution is found, force the lat var to conflict with everything before it. In turn, this forces some chrono backtracking as the conf-sets are propagated backward

  • Kondrak's responce

    • cbf flag, vector

      • $\forall i, cbf[i] \gets 0$

      • when you find a solution, $\forall i, cbf[i] \gets 1$

      • In unlabel

        • if $cbf[i] = 1$

          • $h \gets i - 1$, $cbf[i] = 0$

          • else $h \gets max(conf-set[i])$

Lecture 14: Forward Steps

Backmarking

  • tries to reduce the amount of consistency checking

    • $v[i]$ to reassign to $k$

    • $v[i] \gets k$ last checked against $v[h] \gets g$

    • $v[h]$ has not been modified

  • not super useful – as consistency enforcement is more useful

  • this helps to reduce back checking

  • data structures, max back-check level, minimum backup level

    • $mcl$ max check level, $n \times m$ array, $mcl[i, k]$ stores deepest variable that $v[i] \gets k$ was checked positively against, finer-grained mcl

    • $mbl$ min. backp level $n$ array, gives the shallowest past variable whose value has changesd since $v[i]$ was the current variable

  • BM does not allow dynamic variable ordering

  • aware of the deepest variable that $v[i] \gets k$ is $v[j]$

  • values of variables in the past of $v[j]$ ($h<j$) have not changed

  • so we need to check $v[i] \gets k$ against the values of the variables between $v[j]$ and $v[i]$

  • we do not need to check against the values of the variables in the past of $v[j]$

  • when $mcl[i,k] \leq mbl[i]$, don't check, fails!

  • when $mcl[i,k] \geq mbl[i]$, don't check, succeeds! – type B savings

Forward checking

  • Looks ahead from the current variable, considering all future variables and clear from the domains the values that aren't consistent with the current partial solution

  • FC means more work in the future, but causes fewer nodes to expand

  • When it moves forward, the values of the current domains of future variables are all compatible with past assignments, saves backchecking

  • FC may cause wipe-out, but discovers conflicts early on. will backtrack

  • goal is to find failures early – avoids expanding fruitless subtrees

  • is not truly forcing arc consistency, that's RFL

  • requires a DS to keep track of filtering, can be done in various ways

Lecture 15: Forward Steps, Continued

Forward Checking

  • look-ahead – revises future variables

  • makes more work at instantiation, but expands fewer nodes

  • as you move forward, values in future variables are consistent with past variables

  • can cause wipe-out and conflict discovery early on, backtracking chronologically

  • requires current domain, stack of variables removed by previous instantiations

  • future is a subset that v[i] checks againts

  • past-fc is past variables that checked against v[i]

  • requires functions

    • check-forward – called when instantiating v[i]

      • revise(j, i)

      • returns false if current-domain[j] is empty, true otherwise

      • values are removed from current-domain are pushed as a set into reductions[j]

    • update-current-domain

      • $current-domain[i] \gets domain[i] \setminus reductions[i]$

    • fc-label

      • attempt to instantiate current-variable

      • filters domains of current

      • on wipe-out of future variable, uninstantiate and undo domain filtering

  • Suffers from thrashing, as based on bt

Variable Ordering

  • dom, least domain, smallest domain first

  • degree of variable (number of neighbors)

  • dom/deg is most common

  • Degree is difficult, as it changes very rapidly

    • can/should be implemented statically

Consistency Check Summary

  • CBJ

    • uses backtracking, no addtional structures

  • backmarking uses mcl and mbl

    • two types of consistency-checking savings

  • forward checking

    • works more at each instantiation, but expands fewer subtrees

    • uses reductios, future-fc, past-fc

  • FC-CBJ is the best. Period

  • Other Backtracking algos

    • Graph-based back-jumping – Dechter

    • Pseudo-trees [Freuder, 85]

  • Other lookahead – DAC, MAC – DAC seems to be pretty good as a compromise between MAC and FC

  • More emperical evals

    • now by generating random problems

  • theoretical comparisons

Implementation notes

  • Enforce NC

  • Normalize constraints

  • check for empty crelations

  • Interrupt as soon as you detect domain wipe-out

  • Remember dynamic variable ordering includes the domino effect

    • a/k/a pick singleton domains first

Lecture 15: Intelligent Backt Tracking – Theoretical Eval

  • Kondrak and Van Beek, IJCAI 95

  • Dechter 5, 6

  • most backtracking algorithms are evaluated empirically

  • performance depends heavily on problem instance

  • is therefore average case analysis

  • representativesness of the test problems

  • CSP in NP complete – always possible to construct examples where BJ/CBJ are worse than BT

  • comparison criteria are different

  • paper gives theroretical approach

    • states by number of nodes visited

    • number of constraint checks

  • BT, CBJ perform the same amount of consistency checks

  • bm reduced consistency checked

  • MD does not affect number of nodes visited

  • FC-CBJ may visit more nodes than CBJ

  • proves correctness of BJ/CBJ

  • determines partial order between algos

  • proves BJ, CBJ are correct

  • proves FC never visits more nodes than BJ

  • Improves BMJ and BM-CBJ to perform even fewer consistency checks

  • provides framework for characterizing future BT algorithms

  • BT extends partial solutions – solutions consistent with a set of uninstantiated values if it can be consistently extend to the variables – that is, create a new partial solution extended by one variable s.t. it's still consistent

  • Dead end – when all values of current variables are rejected

  • lower level are closer to the root

  • higher levels are closer to the fringe

  • 2 BT algos are equiv if on every csp they generate the same tree and perform the same consistency checks

  • Lemma 1: If BJ bactkracs to variable x_h from a dead end at x_i, then (a_i \ldots a_h) is not a valid partial solution

Lecture 16: Theoretical Eval, Continued

  • sufficient conditions

    • BT visits a node if parent is consistent

    • BT visits a node if its parent is consistent with all variables

    • CBJ vists a node if its parent is consistent with all sets of variables

    • FC vists a node if it is consistent and its parent is consistent with all variables

  • Necessary Conditions

    • BT visits a node only if parent is consistent

    • BT visits a node only if its parent is consistent with all variables

    • CBJ vists a node only if its parent is consistent with all sets of variables

    • FC vists a node only if it is consistent and its parent is consistent with all variables

  • BT visits all nodes FC visits

  • BJ vistis all nodes CBJ visits

  • CBJ and FC are not comparable.

  • FC will visit no more nodes than BJ

  • BT and BM are equivalent, as are BJ and BMJ

  • proof of correctness – termination, soundness, completeness

  • extends for seeking all solutions

  • BM hybrids should be fixed, by changing MBL to an $n \times m$ array, creating BM2, BMJ2 and BM-CBJ2

Lecture 17: Search orders in CSPs

  • assumptions

    • finite domains

    • binary constraints

  • Chapter 6, Tsang is req'd reading

  • Dual viewpoint heuristics for binary constraint satisfaction problems is recommended reading

  • IN BT, some orders have fewer backtracks

  • in lookahead, some ordering detect failure earlier

  • variable ordering is often the most constrained variable

  • value ordering is often the most promising value first

  • using dynamic ordering in lookahead algos often removes the need for intelligent backtracking

  • Regin and Bessier show that CBJ and similar isn't needed – but not so sure

  • Select values that don't cause constraint violation – do the most promising value first, avoiding constraint violation

  • Discover constraint violation quickly – select variables that don't delay discovery of constraint violation, go with the most constrained variable first

  • Notation: For a given future variable $V_i$, the assignment $V_i \gets x$ partitions the domain of the future variable $V_j$, a set of values consistent with with the assignment of size $\mathrm{Left}(V_j | V_i \gets x)$ and a set of values inconsistent of size $\mathrm{Lost}(V_j | V_i \gets x)$

  • Value selection of Lost and Left

  • Value Ordering to quickly find a solution

    • min-conflict – orders values according to the conflicts in which they are involved with the future variables – most popular (Minton), $Cost(V_i \gets x) = \sum_{V_j \neq i} \mathrm{Lost}(V_j | V_i \gets x)$

    • Cruciality – (Keng and Yu, 1989) $Cruciality(V_i \gets x) = \sum_{V_j \neq i} Lost(V_j | V_i\gets x) / |D_{V_j}$

    • promise – most powerful (Geelen, 1992) $Promise(V_i \gets x) = \prod_{V_j \neq i} Left(V_j | V_i \gets x)$, the number of assignments that can be done such that no constraint on $V_i$ is broken.

    • Others are defined

  • Promise

    • recognizes domain wipe-out

    • product discriminates better than sum

    • semantics of promise gives an upper bound on number of solutions

  • FFP name is contested

    • goal is to recognize dead-ends asap to save search effort

    • choose ordering that reduces branching factor

    • Early ones

      • LD

      • maximal degree

      • minimal ratio of domain size over degree

      • promise for variables

      • Br\'elaz heuristic

      • weighted degree (Boussemart et al, ECAI 2004)

      • graph-based

        • minimal width

        • induced with ordering w*

        • maximal cardinality ordering

        • minimal bandwidth ordering

      • remember to exploit domino effect, especially under dinamic var ordering

      • always state how you are breaking ties – generally lexicographically

      • is often cheap and effective

      • in most cases, suitable for both static and dynamic ordering

    • Promise for variables – Sum of the promises of the values, minimize the number of assignments that can be done such that no constraint on $V_i$ is broken. Has upper bound on the number of solutions

Lecture 18: Ordering Heuristics

  • Least Domain

  • Max Domain

  • Domain/Degree

  • Promise is based on what has the most promise given the current state

  • Br\'elaz heuristic – originally used for graph coloring

  • Remember, variable ordering heuristics are cheap and effective

  • generally suitable for static and dynamic ordering

  • graph-based heuristics aren't often used anymore

  • Br\'elaz

    • originally designed for coloring

    • assigns most constrained nodes first

    • arrange vars in decreasing order of degrees

    • color a vertex with maximal degree with color 1

    • choose a vertex with a maximal saturation degree (number of different colors to which is it adjacent)

      • if thete's an equality, choose any vertex of maximal degree in uncolored

    • color chosen vertex, with lowest numbered color

    • if all vertices are colored, stop, otherwise, choose another vertex and continue

  • wdeg

    • All counstraints assigned a weight, initially 1

    • every time a constraint is broken, during propagation with look-ahead (constraint causing domain wipe-out), weight is increased

    • the weight of an un-assigned variable is defined as the sum of the weights of the constraints that apply to the variable

    • the variable with the largest weight is chosen for instantiation

    • weights are not reinitialied at back-track

    • refined with dom/wdeg

    • inspired by breakout heuristic in local search

    • However, there mart be places where dom/deg is more efficient than dom/wdeg

    • A decay function may help improve this

  • Graph heuristics

    • ordering by elimination or instantiation

    • minimal width

      • given an ordering, the width of a node is the number of its parents

      • then the ordering's width is the largest width of any of the nodes

      • sum of degrees is always equal to the number of edges

      • take minimum of the orders, by width – this is the width of the graph

      • this can reduce chance of backtracking, can help to find minimum width ordering

      • Algo:

        • remove all nodes not connected to any others

        • $k \gets 0$

        • while no nodes left in the graph

          • $k \gets 1$

          • while there are nodes not connected to more than k others

            • remove such nodes from graph, along with any edges connected to them

        • return $k$

        • minimal width ordering of the nodes is obtained by taking the nodes in the reverse order that they were removed

    • Inducted Width Ordering

      • What happens when you reconnect neighbors of what you remove to each other

Guest Speaker: Philip Walton, Workday

  • From Workday, SaaS company for HR management

  • work has primarily been in Data Analysis and optimization

  • go from techniques and technologies to money

  • Could be a Linear Program

    • Measures, Items (index sets)

    • indices i, m

    • variables Quantity, 0 to N

    • model sets, limit, value, weight

    • maximize value and quanitity

    • but must be withinn weight

  • big thing is assigning resources in a data center

  • truck scheduling

    • many truck loads that a company has to move

      • origin and destination specified

      • and time windows at origin and destination

    • many drivers, many trucks

    • some time horizon

    • DOT rules

    • operational rules

      • returning to base

      • equipment preventative maintenance

    • Taking a run at the whole thing is probably too much

      • run a rout model to pick loads that go togethr

      • schedule loads – the big thing for SCPs

      • drivers pick the sets of loads they prefer

  • modeling

    • discrete time forms an integer domain

      • can be clumpy

    • use start time of the task as the domain

    • constraints like before and after are relatively straightforward

    • duration of driving/time windows are prety easy to express

    • small disruptions are normally readily repaired by freeing up parts of the model and letting it re-solve

    • time windows tend to be pretty large

    • isn't always a notion of an optimal schedule

    • side constraints can be pretty easy to represent

      • work rules aren't always easy

      • drivers may have odd drive/sleep patterns

      • side resource use is tolerable

  • Robustness for many things has to be dealt with, but how that's done is very differennt

  • And remember how important interface is – they're both important, and very, very different

  • And even more important is understanding what's going on within the problem domain

  • good large software is built on top of amazing small software

Lecture 19: Variable Orderings, continued

Induced Width Ordering, $\omega^{*}$

  • requires designing induced constraints – can cause much, much less backtracking

  • these are called fill-in edges

  • this procedure moralizes the graph

  • computing min induced width is NP hard

  • MIN FILL heuristic is used

    • when removing anode, add an edge between every two nodes connected to the node, but not already connected to each other

    • remove the node that, after removal, has the samllest number of fill-in edges

    • Remember to remove the node with the smallest first!

  • A tree is a graph with induced width 1

  • An ordering has width, as does a graph. Graph width is not dependent on ordering, nor is induced width, $\omega^{*}$ is called induced width or tree width.

  • triangulated or chordal graphs are graphs in which each cycle of length 4 or more is such that each pair of non-connected vertices is connected

    • Makes it easy to find largest clique, can actually be done in polynomial time

    • is a type of perfect graphs

    • interval graphs are a type of triangulated graph

    • finding minimum number of triangulating edges is also np-hard

Maximal Cardinality ordering

  • Choose an arbitrary node

  • among remaining nodes

    • choose the one connected to the maximum number of already chosen nodes

    • break ties arbitrarily

  • Repeat until done

  • Obtained ordering is instantiation ordering

  • reverse ordering yields a PEO

  • this is the perfect elimination ordering when reversed

  • If the graph does not need to be moralized, the graph is triangular

Lecture 20: Variable Orderings, continuedg

Max Cardinality, Continued

  • width of an induced ordering is same as graph

  • MWO can help to guaranty back-track free search

  • Lots of algos came from RDBMs

  • Induced width is NP Hard!!!

  • Simplicial graph is such that looking at an elimination ordering, all parents of a node form a clique

  • moralized is triangulated

Minimal Bandwidth Ordering

  • bandwidth is

    node

    largest distance between the node and its neighbors

    ordering

    largest bandwidth of the nodes in the ordering

    graph

    smallest bandwidth across all possible orderings

  • heuristic localizes and confines backtracking

  • smaller bandwidth, sooner ability to backtrack

  • find minimum bandwidth ordering is np-hard

  • but finding an ordering of a given bandwidth is polynomial

  • Tsang 6.2.2, only one paper

Ordering Heuristics

  • When

    • static vvar, val ordering

    • dynamic var, static val

    • dynamic var, dynamic val – dynamic vvp

  • at each given level

    • VVPs pertaining to the same variable across a given level, static-static

    • Vars are different, dyn-static

    • vvps are different at each and every level

  • Computing

    • sort vars at preprocessing for static

    • dynamic selects var during search, based on current domain, all unsintantiated neighbors, exploits the domino effect!!

Lecture 21: Filtering For Constraints of Difference

  • GAC – Generalized Arc Consistency, for non-binary CSPs

    • Operates on table constraints, by supports or by no-goods

    • $\pi_A(R)$ gives a new relation with only values of A

      • May be a Set, or a Bag

  • finite csps are linear in space, quadratic time when dealing with all-diff constrants

  • Sometimes, all diff may become many constraints that together are equivalent, but not always

  • GAC4 handles n-ary constraints

    • has good pruning power

    • but is very expensive – $\frac{d!}{(d-p)!}$

  • value is consistent if other values exist for all other values such that the values and the give simulatenously satisfy the problem

  • a constraint is consistent if all values for all variables arec oconsistent

  • csp is arc-consistent if all constraints are consistent

  • csp is diff-arc-consistent iff all all-diff constraints are arc-consistent

  • value graphs are bipartite

    • vertices are $X_C$, $\cup_{x \in X_C}(D_x)$

    • edges $(x_i, a)$ iff $a \in D_x$

  • Space is $\mathcal{O}(nd)$

  • Matching a subset of edges in G with no vertex in common

  • max matching is the bigges possible

  • matching covers a set $X$ – every vertex in $X$ is an endpoint for an edge in matching

  • $M$ that covers $X_C$ is a max matching.

  • If every edge in $GV(C)$ is in a matching that covers $X_C$, $C$ is consistent.

  • CSP is diff-arc-consistent iff, for every all-diff $C \in \mathcal{C}$, for every edge $GV(C)$ belongs to a matching that covers $X_C$ in $GV(C)$

  • Build graph, compute max matching, if size of max matching is smaller than $X_C$, return false, otherwise remove edges from G given the max matching and return true

    • use Maximal Flow in Bipartite graph

    • Or Hopcroft/Karp algo

  • Then we have an all-dif constraint, value graph and a maximum covering

    • remove edges that aren't in a matching covering

  • given a Matching M

    • matching edge is an edge in M

    • free edge is an edge not in M

    • matched vertex is incident to a matching edge

    • free vertex is otherwise

    • alternating path (cycle) – a path (cycle) whose edges are alternatively matching and free

      • flip colors on even alternating path – gives another matching of same size

      • flip colors on alternating cycle – another matching

    • length of path – number of edges in path

    • vital edge belongs to every maximum matching

  • An edge belongs to some of but not all maximum matching iff for an arbitrary maximum matching $m$ , it belongs to either an even alternating cycle or an even alternating path tha begins at a free vertex

  • Edges to remove shouldn't be in all matchings, even alternating paths starting at a free vertex and an even alternating cycle.

  • every directed cycle in $G_0$ corresponds to an even alternating cycle of $G$

  • every directed simple path starting at a free vertex corresponds to an even alternating path of $G$ starting at a free vertex

  • make black point one way, red the other

  • allows for all-diff and other global constraints

Lecture 22: CSPs and Relational DB

  • connections

    • constraint databases and constraint logic programming

    • query processing and solving csps

  • shared

    • constrant dbs (deductive BD, Datalog) and constraint logic programming share representation

    • relational databases and constraint satisfaction share computations

  • Constraint databases use LP – Prolog for DB, Datalog

  • Prolog was a bit of a failed expiriment

  • CLP(R,F) is prolog with CSP techniques for either reals or finite domains

  • Relations

    • binary relations – given $D_a$ and $D_b$, a set of any 2-tuples $\langle x, y \rangle$ with $x \in D_a$ and $y \in D_b$, this defines a relation $R_{ab}$ $R_{ab} = {(x, y)} \subseteq D_a \times D_b$

    • Function – a binary relation such that there is at most one tuple $\langle x, ? \rangle \in R_ab$. $D_a$ is the domain, $D_b$ is the co-domain.

    • $k$-ary relations – are relations on $k$ elements

  • representation

    • 2-dimensional binary array, though k-dimension works

    • tables

    • linked-lists of pairs

    • hash-maps of sets

    • many other options

  • Terms

    • table, relation – constraint

    • arity of relation – arity of variable

    • attribute – variable

    • value of attribute – value of var

    • domain of attribute – domain of variable

    • tuple in table – tuple in constraint/allowed by constraint/consistent with constraint

  • Relational algebra

    • intersection

      • input two relations of same scope

      • output a new, more restrictive relation with the same scope made of tuples in all the input relations simultaneously

      • logical and

    • union

      • input 2 rel of same socope

      • less restrictive relation with same scope of tuples that are in any input relation

      • logical or

    • difference – same as set theory

    • selection

    • projection

    • join

    • composition (from CSP)

Lecture 23: CSPs and Relational DBs, continued

  • Selection and projection are unary operators

  • Selection

    • Input – a relation $R$ and a predicate o attributes of $R$

    • Output a relation $R^\prime$ with the same scope as $R$ but having only a subset of the tuples in $R$

    • Row selection

    • $\sigma_{p}(R)$

  • Projection

    • Input – a relation $R$ and a subset $s$ of the scope

    • Output – a relation $R^\prime$ of scope $s$ with the tuples rewritten such that positions not in $s$ are removed

    • column elimination, in CSP set semantics, in DB bag

    • $\pi_{\{x_1, x_2\}}(R)$

  • Join (natural join)

    • Input – $R$ and $R^\prime$

    • Output a relation $R^{\prime\prime}$ whose scope is the union of $R$ and $R^\prime$ and tuples satisfy both

      • if they have no common attribute, cartesian product

      • if they have an attribute in common, compute all possibilities

    • Computes all solutions to a csp

    • $R \bowtie R^\prime$

  • Composition of two relations

    • Input – two binary relations $R_{ab}$ and $R_{bc}$ with 1 var in common

    • Output – a new induced relation $R_{ac}$ to be combined by intersection to a pre-existsing relation between them

    • bit matrix op is matrix multiplication

    • Generalization as constraint synthesis

    • Produces induced relations a/k/a indirect, inferred, implicit

  • Given $V_1$ and $V_2$ and $C_1$ and $C_2$ between them, how do $C_1 \cap C_2$ and $C_1 \bowtie C_2$ relate? – They'll be the same in this situation

  • Given $V_1$, $V_2$, $V_3$, $C_{V_1, V_2}$ and $C_{V_2, V_3}$, write the induced $C_{V_1, V_3}$ in relational algebra. – $C_{V_1, V_3} = \pi_{V1, V3}(C_{V_1, V_2} \bowtie C_{V_2, V_3}) = C_{V_1, V_2} \circ C_{V_2, V_3}$

  • Given $V_1, V_2, V_3$, $C_{V_1, V_2}, C_{V_2, V_3}, C_{V_1, V_3}$, $C^{\prime}_{V_1, V_3} = (C_{V_1, V_2} \circ C_{V_2, V_3}) \cap C_{V_1, V_3}$

  • Natural join – synthesized constraint

  • left/right join – synth constraint with some inconsistent tuples

  • Differences

    • number of relations

      • db have few

      • csp have many

    • size of relations

      • dbs have high-arity

      • csps are mostly binary

    • bounds of relations

      • dbs are very tight

      • csps are loose

    • size of domains

      • db large

      • csp small

    • type of domain

      • almost always finite, but csp may be continuous

    • number of solutions sought

      • dbs looking for almost all

      • csp only looks for one in general

    • cost-model

      • DB is I/O, memory

      • CSP is computational

Lecture 24: Path Consistency and Global Consistency

  • consistency helps to reduce possible search-space

  • A path of length $m$ is consistent iff for any value $x \in D_{V0}$ and for any value $y \in D_{Vm}$ that are consistent, there exists a sequence of values in the domains of the variables $V_1, V_2, \ldots, V_{m - 1}$ such that all constraints between them (along the path, not across) are satisfied

  • A CSP is path consistent if any path of any length is consistent

  • However, by Mackworth AIJ77, only paths of length 2 are necessary, $\mathcal{O}(n^3)$

  • algo looks at every vertex as a mid-point $k$, and makes $i$ and $j$ together consistent given $j$,

    • and $R_{ij} \gets R_{ij} \cap (R_{ik} \cdot R_{kj})$

    • Closure algorithm. $k$ is outer loop

    • Until fixed point

      • for k from 1 to n

        • for i from 1 to n

          • for j from 1 to n

            • $R_{ij} \gets R_{ij}^{O} \cap (R_{ik} \cdot R_{kj})$

    • Algo

      • $Y^n \gets R$

      • Until $Y^n = Y^0$

        • For k from 1 to n

          • For i from 1 to n

            • For j from 1 to n

              • See slide for equation

      • $Y \gets Y^n$

  • Difference between k consistent and strong k consistency

  • PC1 terminates, is sound and complete for discrete CSPs - $\mathcal{O}(a^5n^5)$

  • In a complete graph, if every path of length 2 is consistent, the network is path consistent

    • PC1 has two operations, composition and intersection

    • proof is done by induction

  • Mohr and Henderson also developed PC-2, 3

  • and Han, Lee created PC4

  • PC5 from Singh uses support bookkeeping from AC6

  • PC8 uses domains, not constraints, 2001 improves on PC8, but isn't tested

  • PC is rarely used except for certain types of constraints

  • monotonic constraints – ordering based

Lecture 25: Path Consistency and Global Consistency – continued

  • completion allows you to use paths of length 2.

  • Strong PC will ensure both PC and AC

  • AC first, then if no inconsistency, PC (likely PC2)

  • Go from weaker first to stronger – making later steps likely cheaper

  • But PC is often unused – except some situations make it much more likely to find a solution

  • PC is also called 3 consistency

Partial Path consistency

  • same as PC, but no universal constraints, defined over cycles

  • graph is triangulated

  • run closure only over the tringles – $\mathcal{O}(n^3)$

  • must be careful for articulation points in the graph

  • PPC is much easier to implement

  • PPC is strictly weaker than PC as a property

  • PC completes graph, PPC will triangulate

  • PC filters more than PPC, edge-by-edge

  • PPC enforces PC on common edges

  • which means that you can find if something is not path consistent

  • From Bliek and Sam-Haroud 1999

  • Can PC detect insatisfiability when PPC does not? – It seems not – PPC and PC are roughly equivalent, but there are some cases – annd it was hand built

Global Consistency properties

  • minimality/decomposability

  • A minimal network – a network where all tuples in a given constraint appear in at least one solution

  • computing a minimal network is NP complete

  • Solving a minimal network is itself NP complete

Lecture 26: Consistency properties

  • Local Consistencies

    • AC

    • GAC

    • Path Consistency – modifies binary constraints, requirecs completed graph

    • Partial Path Consistency – triangulates graph, modifies binary relations

  • Global consistency properties

    • Consistency in general – it has a solution

    • minimality – constraints are as tight as possible, on binary constraints, elimination of a tupple eliminates a solution, every tuple is a member of at least one solution, idealized form of path consistency, garutys consistency, important in interactive product configuration – NP complete in general, Gottlob, 2011

      • determining whether or not a CSP is minimal is NP complete

      • Finding a solution of a minimal CSP is still NP complete

      • occasionally called globally consistent

    • decomposability

      • CSP is decomposable if for any combination of assignments to any number of variables that is consistent given the constraints can be extended to a complete solution

      • guaranties back-track free search

      • strongly $n$-consistent

  • there are some islands of tractability

    • tree structure – AC ensures tractability

    • convex constraints – path consistency ensures minimality and decomposability

Lecture 27: Consistency properties, continued

  • minimality doesn't guaranty bt-free search

  • decomposability ensures no back tracking

  • PC approximates minimality

  • when constraints are convex, PPC on triangulated graph guarantees minimality and decomposability

  • when $\circ$ distributes over $\cap$, then PC1 guarantees that a CSP is minimal and decomposable, and that the outer loop can be removed

  • but this doesn't always hold

  • constraints of bounded difference ad temporal reasoning

    • vars,

    • constraints of the form $a \leq Y -X \leq B$, $Y - X = [a, b] = I$

    • composition and intersection from slides

Lecture 28: Consistency properties, continued

  • Remember, once decomposability is enforced, search is guaranteed backtrack free

  • When constraints are convex, use of PPC ensures minimality and decomposability, but triangles must be done in PEO

  • When composition distributes over intersection

    • PC1 generalizes Floyd-Warshall (all-pairs shortest path)

    • PC-1 generalizes Warshall (transitive closure)

  • Don't confuse properties with algorithms

  • Local consistency

    • remove inconsistent values (node, arc)

    • remove tuples (path consistency)

    • get closer to solution

    • reduce size of problem

    • are cheap

  • Aim for global consistency, but it's hard

  • Sometimes (given certain types of constraints or graphs), local consistency guarantees lobal consistency

    • distributivity in PC, row-convex constraints, special networks

Lecture 28: Local Search

  • Systematic/constructive search is all we've talked about

  • Starts at a solution that's not correct, try to repair

  • General method:

    • for each variable, choose a value

    • check if solution

      • You're done. Lucky.

      • pick a value that broke a constraint and change it (using min-conflict)

  • Local search is in worst case, $\mathcal{O}(d^n)$

  • States laid on a surface

  • quality/cost is height

  • optimum state maximizes solution quality while minimising cost

  • move around from state to state to find a local optimum

  • Gradient descent – will only ever find a local optimum

  • State is a complete assignment of values to variables

  • how do you move – CSP uses min-conflict

  • eval function rates the quality of a state, generally in number of broken constraints

  • Try

    • Start w/ full but arbitrary assignment of aues

    • reduce inconsistencies by local repairs

    • repeat until

      • A solution is found (a global minimum)

      • The current solution cannot be repaired (local minimum)

      • You've gone through max tries (running out of patience)

  • Local repair for decision, local optimzation for optimization

  • Greedy types are hill climbing, local beam

  • Local beam

    • start in parallel with a number of different points

    • choose among those another number of best moves

    • remove those without promise

    • and repeat as necessary

  • Stochastic methods

    • break out, helps to avoid getting stuck in a local minimum

    • when stuck in a local minimum, walk randomly, pick a random variable, change to a random value

    • tabu search

      • keep a list of tabu positions (those that aren't good), but only a certain number

      • when a new state is visited, added to tabu, and farthest back forgotten

      • if no better, go to the next state not tabu

      • repeat until a global minimum is found

    • sometimes you may randomly ignore a heuristic

  • Local search problems

    • get stuck in a local optimum

    • start having can have issues with plateaus (flat local maxima) (be ready to move to a state with the same quality)

    • ridges can go from side to side with minimal progress

  • when stuck, do random walk

  • Simulated annealing

    • When stuck in a local minimum, allow steps towards less good neighbors to escape the ocal maximum

    • be more random at the start, be less random as you go

    • Start at a random state, start countdown and loop until over

      • pick a neighbor at random

      • $d$ = quality of neighbor - quality of current state

      • if $d > 0$

        • then move to neighbor and restart

        • otherwise, move to a neighbor with a transition probability $p<1$

    • Transition probability proportional to $e^{\frac{d}{t}}$

      • $d$ is negative

      • $t$ is countdown

      • As time passes, you're less and less likely to move towards bad neighbors

  • But all forms of local search are:

    • non-exhaustive

    • non-systematic

    • liable to get stuck in a local optimum

    • non-deterministic – outcome depends on where you start

    • heavy-tailed – longer you go, less likely you are to find a solution

  • Is often somewhat probabalistic

Lecture 29: Local Search, Continued

  • starts from a complete assignment, and moves to a new assignment that's better

  • Very greedy approach, but can be parameterized very easily

  • Tends to be very greedy, but only allows improvements

  • others can be better, and allow forr random walks/steps – stochastic algorithms

  • stochastic algorithms are not deterministic!

  • Tabu search moves to a state that may be worse, provided it's not in a list of previous states

  • break-out modifies weight of objective function

  • simulated annealing involves countdowns and loops, controled by quality of neighbor and the time, but when significantly restricted,, it is guaranteed to find an optimum

  • Have issues

    • non-systematic and non-exhaustive

    • have issues getting stuck in local optima

    • non-deterministic!!

    • generally the probability of improving a solution gets smaller, but never actually dies out – heavy-tailed

  • So, when you hit a point where it seems improvability drops

    • keep the most promising solution

    • and start someplace else

    • Random restart

  • Luby restart profile

    • uses a crazy geometric law to decide how/when/where to restart

Genetic algos for local search

  • Combine two complete assignments to generate offspring

  • Start from an initial position

  • encode assignments in a compact manner

  • combine partial solutions to generate new solutions

  • requires initial generation of many random assignments

  • Choose solutions to combine based on quality

  • fitness is used to describe quality, assigns probability for solution

  • Select pairs depending on fitness

  • crossover point is chosen randomly, offspring are generated

  • mutation will randomly change a state

  • and this gets put back in the original population

ERA – Environment, Rules, Agents

  • Environ is $n \times a$ board

  • each var is an agent

  • each position is a value of a domain

  • agent moves in a row on the board

  • each position records the number of violations caused by the fact that agents are occupying other positions

  • agents try to occupy positions where no constraints are broken

  • agents move according to reactive/reaction rules

  • agents may themselves have preferences

  • move rules can be as simple or complex as possible

  • rules:

    least-move

    choose position with min violation

    better-move

    choose position with smaller violation

    random-move

    randomly choose a position

    others

    combinations of those above

  • No one talks to each other, but share a common context

  • everyone kicks eachother out until every one is happy

  • very effective in solving very tight, but solvable instances

  • unstable in over-constrained cases

    • agents keep kcicking each other out (livelock)

    • livelocks can be used to identify bottlenecks

  • very stable on solvable instances

  • oscillates wildly on unsolveable instances

  • Agent motion can be categorized as variable, stable, constant

Lecture 30: Local Search, continued

ERA, Continued

  • Rules help to find a solution somewhat quickly

  • Has less of an issue with getting stuck in various local optima

  • May be very good for resource-allocation problems

  • more dynamic, as there's no global evaluation function

Nothing works, what to do

  • Random restart

    • when no progress made, restart from a new srandom selected state

    • save the best results so far

  • repeat random restart

    • for a fixed number of iterations

    • until best results have not ben improved on for a certain number of iterations – e.g., a geometric law

Evaluation of Local Search

  • Test on a given problem instance

    • an ensemble of problem instances

  • Experiment

    • running algo thosands of times

    • measure some metric as a random variable – i.e., the time needed to find a solution

  • Then provide

    • probability distribution function approximated by the number of runs that took a given time to find a solution

    • the cumulative distribution function approximated by the number of runs that found a solution within a given time

  • Then compare performance with statistical tests

    • t-tests assume normal distribution of the metrics

    • nonparametric tests don't

    • Consult with a trained statistician

Lecture 30: Structure-based methods

  • CSPs represented by networks

    • for bCSPs, graph

      • constraint graph

      • microstructure

      • co-microstructure

    • for general CSPs

      • hypergraph

      • dual – allowed tuples connected to each other when shared variables match

      • primal graph – vertices in graph are original variables, are variables in a constraint are a clique

Lecture 31: Structure-based methods, continued

Graph Vertices Edge
Binary Var Constraints
Non-binary Var constraints
Dual Constraints Equality constraints on shared variables
primal variables cliques over scope of constraints
$\mu$ structure VVPs supports
co $\mu$ structure VVPs conflicts
  • Incidence graph – Variables are one type of node, constraints another, edges connect variables to the constraints, is bipartite

Tree-structured CSPs

  • Tree-structured CSPs can be solved in polynomial time

    • apply $revise(V_i, V_j)$ for all nodes from leaves to root

    • instantiate variables from root to leaves

  • Generally

    • directional consistency from leaves to root

    • solutions built from root to leaves

  • This is often the case for configuration problems

  • Width of a tree is always one for a tree

  • If a CSP graph has $w = 1$, it is guaranteed to be solved with 2 consistency

  • Freuder has therem that if a csp is $w$ in width, then $w+1$-consistency is guaranteed to solve it.

    • May have issues w/ changing width of graph

    • There are ways to prevent this, but is not necessarily true

Cycle-cutset

  • identify a cycle cutset $S$ in the CSP

    • cutset – nodes when removed yield a tree

  • Decompose CSP into two partitions

    • $S$

    • $T$ the nodes forming a tree

  • Iterate until a solution is found

    • Solve the nodes in $S$

    • Try to extend the solution to $T$

  • $|S| < |V|$

  • if $|S| = c$, then $\mathcal{O}(d^c(n-c)d^2) = $\mathcal{O}(ndc+2)$

  • But finding smallest cutset is $NP$ hard

Lecture 32: Structure-based methods, continued

Dynamic Dangle Identification

  • graham reduction used to detect if schema is acyclic

    • join of acyclic schema is polynomial

    • remove vertices w/ degree 1

  • After each var assignment/lookahead

    • check connectivity

    • identify dangling trees using graph reduction

  • for each dangling root

    • do DAC from leaf to root

    • wipeout means unsolvability

  • Restrict search to nodes outside of the identified dangling trees

Directional Path Consistency

  • Given a binary CSP and a fixed variable ordering

  • $\mathcal{O}(nw*^2d^3)$

  • if $w* \leq 2$

    • DPC determines consistency

    • guarantees BT-free search given the ordering

    • But this only works iff induced width is less than or equal to 2, you get strong consistency, minimality and tractability

Biconnected components

  • decomposition into separable graphs

  • consider a CSP whose graph has artiulation nodes

  • assume that the largest biconnected component is $b$

  • build a tree whose nodes are biconnected components considering that the articulation node belongs to the parent (or parent and child)

  • build an order using a pre-order traversal of the tree

  • complexity is then bound by the size of the largest biconnected component

Lecture 33: Structure-based methods, continued

Biconnected components

  • must have articulation nodes – nodes common to two or more problems

  • is $d^b$ where $d$ is the domain size and $b$ is the size of the largest component

  • Use a pre-order of the variable tree, which means that BT effort is the size of the largest component

  • Originally from 1982, in Journal of the ACM

Tree Clustering

  • bi-connected components, but with more than one shared variable for each cluster

  • triangulates the graph

  • finds maximal cliques

  • create a tree structure from cliques

  • Then repeat:

    • solve a clique at each node in the tree

    • apply DAC from leaves to root

    • generates solutions in a BT-free manner

  • MaxCliques from Algorithmic Graph Theory and Perfect Graphs, Golumbic

  • Triangulation is $\mathcal{O}(n^2)$

  • Max clique is $\mathcal{O}(n + e)$ in triangulated graph

  • $\mathcal{O}(k^r)$ for solving clusters, $k$ is domain size, $r$ is size of largest clique ($t = k^r$)

  • generating solution $\mathcal{O}(nt\log t)$, $t$ is tuples in each cluster (sorted domain of a 'super' variable (sorting can help to speed things up)), in best case, we have $n$ cliques

  • complexity bounded by size of largest clique:

    • $\mathcal{O}(n^2) + \mathcal{O}(k^r) + \mathcal{O}(nt\log t) = \mathcal{O}(nk^r\log k^r) = \mathcal{O}(nrk^r)$

Lecture 34: Structure-based methods, continued

Bucket Elimination

  • Built on non-binary CSPs

  • Dechter 1997

  • Choose a variable ordering – PEO recommended

  • create buckets for each variable, placing each constraint in the bucket of the deepest variable in its scope

  • create tree node for each bucket

  • From bottom to top

  • join relations in bucket (including on the domain)

  • project join on deepest bucket with common variable names

  • top-down creates solution backtrack free

  • Can handle cycles!

  • And is a pretty quick form of DAC, but for non-binary CSPs

Lecture 35: Structure-base methods, continued

Bucket Elimination/Tree Clustering

  • linear in $n$, exponential in induced width + 1

  • induced width is the number of vars in the largest component

  • gives fixed parameter complexity

Bucket Elimination

  • very much like Gaussian elimination

  • Useful for Bayesian networks – graphical models of all kinds

Tree Decomposition

  • Generalizes all of structure methods

  • comes from database theory

  • organizes any constraint problem into a tree

  • $(T, \Chi, \Psi)$ of $P = (X, D, C)$

    • $T$ – $(V, E)$ the tree

    • $\Chi(v) \subseteq X$ – A mapping of all variables in a node

    • $\Psi(v) \subseteq C$ – A mapping of constraints and variables

    • Each constraint must appear on at least one node in the tree, and all its variables are in that node

    • nodes where any variable appears induce a single connected subtree (connectedness property), the path between any two nodes with the same variable will all have the variable

  • join trees and acyclic DBs from database theory

  • organizes the constraint network into a tree structure, a join-tree

  • labels tree nodes with variables and constraints – labels must obey some rules

  • Solving

    • bottom-up – joins relations and projects on parent

    • top-down generates the solution

  • Complexity bound by width

    • tree width is the max number of variables in a node - 1

    • hyperwidth is the max number of constranits in a node

  • Group of variables are the variables in common to two nodes

  • However, selecting the tree decomposition is NP-hard, use approximation by min-fill

  • Quite a few different tree-decomposition techniques – Gottlob's HYPERTREE seems to be one of the best

  • parameters

    • tree width max number of variables in any node minus 1

    • hypewwidth – number of constraints in any node

  • theory – characterizes tractable csps

    • complexity is bound by tree/hyper-width

    • fixed parameter complexity – independent of number of variables

  • practice – uses structure to solve csps

  • requires balancing time and space

Lecture 26: More on Constraint Consistency

Global Consistency Properties

  • minimality – relations join completely, $n$-consistency

  • decomposability – strong $n$-consistency

  • solvability

Local Properties

  • AC/GAC – Domains

  • Path Consistency – Constraints, add new constraints (constraint synthesis)

  • Partial Path Consistency – Constraints, add new constraints

  • Directional Path Consistency – Constraints, add new constraints

  • Types of constraint algorithms

    • Binary

    • Non-Binary

    • Global Constraints

  • For continuous domains, work on box-consistency

  • On interval, box consistency can occasionally also be used

  • weighted CSPs can also be dealt with, with a variant of AC

Info

  • arc consistency – $(1,1)$-consistency – for every variable, the solution can be extended to another consistent assignment

  • path-consistency – $(2,1)$-consistency – for every two variables, the solution can be extended to another consistent assignment

  • $(i,j)$ – for every consistent assignment of length $i$, the solution can be extended by $j$ consistent assignments ($i + j = n$)

  • strong $k$-consistency is $k$-consistency down to $1$ consistency

Lecture 27: Still more on Constraint Consistency

  • domain minimality – constraints over arity 1, Gottlob 2011

  • however, even if domain minimality and constraint minimality hold, solving is still NP complete

  • global inverse consistency – where given a value at 1 var, you can extend it to $n - 1$ assignments

$(i, j)$-consistency

  • $j = 1$

    • 2 – AC

    • 3 – PC

    • k – $k$-consistency

  • $(1, n - 1)$ – domain minimality, global inverse consistency

  • $(1, k)$ – $d^{k + 1}$, guaranteed that a solution can be extended over $k$ variables

  • NIC – Neighborhood Inverse Consistency, for every variable, look at its neighbors, and make consistent with those

Lecture 28: More Constraint Consistency

Singleton Arc Consistency

  • family of algos based on singleton test

  • singleton test like search – kinda

  • The CSP is AC for every VVP

  • Algo

    • Until No change

      • for each var

        • for each value

          • assign the val to the var

          • if CSP is AC, keep

          • else Remove

  • Sometimes, when you get lots of backtracks, trigger a high level consistency

Inverse consistency

  • (1, 2) consistency – Path Inverse consistency

  • Neighborhood Inverce – see previous

  • GIC – Global Inverse Consistency – ensure domain minimality

Special Constraints – AC

  • functional – A constraint C is functional with regard to D iff for all $v \in D$ there exists at most one $w \in D$ such that $C(v,w)$

  • monotonic constraints – a constraint C is monotonic with regard to $D$ iff there exists a total ordering on $D$ such that, for all $v$ and $w \in D$, $C(v,w)$ hold, implies $C(v^{\prime}, w^{\prime})$ holds for all values $v^{\prime}, w^{\prime} \in D$ such that $v^{\prime} \leq v \land w^{\prime} \leq w$

Non-Binary Special constraints

  • almost all properties and algos discussed were restricted to binaryies

  • consistency for non-binary is the topic of current research

    • in particular domain filtering that doesn't change topology or modify constraint definitions

    • relational $m$ consistency – adding constraints/changing constraint definitions

  • Domain Filtering

    • GAC

    • Singleton GAC

    • maxRPWC, rPIC, RPWC, etc

  • Relational consistency

    • Strong consistencies

    • Simple Tabular Reduction

Lecture 29: Extensions to BT Search

  • bounded number of backtrack search

  • bounded backtrack depth search

  • limited discrepency search

  • credit-based backtrack search

  • randomized backtrack search

Credit Based Search

  • start with a given credit, usually $n^3$

  • assign 1/2 credit to current assignment, 1/2 to the remaining

  • keep going in a depth-first manner, until credit is used up, chronologically backtrack from there

  • Used in conjunction with backtrack bounded search

Limited Discrepancy Search

  • may be blind at shallow levels of search

  • disobeys a given number of times

  • Matthew Ginsburg

Randomized BT

  • in systematic search, ordering of the variables/ values determines which of the solution is explored

  • randomization allows us to explore wider portion of search tree

  • thrashing causes stagnation of BT search, interrupt it, and restart

  • Pick random variables at instantiation

  • And Random values when you're picking values

  • Use the restart schedule when thrashing

    • Fixed cutoff and universal – Luby

    • Randomization and Rapid restarts (Gomes), fixed optimal cutoff, priori knowledge of cost distribution required

    • randomization annd geometric restarts Walsh – $C_{i + 1} = r C_i$, $C_0 = n$

    • Randomization and dynamic geometric restarts (guddeti) – $C_{i + 1} = \left\{\begin{matrix} rC_i & \text{when solution improved} \\ C_i & \text{otherwise}\end{matrix}\right.$

    • Bayesion approach (Kautz)