Fundamentals of Constraint Processing

Lecture 2

Resources

check website – has many, many links
conferences
journals
- include logic programming and CSP
- Linear Programming, Integer Programming, Mixed Integer Programing, Dynamic Programming
journals are also pretty big – and tend to be easier to get into
many good ways to get jobs with this

Constraint Satisfaction 101

examples
- generating optimal decisions
  - have unary, bynary, n-ary constraints
  - often have many choices, but restrictions can make it easier to make the right decision
modeling constraints remains an art
sets of variables with sets of choices, and aset of constraints restricting combinations of values
applications
- radio resource management
- databases
- temporal and spatial reasoning
- allocation
- design/config
- graphics, visualization, interfaces
- verification and engineering
- HCI and decision support
- molecular biology
- analytical chemistry
- robotics, machine vision, computational linguistics
- transportation
- qualitative and diagnostic reasoning
constraint processing is modeling, plus inference and search
- modeling – variables, domains, constraints, query
- inference – enforcing consistency (filtering) and propogation (AC, PC)
- search – systematic (chronological backtracking, with or without intelligence, etc) or non-systemactic (hill climbing, taboo, etc)
Real Example
- Netherlands have dense network
- wanted more robust schedule that's more effective, increasing profit
- used OR and AI
Singapore
- large container hub
- has custom, special loading built using CP
CP is:
- solving decision problems
- with arbitrary constraints
- providing concise and high-level feedback about alternatives and conflicts
Very Practical

Defining a problem

has a given and a question
given is a set of objects, the relations, etc
question ex: find $x$ such that condition $y$ is satisfied
- these can be simple or advanced
a problem must always, always, always have a question!
Given $\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})$
- $\mathcal{V} = \{V_1, V_2, \ldots, V_n\}$ Variables
- $\mathcal{D} = \{D_{V_1}, D_{V_2}, \ldots, D_{V_n}\}$ Domains for Variables
- Constrants $\mathcal{C} = \{C_1, C_2, \ldots, C_l\}$ where $C_{V_i,V_j,\ldots,V_k} = \{(x,y,\ldots,z)\} \subseteq D_{V_1} \times D_{V_2} \times \ldots \times D_{V_n}$
Query whether we can find a value for each variable such that all constraints are satisfied, queries can include weights (making it a Constrained Optimization Problem)
Various levels of consistency
- AC
- GAC
- SAC
- SGAC
Sudoku can be done with consistency, but whether or not search is required for all
find solution – decisions
find sumber of, all solutions – counting
find set of constraints that can be removed so that a solution exists – optimization
specials
- binary CSPs have only two variables
- restricted to $\{0, 1\}$, boolean CSP
- finite, discrete domains, enumeration can work
- continuous domains require sophisticated algebraic techniques are needed – consistency techniques are then used on domain bounds

Lecture 3

always a given $\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})$
Constraints are relations
- relations are a subset of the cartesian product of $n \geq 2$ sets
- relations can be represented as a table, which are refered to as allowed tuples, or supports, which is a table constraint, being an enumerated
- extension in a set with all members enumerated
- conflicts are the set of tuples that are no good
- Constraints can also be in intention – they're defined in terms of a logical function, equivalent to extension, but can be easier to understand
- A check function should always be a simple implementation – to build with them
Code must always be well structured!

Lecture 4

$P = (V, D, C)$
- $V$ variables
- $D$ domains
- $C$ constraints
Constraints are defined as $C_1 = \langle\mathrm{scope}(C_1), \mathrm{rel}(C_1)\rangle$
- Scope is the variables it discusses – the set of variables over which the constraint is defined
- relation is an abstract relation – may be in extension (listed as either supports or conflicts) or intension (roughly as a function or boolean expression)
- Also have arity, the cardinality of the scope
- universal constraints allow any member of the relevant cartesian product, but may be induced so as to be something else – equivalent to know constraint, and is binary alone
VVP – Variable Value Pair
make sure to implement check function, and to do so in a very independent manner
keep track of a number of constraint checks, always incrementing for every check
will be writing abscom to parse
Graph representation
- Macro representation
- Variables are represented as nodes (vertices)
- domains as node labels
- constraints as arcs (edges) between nodes
- hypergraphs allow for special constraint nodes, or use a bubble to connect them instead – the hyperedges
relations in intension are defined by set-bulider notation
- used when not practical/possible to list all tuples
- define types/templates of common constraints
- linear constraitns
- all-diff constraints
- at most
- TSP-constraints
- cycle-constraints
constraints implemented
- in extension
  - set of tuples (list/table)
  - binary matrix (bit-matrix)
  - multi-valued decision diagrams
- in intension
  - constrained decision diagrams
  - predicate functions

Lecture 5

examples of modeling
- can have temporal csp
- graph or map coloring – requires 4
- resource allocation – airline trip planning, list coloring
- product configuration – allowed components and what they require/forbid
- logic puzzles
Constraint types
- algebraic constraints
- constraitns of bounded difference
- coloring
- mutual exclusion
- difference constraints
- arbitrary constraints, must be made explicit
Databases
- Join in DB is a CSP
- View materialization is a CSP
Interactive systems
- data-flow
- spreadsheets
- graphical layout systems and animation
- graphical user interfaces
Molecular biologiy
- threading and similar

Formal characterization

macrostructures
- networks/constraint graphs, $G = (V, E)$, may be hypergraphs
Micro-structure – includes every possible Variable Value Pair
- VVPs are linked iff are a support and are consistent
- cliques of size $n$ are possible solutions
- however, finding cliques is $NP$ complete
Co-microstructure – includes every possible VVP
- VVPs are linked iff they are a no-good
- cliques of size $n$ prevent a given solution
Complexity
- decision problem
- NP-complete by reduction from 3SAT
- can be verified in polynomial time, but solution is not generated in polynomial time

Proof of CSP into SAT

CSP has Variables, Domains, Constraints
Every VVP is a Boolean variable in SAT
Requires a clause for each variable, a disjunction for each VVP, to be added to the conjunction of clauses, along with a disjunction of not a or not b of each diffenrent VVP as pairs
Constraints are then mapped to SAT

Lecture 6

Solving a CSP

by search
- constructive, systematic, exhaustive – involves enumerating all possible VVPs
- iterative repair, local search – pick an assignment and change unsatisfied constraints until all constraints are satisfied
- Before start
  - consider the model/forumaltion – controls the size of the search space (defining a model is an art)
  - pre-process
    - i.e. constraint filtering/propagation, consistency checking – reduce the size of search space
Constraint Checking
- arc consistency
  - removes individual possible elements from the domains of either element
  - Revise Function (B, C):
    - $\forall x \in D_B$
      - if support($x$, $D_c$)
        
        break
      - else
        
        $D_B \gets D_B \setminus \{x\}$
  - Support function: for every element, if supports, return true, if none exist, default to false
  - will not always find a solution, but will reduce problem size
- k-consistency
  - consistency on $k\leq n$ variables
Systematic search
- for every value of V_i, check for consistency with previous assignments (starting from the bottom), cuts off portions of the tree if false (using backchecking to prune a search tree)
- back-checking expands a partial solution iff it is thus-far complete
- chronological backtracking – DFS with only the current path, undoes the most recent variable assignment and tries a new one – can be used for only one solution
  - is complete – will always find a solution if it exists
  - is sound / correct – a solution it gives will always be correct – will not give a false positive
- intelligent backtracking
  - jump back based on the reason for failure – generally the deepest
- Ordering heuristics
  - deciding what to expand first
  - often most constraint and most promising values first
  - strategies for variable/value ordering could be static/dynamic
  - effects branching factor – best to branch as little as possible
  - generally variable ordering is dynamic, but value is static (lexicographic)
- Keep-in mind forward-checking – removing values from domains that aren't consistent – a/k/a domain annhilation – and is more efficient
- Real Full Lookahead is used to further restrict domains and search spaces more than plain forward-checking
- balance between reasoning and search

Lecture 7

Arc Consistency

revise should only modify one variable's domain
requires support, takes VVP and domain of other
- checks for VVP in X with every element of Y, returning true if found, false if none found
- check always increments if there's a constraint
- revise should check if there's a constraint to begin with
review reading!
VVPs can be represented as tuples, may use angle brackets
AC-1:
- assume CSP has $n$ variables, having domain size $d$, and $e$ constraints
- $Q$ is a set of directed edges to revise, from x to y, from y to x, size $2e$
- for all edges in $Q$
  - set acp to fales
  - repeat until revisedp is false
    - set revisedp to false
    - for everything in $Q$
      - if it's revised, set revisedp to true
      - But, if a variable's domain is empty, break and report – this CSP has no solution
  - This CSP is then arc-consistent
- Make sure to sort list of edges!
Unary constraints should be done first!

Lecture 8

Property	Algorithm	Complexity
AC	AC-1	$O(na^3e) = O(n^3a^3)$

solutions found by
- detecting components
- NC
- normalize constraints
- NC, AC, other forms of AC
- Search
- Verification
- Solution, Node Vists, Constraint Checks, CPU Time
when implementing AC1, terminate if domain wipe-out occurs
worst case complexity is seldom reached
remember, AC may discover a solution to a CSP, or may find an inconsistent problem (by domain annhilation)
AC3
- for $i \gets 1$ to $n$, do $NC(i)$
- $Q \gets \{(i, j) | (i, j) \in arcs, i \neq j\}$
- while $Q$ is non-empty:
  - select and delete any arc $(k, m) \in Q$
  - if $\mathsc{Revise}(k, m)$ then $Q \gets Q \cup \{(i, k) | (i, k) \in arcs(G), i \neq k, i \neq m\}$
- Check for domain wipe-out!

Lecture 9

Remember to deal with Node Consistency – and don't include it in the check count
see lecture 8 notes
AC3 is not only more powerful
$2e + a[2e - n]$
AC3 is $O(a^2 (2e + a(2e - n))) = O(a^3e) = O(a^3n^2)$
AC4 is even more efficient, but requires special bookkeeping, is $O(a^2n^2)$
Variant of AC-3 called AC-2001, but requires cubic space, has yet another variant, AC-3.1$^{m}$, requires different bookkeeping
AC4
- generate m, s, and counter
  - m and s have as many rows as there are VVPs in the problem, m is set to true, s to empty list
  - m is for alive
  - s is for a list of all VVPs that support a given VVP
  - given a vvp, provide a number of supports provided by a given variable
- prepare data structures
  - for every constraint all tuples:
    - when allowed, update s
      - increment counter
- Then process all constraints
  - if counter is 0, remove from domain, and update $m$
  - since $x$ is removed, then for everything such that it was a support, decrement the counter
AC3 tends to be a bit better than AC4

Lecture 10

soundness – can trust
complete – will always find a solution if exists
more efficient AC variants are useful for use during search
but when enforcing AC during search, you must be able to keep track of what was removed
remember properties are not the same as the algorithms! there may be many algorithms to implement each property!
most consistency methods are local initially
- node/arc remove inconsistent values
- path removes inconsistent tuples
  - used when you have positive tables of supports
- they get closer to the solution
- reduce the size of the problem
- are cheap (polynomial time)
- choice of consistency properties helps to reduce time to solution

Intelligent Backtracking Algorithms

Read Prosser, CI 93. Hybrid Algorithms for the Constraint Satisfaction Problem
Consider Reading 5/6 of Dechter
Issues
- Variable/Value ordering
- Variable Instantiation
- Currentt Path
- Current Variable
- Past Variables
- Future Variables
- Shallow/Deep Levels/Nodes
- Search space/Search Tree
- Back-checking
- Backtracking
As you go, un-assigned variables are the future variables/future sub-problem
those that have been instantiated are the past variables
the current path is current instantiations
always check a new instantiation against past variables (back-checking)
algorithms
- Vanilla: BT
- Improved Backward: BJ (back-jumping), CBJ (conflict-directed BJ)
- Improving Forward: BM (back-marking), FC (forward-checking)
D-Way branching is conceptually easier to understand than 2-way
algorithms from Prosser are iterative back-tracking and pluggable/modularized
Kondrak and Van Beck looked at the efficiency of algorithms

Lecture 11

Variables $V_i$, $i \in [1, n]$
Domains $D_i = \{V_{i1}, v_{i2}, \ldots, v_{iMi}\}$
Constraint between $V_i$ and $V_j$ : $C_{i,j}$
Constraint graph $G$
Arcs of $G$: $\mathrm{Arc}(G)$
Instantiation order is static or dynamic
lang primitives are from lisp
Data structures
- $v$ is a 1 by $n$ array to store assignments, $v_0$ is the root, backtracking to this indicates insolvability, $v_i$ is the assignment to the $i$th variable
- $domain$ is the original domain
- $current-domain$ is the current domain, must be updated on back-tracking
- check is as before
Generic form
- takes n, status
- set consistent to true
- status to unknown
- $i \gets 1$
- while the status is unknown
  - if consistent
    - $i \gets \mathrm{label}(i, consistent)$
    - else $i \gets \mathrm{unlabel}(i, consistent)$
  - if $i > n$
    - set status to solution
    - else if $i = 0$, set status to impossible
label and unlabel functions are provided by the backtrack algorithm
label is forward move, unlabel is backward move
For BT
- label
  - when $v_i$ is assigned a value from the $current-domain_i$, we perform back-checking against past variables
  - if back-checking succeeds
  - return $i + 1$
  - else, remove assigned value from current-domain[i], assign next value, etc
  - if no value exists, set consistent to false
  - function
    - takes i, consistent
    - set consistent to false
    - for v[i] in each element of the current domain while not consistent
      - set consistent to true
      - for h from 1 to i - 1 and while consistent
        
        set consistent to check(i, h)
      - if not consistent
        
        then remove i from current-domain
    - if consistent, return i+1, otherwise i
- unlabel
  - current level is set to $i - 1$
  - for future variables $j$, reset the current-domain
  - if the domain of $h$ is not empty, set consistent to true
  - Function
    - takes i, consistent
    - set h to i - 1
    - set current-domain[i] to domain[i]
    - remove v[h] from current-domain[h]
    - set consistent to current-domain[h] not-equal empty-set
    - return h
- label does the actual instantiation

Lecture 12: Phase Transitions

Cheeseman 1991
Order parameter
Probability solution exists for random problems is almost 1
there's a critical value whereby, after it, probability is almost 0
around the critical value, the probability is around 0.5
cost of solving drops sharply as it gets further to the right of the critical value, but high as it goes to it
around .5 probability is high cost of solving, no matter the algorithm, this is referred to as a phase transition
conjecture regarding the characterization of NP complete problems
applies to detecting/implementing arc-consistency
random graphs are almost always easy to to color – conjecture from famous paper
- graph coloring is a reduction operator
- friends are vertices with the same neighborhood
- this is the degree
For CSPs, it's either density or tightness
currently effects the way of expirement conduct – try and deal with the hardest problems
but be careful not to focus exclusively on the redior around the critical value
Run on random CSPs – given statistical analysis
Vary params $\langle n, a, t, p \rangle$
- $n$ – number of variables
- $a$ – domain size$
- $t$ – tightness $\frac{|\mathrm{forbidden tuples}|}{|\mathrm{all tuples}|}$
- $p$ – proportion of constraints $p = \frac{e}{e_{max} - e_{min}}$, also $\frac{2e}{n(n-1)}$
- $t$ and $p$ will be between 0 and 1
- Can have issue with uniformity, difficulty (phase transition), solvability
- empirical studies can be done simply by varying one of the parameters
- do it (Model B):
  1. generate $n$ nodes
  2. generate a list of $\frac{n(n-1)}{2}$ tuples of all combinations of 2 nodes.
  3. choose e elements from the above list as constraints to go between the $n$ nodes
  4. if the graph is not connected, throw away, go back to step 4, otherwise proceed
  5. generate a list of $a^2$ tuples of all combinations of 2 values
  6. For each constraint, choose randomly a number of tuples from the list to gparantee tightness $t$ for the constraint.
- Model B instances may be flawed, but it's rare

Lecture 13: Back-tracking: Continued

keep track of path – an array of the instantiations thus far and to be created
remember what unlevel does – actually performs backtracking
many different ways to order variables
when doing value-ordering, use lexicographic ordering!
eventually, variable ordering heuristics must be broken with lexicographic ordering

Back-Jumping

uses $max-check$, an array, and when check(i, h) succeeds, $max-check[i] \gets max(max_check[i], h)$
if the current domain of $h$ is empty, chronological back-tracking is performed from h
you go back to $max-check[i]$ to fix stuff when you have an empty variable, unless you've already back-jumped in the current path
BJ just modifies BT-label and BT-unlabel
- BJ-label just updates $max-check$
- bj-unlabel
  - backtrack to h = max-check[i]
  - resets max-check[j] = 0 for j in $h+1$ to $i$

Conflict-directed BJ

requires a new data structure, the conflict-set array
- tracking at what levels conflicts occur
knows how to jump-back again, to the deepest level of conflict
can jump-back more than once!
useless if good variable ordering
conflict sets are initialized to $\{0\}$
At any point, conf-set is a subset of past variables that are in conflict with $i$
when check(i, h) fails add h to conf-set[i]
when domain is empty
- jumps to deepest past variable in conf-set
  - update $conf-set[h] = conf-set[h] \cup conf-set[i] \setminus \{h\}$
To calc all solutions – don't use CBJ

Lecture 14: Back-tracking: Continued

CBJ Continued

max-check – deepest level checked against – if instantiated, will be $i - 1$
conflict set is used for holding what's conflicted
Finding all solutions with CBJ
- use the conflict sets
- or use CBF from Kondrak
- keep track of number of solutions
make sure to check generated solutions with the solution checker!!
modify algo to after solution, back track to every level at least once!
- in actuality, must be back tracked to exactly one
look at Chis Thiel's response
- when a solution is found, force the lat var to conflict with everything before it. In turn, this forces some chrono backtracking as the conf-sets are propagated backward
Kondrak's responce
- cbf flag, vector
  - $\forall i, cbf[i] \gets 0$
  - when you find a solution, $\forall i, cbf[i] \gets 1$
  - In unlabel
    - if $cbf[i] = 1$
      - $h \gets i - 1$, $cbf[i] = 0$
      - else $h \gets max(conf-set[i])$

Lecture 14: Forward Steps

Backmarking

tries to reduce the amount of consistency checking
- $v[i]$ to reassign to $k$
- $v[i] \gets k$ last checked against $v[h] \gets g$
- $v[h]$ has not been modified
not super useful – as consistency enforcement is more useful
this helps to reduce back checking
data structures, max back-check level, minimum backup level
- $mcl$ max check level, $n \times m$ array, $mcl[i, k]$ stores deepest variable that $v[i] \gets k$ was checked positively against, finer-grained mcl
- $mbl$ min. backp level $n$ array, gives the shallowest past variable whose value has changesd since $v[i]$ was the current variable
BM does not allow dynamic variable ordering
aware of the deepest variable that $v[i] \gets k$ is $v[j]$
values of variables in the past of $v[j]$ ($h<j$) have not changed
so we need to check $v[i] \gets k$ against the values of the variables between $v[j]$ and $v[i]$
we do not need to check against the values of the variables in the past of $v[j]$
when $mcl[i,k] \leq mbl[i]$, don't check, fails!
when $mcl[i,k] \geq mbl[i]$, don't check, succeeds! – type B savings

Forward checking

Looks ahead from the current variable, considering all future variables and clear from the domains the values that aren't consistent with the current partial solution
FC means more work in the future, but causes fewer nodes to expand
When it moves forward, the values of the current domains of future variables are all compatible with past assignments, saves backchecking
FC may cause wipe-out, but discovers conflicts early on. will backtrack
goal is to find failures early – avoids expanding fruitless subtrees
is not truly forcing arc consistency, that's RFL
requires a DS to keep track of filtering, can be done in various ways

Lecture 15: Forward Steps, Continued

Forward Checking

look-ahead – revises future variables
makes more work at instantiation, but expands fewer nodes
as you move forward, values in future variables are consistent with past variables
can cause wipe-out and conflict discovery early on, backtracking chronologically
requires current domain, stack of variables removed by previous instantiations
future is a subset that v[i] checks againts
past-fc is past variables that checked against v[i]
requires functions
- check-forward – called when instantiating v[i]
  - revise(j, i)
  - returns false if current-domain[j] is empty, true otherwise
  - values are removed from current-domain are pushed as a set into reductions[j]
- update-current-domain
  - $current-domain[i] \gets domain[i] \setminus reductions[i]$
- fc-label
  - attempt to instantiate current-variable
  - filters domains of current
  - on wipe-out of future variable, uninstantiate and undo domain filtering
Suffers from thrashing, as based on bt

Variable Ordering

dom, least domain, smallest domain first
degree of variable (number of neighbors)
dom/deg is most common
Degree is difficult, as it changes very rapidly
- can/should be implemented statically

Consistency Check Summary

CBJ
- uses backtracking, no addtional structures
backmarking uses mcl and mbl
- two types of consistency-checking savings
forward checking
- works more at each instantiation, but expands fewer subtrees
- uses reductios, future-fc, past-fc
FC-CBJ is the best. Period
Other Backtracking algos
- Graph-based back-jumping – Dechter
- Pseudo-trees [Freuder, 85]
Other lookahead – DAC, MAC – DAC seems to be pretty good as a compromise between MAC and FC
More emperical evals
- now by generating random problems
theoretical comparisons

Implementation notes

Enforce NC
Normalize constraints
check for empty crelations
Interrupt as soon as you detect domain wipe-out
Remember dynamic variable ordering includes the domino effect
- a/k/a pick singleton domains first

Lecture 15: Intelligent Backt Tracking – Theoretical Eval

Kondrak and Van Beek, IJCAI 95
Dechter 5, 6
most backtracking algorithms are evaluated empirically
performance depends heavily on problem instance
is therefore average case analysis
representativesness of the test problems
CSP in NP complete – always possible to construct examples where BJ/CBJ are worse than BT
comparison criteria are different
paper gives theroretical approach
- states by number of nodes visited
- number of constraint checks
BT, CBJ perform the same amount of consistency checks
bm reduced consistency checked
MD does not affect number of nodes visited
FC-CBJ may visit more nodes than CBJ
proves correctness of BJ/CBJ
determines partial order between algos
proves BJ, CBJ are correct
proves FC never visits more nodes than BJ
Improves BMJ and BM-CBJ to perform even fewer consistency checks
provides framework for characterizing future BT algorithms
BT extends partial solutions – solutions consistent with a set of uninstantiated values if it can be consistently extend to the variables – that is, create a new partial solution extended by one variable s.t. it's still consistent
Dead end – when all values of current variables are rejected
lower level are closer to the root
higher levels are closer to the fringe
2 BT algos are equiv if on every csp they generate the same tree and perform the same consistency checks
Lemma 1: If BJ bactkracs to variable x_h from a dead end at x_i, then (a_i \ldots a_h) is not a valid partial solution

Lecture 16: Theoretical Eval, Continued

sufficient conditions
- BT visits a node if parent is consistent
- BT visits a node if its parent is consistent with all variables
- CBJ vists a node if its parent is consistent with all sets of variables
- FC vists a node if it is consistent and its parent is consistent with all variables
Necessary Conditions
- BT visits a node only if parent is consistent
- BT visits a node only if its parent is consistent with all variables
- CBJ vists a node only if its parent is consistent with all sets of variables
- FC vists a node only if it is consistent and its parent is consistent with all variables
BT visits all nodes FC visits
BJ vistis all nodes CBJ visits
CBJ and FC are not comparable.
FC will visit no more nodes than BJ
BT and BM are equivalent, as are BJ and BMJ
proof of correctness – termination, soundness, completeness
extends for seeking all solutions
BM hybrids should be fixed, by changing MBL to an $n \times m$ array, creating BM2, BMJ2 and BM-CBJ2

Lecture 17: Search orders in CSPs

assumptions
- finite domains
- binary constraints
Chapter 6, Tsang is req'd reading
Dual viewpoint heuristics for binary constraint satisfaction problems is recommended reading
IN BT, some orders have fewer backtracks
in lookahead, some ordering detect failure earlier
variable ordering is often the most constrained variable
value ordering is often the most promising value first
using dynamic ordering in lookahead algos often removes the need for intelligent backtracking
Regin and Bessier show that CBJ and similar isn't needed – but not so sure
Select values that don't cause constraint violation – do the most promising value first, avoiding constraint violation
Discover constraint violation quickly – select variables that don't delay discovery of constraint violation, go with the most constrained variable first
Notation: For a given future variable $V_i$, the assignment $V_i \gets x$ partitions the domain of the future variable $V_j$, a set of values consistent with with the assignment of size $\mathrm{Left}(V_j | V_i \gets x)$ and a set of values inconsistent of size $\mathrm{Lost}(V_j | V_i \gets x)$
Value selection of Lost and Left
Value Ordering to quickly find a solution
- min-conflict – orders values according to the conflicts in which they are involved with the future variables – most popular (Minton), $Cost(V_i \gets x) = \sum_{V_j \neq i} \mathrm{Lost}(V_j | V_i \gets x)$
- Cruciality – (Keng and Yu, 1989) $Cruciality(V_i \gets x) = \sum_{V_j \neq i} Lost(V_j | V_i\gets x) / |D_{V_j}$
- promise – most powerful (Geelen, 1992) $Promise(V_i \gets x) = \prod_{V_j \neq i} Left(V_j | V_i \gets x)$, the number of assignments that can be done such that no constraint on $V_i$ is broken.
- Others are defined
Promise
- recognizes domain wipe-out
- product discriminates better than sum
- semantics of promise gives an upper bound on number of solutions
FFP name is contested
- goal is to recognize dead-ends asap to save search effort
- choose ordering that reduces branching factor
- Early ones
  - LD
  - maximal degree
  - minimal ratio of domain size over degree
  - promise for variables
  - Br\'elaz heuristic
  - weighted degree (Boussemart et al, ECAI 2004)
  - graph-based
    - minimal width
    - induced with ordering w*
    - maximal cardinality ordering
    - minimal bandwidth ordering
  - remember to exploit domino effect, especially under dinamic var ordering
  - always state how you are breaking ties – generally lexicographically
  - is often cheap and effective
  - in most cases, suitable for both static and dynamic ordering
- Promise for variables – Sum of the promises of the values, minimize the number of assignments that can be done such that no constraint on $V_i$ is broken. Has upper bound on the number of solutions

Lecture 18: Ordering Heuristics

Least Domain
Max Domain
Domain/Degree
Promise is based on what has the most promise given the current state
Br\'elaz heuristic – originally used for graph coloring
Remember, variable ordering heuristics are cheap and effective
generally suitable for static and dynamic ordering
graph-based heuristics aren't often used anymore
Br\'elaz
- originally designed for coloring
- assigns most constrained nodes first
- arrange vars in decreasing order of degrees
- color a vertex with maximal degree with color 1
- choose a vertex with a maximal saturation degree (number of different colors to which is it adjacent)
  - if thete's an equality, choose any vertex of maximal degree in uncolored
- color chosen vertex, with lowest numbered color
- if all vertices are colored, stop, otherwise, choose another vertex and continue
wdeg
- All counstraints assigned a weight, initially 1
- every time a constraint is broken, during propagation with look-ahead (constraint causing domain wipe-out), weight is increased
- the weight of an un-assigned variable is defined as the sum of the weights of the constraints that apply to the variable
- the variable with the largest weight is chosen for instantiation
- weights are not reinitialied at back-track
- refined with dom/wdeg
- inspired by breakout heuristic in local search
- However, there mart be places where dom/deg is more efficient than dom/wdeg
- A decay function may help improve this
Graph heuristics
- ordering by elimination or instantiation
- minimal width
  - given an ordering, the width of a node is the number of its parents
  - then the ordering's width is the largest width of any of the nodes
  - sum of degrees is always equal to the number of edges
  - take minimum of the orders, by width – this is the width of the graph
  - this can reduce chance of backtracking, can help to find minimum width ordering
  - Algo:
    - remove all nodes not connected to any others
    - $k \gets 0$
    - while no nodes left in the graph
      - $k \gets 1$
      - while there are nodes not connected to more than k others
        
        remove such nodes from graph, along with any edges connected to them
    - return $k$
    - minimal width ordering of the nodes is obtained by taking the nodes in the reverse order that they were removed
- Inducted Width Ordering
  - What happens when you reconnect neighbors of what you remove to each other

Guest Speaker: Philip Walton, Workday

From Workday, SaaS company for HR management
work has primarily been in Data Analysis and optimization
go from techniques and technologies to money
Could be a Linear Program
- Measures, Items (index sets)
- indices i, m
- variables Quantity, 0 to N
- model sets, limit, value, weight
- maximize value and quanitity
- but must be withinn weight
big thing is assigning resources in a data center
truck scheduling
- many truck loads that a company has to move
  - origin and destination specified
  - and time windows at origin and destination
- many drivers, many trucks
- some time horizon
- DOT rules
- operational rules
  - returning to base
  - equipment preventative maintenance
- Taking a run at the whole thing is probably too much
  - run a rout model to pick loads that go togethr
  - schedule loads – the big thing for SCPs
  - drivers pick the sets of loads they prefer
modeling
- discrete time forms an integer domain
  - can be clumpy
- use start time of the task as the domain
- constraints like before and after are relatively straightforward
- duration of driving/time windows are prety easy to express
- small disruptions are normally readily repaired by freeing up parts of the model and letting it re-solve
- time windows tend to be pretty large
- isn't always a notion of an optimal schedule
- side constraints can be pretty easy to represent
  - work rules aren't always easy
  - drivers may have odd drive/sleep patterns
  - side resource use is tolerable
Robustness for many things has to be dealt with, but how that's done is very differennt
And remember how important interface is – they're both important, and very, very different
And even more important is understanding what's going on within the problem domain
good large software is built on top of amazing small software

Lecture 19: Variable Orderings, continued

Induced Width Ordering, $\omega^{*}$

requires designing induced constraints – can cause much, much less backtracking
these are called fill-in edges
this procedure moralizes the graph
computing min induced width is NP hard
MIN FILL heuristic is used
- when removing anode, add an edge between every two nodes connected to the node, but not already connected to each other
- remove the node that, after removal, has the samllest number of fill-in edges
- Remember to remove the node with the smallest first!
A tree is a graph with induced width 1
An ordering has width, as does a graph. Graph width is not dependent on ordering, nor is induced width, $\omega^{*}$ is called induced width or tree width.
triangulated or chordal graphs are graphs in which each cycle of length 4 or more is such that each pair of non-connected vertices is connected
- Makes it easy to find largest clique, can actually be done in polynomial time
- is a type of perfect graphs
- interval graphs are a type of triangulated graph
- finding minimum number of triangulating edges is also np-hard

Maximal Cardinality ordering

Choose an arbitrary node
among remaining nodes
- choose the one connected to the maximum number of already chosen nodes
- break ties arbitrarily
Repeat until done
Obtained ordering is instantiation ordering
reverse ordering yields a PEO
this is the perfect elimination ordering when reversed
If the graph does not need to be moralized, the graph is triangular

Lecture 20: Variable Orderings, continuedg

Max Cardinality, Continued

width of an induced ordering is same as graph
MWO can help to guaranty back-track free search
Lots of algos came from RDBMs
Induced width is NP Hard!!!
Simplicial graph is such that looking at an elimination ordering, all parents of a node form a clique
moralized is triangulated

Minimal Bandwidth Ordering

bandwidth is

node

largest distance between the node and its neighbors

ordering

largest bandwidth of the nodes in the ordering

graph

smallest bandwidth across all possible orderings
heuristic localizes and confines backtracking
smaller bandwidth, sooner ability to backtrack
find minimum bandwidth ordering is np-hard
but finding an ordering of a given bandwidth is polynomial
Tsang 6.2.2, only one paper

Ordering Heuristics

When
- static vvar, val ordering
- dynamic var, static val
- dynamic var, dynamic val – dynamic vvp
at each given level
- VVPs pertaining to the same variable across a given level, static-static
- Vars are different, dyn-static
- vvps are different at each and every level
Computing
- sort vars at preprocessing for static
- dynamic selects var during search, based on current domain, all unsintantiated neighbors, exploits the domino effect!!

Lecture 21: Filtering For Constraints of Difference

GAC – Generalized Arc Consistency, for non-binary CSPs
- Operates on table constraints, by supports or by no-goods
- $\pi_A(R)$ gives a new relation with only values of A
  - May be a Set, or a Bag
finite csps are linear in space, quadratic time when dealing with all-diff constrants
Sometimes, all diff may become many constraints that together are equivalent, but not always
GAC4 handles n-ary constraints
- has good pruning power
- but is very expensive – $\frac{d!}{(d-p)!}$
value is consistent if other values exist for all other values such that the values and the give simulatenously satisfy the problem
a constraint is consistent if all values for all variables arec oconsistent
csp is arc-consistent if all constraints are consistent
csp is diff-arc-consistent iff all all-diff constraints are arc-consistent
value graphs are bipartite
- vertices are $X_C$, $\cup_{x \in X_C}(D_x)$
- edges $(x_i, a)$ iff $a \in D_x$
Space is $\mathcal{O}(nd)$
Matching a subset of edges in G with no vertex in common
max matching is the bigges possible
matching covers a set $X$ – every vertex in $X$ is an endpoint for an edge in matching
$M$ that covers $X_C$ is a max matching.
If every edge in $GV(C)$ is in a matching that covers $X_C$, $C$ is consistent.
CSP is diff-arc-consistent iff, for every all-diff $C \in \mathcal{C}$, for every edge $GV(C)$ belongs to a matching that covers $X_C$ in $GV(C)$
Build graph, compute max matching, if size of max matching is smaller than $X_C$, return false, otherwise remove edges from G given the max matching and return true
- use Maximal Flow in Bipartite graph
- Or Hopcroft/Karp algo
Then we have an all-dif constraint, value graph and a maximum covering
- remove edges that aren't in a matching covering
given a Matching M
- matching edge is an edge in M
- free edge is an edge not in M
- matched vertex is incident to a matching edge
- free vertex is otherwise
- alternating path (cycle) – a path (cycle) whose edges are alternatively matching and free
  - flip colors on even alternating path – gives another matching of same size
  - flip colors on alternating cycle – another matching
- length of path – number of edges in path
- vital edge belongs to every maximum matching
An edge belongs to some of but not all maximum matching iff for an arbitrary maximum matching $m$ , it belongs to either an even alternating cycle or an even alternating path tha begins at a free vertex
Edges to remove shouldn't be in all matchings, even alternating paths starting at a free vertex and an even alternating cycle.
every directed cycle in $G_0$ corresponds to an even alternating cycle of $G$
every directed simple path starting at a free vertex corresponds to an even alternating path of $G$ starting at a free vertex
make black point one way, red the other
allows for all-diff and other global constraints

Lecture 22: CSPs and Relational DB

connections
- constraint databases and constraint logic programming
- query processing and solving csps
shared
- constrant dbs (deductive BD, Datalog) and constraint logic programming share representation
- relational databases and constraint satisfaction share computations
Constraint databases use LP – Prolog for DB, Datalog
Prolog was a bit of a failed expiriment
CLP(R,F) is prolog with CSP techniques for either reals or finite domains
Relations
- binary relations – given $D_a$ and $D_b$, a set of any 2-tuples $\langle x, y \rangle$ with $x \in D_a$ and $y \in D_b$, this defines a relation $R_{ab}$ $R_{ab} = {(x, y)} \subseteq D_a \times D_b$
- Function – a binary relation such that there is at most one tuple $\langle x, ? \rangle \in R_ab$. $D_a$ is the domain, $D_b$ is the co-domain.
- $k$-ary relations – are relations on $k$ elements
representation
- 2-dimensional binary array, though k-dimension works
- tables
- linked-lists of pairs
- hash-maps of sets
- many other options
Terms
- table, relation – constraint
- arity of relation – arity of variable
- attribute – variable
- value of attribute – value of var
- domain of attribute – domain of variable
- tuple in table – tuple in constraint/allowed by constraint/consistent with constraint
Relational algebra
- intersection
  - input two relations of same scope
  - output a new, more restrictive relation with the same scope made of tuples in all the input relations simultaneously
  - logical and
- union
  - input 2 rel of same socope
  - less restrictive relation with same scope of tuples that are in any input relation
  - logical or
- difference – same as set theory
- selection
- projection
- join
- composition (from CSP)

Lecture 23: CSPs and Relational DBs, continued

Selection and projection are unary operators
Selection
- Input – a relation $R$ and a predicate o attributes of $R$
- Output a relation $R^\prime$ with the same scope as $R$ but having only a subset of the tuples in $R$
- Row selection
- $\sigma_{p}(R)$
Projection
- Input – a relation $R$ and a subset $s$ of the scope
- Output – a relation $R^\prime$ of scope $s$ with the tuples rewritten such that positions not in $s$ are removed
- column elimination, in CSP set semantics, in DB bag
- $\pi_{\{x_1, x_2\}}(R)$
Join (natural join)
- Input – $R$ and $R^\prime$
- Output a relation $R^{\prime\prime}$ whose scope is the union of $R$ and $R^\prime$ and tuples satisfy both
  - if they have no common attribute, cartesian product
  - if they have an attribute in common, compute all possibilities
- Computes all solutions to a csp
- $R \bowtie R^\prime$
Composition of two relations
- Input – two binary relations $R_{ab}$ and $R_{bc}$ with 1 var in common
- Output – a new induced relation $R_{ac}$ to be combined by intersection to a pre-existsing relation between them
- bit matrix op is matrix multiplication
- Generalization as constraint synthesis
- Produces induced relations a/k/a indirect, inferred, implicit
Given $V_1$ and $V_2$ and $C_1$ and $C_2$ between them, how do $C_1 \cap C_2$ and $C_1 \bowtie C_2$ relate? – They'll be the same in this situation
Given $V_1$, $V_2$, $V_3$, $C_{V_1, V_2}$ and $C_{V_2, V_3}$, write the induced $C_{V_1, V_3}$ in relational algebra. – $C_{V_1, V_3} = \pi_{V1, V3}(C_{V_1, V_2} \bowtie C_{V_2, V_3}) = C_{V_1, V_2} \circ C_{V_2, V_3}$
Given $V_1, V_2, V_3$, $C_{V_1, V_2}, C_{V_2, V_3}, C_{V_1, V_3}$, $C^{\prime}_{V_1, V_3} = (C_{V_1, V_2} \circ C_{V_2, V_3}) \cap C_{V_1, V_3}$
Natural join – synthesized constraint
left/right join – synth constraint with some inconsistent tuples
Differences
- number of relations
  - db have few
  - csp have many
- size of relations
  - dbs have high-arity
  - csps are mostly binary
- bounds of relations
  - dbs are very tight
  - csps are loose
- size of domains
  - db large
  - csp small
- type of domain
  - almost always finite, but csp may be continuous
- number of solutions sought
  - dbs looking for almost all
  - csp only looks for one in general
- cost-model
  - DB is I/O, memory
  - CSP is computational

Lecture 24: Path Consistency and Global Consistency

consistency helps to reduce possible search-space
A path of length $m$ is consistent iff for any value $x \in D_{V0}$ and for any value $y \in D_{Vm}$ that are consistent, there exists a sequence of values in the domains of the variables $V_1, V_2, \ldots, V_{m - 1}$ such that all constraints between them (along the path, not across) are satisfied
A CSP is path consistent if any path of any length is consistent
However, by Mackworth AIJ77, only paths of length 2 are necessary, $\mathcal{O}(n^3)$
algo looks at every vertex as a mid-point $k$, and makes $i$ and $j$ together consistent given $j$,
- and $R_{ij} \gets R_{ij} \cap (R_{ik} \cdot R_{kj})$
- Closure algorithm. $k$ is outer loop
- Until fixed point
  - for k from 1 to n
    - for i from 1 to n
      - for j from 1 to n
        
        $R_{ij} \gets R_{ij}^{O} \cap (R_{ik} \cdot R_{kj})$
- Algo
  - $Y^n \gets R$
  - Until $Y^n = Y^0$
    - For k from 1 to n
      - For i from 1 to n
        
        For j from 1 to n
        
        See slide for equation
  - $Y \gets Y^n$
Difference between k consistent and strong k consistency
PC1 terminates, is sound and complete for discrete CSPs - $\mathcal{O}(a^5n^5)$
In a complete graph, if every path of length 2 is consistent, the network is path consistent
- PC1 has two operations, composition and intersection
- proof is done by induction
Mohr and Henderson also developed PC-2, 3
and Han, Lee created PC4
PC5 from Singh uses support bookkeeping from AC6
PC8 uses domains, not constraints, 2001 improves on PC8, but isn't tested
PC is rarely used except for certain types of constraints
monotonic constraints – ordering based

Lecture 25: Path Consistency and Global Consistency – continued

completion allows you to use paths of length 2.
Strong PC will ensure both PC and AC
AC first, then if no inconsistency, PC (likely PC2)
Go from weaker first to stronger – making later steps likely cheaper
But PC is often unused – except some situations make it much more likely to find a solution
PC is also called 3 consistency

Partial Path consistency

same as PC, but no universal constraints, defined over cycles
graph is triangulated
run closure only over the tringles – $\mathcal{O}(n^3)$
must be careful for articulation points in the graph
PPC is much easier to implement
PPC is strictly weaker than PC as a property
PC completes graph, PPC will triangulate
PC filters more than PPC, edge-by-edge
PPC enforces PC on common edges
which means that you can find if something is not path consistent
From Bliek and Sam-Haroud 1999
Can PC detect insatisfiability when PPC does not? – It seems not – PPC and PC are roughly equivalent, but there are some cases – annd it was hand built

Global Consistency properties

minimality/decomposability
A minimal network – a network where all tuples in a given constraint appear in at least one solution
computing a minimal network is NP complete
Solving a minimal network is itself NP complete

Lecture 26: Consistency properties

Local Consistencies
- AC
- GAC
- Path Consistency – modifies binary constraints, requirecs completed graph
- Partial Path Consistency – triangulates graph, modifies binary relations
Global consistency properties
- Consistency in general – it has a solution
- minimality – constraints are as tight as possible, on binary constraints, elimination of a tupple eliminates a solution, every tuple is a member of at least one solution, idealized form of path consistency, garutys consistency, important in interactive product configuration – NP complete in general, Gottlob, 2011
  - determining whether or not a CSP is minimal is NP complete
  - Finding a solution of a minimal CSP is still NP complete
  - occasionally called globally consistent
- decomposability
  - CSP is decomposable if for any combination of assignments to any number of variables that is consistent given the constraints can be extended to a complete solution
  - guaranties back-track free search
  - strongly $n$-consistent
there are some islands of tractability
- tree structure – AC ensures tractability
- convex constraints – path consistency ensures minimality and decomposability

Lecture 27: Consistency properties, continued

minimality doesn't guaranty bt-free search
decomposability ensures no back tracking
PC approximates minimality
when constraints are convex, PPC on triangulated graph guarantees minimality and decomposability
when $\circ$ distributes over $\cap$, then PC1 guarantees that a CSP is minimal and decomposable, and that the outer loop can be removed
but this doesn't always hold
constraints of bounded difference ad temporal reasoning
- vars,
- constraints of the form $a \leq Y -X \leq B$, $Y - X = [a, b] = I$
- composition and intersection from slides

Lecture 28: Consistency properties, continued

Remember, once decomposability is enforced, search is guaranteed backtrack free
When constraints are convex, use of PPC ensures minimality and decomposability, but triangles must be done in PEO
When composition distributes over intersection
- PC1 generalizes Floyd-Warshall (all-pairs shortest path)
- PC-1 generalizes Warshall (transitive closure)
Don't confuse properties with algorithms
Local consistency
- remove inconsistent values (node, arc)
- remove tuples (path consistency)
- get closer to solution
- reduce size of problem
- are cheap
Aim for global consistency, but it's hard
Sometimes (given certain types of constraints or graphs), local consistency guarantees lobal consistency
- distributivity in PC, row-convex constraints, special networks

Lecture 28: Local Search

Systematic/constructive search is all we've talked about
Starts at a solution that's not correct, try to repair
General method:
- for each variable, choose a value
- check if solution
  - You're done. Lucky.
  - pick a value that broke a constraint and change it (using min-conflict)
Local search is in worst case, $\mathcal{O}(d^n)$
States laid on a surface
quality/cost is height
optimum state maximizes solution quality while minimising cost
move around from state to state to find a local optimum
Gradient descent – will only ever find a local optimum
State is a complete assignment of values to variables
how do you move – CSP uses min-conflict
eval function rates the quality of a state, generally in number of broken constraints
Try
- Start w/ full but arbitrary assignment of aues
- reduce inconsistencies by local repairs
- repeat until
  - A solution is found (a global minimum)
  - The current solution cannot be repaired (local minimum)
  - You've gone through max tries (running out of patience)
Local repair for decision, local optimzation for optimization
Greedy types are hill climbing, local beam
Local beam
- start in parallel with a number of different points
- choose among those another number of best moves
- remove those without promise
- and repeat as necessary
Stochastic methods
- break out, helps to avoid getting stuck in a local minimum
- when stuck in a local minimum, walk randomly, pick a random variable, change to a random value
- tabu search
  - keep a list of tabu positions (those that aren't good), but only a certain number
  - when a new state is visited, added to tabu, and farthest back forgotten
  - if no better, go to the next state not tabu
  - repeat until a global minimum is found
- sometimes you may randomly ignore a heuristic
Local search problems
- get stuck in a local optimum
- start having can have issues with plateaus (flat local maxima) (be ready to move to a state with the same quality)
- ridges can go from side to side with minimal progress
when stuck, do random walk
Simulated annealing
- When stuck in a local minimum, allow steps towards less good neighbors to escape the ocal maximum
- be more random at the start, be less random as you go
- Start at a random state, start countdown and loop until over
  - pick a neighbor at random
  - $d$ = quality of neighbor - quality of current state
  - if $d > 0$
    - then move to neighbor and restart
    - otherwise, move to a neighbor with a transition probability $p<1$
- Transition probability proportional to $e^{\frac{d}{t}}$
  - $d$ is negative
  - $t$ is countdown
  - As time passes, you're less and less likely to move towards bad neighbors
But all forms of local search are:
- non-exhaustive
- non-systematic
- liable to get stuck in a local optimum
- non-deterministic – outcome depends on where you start
- heavy-tailed – longer you go, less likely you are to find a solution
Is often somewhat probabalistic

Lecture 29: Local Search, Continued

starts from a complete assignment, and moves to a new assignment that's better
Very greedy approach, but can be parameterized very easily
Tends to be very greedy, but only allows improvements
others can be better, and allow forr random walks/steps – stochastic algorithms
stochastic algorithms are not deterministic!
Tabu search moves to a state that may be worse, provided it's not in a list of previous states
break-out modifies weight of objective function
simulated annealing involves countdowns and loops, controled by quality of neighbor and the time, but when significantly restricted,, it is guaranteed to find an optimum
Have issues
- non-systematic and non-exhaustive
- have issues getting stuck in local optima
- non-deterministic!!
- generally the probability of improving a solution gets smaller, but never actually dies out – heavy-tailed
So, when you hit a point where it seems improvability drops
- keep the most promising solution
- and start someplace else
- Random restart
Luby restart profile
- uses a crazy geometric law to decide how/when/where to restart

Genetic algos for local search

Combine two complete assignments to generate offspring
Start from an initial position
encode assignments in a compact manner
combine partial solutions to generate new solutions
requires initial generation of many random assignments
Choose solutions to combine based on quality
fitness is used to describe quality, assigns probability for solution
Select pairs depending on fitness
crossover point is chosen randomly, offspring are generated
mutation will randomly change a state
and this gets put back in the original population

ERA – Environment, Rules, Agents

Environ is $n \times a$ board
each var is an agent
each position is a value of a domain
agent moves in a row on the board
each position records the number of violations caused by the fact that agents are occupying other positions
agents try to occupy positions where no constraints are broken
agents move according to reactive/reaction rules
agents may themselves have preferences
move rules can be as simple or complex as possible
rules:

least-move

choose position with min violation

better-move

choose position with smaller violation

random-move

randomly choose a position

others

combinations of those above
No one talks to each other, but share a common context
everyone kicks eachother out until every one is happy
very effective in solving very tight, but solvable instances
unstable in over-constrained cases
- agents keep kcicking each other out (livelock)
- livelocks can be used to identify bottlenecks
very stable on solvable instances
oscillates wildly on unsolveable instances
Agent motion can be categorized as variable, stable, constant

Lecture 30: Local Search, continued

ERA, Continued

Rules help to find a solution somewhat quickly
Has less of an issue with getting stuck in various local optima
May be very good for resource-allocation problems
more dynamic, as there's no global evaluation function

Nothing works, what to do

Random restart
- when no progress made, restart from a new srandom selected state
- save the best results so far
repeat random restart
- for a fixed number of iterations
- until best results have not ben improved on for a certain number of iterations – e.g., a geometric law

Evaluation of Local Search

Test on a given problem instance
- an ensemble of problem instances
Experiment
- running algo thosands of times
- measure some metric as a random variable – i.e., the time needed to find a solution
Then provide
- probability distribution function approximated by the number of runs that took a given time to find a solution
- the cumulative distribution function approximated by the number of runs that found a solution within a given time
Then compare performance with statistical tests
- t-tests assume normal distribution of the metrics
- nonparametric tests don't
- Consult with a trained statistician

Lecture 30: Structure-based methods

CSPs represented by networks
- for bCSPs, graph
  - constraint graph
  - microstructure
  - co-microstructure
- for general CSPs
  - hypergraph
  - dual – allowed tuples connected to each other when shared variables match
  - primal graph – vertices in graph are original variables, are variables in a constraint are a clique

Lecture 31: Structure-based methods, continued

Graph	Vertices	Edge
Binary	Var	Constraints
Non-binary	Var	constraints
Dual	Constraints	Equality constraints on shared variables
primal	variables	cliques over scope of constraints
$\mu$ structure	VVPs	supports
co $\mu$ structure	VVPs	conflicts

Incidence graph – Variables are one type of node, constraints another, edges connect variables to the constraints, is bipartite

Tree-structured CSPs

Tree-structured CSPs can be solved in polynomial time
- apply $revise(V_i, V_j)$ for all nodes from leaves to root
- instantiate variables from root to leaves
Generally
- directional consistency from leaves to root
- solutions built from root to leaves
This is often the case for configuration problems
Width of a tree is always one for a tree
If a CSP graph has $w = 1$, it is guaranteed to be solved with 2 consistency
Freuder has therem that if a csp is $w$ in width, then $w+1$-consistency is guaranteed to solve it.
- May have issues w/ changing width of graph
- There are ways to prevent this, but is not necessarily true

Cycle-cutset

identify a cycle cutset $S$ in the CSP
- cutset – nodes when removed yield a tree
Decompose CSP into two partitions
- $S$
- $T$ the nodes forming a tree
Iterate until a solution is found
- Solve the nodes in $S$
- Try to extend the solution to $T$
$|S| < |V|$
if $|S| = c$, then $\mathcal{O}(d^c(n-c)d^2) = $\mathcal{O}(nd^c+2)$
But finding smallest cutset is $NP$ hard

Lecture 32: Structure-based methods, continued

Dynamic Dangle Identification

graham reduction used to detect if schema is acyclic
- join of acyclic schema is polynomial
- remove vertices w/ degree 1
After each var assignment/lookahead
- check connectivity
- identify dangling trees using graph reduction
for each dangling root
- do DAC from leaf to root
- wipeout means unsolvability
Restrict search to nodes outside of the identified dangling trees

Directional Path Consistency

Given a binary CSP and a fixed variable ordering
$\mathcal{O}(nw*^2d^3)$
if $w* \leq 2$
- DPC determines consistency
- guarantees BT-free search given the ordering
- But this only works iff induced width is less than or equal to 2, you get strong consistency, minimality and tractability

Biconnected components

decomposition into separable graphs
consider a CSP whose graph has artiulation nodes
assume that the largest biconnected component is $b$
build a tree whose nodes are biconnected components considering that the articulation node belongs to the parent (or parent and child)
build an order using a pre-order traversal of the tree
complexity is then bound by the size of the largest biconnected component

Lecture 33: Structure-based methods, continued

Biconnected components

must have articulation nodes – nodes common to two or more problems
is $d^b$ where $d$ is the domain size and $b$ is the size of the largest component
Use a pre-order of the variable tree, which means that BT effort is the size of the largest component
Originally from 1982, in Journal of the ACM

Tree Clustering

bi-connected components, but with more than one shared variable for each cluster
triangulates the graph
finds maximal cliques
create a tree structure from cliques
Then repeat:
- solve a clique at each node in the tree
- apply DAC from leaves to root
- generates solutions in a BT-free manner
MaxCliques from Algorithmic Graph Theory and Perfect Graphs, Golumbic
Triangulation is $\mathcal{O}(n^2)$
Max clique is $\mathcal{O}(n + e)$ in triangulated graph
$\mathcal{O}(k^r)$ for solving clusters, $k$ is domain size, $r$ is size of largest clique ($t = k^r$)
generating solution $\mathcal{O}(nt\log t)$, $t$ is tuples in each cluster (sorted domain of a 'super' variable (sorting can help to speed things up)), in best case, we have $n$ cliques
complexity bounded by size of largest clique:
- $\mathcal{O}(n^2) + \mathcal{O}(k^r) + \mathcal{O}(nt\log t) = \mathcal{O}(nk^r\log k^r) = \mathcal{O}(nrk^r)$

Lecture 34: Structure-based methods, continued

Bucket Elimination

Built on non-binary CSPs
Dechter 1997
Choose a variable ordering – PEO recommended
create buckets for each variable, placing each constraint in the bucket of the deepest variable in its scope
create tree node for each bucket
From bottom to top
join relations in bucket (including on the domain)
project join on deepest bucket with common variable names
top-down creates solution backtrack free
Can handle cycles!
And is a pretty quick form of DAC, but for non-binary CSPs

Lecture 35: Structure-base methods, continued

Bucket Elimination/Tree Clustering

linear in $n$, exponential in induced width + 1
induced width is the number of vars in the largest component
gives fixed parameter complexity

Bucket Elimination

very much like Gaussian elimination
Useful for Bayesian networks – graphical models of all kinds

Tree Decomposition

Generalizes all of structure methods
comes from database theory
organizes any constraint problem into a tree
$(T, \Chi, \Psi)$ of $P = (X, D, C)$
- $T$ – $(V, E)$ the tree
- $\Chi(v) \subseteq X$ – A mapping of all variables in a node
- $\Psi(v) \subseteq C$ – A mapping of constraints and variables
- Each constraint must appear on at least one node in the tree, and all its variables are in that node
- nodes where any variable appears induce a single connected subtree (connectedness property), the path between any two nodes with the same variable will all have the variable
join trees and acyclic DBs from database theory
organizes the constraint network into a tree structure, a join-tree
labels tree nodes with variables and constraints – labels must obey some rules
Solving
- bottom-up – joins relations and projects on parent
- top-down generates the solution
Complexity bound by width
- tree width is the max number of variables in a node - 1
- hyperwidth is the max number of constranits in a node
Group of variables are the variables in common to two nodes
However, selecting the tree decomposition is NP-hard, use approximation by min-fill
Quite a few different tree-decomposition techniques – Gottlob's HYPERTREE seems to be one of the best
parameters
- tree width max number of variables in any node minus 1
- hypewwidth – number of constraints in any node
theory – characterizes tractable csps
- complexity is bound by tree/hyper-width
- fixed parameter complexity – independent of number of variables
practice – uses structure to solve csps
requires balancing time and space

Lecture 26: More on Constraint Consistency

Global Consistency Properties

minimality – relations join completely, $n$-consistency
decomposability – strong $n$-consistency
solvability

Local Properties

AC/GAC – Domains
Path Consistency – Constraints, add new constraints (constraint synthesis)
Partial Path Consistency – Constraints, add new constraints
Directional Path Consistency – Constraints, add new constraints
Types of constraint algorithms
- Binary
- Non-Binary
- Global Constraints
For continuous domains, work on box-consistency
On interval, box consistency can occasionally also be used
weighted CSPs can also be dealt with, with a variant of AC

Info

arc consistency – $(1,1)$-consistency – for every variable, the solution can be extended to another consistent assignment
path-consistency – $(2,1)$-consistency – for every two variables, the solution can be extended to another consistent assignment
$(i,j)$ – for every consistent assignment of length $i$, the solution can be extended by $j$ consistent assignments ($i + j = n$)
strong $k$-consistency is $k$-consistency down to $1$ consistency

Lecture 27: Still more on Constraint Consistency

domain minimality – constraints over arity 1, Gottlob 2011
however, even if domain minimality and constraint minimality hold, solving is still NP complete
global inverse consistency – where given a value at 1 var, you can extend it to $n - 1$ assignments

$(i, j)$-consistency

$j = 1$
- 2 – AC
- 3 – PC
- k – $k$-consistency
$(1, n - 1)$ – domain minimality, global inverse consistency
$(1, k)$ – $d^{k + 1}$, guaranteed that a solution can be extended over $k$ variables
NIC – Neighborhood Inverse Consistency, for every variable, look at its neighbors, and make consistent with those

Lecture 28: More Constraint Consistency

Singleton Arc Consistency

family of algos based on singleton test
singleton test like search – kinda
The CSP is AC for every VVP
Algo
- Until No change
  - for each var
    - for each value
      - assign the val to the var
      - if CSP is AC, keep
      - else Remove
Sometimes, when you get lots of backtracks, trigger a high level consistency

Inverse consistency

(1, 2) consistency – Path Inverse consistency
Neighborhood Inverce – see previous
GIC – Global Inverse Consistency – ensure domain minimality

Special Constraints – AC

functional – A constraint C is functional with regard to D iff for all $v \in D$ there exists at most one $w \in D$ such that $C(v,w)$
monotonic constraints – a constraint C is monotonic with regard to $D$ iff there exists a total ordering on $D$ such that, for all $v$ and $w \in D$, $C(v,w)$ hold, implies $C(v^{\prime}, w^{\prime})$ holds for all values $v^{\prime}, w^{\prime} \in D$ such that $v^{\prime} \leq v \land w^{\prime} \leq w$

Non-Binary Special constraints

almost all properties and algos discussed were restricted to binaryies
consistency for non-binary is the topic of current research
- in particular domain filtering that doesn't change topology or modify constraint definitions
- relational $m$ consistency – adding constraints/changing constraint definitions
Domain Filtering
- GAC
- Singleton GAC
- maxRPWC, rPIC, RPWC, etc
Relational consistency
- Strong consistencies
- Simple Tabular Reduction

Lecture 29: Extensions to BT Search

bounded number of backtrack search
bounded backtrack depth search
limited discrepency search
credit-based backtrack search
randomized backtrack search

Credit Based Search

start with a given credit, usually $n^3$
assign 1/2 credit to current assignment, 1/2 to the remaining
keep going in a depth-first manner, until credit is used up, chronologically backtrack from there
Used in conjunction with backtrack bounded search

Limited Discrepancy Search

may be blind at shallow levels of search
disobeys a given number of times
Matthew Ginsburg

Randomized BT

in systematic search, ordering of the variables/ values determines which of the solution is explored
randomization allows us to explore wider portion of search tree
thrashing causes stagnation of BT search, interrupt it, and restart
Pick random variables at instantiation
And Random values when you're picking values
Use the restart schedule when thrashing
- Fixed cutoff and universal – Luby
- Randomization and Rapid restarts (Gomes), fixed optimal cutoff, priori knowledge of cost distribution required
- randomization annd geometric restarts Walsh – $C_{i + 1} = r C_i$, $C_0 = n$
- Randomization and dynamic geometric restarts (guddeti) – $C_{i + 1} = \left\{\begin{matrix} rC_i & \text{when solution improved} \\ C_i & \text{otherwise}\end{matrix}\right.$
- Bayesion approach (Kautz)