Fundamentals of Constraint Processing
Lecture 2
Resources
-
check website – has many, many links
-
conferences
-
journals
-
include logic programming and CSP
-
Linear Programming, Integer Programming, Mixed Integer Programing, Dynamic Programming
-
-
journals are also pretty big – and tend to be easier to get into
-
many good ways to get jobs with this
Constraint Satisfaction 101
-
examples
-
generating optimal decisions
-
have unary, bynary, n-ary constraints
-
often have many choices, but restrictions can make it easier to make the right decision
-
-
-
modeling constraints remains an art
-
sets of variables with sets of choices, and aset of constraints restricting combinations of values
-
applications
-
radio resource management
-
databases
-
temporal and spatial reasoning
-
allocation
-
design/config
-
graphics, visualization, interfaces
-
verification and engineering
-
HCI and decision support
-
molecular biology
-
analytical chemistry
-
robotics, machine vision, computational linguistics
-
transportation
-
qualitative and diagnostic reasoning
-
-
constraint processing is modeling, plus inference and search
-
modeling – variables, domains, constraints, query
-
inference – enforcing consistency (filtering) and propogation (AC, PC)
-
search – systematic (chronological backtracking, with or without intelligence, etc) or non-systemactic (hill climbing, taboo, etc)
-
-
Real Example
-
Netherlands have dense network
-
wanted more robust schedule that's more effective, increasing profit
-
used OR and AI
-
-
Singapore
-
large container hub
-
has custom, special loading built using CP
-
-
CP is:
-
solving decision problems
-
with arbitrary constraints
-
providing concise and high-level feedback about alternatives and conflicts
-
-
Very Practical
Defining a problem
-
has a given and a question
-
given is a set of objects, the relations, etc
-
question ex: find $x$ such that condition $y$ is satisfied
-
these can be simple or advanced
-
-
a problem must always, always, always have a question!
-
Given $\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})$
-
$\mathcal{V} = \{V_1, V_2, \ldots, V_n\}$ Variables
-
$\mathcal{D} = \{D_{V_1}, D_{V_2}, \ldots, D_{V_n}\}$ Domains for Variables
-
Constrants $\mathcal{C} = \{C_1, C_2, \ldots, C_l\}$ where $C_{V_i,V_j,\ldots,V_k} = \{(x,y,\ldots,z)\} \subseteq D_{V_1} \times D_{V_2} \times \ldots \times D_{V_n}$
-
-
Query whether we can find a value for each variable such that all constraints are satisfied, queries can include weights (making it a Constrained Optimization Problem)
-
Various levels of consistency
-
AC
-
GAC
-
SAC
-
SGAC
-
-
Sudoku can be done with consistency, but whether or not search is required for all
-
find solution – decisions
-
find sumber of, all solutions – counting
-
find set of constraints that can be removed so that a solution exists – optimization
-
specials
-
binary CSPs have only two variables
-
restricted to $\{0, 1\}$, boolean CSP
-
finite, discrete domains, enumeration can work
-
continuous domains require sophisticated algebraic techniques are needed – consistency techniques are then used on domain bounds
-
Lecture 3
-
always a given $\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})$
-
Constraints are relations
-
relations are a subset of the cartesian product of $n \geq 2$ sets
-
relations can be represented as a table, which are refered to as allowed tuples, or supports, which is a table constraint, being an enumerated
-
extension in a set with all members enumerated
-
conflicts are the set of tuples that are no good
-
Constraints can also be in intention – they're defined in terms of a logical function, equivalent to extension, but can be easier to understand
-
A check function should always be a simple implementation – to build with them
-
-
Code must always be well structured!
Lecture 4
-
$P = (V, D, C)$
-
$V$ variables
-
$D$ domains
-
$C$ constraints
-
-
Constraints are defined as $C_1 = \langle\mathrm{scope}(C_1), \mathrm{rel}(C_1)\rangle$
-
Scope is the variables it discusses – the set of variables over which the constraint is defined
-
relation is an abstract relation – may be in extension (listed as either supports or conflicts) or intension (roughly as a function or boolean expression)
-
Also have arity, the cardinality of the scope
-
universal constraints allow any member of the relevant cartesian product, but may be induced so as to be something else – equivalent to know constraint, and is binary alone
-
-
VVP – Variable Value Pair
-
make sure to implement check function, and to do so in a very independent manner
-
keep track of a number of constraint checks, always incrementing for every check
-
will be writing abscom to parse
-
Graph representation
-
Macro representation
-
Variables are represented as nodes (vertices)
-
domains as node labels
-
constraints as arcs (edges) between nodes
-
hypergraphs allow for special constraint nodes, or use a bubble to connect them instead – the hyperedges
-
-
relations in intension are defined by set-bulider notation
-
used when not practical/possible to list all tuples
-
define types/templates of common constraints
-
linear constraitns
-
all-diff constraints
-
at most
-
TSP-constraints
-
cycle-constraints
-
-
constraints implemented
-
in extension
-
set of tuples (list/table)
-
binary matrix (bit-matrix)
-
multi-valued decision diagrams
-
-
in intension
-
constrained decision diagrams
-
predicate functions
-
-
Lecture 5
-
examples of modeling
-
can have temporal csp
-
graph or map coloring – requires 4
-
resource allocation – airline trip planning, list coloring
-
product configuration – allowed components and what they require/forbid
-
logic puzzles
-
-
Constraint types
-
algebraic constraints
-
constraitns of bounded difference
-
coloring
-
mutual exclusion
-
difference constraints
-
arbitrary constraints, must be made explicit
-
-
Databases
-
Join in DB is a CSP
-
View materialization is a CSP
-
-
Interactive systems
-
data-flow
-
spreadsheets
-
graphical layout systems and animation
-
graphical user interfaces
-
-
Molecular biologiy
-
threading and similar
-
Formal characterization
-
macrostructures
-
networks/constraint graphs, $G = (V, E)$, may be hypergraphs
-
-
Micro-structure – includes every possible Variable Value Pair
-
VVPs are linked iff are a support and are consistent
-
cliques of size $n$ are possible solutions
-
however, finding cliques is $NP$ complete
-
-
Co-microstructure – includes every possible VVP
-
VVPs are linked iff they are a no-good
-
cliques of size $n$ prevent a given solution
-
-
Complexity
-
decision problem
-
NP-complete by reduction from 3SAT
-
can be verified in polynomial time, but solution is not generated in polynomial time
-
Proof of CSP into SAT
-
CSP has Variables, Domains, Constraints
-
Every VVP is a Boolean variable in SAT
-
Requires a clause for each variable, a disjunction for each VVP, to be added to the conjunction of clauses, along with a disjunction of not a or not b of each diffenrent VVP as pairs
-
Constraints are then mapped to SAT
Lecture 6
Solving a CSP
-
by search
-
constructive, systematic, exhaustive – involves enumerating all possible VVPs
-
iterative repair, local search – pick an assignment and change unsatisfied constraints until all constraints are satisfied
-
Before start
-
consider the model/forumaltion – controls the size of the search space (defining a model is an art)
-
pre-process
-
i.e. constraint filtering/propagation, consistency checking – reduce the size of search space
-
-
-
-
Constraint Checking
-
arc consistency
-
removes individual possible elements from the domains of either element
-
Revise Function (B, C):
-
$\forall x \in D_B$
-
if support($x$, $D_c$)
-
break
-
-
else
-
$D_B \gets D_B \setminus \{x\}$
-
-
-
-
Support function: for every element, if supports, return true, if none exist, default to false
-
will not always find a solution, but will reduce problem size
-
-
k-consistency
-
consistency on $k\leq n$ variables
-
-
-
Systematic search
-
for every value of V_i, check for consistency with previous assignments (starting from the bottom), cuts off portions of the tree if false (using backchecking to prune a search tree)
-
back-checking expands a partial solution iff it is thus-far complete
-
chronological backtracking – DFS with only the current path, undoes the most recent variable assignment and tries a new one – can be used for only one solution
-
is complete – will always find a solution if it exists
-
is sound / correct – a solution it gives will always be correct – will not give a false positive
-
-
intelligent backtracking
-
jump back based on the reason for failure – generally the deepest
-
-
Ordering heuristics
-
deciding what to expand first
-
often most constraint and most promising values first
-
strategies for variable/value ordering could be static/dynamic
-
effects branching factor – best to branch as little as possible
-
generally variable ordering is dynamic, but value is static (lexicographic)
-
-
Keep-in mind forward-checking – removing values from domains that aren't consistent – a/k/a domain annhilation – and is more efficient
-
Real Full Lookahead is used to further restrict domains and search spaces more than plain forward-checking
-
balance between reasoning and search
-
Lecture 7
Arc Consistency
-
revise should only modify one variable's domain
-
requires support, takes VVP and domain of other
-
checks for VVP in X with every element of Y, returning true if found, false if none found
-
check always increments if there's a constraint
-
revise should check if there's a constraint to begin with
-
-
review reading!
-
VVPs can be represented as tuples, may use angle brackets
-
AC-1:
-
assume CSP has $n$ variables, having domain size $d$, and $e$ constraints
-
$Q$ is a set of directed edges to revise, from x to y, from y to x, size $2e$
-
for all edges in $Q$
-
set acp to fales
-
repeat until revisedp is false
-
set revisedp to false
-
for everything in $Q$
-
if it's revised, set revisedp to true
-
But, if a variable's domain is empty, break and report – this CSP has no solution
-
-
-
This CSP is then arc-consistent
-
-
Make sure to sort list of edges!
-
-
Unary constraints should be done first!
Lecture 8
Property | Algorithm | Complexity |
---|---|---|
AC | AC-1 | $O(na^3e) = O(n^3a^3)$ |
-
solutions found by
-
detecting components
-
NC
-
normalize constraints
-
NC, AC, other forms of AC
-
Search
-
Verification
-
Solution, Node Vists, Constraint Checks, CPU Time
-
-
when implementing AC1, terminate if domain wipe-out occurs
-
worst case complexity is seldom reached
-
remember, AC may discover a solution to a CSP, or may find an inconsistent problem (by domain annhilation)
-
AC3
-
for $i \gets 1$ to $n$, do $NC(i)$
-
$Q \gets \{(i, j) | (i, j) \in arcs, i \neq j\}$
-
while $Q$ is non-empty:
-
select and delete any arc $(k, m) \in Q$
-
if $\mathsc{Revise}(k, m)$ then $Q \gets Q \cup \{(i, k) | (i, k) \in arcs(G), i \neq k, i \neq m\}$
-
-
Check for domain wipe-out!
-
Lecture 9
-
Remember to deal with Node Consistency – and don't include it in the check count
-
see lecture 8 notes
-
AC3 is not only more powerful
-
$2e + a[2e - n]$
-
AC3 is $O(a^2 (2e + a(2e - n))) = O(a^3e) = O(a^3n^2)$
-
AC4 is even more efficient, but requires special bookkeeping, is $O(a^2n^2)$
-
Variant of AC-3 called AC-2001, but requires cubic space, has yet another variant, AC-3.1$^{m}$, requires different bookkeeping
-
AC4
-
generate m, s, and counter
-
m and s have as many rows as there are VVPs in the problem, m is set to true, s to empty list
-
m is for alive
-
s is for a list of all VVPs that support a given VVP
-
given a vvp, provide a number of supports provided by a given variable
-
-
prepare data structures
-
for every constraint all tuples:
-
when allowed, update s
-
increment counter
-
-
-
-
Then process all constraints
-
if counter is 0, remove from domain, and update $m$
-
since $x$ is removed, then for everything such that it was a support, decrement the counter
-
-
-
AC3 tends to be a bit better than AC4
Lecture 10
-
soundness – can trust
-
complete – will always find a solution if exists
-
more efficient AC variants are useful for use during search
-
but when enforcing AC during search, you must be able to keep track of what was removed
-
remember properties are not the same as the algorithms! there may be many algorithms to implement each property!
-
most consistency methods are local initially
-
node/arc remove inconsistent values
-
path removes inconsistent tuples
-
used when you have positive tables of supports
-
-
they get closer to the solution
-
reduce the size of the problem
-
are cheap (polynomial time)
-
choice of consistency properties helps to reduce time to solution
-
Intelligent Backtracking Algorithms
-
Read Prosser, CI 93. Hybrid Algorithms for the Constraint Satisfaction Problem
-
Consider Reading 5/6 of Dechter
-
Issues
-
Variable/Value ordering
-
Variable Instantiation
-
Currentt Path
-
Current Variable
-
Past Variables
-
Future Variables
-
Shallow/Deep Levels/Nodes
-
Search space/Search Tree
-
Back-checking
-
Backtracking
-
-
As you go, un-assigned variables are the future variables/future sub-problem
-
those that have been instantiated are the past variables
-
the current path is current instantiations
-
always check a new instantiation against past variables (back-checking)
-
algorithms
-
Vanilla: BT
-
Improved Backward: BJ (back-jumping), CBJ (conflict-directed BJ)
-
Improving Forward: BM (back-marking), FC (forward-checking)
-
-
D-Way branching is conceptually easier to understand than 2-way
-
algorithms from Prosser are iterative back-tracking and pluggable/modularized
-
Kondrak and Van Beck looked at the efficiency of algorithms
Lecture 11
-
Variables $V_i$, $i \in [1, n]$
-
Domains $D_i = \{V_{i1}, v_{i2}, \ldots, v_{iMi}\}$
-
Constraint between $V_i$ and $V_j$ : $C_{i,j}$
-
Constraint graph $G$
-
Arcs of $G$: $\mathrm{Arc}(G)$
-
Instantiation order is static or dynamic
-
lang primitives are from lisp
-
Data structures
-
$v$ is a 1 by $n$ array to store assignments, $v_0$ is the root, backtracking to this indicates insolvability, $v_i$ is the assignment to the $i$th variable
-
$domain$ is the original domain
-
$current-domain$ is the current domain, must be updated on back-tracking
-
check is as before
-
-
Generic form
-
takes n, status
-
set consistent to true
-
status to unknown
-
$i \gets 1$
-
while the status is unknown
-
if consistent
-
$i \gets \mathrm{label}(i, consistent)$
-
else $i \gets \mathrm{unlabel}(i, consistent)$
-
-
if $i > n$
-
set status to solution
-
else if $i = 0$, set status to impossible
-
-
-
-
label and unlabel functions are provided by the backtrack algorithm
-
label is forward move, unlabel is backward move
-
For BT
-
label
-
when $v_i$ is assigned a value from the $current-domain_i$, we perform back-checking against past variables
-
if back-checking succeeds
-
return $i + 1$
-
else, remove assigned value from current-domain[i], assign next value, etc
-
if no value exists, set consistent to false
-
function
-
takes i, consistent
-
set consistent to false
-
for v[i] in each element of the current domain while not consistent
-
set consistent to true
-
for h from 1 to i - 1 and while consistent
-
set consistent to check(i, h)
-
-
if not consistent
-
then remove i from current-domain
-
-
-
if consistent, return i+1, otherwise i
-
-
-
unlabel
-
current level is set to $i - 1$
-
for future variables $j$, reset the current-domain
-
if the domain of $h$ is not empty, set consistent to true
-
Function
-
takes i, consistent
-
set h to i - 1
-
set current-domain[i] to domain[i]
-
remove v[h] from current-domain[h]
-
set consistent to current-domain[h] not-equal empty-set
-
return h
-
-
-
label does the actual instantiation
-
Lecture 12: Phase Transitions
-
Cheeseman 1991
-
Order parameter
-
Probability solution exists for random problems is almost 1
-
there's a critical value whereby, after it, probability is almost 0
-
around the critical value, the probability is around 0.5
-
cost of solving drops sharply as it gets further to the right of the critical value, but high as it goes to it
-
around .5 probability is high cost of solving, no matter the algorithm, this is referred to as a phase transition
-
conjecture regarding the characterization of NP complete problems
-
applies to detecting/implementing arc-consistency
-
random graphs are almost always easy to to color – conjecture from famous paper
-
graph coloring is a reduction operator
-
friends are vertices with the same neighborhood
-
this is the degree
-
-
For CSPs, it's either density or tightness
-
currently effects the way of expirement conduct – try and deal with the hardest problems
-
but be careful not to focus exclusively on the redior around the critical value
-
Run on random CSPs – given statistical analysis
-
Vary params $\langle n, a, t, p \rangle$
-
$n$ – number of variables
-
$a$ – domain size$
-
$t$ – tightness $\frac{|\mathrm{forbidden tuples}|}{|\mathrm{all tuples}|}$
-
$p$ – proportion of constraints $p = \frac{e}{e_{max} - e_{min}}$, also $\frac{2e}{n(n-1)}$
-
$t$ and $p$ will be between 0 and 1
-
Can have issue with uniformity, difficulty (phase transition), solvability
-
empirical studies can be done simply by varying one of the parameters
-
do it (Model B):
-
generate $n$ nodes
-
generate a list of $\frac{n(n-1)}{2}$ tuples of all combinations of 2 nodes.
-
choose e elements from the above list as constraints to go between the $n$ nodes
-
if the graph is not connected, throw away, go back to step 4, otherwise proceed
-
generate a list of $a^2$ tuples of all combinations of 2 values
-
For each constraint, choose randomly a number of tuples from the list to gparantee tightness $t$ for the constraint.
-
-
Model B instances may be flawed, but it's rare
-
Lecture 13: Back-tracking: Continued
-
keep track of path – an array of the instantiations thus far and to be created
-
remember what unlevel does – actually performs backtracking
-
many different ways to order variables
-
when doing value-ordering, use lexicographic ordering!
-
eventually, variable ordering heuristics must be broken with lexicographic ordering
Back-Jumping
-
uses $max-check$, an array, and when check(i, h) succeeds, $max-check[i] \gets max(max_check[i], h)$
-
if the current domain of $h$ is empty, chronological back-tracking is performed from h
-
you go back to $max-check[i]$ to fix stuff when you have an empty variable, unless you've already back-jumped in the current path
-
BJ just modifies BT-label and BT-unlabel
-
BJ-label just updates $max-check$
-
bj-unlabel
-
backtrack to h = max-check[i]
-
resets max-check[j] = 0 for j in $h+1$ to $i$
-
-
Conflict-directed BJ
-
requires a new data structure, the conflict-set array
-
tracking at what levels conflicts occur
-
-
knows how to jump-back again, to the deepest level of conflict
-
can jump-back more than once!
-
useless if good variable ordering
-
conflict sets are initialized to $\{0\}$
-
At any point, conf-set is a subset of past variables that are in conflict with $i$
-
when check(i, h) fails add h to conf-set[i]
-
when domain is empty
-
jumps to deepest past variable in conf-set
-
update $conf-set[h] = conf-set[h] \cup conf-set[i] \setminus \{h\}$
-
-
-
To calc all solutions – don't use CBJ
Lecture 14: Back-tracking: Continued
CBJ Continued
-
max-check – deepest level checked against – if instantiated, will be $i - 1$
-
conflict set is used for holding what's conflicted
-
Finding all solutions with CBJ
-
use the conflict sets
-
or use CBF from Kondrak
-
keep track of number of solutions
-
-
make sure to check generated solutions with the solution checker!!
-
modify algo to after solution, back track to every level at least once!
-
in actuality, must be back tracked to exactly one
-
-
look at Chis Thiel's response
-
when a solution is found, force the lat var to conflict with everything before it. In turn, this forces some chrono backtracking as the conf-sets are propagated backward
-
-
Kondrak's responce
-
cbf flag, vector
-
$\forall i, cbf[i] \gets 0$
-
when you find a solution, $\forall i, cbf[i] \gets 1$
-
In unlabel
-
if $cbf[i] = 1$
-
$h \gets i - 1$, $cbf[i] = 0$
-
else $h \gets max(conf-set[i])$
-
-
-
-
Lecture 14: Forward Steps
Backmarking
-
tries to reduce the amount of consistency checking
-
$v[i]$ to reassign to $k$
-
$v[i] \gets k$ last checked against $v[h] \gets g$
-
$v[h]$ has not been modified
-
-
not super useful – as consistency enforcement is more useful
-
this helps to reduce back checking
-
data structures, max back-check level, minimum backup level
-
$mcl$ max check level, $n \times m$ array, $mcl[i, k]$ stores deepest variable that $v[i] \gets k$ was checked positively against, finer-grained mcl
-
$mbl$ min. backp level $n$ array, gives the shallowest past variable whose value has changesd since $v[i]$ was the current variable
-
-
BM does not allow dynamic variable ordering
-
aware of the deepest variable that $v[i] \gets k$ is $v[j]$
-
values of variables in the past of $v[j]$ ($h<j$) have not changed
-
so we need to check $v[i] \gets k$ against the values of the variables between $v[j]$ and $v[i]$
-
we do not need to check against the values of the variables in the past of $v[j]$
-
when $mcl[i,k] \leq mbl[i]$, don't check, fails!
-
when $mcl[i,k] \geq mbl[i]$, don't check, succeeds! – type B savings
Forward checking
-
Looks ahead from the current variable, considering all future variables and clear from the domains the values that aren't consistent with the current partial solution
-
FC means more work in the future, but causes fewer nodes to expand
-
When it moves forward, the values of the current domains of future variables are all compatible with past assignments, saves backchecking
-
FC may cause wipe-out, but discovers conflicts early on. will backtrack
-
goal is to find failures early – avoids expanding fruitless subtrees
-
is not truly forcing arc consistency, that's RFL
-
requires a DS to keep track of filtering, can be done in various ways
Lecture 15: Forward Steps, Continued
Forward Checking
-
look-ahead – revises future variables
-
makes more work at instantiation, but expands fewer nodes
-
as you move forward, values in future variables are consistent with past variables
-
can cause wipe-out and conflict discovery early on, backtracking chronologically
-
requires current domain, stack of variables removed by previous instantiations
-
future is a subset that v[i] checks againts
-
past-fc is past variables that checked against v[i]
-
requires functions
-
check-forward – called when instantiating v[i]
-
revise(j, i)
-
returns false if current-domain[j] is empty, true otherwise
-
values are removed from current-domain are pushed as a set into reductions[j]
-
-
update-current-domain
-
$current-domain[i] \gets domain[i] \setminus reductions[i]$
-
-
fc-label
-
attempt to instantiate current-variable
-
filters domains of current
-
on wipe-out of future variable, uninstantiate and undo domain filtering
-
-
-
Suffers from thrashing, as based on bt
Variable Ordering
-
dom, least domain, smallest domain first
-
degree of variable (number of neighbors)
-
dom/deg is most common
-
Degree is difficult, as it changes very rapidly
-
can/should be implemented statically
-
Consistency Check Summary
-
CBJ
-
uses backtracking, no addtional structures
-
-
backmarking uses mcl and mbl
-
two types of consistency-checking savings
-
-
forward checking
-
works more at each instantiation, but expands fewer subtrees
-
uses reductios, future-fc, past-fc
-
-
FC-CBJ is the best. Period
-
Other Backtracking algos
-
Graph-based back-jumping – Dechter
-
Pseudo-trees [Freuder, 85]
-
-
Other lookahead – DAC, MAC – DAC seems to be pretty good as a compromise between MAC and FC
-
More emperical evals
-
now by generating random problems
-
-
theoretical comparisons
Implementation notes
-
Enforce NC
-
Normalize constraints
-
check for empty crelations
-
Interrupt as soon as you detect domain wipe-out
-
Remember dynamic variable ordering includes the domino effect
-
a/k/a pick singleton domains first
-
Lecture 15: Intelligent Backt Tracking – Theoretical Eval
-
Kondrak and Van Beek, IJCAI 95
-
Dechter 5, 6
-
most backtracking algorithms are evaluated empirically
-
performance depends heavily on problem instance
-
is therefore average case analysis
-
representativesness of the test problems
-
CSP in NP complete – always possible to construct examples where BJ/CBJ are worse than BT
-
comparison criteria are different
-
paper gives theroretical approach
-
states by number of nodes visited
-
number of constraint checks
-
-
BT, CBJ perform the same amount of consistency checks
-
bm reduced consistency checked
-
MD does not affect number of nodes visited
-
FC-CBJ may visit more nodes than CBJ
-
proves correctness of BJ/CBJ
-
determines partial order between algos
-
proves BJ, CBJ are correct
-
proves FC never visits more nodes than BJ
-
Improves BMJ and BM-CBJ to perform even fewer consistency checks
-
provides framework for characterizing future BT algorithms
-
BT extends partial solutions – solutions consistent with a set of uninstantiated values if it can be consistently extend to the variables – that is, create a new partial solution extended by one variable s.t. it's still consistent
-
Dead end – when all values of current variables are rejected
-
lower level are closer to the root
-
higher levels are closer to the fringe
-
2 BT algos are equiv if on every csp they generate the same tree and perform the same consistency checks
-
Lemma 1: If BJ bactkracs to variable x_h from a dead end at x_i, then (a_i \ldots a_h) is not a valid partial solution
Lecture 16: Theoretical Eval, Continued
-
sufficient conditions
-
BT visits a node if parent is consistent
-
BT visits a node if its parent is consistent with all variables
-
CBJ vists a node if its parent is consistent with all sets of variables
-
FC vists a node if it is consistent and its parent is consistent with all variables
-
-
Necessary Conditions
-
BT visits a node only if parent is consistent
-
BT visits a node only if its parent is consistent with all variables
-
CBJ vists a node only if its parent is consistent with all sets of variables
-
FC vists a node only if it is consistent and its parent is consistent with all variables
-
-
BT visits all nodes FC visits
-
BJ vistis all nodes CBJ visits
-
CBJ and FC are not comparable.
-
FC will visit no more nodes than BJ
-
BT and BM are equivalent, as are BJ and BMJ
-
proof of correctness – termination, soundness, completeness
-
extends for seeking all solutions
-
BM hybrids should be fixed, by changing MBL to an $n \times m$ array, creating BM2, BMJ2 and BM-CBJ2
Lecture 17: Search orders in CSPs
-
assumptions
-
finite domains
-
binary constraints
-
-
Chapter 6, Tsang is req'd reading
-
Dual viewpoint heuristics for binary constraint satisfaction problems is recommended reading
-
IN BT, some orders have fewer backtracks
-
in lookahead, some ordering detect failure earlier
-
variable ordering is often the most constrained variable
-
value ordering is often the most promising value first
-
using dynamic ordering in lookahead algos often removes the need for intelligent backtracking
-
Regin and Bessier show that CBJ and similar isn't needed – but not so sure
-
Select values that don't cause constraint violation – do the most promising value first, avoiding constraint violation
-
Discover constraint violation quickly – select variables that don't delay discovery of constraint violation, go with the most constrained variable first
-
Notation: For a given future variable $V_i$, the assignment $V_i \gets x$ partitions the domain of the future variable $V_j$, a set of values consistent with with the assignment of size $\mathrm{Left}(V_j | V_i \gets x)$ and a set of values inconsistent of size $\mathrm{Lost}(V_j | V_i \gets x)$
-
Value selection of Lost and Left
-
Value Ordering to quickly find a solution
-
min-conflict – orders values according to the conflicts in which they are involved with the future variables – most popular (Minton), $Cost(V_i \gets x) = \sum_{V_j \neq i} \mathrm{Lost}(V_j | V_i \gets x)$
-
Cruciality – (Keng and Yu, 1989) $Cruciality(V_i \gets x) = \sum_{V_j \neq i} Lost(V_j | V_i\gets x) / |D_{V_j}$
-
promise – most powerful (Geelen, 1992) $Promise(V_i \gets x) = \prod_{V_j \neq i} Left(V_j | V_i \gets x)$, the number of assignments that can be done such that no constraint on $V_i$ is broken.
-
Others are defined
-
-
Promise
-
recognizes domain wipe-out
-
product discriminates better than sum
-
semantics of promise gives an upper bound on number of solutions
-
-
FFP name is contested
-
goal is to recognize dead-ends asap to save search effort
-
choose ordering that reduces branching factor
-
Early ones
-
LD
-
maximal degree
-
minimal ratio of domain size over degree
-
promise for variables
-
Br\'elaz heuristic
-
weighted degree (Boussemart et al, ECAI 2004)
-
graph-based
-
minimal width
-
induced with ordering w*
-
maximal cardinality ordering
-
minimal bandwidth ordering
-
-
remember to exploit domino effect, especially under dinamic var ordering
-
always state how you are breaking ties – generally lexicographically
-
is often cheap and effective
-
in most cases, suitable for both static and dynamic ordering
-
-
Promise for variables – Sum of the promises of the values, minimize the number of assignments that can be done such that no constraint on $V_i$ is broken. Has upper bound on the number of solutions
-
Lecture 18: Ordering Heuristics
-
Least Domain
-
Max Domain
-
Domain/Degree
-
Promise is based on what has the most promise given the current state
-
Br\'elaz heuristic – originally used for graph coloring
-
Remember, variable ordering heuristics are cheap and effective
-
generally suitable for static and dynamic ordering
-
graph-based heuristics aren't often used anymore
-
Br\'elaz
-
originally designed for coloring
-
assigns most constrained nodes first
-
arrange vars in decreasing order of degrees
-
color a vertex with maximal degree with color 1
-
choose a vertex with a maximal saturation degree (number of different colors to which is it adjacent)
-
if thete's an equality, choose any vertex of maximal degree in uncolored
-
-
color chosen vertex, with lowest numbered color
-
if all vertices are colored, stop, otherwise, choose another vertex and continue
-
-
wdeg
-
All counstraints assigned a weight, initially 1
-
every time a constraint is broken, during propagation with look-ahead (constraint causing domain wipe-out), weight is increased
-
the weight of an un-assigned variable is defined as the sum of the weights of the constraints that apply to the variable
-
the variable with the largest weight is chosen for instantiation
-
weights are not reinitialied at back-track
-
refined with dom/wdeg
-
inspired by breakout heuristic in local search
-
However, there mart be places where dom/deg is more efficient than dom/wdeg
-
A decay function may help improve this
-
-
Graph heuristics
-
ordering by elimination or instantiation
-
minimal width
-
given an ordering, the width of a node is the number of its parents
-
then the ordering's width is the largest width of any of the nodes
-
sum of degrees is always equal to the number of edges
-
take minimum of the orders, by width – this is the width of the graph
-
this can reduce chance of backtracking, can help to find minimum width ordering
-
Algo:
-
remove all nodes not connected to any others
-
$k \gets 0$
-
while no nodes left in the graph
-
$k \gets 1$
-
while there are nodes not connected to more than k others
-
remove such nodes from graph, along with any edges connected to them
-
-
-
return $k$
-
minimal width ordering of the nodes is obtained by taking the nodes in the reverse order that they were removed
-
-
-
Inducted Width Ordering
-
What happens when you reconnect neighbors of what you remove to each other
-
-
Guest Speaker: Philip Walton, Workday
-
From Workday, SaaS company for HR management
-
work has primarily been in Data Analysis and optimization
-
go from techniques and technologies to money
-
Could be a Linear Program
-
Measures, Items (index sets)
-
indices i, m
-
variables Quantity, 0 to N
-
model sets, limit, value, weight
-
maximize value and quanitity
-
but must be withinn weight
-
-
big thing is assigning resources in a data center
-
truck scheduling
-
many truck loads that a company has to move
-
origin and destination specified
-
and time windows at origin and destination
-
-
many drivers, many trucks
-
some time horizon
-
DOT rules
-
operational rules
-
returning to base
-
equipment preventative maintenance
-
-
Taking a run at the whole thing is probably too much
-
run a rout model to pick loads that go togethr
-
schedule loads – the big thing for SCPs
-
drivers pick the sets of loads they prefer
-
-
-
modeling
-
discrete time forms an integer domain
-
can be clumpy
-
-
use start time of the task as the domain
-
constraints like before and after are relatively straightforward
-
duration of driving/time windows are prety easy to express
-
small disruptions are normally readily repaired by freeing up parts of the model and letting it re-solve
-
time windows tend to be pretty large
-
isn't always a notion of an optimal schedule
-
side constraints can be pretty easy to represent
-
work rules aren't always easy
-
drivers may have odd drive/sleep patterns
-
side resource use is tolerable
-
-
-
Robustness for many things has to be dealt with, but how that's done is very differennt
-
And remember how important interface is – they're both important, and very, very different
-
And even more important is understanding what's going on within the problem domain
-
good large software is built on top of amazing small software
Lecture 19: Variable Orderings, continued
Induced Width Ordering, $\omega^{*}$
-
requires designing induced constraints – can cause much, much less backtracking
-
these are called fill-in edges
-
this procedure moralizes the graph
-
computing min induced width is NP hard
-
MIN FILL heuristic is used
-
when removing anode, add an edge between every two nodes connected to the node, but not already connected to each other
-
remove the node that, after removal, has the samllest number of fill-in edges
-
Remember to remove the node with the smallest first!
-
-
A tree is a graph with induced width 1
-
An ordering has width, as does a graph. Graph width is not dependent on ordering, nor is induced width, $\omega^{*}$ is called induced width or tree width.
-
triangulated or chordal graphs are graphs in which each cycle of length 4 or more is such that each pair of non-connected vertices is connected
-
Makes it easy to find largest clique, can actually be done in polynomial time
-
is a type of perfect graphs
-
interval graphs are a type of triangulated graph
-
finding minimum number of triangulating edges is also np-hard
-
Maximal Cardinality ordering
-
Choose an arbitrary node
-
among remaining nodes
-
choose the one connected to the maximum number of already chosen nodes
-
break ties arbitrarily
-
-
Repeat until done
-
Obtained ordering is instantiation ordering
-
reverse ordering yields a PEO
-
this is the perfect elimination ordering when reversed
-
If the graph does not need to be moralized, the graph is triangular
Lecture 20: Variable Orderings, continuedg
Max Cardinality, Continued
-
width of an induced ordering is same as graph
-
MWO can help to guaranty back-track free search
-
Lots of algos came from RDBMs
-
Induced width is NP Hard!!!
-
Simplicial graph is such that looking at an elimination ordering, all parents of a node form a clique
-
moralized is triangulated
Minimal Bandwidth Ordering
-
bandwidth is
- node
-
largest distance between the node and its neighbors
- ordering
-
largest bandwidth of the nodes in the ordering
- graph
-
smallest bandwidth across all possible orderings
-
heuristic localizes and confines backtracking
-
smaller bandwidth, sooner ability to backtrack
-
find minimum bandwidth ordering is np-hard
-
but finding an ordering of a given bandwidth is polynomial
-
Tsang 6.2.2, only one paper
Ordering Heuristics
-
When
-
static vvar, val ordering
-
dynamic var, static val
-
dynamic var, dynamic val – dynamic vvp
-
-
at each given level
-
VVPs pertaining to the same variable across a given level, static-static
-
Vars are different, dyn-static
-
vvps are different at each and every level
-
-
Computing
-
sort vars at preprocessing for static
-
dynamic selects var during search, based on current domain, all unsintantiated neighbors, exploits the domino effect!!
-
Lecture 21: Filtering For Constraints of Difference
-
GAC – Generalized Arc Consistency, for non-binary CSPs
-
Operates on table constraints, by supports or by no-goods
-
$\pi_A(R)$ gives a new relation with only values of A
-
May be a Set, or a Bag
-
-
-
finite csps are linear in space, quadratic time when dealing with all-diff constrants
-
Sometimes, all diff may become many constraints that together are equivalent, but not always
-
GAC4 handles n-ary constraints
-
has good pruning power
-
but is very expensive – $\frac{d!}{(d-p)!}$
-
-
value is consistent if other values exist for all other values such that the values and the give simulatenously satisfy the problem
-
a constraint is consistent if all values for all variables arec oconsistent
-
csp is arc-consistent if all constraints are consistent
-
csp is diff-arc-consistent iff all all-diff constraints are arc-consistent
-
value graphs are bipartite
-
vertices are $X_C$, $\cup_{x \in X_C}(D_x)$
-
edges $(x_i, a)$ iff $a \in D_x$
-
-
Space is $\mathcal{O}(nd)$
-
Matching a subset of edges in G with no vertex in common
-
max matching is the bigges possible
-
matching covers a set $X$ – every vertex in $X$ is an endpoint for an edge in matching
-
$M$ that covers $X_C$ is a max matching.
-
If every edge in $GV(C)$ is in a matching that covers $X_C$, $C$ is consistent.
-
CSP is diff-arc-consistent iff, for every all-diff $C \in \mathcal{C}$, for every edge $GV(C)$ belongs to a matching that covers $X_C$ in $GV(C)$
-
Build graph, compute max matching, if size of max matching is smaller than $X_C$, return false, otherwise remove edges from G given the max matching and return true
-
use Maximal Flow in Bipartite graph
-
Or Hopcroft/Karp algo
-
-
Then we have an all-dif constraint, value graph and a maximum covering
-
remove edges that aren't in a matching covering
-
-
given a Matching M
-
matching edge is an edge in M
-
free edge is an edge not in M
-
matched vertex is incident to a matching edge
-
free vertex is otherwise
-
alternating path (cycle) – a path (cycle) whose edges are alternatively matching and free
-
flip colors on even alternating path – gives another matching of same size
-
flip colors on alternating cycle – another matching
-
-
length of path – number of edges in path
-
vital edge belongs to every maximum matching
-
-
An edge belongs to some of but not all maximum matching iff for an arbitrary maximum matching $m$ , it belongs to either an even alternating cycle or an even alternating path tha begins at a free vertex
-
Edges to remove shouldn't be in all matchings, even alternating paths starting at a free vertex and an even alternating cycle.
-
every directed cycle in $G_0$ corresponds to an even alternating cycle of $G$
-
every directed simple path starting at a free vertex corresponds to an even alternating path of $G$ starting at a free vertex
-
make black point one way, red the other
-
allows for all-diff and other global constraints
Lecture 22: CSPs and Relational DB
-
connections
-
constraint databases and constraint logic programming
-
query processing and solving csps
-
-
shared
-
constrant dbs (deductive BD, Datalog) and constraint logic programming share representation
-
relational databases and constraint satisfaction share computations
-
-
Constraint databases use LP – Prolog for DB, Datalog
-
Prolog was a bit of a failed expiriment
-
CLP(R,F) is prolog with CSP techniques for either reals or finite domains
-
Relations
-
binary relations – given $D_a$ and $D_b$, a set of any 2-tuples $\langle x, y \rangle$ with $x \in D_a$ and $y \in D_b$, this defines a relation $R_{ab}$ $R_{ab} = {(x, y)} \subseteq D_a \times D_b$
-
Function – a binary relation such that there is at most one tuple $\langle x, ? \rangle \in R_ab$. $D_a$ is the domain, $D_b$ is the co-domain.
-
$k$-ary relations – are relations on $k$ elements
-
-
representation
-
2-dimensional binary array, though k-dimension works
-
tables
-
linked-lists of pairs
-
hash-maps of sets
-
many other options
-
-
Terms
-
table, relation – constraint
-
arity of relation – arity of variable
-
attribute – variable
-
value of attribute – value of var
-
domain of attribute – domain of variable
-
tuple in table – tuple in constraint/allowed by constraint/consistent with constraint
-
-
Relational algebra
-
intersection
-
input two relations of same scope
-
output a new, more restrictive relation with the same scope made of tuples in all the input relations simultaneously
-
logical and
-
-
union
-
input 2 rel of same socope
-
less restrictive relation with same scope of tuples that are in any input relation
-
logical or
-
-
difference – same as set theory
-
selection
-
projection
-
join
-
composition (from CSP)
-
Lecture 23: CSPs and Relational DBs, continued
-
Selection and projection are unary operators
-
Selection
-
Input – a relation $R$ and a predicate o attributes of $R$
-
Output a relation $R^\prime$ with the same scope as $R$ but having only a subset of the tuples in $R$
-
Row selection
-
$\sigma_{p}(R)$
-
-
Projection
-
Input – a relation $R$ and a subset $s$ of the scope
-
Output – a relation $R^\prime$ of scope $s$ with the tuples rewritten such that positions not in $s$ are removed
-
column elimination, in CSP set semantics, in DB bag
-
$\pi_{\{x_1, x_2\}}(R)$
-
-
Join (natural join)
-
Input – $R$ and $R^\prime$
-
Output a relation $R^{\prime\prime}$ whose scope is the union of $R$ and $R^\prime$ and tuples satisfy both
-
if they have no common attribute, cartesian product
-
if they have an attribute in common, compute all possibilities
-
-
Computes all solutions to a csp
-
$R \bowtie R^\prime$
-
-
Composition of two relations
-
Input – two binary relations $R_{ab}$ and $R_{bc}$ with 1 var in common
-
Output – a new induced relation $R_{ac}$ to be combined by intersection to a pre-existsing relation between them
-
bit matrix op is matrix multiplication
-
Generalization as constraint synthesis
-
Produces induced relations a/k/a indirect, inferred, implicit
-
-
Given $V_1$ and $V_2$ and $C_1$ and $C_2$ between them, how do $C_1 \cap C_2$ and $C_1 \bowtie C_2$ relate? – They'll be the same in this situation
-
Given $V_1$, $V_2$, $V_3$, $C_{V_1, V_2}$ and $C_{V_2, V_3}$, write the induced $C_{V_1, V_3}$ in relational algebra. – $C_{V_1, V_3} = \pi_{V1, V3}(C_{V_1, V_2} \bowtie C_{V_2, V_3}) = C_{V_1, V_2} \circ C_{V_2, V_3}$
-
Given $V_1, V_2, V_3$, $C_{V_1, V_2}, C_{V_2, V_3}, C_{V_1, V_3}$, $C^{\prime}_{V_1, V_3} = (C_{V_1, V_2} \circ C_{V_2, V_3}) \cap C_{V_1, V_3}$
-
Natural join – synthesized constraint
-
left/right join – synth constraint with some inconsistent tuples
-
Differences
-
number of relations
-
db have few
-
csp have many
-
-
size of relations
-
dbs have high-arity
-
csps are mostly binary
-
-
bounds of relations
-
dbs are very tight
-
csps are loose
-
-
size of domains
-
db large
-
csp small
-
-
type of domain
-
almost always finite, but csp may be continuous
-
-
number of solutions sought
-
dbs looking for almost all
-
csp only looks for one in general
-
-
cost-model
-
DB is I/O, memory
-
CSP is computational
-
-
Lecture 24: Path Consistency and Global Consistency
-
consistency helps to reduce possible search-space
-
A path of length $m$ is consistent iff for any value $x \in D_{V0}$ and for any value $y \in D_{Vm}$ that are consistent, there exists a sequence of values in the domains of the variables $V_1, V_2, \ldots, V_{m - 1}$ such that all constraints between them (along the path, not across) are satisfied
-
A CSP is path consistent if any path of any length is consistent
-
However, by Mackworth AIJ77, only paths of length 2 are necessary, $\mathcal{O}(n^3)$
-
algo looks at every vertex as a mid-point $k$, and makes $i$ and $j$ together consistent given $j$,
-
and $R_{ij} \gets R_{ij} \cap (R_{ik} \cdot R_{kj})$
-
Closure algorithm. $k$ is outer loop
-
Until fixed point
-
for k from 1 to n
-
for i from 1 to n
-
for j from 1 to n
-
$R_{ij} \gets R_{ij}^{O} \cap (R_{ik} \cdot R_{kj})$
-
-
-
-
-
Algo
-
$Y^n \gets R$
-
Until $Y^n = Y^0$
-
For k from 1 to n
-
For i from 1 to n
-
For j from 1 to n
-
See slide for equation
-
-
-
-
-
$Y \gets Y^n$
-
-
-
Difference between k consistent and strong k consistency
-
PC1 terminates, is sound and complete for discrete CSPs - $\mathcal{O}(a^5n^5)$
-
In a complete graph, if every path of length 2 is consistent, the network is path consistent
-
PC1 has two operations, composition and intersection
-
proof is done by induction
-
-
Mohr and Henderson also developed PC-2, 3
-
and Han, Lee created PC4
-
PC5 from Singh uses support bookkeeping from AC6
-
PC8 uses domains, not constraints, 2001 improves on PC8, but isn't tested
-
PC is rarely used except for certain types of constraints
-
monotonic constraints – ordering based
Lecture 25: Path Consistency and Global Consistency – continued
-
completion allows you to use paths of length 2.
-
Strong PC will ensure both PC and AC
-
AC first, then if no inconsistency, PC (likely PC2)
-
Go from weaker first to stronger – making later steps likely cheaper
-
But PC is often unused – except some situations make it much more likely to find a solution
-
PC is also called 3 consistency
Partial Path consistency
-
same as PC, but no universal constraints, defined over cycles
-
graph is triangulated
-
run closure only over the tringles – $\mathcal{O}(n^3)$
-
must be careful for articulation points in the graph
-
PPC is much easier to implement
-
PPC is strictly weaker than PC as a property
-
PC completes graph, PPC will triangulate
-
PC filters more than PPC, edge-by-edge
-
PPC enforces PC on common edges
-
which means that you can find if something is not path consistent
-
From Bliek and Sam-Haroud 1999
-
Can PC detect insatisfiability when PPC does not? – It seems not – PPC and PC are roughly equivalent, but there are some cases – annd it was hand built
Global Consistency properties
-
minimality/decomposability
-
A minimal network – a network where all tuples in a given constraint appear in at least one solution
-
computing a minimal network is NP complete
-
Solving a minimal network is itself NP complete
Lecture 26: Consistency properties
-
Local Consistencies
-
AC
-
GAC
-
Path Consistency – modifies binary constraints, requirecs completed graph
-
Partial Path Consistency – triangulates graph, modifies binary relations
-
-
Global consistency properties
-
Consistency in general – it has a solution
-
minimality – constraints are as tight as possible, on binary constraints, elimination of a tupple eliminates a solution, every tuple is a member of at least one solution, idealized form of path consistency, garutys consistency, important in interactive product configuration – NP complete in general, Gottlob, 2011
-
determining whether or not a CSP is minimal is NP complete
-
Finding a solution of a minimal CSP is still NP complete
-
occasionally called globally consistent
-
-
decomposability
-
CSP is decomposable if for any combination of assignments to any number of variables that is consistent given the constraints can be extended to a complete solution
-
guaranties back-track free search
-
strongly $n$-consistent
-
-
-
there are some islands of tractability
-
tree structure – AC ensures tractability
-
convex constraints – path consistency ensures minimality and decomposability
-
Lecture 27: Consistency properties, continued
-
minimality doesn't guaranty bt-free search
-
decomposability ensures no back tracking
-
PC approximates minimality
-
when constraints are convex, PPC on triangulated graph guarantees minimality and decomposability
-
when $\circ$ distributes over $\cap$, then PC1 guarantees that a CSP is minimal and decomposable, and that the outer loop can be removed
-
but this doesn't always hold
-
constraints of bounded difference ad temporal reasoning
-
vars,
-
constraints of the form $a \leq Y -X \leq B$, $Y - X = [a, b] = I$
-
composition and intersection from slides
-
Lecture 28: Consistency properties, continued
-
Remember, once decomposability is enforced, search is guaranteed backtrack free
-
When constraints are convex, use of PPC ensures minimality and decomposability, but triangles must be done in PEO
-
When composition distributes over intersection
-
PC1 generalizes Floyd-Warshall (all-pairs shortest path)
-
PC-1 generalizes Warshall (transitive closure)
-
-
Don't confuse properties with algorithms
-
Local consistency
-
remove inconsistent values (node, arc)
-
remove tuples (path consistency)
-
get closer to solution
-
reduce size of problem
-
are cheap
-
-
Aim for global consistency, but it's hard
-
Sometimes (given certain types of constraints or graphs), local consistency guarantees lobal consistency
-
distributivity in PC, row-convex constraints, special networks
-
Lecture 28: Local Search
-
Systematic/constructive search is all we've talked about
-
Starts at a solution that's not correct, try to repair
-
General method:
-
for each variable, choose a value
-
check if solution
-
You're done. Lucky.
-
pick a value that broke a constraint and change it (using min-conflict)
-
-
-
Local search is in worst case, $\mathcal{O}(d^n)$
-
States laid on a surface
-
quality/cost is height
-
optimum state maximizes solution quality while minimising cost
-
move around from state to state to find a local optimum
-
Gradient descent – will only ever find a local optimum
-
State is a complete assignment of values to variables
-
how do you move – CSP uses min-conflict
-
eval function rates the quality of a state, generally in number of broken constraints
-
Try
-
Start w/ full but arbitrary assignment of aues
-
reduce inconsistencies by local repairs
-
repeat until
-
A solution is found (a global minimum)
-
The current solution cannot be repaired (local minimum)
-
You've gone through max tries (running out of patience)
-
-
-
Local repair for decision, local optimzation for optimization
-
Greedy types are hill climbing, local beam
-
Local beam
-
start in parallel with a number of different points
-
choose among those another number of best moves
-
remove those without promise
-
and repeat as necessary
-
-
Stochastic methods
-
break out, helps to avoid getting stuck in a local minimum
-
when stuck in a local minimum, walk randomly, pick a random variable, change to a random value
-
tabu search
-
keep a list of tabu positions (those that aren't good), but only a certain number
-
when a new state is visited, added to tabu, and farthest back forgotten
-
if no better, go to the next state not tabu
-
repeat until a global minimum is found
-
-
sometimes you may randomly ignore a heuristic
-
-
Local search problems
-
get stuck in a local optimum
-
start having can have issues with plateaus (flat local maxima) (be ready to move to a state with the same quality)
-
ridges can go from side to side with minimal progress
-
-
when stuck, do random walk
-
Simulated annealing
-
When stuck in a local minimum, allow steps towards less good neighbors to escape the ocal maximum
-
be more random at the start, be less random as you go
-
Start at a random state, start countdown and loop until over
-
pick a neighbor at random
-
$d$ = quality of neighbor - quality of current state
-
if $d > 0$
-
then move to neighbor and restart
-
otherwise, move to a neighbor with a transition probability $p<1$
-
-
-
Transition probability proportional to $e^{\frac{d}{t}}$
-
$d$ is negative
-
$t$ is countdown
-
As time passes, you're less and less likely to move towards bad neighbors
-
-
-
But all forms of local search are:
-
non-exhaustive
-
non-systematic
-
liable to get stuck in a local optimum
-
non-deterministic – outcome depends on where you start
-
heavy-tailed – longer you go, less likely you are to find a solution
-
-
Is often somewhat probabalistic
Lecture 29: Local Search, Continued
-
starts from a complete assignment, and moves to a new assignment that's better
-
Very greedy approach, but can be parameterized very easily
-
Tends to be very greedy, but only allows improvements
-
others can be better, and allow forr random walks/steps – stochastic algorithms
-
stochastic algorithms are not deterministic!
-
Tabu search moves to a state that may be worse, provided it's not in a list of previous states
-
break-out modifies weight of objective function
-
simulated annealing involves countdowns and loops, controled by quality of neighbor and the time, but when significantly restricted,, it is guaranteed to find an optimum
-
Have issues
-
non-systematic and non-exhaustive
-
have issues getting stuck in local optima
-
non-deterministic!!
-
generally the probability of improving a solution gets smaller, but never actually dies out – heavy-tailed
-
-
So, when you hit a point where it seems improvability drops
-
keep the most promising solution
-
and start someplace else
-
Random restart
-
-
Luby restart profile
-
uses a crazy geometric law to decide how/when/where to restart
-
Genetic algos for local search
-
Combine two complete assignments to generate offspring
-
Start from an initial position
-
encode assignments in a compact manner
-
combine partial solutions to generate new solutions
-
requires initial generation of many random assignments
-
Choose solutions to combine based on quality
-
fitness is used to describe quality, assigns probability for solution
-
Select pairs depending on fitness
-
crossover point is chosen randomly, offspring are generated
-
mutation will randomly change a state
-
and this gets put back in the original population
ERA – Environment, Rules, Agents
-
Environ is $n \times a$ board
-
each var is an agent
-
each position is a value of a domain
-
agent moves in a row on the board
-
each position records the number of violations caused by the fact that agents are occupying other positions
-
agents try to occupy positions where no constraints are broken
-
agents move according to reactive/reaction rules
-
agents may themselves have preferences
-
move rules can be as simple or complex as possible
-
rules:
- least-move
-
choose position with min violation
- better-move
-
choose position with smaller violation
- random-move
-
randomly choose a position
- others
-
combinations of those above
-
No one talks to each other, but share a common context
-
everyone kicks eachother out until every one is happy
-
very effective in solving very tight, but solvable instances
-
unstable in over-constrained cases
-
agents keep kcicking each other out (livelock)
-
livelocks can be used to identify bottlenecks
-
-
very stable on solvable instances
-
oscillates wildly on unsolveable instances
-
Agent motion can be categorized as variable, stable, constant
Lecture 30: Local Search, continued
ERA, Continued
-
Rules help to find a solution somewhat quickly
-
Has less of an issue with getting stuck in various local optima
-
May be very good for resource-allocation problems
-
more dynamic, as there's no global evaluation function
Nothing works, what to do
-
Random restart
-
when no progress made, restart from a new srandom selected state
-
save the best results so far
-
-
repeat random restart
-
for a fixed number of iterations
-
until best results have not ben improved on for a certain number of iterations – e.g., a geometric law
-
Evaluation of Local Search
-
Test on a given problem instance
-
an ensemble of problem instances
-
-
Experiment
-
running algo thosands of times
-
measure some metric as a random variable – i.e., the time needed to find a solution
-
-
Then provide
-
probability distribution function approximated by the number of runs that took a given time to find a solution
-
the cumulative distribution function approximated by the number of runs that found a solution within a given time
-
-
Then compare performance with statistical tests
-
t-tests assume normal distribution of the metrics
-
nonparametric tests don't
-
Consult with a trained statistician
-
Lecture 30: Structure-based methods
-
CSPs represented by networks
-
for bCSPs, graph
-
constraint graph
-
microstructure
-
co-microstructure
-
-
for general CSPs
-
hypergraph
-
dual – allowed tuples connected to each other when shared variables match
-
primal graph – vertices in graph are original variables, are variables in a constraint are a clique
-
-
Lecture 31: Structure-based methods, continued
Graph | Vertices | Edge |
---|---|---|
Binary | Var | Constraints |
Non-binary | Var | constraints |
Dual | Constraints | Equality constraints on shared variables |
primal | variables | cliques over scope of constraints |
$\mu$ structure | VVPs | supports |
co $\mu$ structure | VVPs | conflicts |
-
Incidence graph – Variables are one type of node, constraints another, edges connect variables to the constraints, is bipartite
Tree-structured CSPs
-
Tree-structured CSPs can be solved in polynomial time
-
apply $revise(V_i, V_j)$ for all nodes from leaves to root
-
instantiate variables from root to leaves
-
-
Generally
-
directional consistency from leaves to root
-
solutions built from root to leaves
-
-
This is often the case for configuration problems
-
Width of a tree is always one for a tree
-
If a CSP graph has $w = 1$, it is guaranteed to be solved with 2 consistency
-
Freuder has therem that if a csp is $w$ in width, then $w+1$-consistency is guaranteed to solve it.
-
May have issues w/ changing width of graph
-
There are ways to prevent this, but is not necessarily true
-
Cycle-cutset
-
identify a cycle cutset $S$ in the CSP
-
cutset – nodes when removed yield a tree
-
-
Decompose CSP into two partitions
-
$S$
-
$T$ the nodes forming a tree
-
-
Iterate until a solution is found
-
Solve the nodes in $S$
-
Try to extend the solution to $T$
-
-
$|S| < |V|$
-
if $|S| = c$, then $\mathcal{O}(d^c(n-c)d^2) = $\mathcal{O}(ndc+2)$
-
But finding smallest cutset is $NP$ hard
Lecture 32: Structure-based methods, continued
Dynamic Dangle Identification
-
graham reduction used to detect if schema is acyclic
-
join of acyclic schema is polynomial
-
remove vertices w/ degree 1
-
-
After each var assignment/lookahead
-
check connectivity
-
identify dangling trees using graph reduction
-
-
for each dangling root
-
do DAC from leaf to root
-
wipeout means unsolvability
-
-
Restrict search to nodes outside of the identified dangling trees
Directional Path Consistency
-
Given a binary CSP and a fixed variable ordering
-
$\mathcal{O}(nw*^2d^3)$
-
if $w* \leq 2$
-
DPC determines consistency
-
guarantees BT-free search given the ordering
-
But this only works iff induced width is less than or equal to 2, you get strong consistency, minimality and tractability
-
Biconnected components
-
decomposition into separable graphs
-
consider a CSP whose graph has artiulation nodes
-
assume that the largest biconnected component is $b$
-
build a tree whose nodes are biconnected components considering that the articulation node belongs to the parent (or parent and child)
-
build an order using a pre-order traversal of the tree
-
complexity is then bound by the size of the largest biconnected component
Lecture 33: Structure-based methods, continued
Biconnected components
-
must have articulation nodes – nodes common to two or more problems
-
is $d^b$ where $d$ is the domain size and $b$ is the size of the largest component
-
Use a pre-order of the variable tree, which means that BT effort is the size of the largest component
-
Originally from 1982, in Journal of the ACM
Tree Clustering
-
bi-connected components, but with more than one shared variable for each cluster
-
triangulates the graph
-
finds maximal cliques
-
create a tree structure from cliques
-
Then repeat:
-
solve a clique at each node in the tree
-
apply DAC from leaves to root
-
generates solutions in a BT-free manner
-
-
MaxCliques from Algorithmic Graph Theory and Perfect Graphs, Golumbic
-
Triangulation is $\mathcal{O}(n^2)$
-
Max clique is $\mathcal{O}(n + e)$ in triangulated graph
-
$\mathcal{O}(k^r)$ for solving clusters, $k$ is domain size, $r$ is size of largest clique ($t = k^r$)
-
generating solution $\mathcal{O}(nt\log t)$, $t$ is tuples in each cluster (sorted domain of a 'super' variable (sorting can help to speed things up)), in best case, we have $n$ cliques
-
complexity bounded by size of largest clique:
-
$\mathcal{O}(n^2) + \mathcal{O}(k^r) + \mathcal{O}(nt\log t) = \mathcal{O}(nk^r\log k^r) = \mathcal{O}(nrk^r)$
-
Lecture 34: Structure-based methods, continued
Bucket Elimination
-
Built on non-binary CSPs
-
Dechter 1997
-
Choose a variable ordering – PEO recommended
-
create buckets for each variable, placing each constraint in the bucket of the deepest variable in its scope
-
create tree node for each bucket
-
From bottom to top
-
join relations in bucket (including on the domain)
-
project join on deepest bucket with common variable names
-
top-down creates solution backtrack free
-
Can handle cycles!
-
And is a pretty quick form of DAC, but for non-binary CSPs
Lecture 35: Structure-base methods, continued
Bucket Elimination/Tree Clustering
-
linear in $n$, exponential in induced width + 1
-
induced width is the number of vars in the largest component
-
gives fixed parameter complexity
Bucket Elimination
-
very much like Gaussian elimination
-
Useful for Bayesian networks – graphical models of all kinds
Tree Decomposition
-
Generalizes all of structure methods
-
comes from database theory
-
organizes any constraint problem into a tree
-
$(T, \Chi, \Psi)$ of $P = (X, D, C)$
-
$T$ – $(V, E)$ the tree
-
$\Chi(v) \subseteq X$ – A mapping of all variables in a node
-
$\Psi(v) \subseteq C$ – A mapping of constraints and variables
-
Each constraint must appear on at least one node in the tree, and all its variables are in that node
-
nodes where any variable appears induce a single connected subtree (connectedness property), the path between any two nodes with the same variable will all have the variable
-
-
join trees and acyclic DBs from database theory
-
organizes the constraint network into a tree structure, a join-tree
-
labels tree nodes with variables and constraints – labels must obey some rules
-
Solving
-
bottom-up – joins relations and projects on parent
-
top-down generates the solution
-
-
Complexity bound by width
-
tree width is the max number of variables in a node - 1
-
hyperwidth is the max number of constranits in a node
-
-
Group of variables are the variables in common to two nodes
-
However, selecting the tree decomposition is NP-hard, use approximation by min-fill
-
Quite a few different tree-decomposition techniques – Gottlob's HYPERTREE seems to be one of the best
-
parameters
-
tree width max number of variables in any node minus 1
-
hypewwidth – number of constraints in any node
-
-
theory – characterizes tractable csps
-
complexity is bound by tree/hyper-width
-
fixed parameter complexity – independent of number of variables
-
-
practice – uses structure to solve csps
-
requires balancing time and space
Lecture 26: More on Constraint Consistency
Global Consistency Properties
-
minimality – relations join completely, $n$-consistency
-
decomposability – strong $n$-consistency
-
solvability
Local Properties
-
AC/GAC – Domains
-
Path Consistency – Constraints, add new constraints (constraint synthesis)
-
Partial Path Consistency – Constraints, add new constraints
-
Directional Path Consistency – Constraints, add new constraints
-
Types of constraint algorithms
-
Binary
-
Non-Binary
-
Global Constraints
-
-
For continuous domains, work on box-consistency
-
On interval, box consistency can occasionally also be used
-
weighted CSPs can also be dealt with, with a variant of AC
Info
-
arc consistency – $(1,1)$-consistency – for every variable, the solution can be extended to another consistent assignment
-
path-consistency – $(2,1)$-consistency – for every two variables, the solution can be extended to another consistent assignment
-
$(i,j)$ – for every consistent assignment of length $i$, the solution can be extended by $j$ consistent assignments ($i + j = n$)
-
strong $k$-consistency is $k$-consistency down to $1$ consistency
Lecture 27: Still more on Constraint Consistency
-
domain minimality – constraints over arity 1, Gottlob 2011
-
however, even if domain minimality and constraint minimality hold, solving is still NP complete
-
global inverse consistency – where given a value at 1 var, you can extend it to $n - 1$ assignments
$(i, j)$-consistency
-
$j = 1$
-
2 – AC
-
3 – PC
-
k – $k$-consistency
-
-
$(1, n - 1)$ – domain minimality, global inverse consistency
-
$(1, k)$ – $d^{k + 1}$, guaranteed that a solution can be extended over $k$ variables
-
NIC – Neighborhood Inverse Consistency, for every variable, look at its neighbors, and make consistent with those
Lecture 28: More Constraint Consistency
Singleton Arc Consistency
-
family of algos based on singleton test
-
singleton test like search – kinda
-
The CSP is AC for every VVP
-
Algo
-
Until No change
-
for each var
-
for each value
-
assign the val to the var
-
if CSP is AC, keep
-
else Remove
-
-
-
-
-
Sometimes, when you get lots of backtracks, trigger a high level consistency
Inverse consistency
-
(1, 2) consistency – Path Inverse consistency
-
Neighborhood Inverce – see previous
-
GIC – Global Inverse Consistency – ensure domain minimality
Special Constraints – AC
-
functional – A constraint C is functional with regard to D iff for all $v \in D$ there exists at most one $w \in D$ such that $C(v,w)$
-
monotonic constraints – a constraint C is monotonic with regard to $D$ iff there exists a total ordering on $D$ such that, for all $v$ and $w \in D$, $C(v,w)$ hold, implies $C(v^{\prime}, w^{\prime})$ holds for all values $v^{\prime}, w^{\prime} \in D$ such that $v^{\prime} \leq v \land w^{\prime} \leq w$
Non-Binary Special constraints
-
almost all properties and algos discussed were restricted to binaryies
-
consistency for non-binary is the topic of current research
-
in particular domain filtering that doesn't change topology or modify constraint definitions
-
relational $m$ consistency – adding constraints/changing constraint definitions
-
-
Domain Filtering
-
GAC
-
Singleton GAC
-
maxRPWC, rPIC, RPWC, etc
-
-
Relational consistency
-
Strong consistencies
-
Simple Tabular Reduction
-
Lecture 29: Extensions to BT Search
-
bounded number of backtrack search
-
bounded backtrack depth search
-
limited discrepency search
-
credit-based backtrack search
-
randomized backtrack search
Credit Based Search
-
start with a given credit, usually $n^3$
-
assign 1/2 credit to current assignment, 1/2 to the remaining
-
keep going in a depth-first manner, until credit is used up, chronologically backtrack from there
-
Used in conjunction with backtrack bounded search
Limited Discrepancy Search
-
may be blind at shallow levels of search
-
disobeys a given number of times
-
Matthew Ginsburg
Randomized BT
-
in systematic search, ordering of the variables/ values determines which of the solution is explored
-
randomization allows us to explore wider portion of search tree
-
thrashing causes stagnation of BT search, interrupt it, and restart
-
Pick random variables at instantiation
-
And Random values when you're picking values
-
Use the restart schedule when thrashing
-
Fixed cutoff and universal – Luby
-
Randomization and Rapid restarts (Gomes), fixed optimal cutoff, priori knowledge of cost distribution required
-
randomization annd geometric restarts Walsh – $C_{i + 1} = r C_i$, $C_0 = n$
-
Randomization and dynamic geometric restarts (guddeti) – $C_{i + 1} = \left\{\begin{matrix} rC_i & \text{when solution improved} \\ C_i & \text{otherwise}\end{matrix}\right.$
-
Bayesion approach (Kautz)
-