## Sunday, May 10, 2020

### [mwuzwpqq] Difficult (small) SAT

Consider a satisfiable Boolean formula in conjunctive normal form (CNF).  There may be multiple solutions, i.e., multiple interpretations that satisfy it.

Define a path to a solution as a sequence of following:

1. Pick a variable and assign a value to it.  The assigned value must be one that leads to a satisfying interpretation.
2. Do unit propagation to get more assignments to variables (and simplification of the formula).
3. Repeat steps 1 and 2 until all variables are assigned.

Define the cost of a path as follows.  Start at zero cost.  Each time we assign a variable in step 1, consider the variables that had already been assigned at that point, then, if the latest variable had only one possible correct value, add 1 to the cost.  On the other hand, if both true and false can lead to satisfying interpretations, then add 0.  That is, arbitrary choices are free.  (This tends to penalize formulae with multiple satisfying interpretations.)  The unit propagation of step 2 is also always free.

Find a formula whose minimum cost path to a satisfying interpretation has relatively high cost.

I suspect that if we ask to maximize minimum cost, it will result in formulae that have a very regular structure, and, knowing that structure, the problem will be very easy to solve.  So, aim for high cost but not too high.

Intended application is SAT puzzles for humans to solve.  Such puzzles are small enough that for a given puzzle, finding the minimum cost path with a computer is feasible (e.g., with A* search).

What is the next most powerful tool after unit propagation?  Probably eliminating a possible value for a variable because that value leads, via one round of unit propagation, to invalidity.  Consider adding this as a free step in the path as well.

Dual problem: given an invalid (unsatisfiable) formula, produce a short proof of its unsatisfiability.  We seek formulae whose shortest proofs are relatively long.  Unlike the satisfiable case, proofs are trees, not paths, so the cost is the size of the tree.  There will not be the "arbitrary" assignments which can be assigned zero cost, because both assignments always have to be considered.  There are probably other proof gadgets which ought to have zero cost, e.g., permutation of variables.