yuenseng

The Satisfiability Problem

5 Nov 2024

The satisfiability problem is the problem of determining whether a given Boolean formula has a satisfying assignment. The following language SAT is a formal representation of the problem:

SAT = { φ | φ is a Boolean formula that has a satisfying assignment }

A satisfying assignment is an assignment of 0/1 or True/False to the variables in the formula such that the formula evaluates to 1 or True. For instance,

φ = (x₁ ∨ x₂) ∧ (¬x₁ ∨ x₃)

is satisfiable because the formula evaluates to True when:

x₁ = 0;
x₂ = 1;
x₃ = 1.

SAT has a special place in the study of computational complexity because the problem is the first to be proven NP-complete. NP is the class of problems (or languages) that can be verified in polynomial time. Now, a language with a verifier that runs in polynomial time may or may not be decidable in polynomial time. That is to say, we can verify a given solution to the problem in polynomial time, but we may not be able to find a solution in polynomial time. This is in contrast to the class P, which consists of problems that can be both verified and solved in polynomial time. In other words:

an NP problem can be verified quickly, but may not be solved quickly;
a P problem can be both verified and solved quickly.

SAT is in NP because, given a satisfying assignment, we can verify in polynomial time whether the formula evaluates to True. But we don't know whether an algorithm exists that can find a satisfying assignment in polynomial time. Note that SAT is solvable, just that we don't know how to solve it efficiently. One naive solution is to try all possible assignments to the variables in the formula, and verify each, until we find one that is satisfying. But this algorithm runs in exponential time, i.e. it is very slow.

Not only is SAT in NP, it is also NP-complete. A language B is NP-complete if:

B is in NP;
for every language A in NP, there is a polynomial-time reduction from A to B.

Reduction is a technical term, but we can read term as "transforming" one problem into another. Reduction is a powerful technique because it allows us to use a solution to one problem to solve another, if we can show that the two problems are equivalent in some sense, and present some way to transform one into the other. A polynomial-time reduction is one that runs in polynomial time.

The theorem that SAT is NP-complete is known in the literature as the Cook-Levin theorem. The following is one leaky proof of the theorem.

For every language A in NP, there exists some non-deterministic Turing machine M that decides A in polynomial time. The computation of a particular nondeterministic branch of M on some input w can be represented as a sequence of configurations called a computation history. Now, the trick is to show that a given computation history can be represented as a Boolean formula, such that the formula is satisfiable if and only if the computation history is an accepting computation of M on w. What we are doing here effectively is reducing an instance of an NP problem into an instance of SAT, such that a solution to SAT, i.e. some satisfying assignment, is also a solution to the original NP problem, i.e. an accepting computation of M on w.

Let A be a language in NP. By definition, there exists a nondeterministic Turing machine N that decides A in n^k time for some constant k. When N is given an input w, the machine goes through a sequence of configurations. We can construct a computation history of N on w by stacking these configurations on top of each other to form a table with n^k rows:

#q₀w₁w₂...w_n# (starting confiig)
#....................# (starting config + 1)
#....................# (starting config + 2)
...
#....................# (accepting config)

A computation history is an accepting one if the last row in the table is an accepting configuration, i.e. the row contains some accepting state, q_accept. Note that a computation history represents a single nondeterministic branch of N on w. There may be many such branches, but we only need one to be accepting to say that N accepts w.

Next, we construct a Boolean formula φ that can represent a computation history. Furthermore, φ is satisfiable if and only if the computation history is an accepting one. φ will be a conjunction of four clauses:

φ = φ_cell ∧ φ_start ∧ φ_accept ∧ φ_move

Each clause introduces some constraints on the arrangements of symbol in the table. Consider the top-left cell, which we denote (1,1). (1,1) contains a #. Now, each cell in any table contains a symbol from the set of all possible symbols C = Q ∪ Σ ∪ {#}, but we want φ to constrain the symbol on the top-left cell to be a #. We can achieve this through the following clause:

φ_1,1 = φ_1,1-ATLEAST1 ∧ φ_{1,1-NOMORETHAN1}

where

φ_ATLEAST1 = x_1,1,c₁ ∨ x_1,1,c₂ ∨ ... ∨ x_{1,1,c_|C|}

We say that x_1,1,c = True if the symbol in the top-left cell contains a c. Thus, a disjunction of all possible symbols in C will be True if there is at least one symbol in the top-left cell. On the other hand, we want to ensure that there is no more than one symbol in the top-left cell, and we achieve this through φ_NOMORETHAN1, applying the same principle as above.

In general,

φ_cell = ∧_{1≤i,j≤n^k} ( φ_i,j )
φ_cell = ∧_{1≤i,j≤n^k} ( φ_i,j-ATLEAST1 ∧ φ_{i,j-NOMORETHAN1} )
φ_cell = ∧_{1≤i,j≤n^k} [(∨_s∈C ( x_i,j,s ) ) ∧ (∧_{s,t∈C, s≠t} ( ¬x_i,j,s ∨ ¬x_i,j,t ) )]

Next, we can constrain the symbols in the first and last row to be the starting configuration and accepting configuration, respectively:

φ_start = (x_1,1,#) ∧ (x_1,2,w₁) ∧ (x_1,3,w₂) ∧ ... ∧ (x_{1,n^k-1,w_n}) ∧ (x_1,n^k,#)
φ_accept = ∨_1≤j≤n^k (x_{n^k,j,q_accept})

Finally, we can constrain the symbols in the table such that they represent valid transitions between configurations. We achieve this by considering each 3x2 window in the table, and ensuring that the symbols in the window represent a legal transition, based on the transition function of N. For any cell (i, j), the window centered at (i, j) contains cells: (i, j-1), (i, j), (i, j+1), (i+1, j-1), (i+1, j), and (i+1, j+1). Thus, the set of legal windows centered at (i, j) can be written as:

φ_move-i,j = ∨_{p,q,r,s,t,u∈C} (x_i,j-1,p ∧ x_i,j,q ∧ x_i,j+1,r ∧ x_i+1,j-1,s ∧ x_i+1,j,t ∧ x_i+1,j+1,u)

for all p, q, r, s, t, u in C that represent valid transitions. In general,

φ_move = ∧_{1≤i<n^k,1<j<n^k}( φ_move-i,j )

To summarise:

φ = φ_cell ∧ φ_start ∧ φ_accept ∧ φ_move
φ_cell = ∧_{1≤i,j≤n^k} [(∨_s∈C ( x_i,j,s ) ) ∧ (∧_{s,t∈C, s≠t} ( ¬x_i,j,s ∨ ¬x_i,j,t ) )]
φ_start = (x_1,1,#) ∧ (x_1,2,w₁) ∧ (x_1,3,w₂) ∧ ... ∧ (x_{1,n^k-1,w_n}) ∧ (x_1,n^k,#)
φ_accept = ∨_1≤j≤n^k (x_{n^k,j,q_accept})
φ_move = ∧_{1≤i<n^k,1<j<n^k}( φ_move-i,j )
φ_move-i,j = ∨_{p,q,r,s,t,u∈C} (x_i,j-1,p ∧ x_i,j,q ∧ x_i,j+1,r ∧ x_i+1,j-1,s ∧ x_i+1,j,t ∧ x_i+1,j+1,u)

The construction of φ is complete. Moreover, it is correct, because the formula has been constrained by φ_move to allow only valid transitions between configurations, based on the transition function of N. Thus, φ is satisfiable if and only if the computation history is an accepting one. In other words, the problem of determining whether N accepts w has been reduced to the problem of determining whether φ has a satisfying assignment.

Next, we show that the construction of φ takes polynomial time. We can do this by considering the size of φ. The number of cells in the table is n^k × n^k = n^2k, and each cell is associated with |C| variables. Since |C| depends on the size of the input alphabet, which is a parameter of N, and not on the size of the input, n^k, the total number of variables in φ is O(n^2k).

Now, we consider each components of φ:

φ_cell contains a fixed size fragment for each cell in the table (i.e. it doesn't depend on the size of the input), and thus takes O(n^2k) time to construct;
φ_start has a fixed-size fragment for each cell in the first row of the table, and thus takes O(n^k) time to construct;
φ_accept has a fixed-size fragment for each cell in the last row of the table, and thus takes O(n^k) time to construct;
φ_move contains a fixed-size fragment for the 3x2 window centered at each cell in the table, and thus takes O(n^2k) time to construct.

Thus, the total size of φ is O(n^2k), and the construction of φ takes polynomial time. An alternative explanation that may be slightly more intuitive is that each component of φ contains nearly identical fragments that are repeated O(n^2k) times. The size of all fragments are, in some sense, parameterised by the size of the Turing machine N, and not by the size of the input w. Thus, we are able to apply a relatively simple algorithm to contruct each component of φ, though the algorithm may repeat the same steps many times.

SAT is one of many NP-complete problems (other NP-complete problems have been found since the discovery of SAT). The significance of SAT being NP-complete is that, if we can find a polynomial-time algorithm to solve SAT, then we can construct a polynomial-time algorithm to solve all problems in NP. Given a problem A in NP:

Reduce A to SAT in polynomial time;
Solve SAT in polynomial time.
The solution to SAT is also a solution to A.

This is a powerful result because most problems in NP are believed to be hard to solve. These NP-hard problems are encountered in many applications, such as route planning in logistics, dependency resolution in software engineering, protein folding in biology, and many others. SAT itself is used in the verification of electronic circuits in hardware design. Current methods for solving SAT, i.e. DPLL and CDCL, both run in exponential time in the worst case. Both algorithms are based on backtracking search, which is a brute-force method that tries all possible assignments to the variables in the formula, backtracking when an invalid assignment is found. If we can find a polynomial-time algorithm to solve SAT, then we can apply use that algorithm to solve many other problems in NP, which would have a significant impact on many applications.