PDF of Eric’s handwritten notes are here.

Boolean formula

Variables: x_1, x_2,\dots,x_n taking values True or False

Literals: x_1,\overline{x_1},x_2,\overline{x_2},\dots,x_n,\overline{x_n}.

Operators: \vee= OR, \wedge= AND.

Clause is an OR of several literals. Example: (\overline{x_3}\vee x_5\vee\overline{x_2}\vee x_1).

Boolean formula f is in conjunctive normal form (CNF) means that it is the AND of m such clauses. Example: (x_2)\wedge(\overline{x_3}\vee x_5\vee\overline{x_2}\vee x_1)\wedge(\overline{x_2}\vee\overline{x_1}).


Input: given formula f in CNF with n variables and m clauses.

Output: assignment satisfying f if one exists, and NO if no satisfying assignment exists.

k-SAT: same as SAT but the input f has clauses with \le k literals in each. We’ll see that SAT is NP-complete and k-SAT is NP-complete for each k\ge 3.

Claim: 2-SAT has a polynomial-time algorithm.

Take input f for 2-SAT. We’ll assume all clauses have size exactly 2. What about clauses of size 1?  A clause of size 1 is called a `unit clause’.

For f, take a unit clause, say literal a_i:

  • satisfy this clause by setting a_i=T
  • remove all clauses containing a_i (these are all satisfied)
  • drop any occurrences of \overline{a_i}.

Let f' be the resulting formula.
Observation: f is satisfiable \Leftrightarrow f' is satisfiable.

Thus we can repeat the above procedure until all unit clauses are eliminated.  Then we are left with a formula where all clauses are of size exactly 2, or we have some empty clauses (the clauses were never eliminated but the literals were dropped).  If we have an empty clause then the formula is not satisfiable, so we just output NO.

Take f with clauses of size exactly 2, and n variables and m clauses. Create a directed graph as follows:

  • 2n vertices corresponding to x_1, \overline{x_1},x_2,\overline{x_2},\dots,x_n,\overline{x_n}
  • 2m edges corresponding to 2 “implications” per clause
  • example: (x_i\vee\overline{x_j})
    • if x_i=F then we need to set x_j=F to satisfy this clause.
    • if x_j=T then we need x_i=T.
    • so we include edges \overline{x_i}\rightarrow\overline{x_j} and x_j\rightarrow x_i.
  • In general for clause (\alpha\vee\beta), add edges \overline{\alpha}\rightarrow\beta and \overline{\beta}\rightarrow\alpha.

Example: f=(\overline{x_1}\vee\overline{x_2})\wedge(x_2\vee x_3)\wedge(\overline{x_3}\vee\overline{x_1}). (see PDF for the graph). Note path x_1\rightarrow\overline{x_2}\rightarrow x_3\rightarrow\overline{x_1}. So if x_1=T then we need \overline{x_1}=T (i.e., x_1=F), contradiction. But if x_1=F then we can set x_2=T and x_3 be anything to satisfy f.

Note: a path x_i \leadsto x_j means setting x_i=T we must also set x_j=T. Thus if there’s a path x_i\leadsto\overline{x_i} then we can’t set x_i=T and satisfy f. Similarly if there’s a path \overline{x_i}\leadsto x_i. Therefore if x_i and \overline{x_i} are in the same SCC, then f is unsatisfiable.

Lemma: f is satisfiable  \Leftrightarrow\ \forall_i, x_i and \overline{x_i} are in different SCC’s.

We just argued that if \exists_i where x_i and \overline{x_i} are in the same SCC, then f is unsatisfiable. Thus we proved \Rightarrow. We now need to show \Leftarrow which we’ll do by giving an algorithm.

Key idea: take a sink SCC S and set S=T (by satisfying all literals in S). Note that S has no outgoing edges so no implications from setting S=T. But we’ve also set \overline{S}=F (does \overline{S} have incoming edges?)

Take a source SCC S' and set S'=F. Note that S' has no incoming edges so we’ll never be forced to set S' to T. But we’ve also set \overline{S'}=T (does \overline{S'} have outgoing edges?)

Lemma: S is a sink SCC \Leftrightarrow\ \overline{S} is a source SCC.

So we take a sink SCC S:

  • set S=T (and thus \overline{S}=F)
  • remove S and \overline{S}
  • repeat until empty graph

This is valid because no variable x_i has x_i and \overline{x_i} in the same SCC. Just need to prove the lemma now.

Claim: \alpha\leadsto\beta\Leftrightarrow\overline{\beta}\leadsto\overline{\alpha}. (\leadsto means a path)

Proof: for \Rightarrow: suppose path \alpha\leadsto\beta is \gamma_0\rightarrow\gamma_1\rightarrow\cdots\rightarrow\gamma_l where \gamma_0=\alpha and \gamma_l=\beta. Note that the edge \gamma_i\rightarrow\gamma_{i+1} comes form the clause (\overline{\gamma_i}\vee\gamma_{i+1}) and the other edge is \overline{\gamma_{i+1}}\rightarrow\overline{\gamma_i}. Thus we have the path \overline{\gamma_l}\rightarrow\overline{\gamma_{i-1}}\rightarrow\cdots\rightarrow\overline{\gamma_0} where \overline{\gamma_l}=\overline{\beta} and \overline{\gamma_0}=\overline{\alpha}.

For \Leftarrow, similarly we can construct a path \alpha\leadsto\beta from a path \overline{\beta}\leadsto\overline{\alpha}.  \Box

Take SCC S. For \alpha,\beta\in S, there are paths \alpha\leadsto\beta and \beta\leadsto\alpha. Thus there’s also paths \overline{\beta}\leadsto\overline{\alpha} and \overline{\alpha}\leadsto\overline{\beta}. So \overline{\alpha} and \overline{\beta} are in the same SCC S'. Therefore \overline{S} is a SCC. So S is a SCC \Leftrightarrow\ \overline{S} is a SCC.

Moreover, \overline{S} has no incoming edges (so source SCC) if and only if S has no outgoing edges (sink SCC). This proves the lemma.