Homework 8

Due by 8:00pm, Tuesday, April 29, 2025.

Make sure you follow all the homework policies.

All submissions should be done via Autolab.

The support page on SAT solvers could be useful for Question 3.

Note that by contrast, S is an independent set if there is no edges present among the vertices in S.

How are graphs represented in Q1 and Q2?

It does not matter: you can assume either the adjacency list representation or the adjacency matrix representation-- whatever is more covenient for you.

Question 1 (Clique problem) [50 points]

In this problem, we will consider a problem that is essentially the "complement" of the independent set problem. Given a graph $G=(V,E)$, a clique is a subset $S\subseteq V$ such that all $\binom{|S|}{2}$ edges between the vertices in $S$ exist.¹ As we have done in class, we consider the following decision version of the problem of finding the largest clique in a graph.

The problem

Given a graph $G$ and a number $k$, does $G$ contain a clique of size at least $k$? I.e. does there exists a subset $S\subseteq V$ such that $|S|=k$ and there are all possible $\binom{k}{2}$ edges present among the vertices in $S$?

Sample Input/Output pairs

For the sample inputs, let us consider the following graph, which we will call $G_0$:

Input: $G_0, 3$
Output: true (Consider the three blue nodes)

Input: $G_0, 4$
Output: false (Every subset of 4 nodes has at least 2 out of the 6 edges missing)

There are two parts to the problem:

Part (a): Show that the Clique problem is in $\mathcal{NP}$.
Part (b): Show that the Clique problem is $\mathcal{NP}$-complete.

Hint

It might be useful to consider the complement graph of $G$ and figure out what a clique in $G$ looks like in its complement. (The complement of $G=(V,E)$ is the graph $H=(V,E')$, where $(u,v)\in E'$ if and only if $(u,v)\not\in E$: i.e. if an edge is present in $G$, then it not present in $H$ and if an edge is not present in $G$, then it is present in $H$.)

Submission

Submit part (a) and (b) separately

You need to submit two (2) PDF files to Autolab: one for part (a) and one for part (b). While you can assume part (a) as a given for part (b), to get credit for part (a) you have to submit you solution for part (a) separately from part (b).

Make sure you submit the correct PDF to the correct submission link on Autolab. If you do not (e.g. if you submit Q1(a) PDF to Q2(a) or even Q1(b)), then you will lose ALL points.

We recommend that you typeset your solution but we will accept scans of handwritten solution-- you have to make sure that the scan is legible.

PDF only please

If Autolab cannot display your file, (irrespective of the reason) then you will get a zero (0) on the entire question.

Autolab might not be able to display files in formats other than PDF (e.g. Word cannot be displayed). Note that Autolab will "accept" your submission even if you submit non-PDF file, so it is YOUR responsibility to make sure you submit in the correct format.

Also the file size has to be at most 3MB.

Grading Guidelines

We will follow the usual grading guidelines for non-programming questions. Here is a high level grading rubric specific to part (a) of this problem:

Proof idea: 10 points. Nothing else is needed.

and here is the high level grading rubric for part (b):

Reduction idea: 8 points.
Reduction details: 8 points.
Proof idea: 8 points.
Proof details: 8 points. The proof details should contain the complete proof that the Clique problem is $\mathcal{NP}$-complete.
Runtime analysis of the reduction: 8 point.

Note

If you do not have labeled and separated out proof idea, reduction idea, proof details, reduction details and runtime analysis for part (b) you will get a zero(0) irrespective of the technical correctness of your solution.

Templates

Download LaTeX template Download Microsoft Word template

Note

You must explicitly list your sources and collaborators when you upload your submission to Autolab. Note that there are only five allowed sources. If you have used a source that is not allowed, please do not submit your homework. If you did not consult any source or did not collaborate with anyone just say None.

Question 2 (MaxClique problem) [25 points]

So far we have focused on Boolean problems for $\mathcal{NP}$-completeness. There is also a theory corresponding to when the output is not just {true,false}. In this problem, we will consider perhaps a more natural problem corresponding to the largest clique problem instead of the Clique problem we saw in Problem 1.

The problem

Given a graph $G$, output a largest clique of $G$, i.e., the output $S$ should be a clique and it must be the case that $G$ does not have a clique of size $|S|+1$.

Sample Input/Output pairs

Input: $G_0$ (from Question 1)
Output: The set of blue vertices.

In this problem we will see how one can reduce the MaxClique problem to the Clique problem (from Question 1). There are two parts to the problem.

Part (a): For this part, assume that we are interested in the simpler problem of outputting the size of a largest clique in $G$ (instead of actually outputting a largest clique). Further, assume that the Clique problem from Question 1 can be solved in time $T(n)$ on any graph of size $G$ (and any value of $k$).
- Sub-part (1): Show that one can compute the size of a largest clique in time $O(n\cdot T(n))$.
- Sub-part (2): Show that one can compute the size of a largest clique in time $O(\log{n}\cdot T(n))$.
Part (b): Assume that the Clique problem from Question 1 is in $\mathcal{P}$. Show that then the MaxClique problem is also in $\mathcal{P}$ (i.e. there exists a polynomial time algorithm to solve the MaxClique problem). In other words, you want to argue that the Clique problem from Question 1 is essentially "as hard as" the MaxClique problem.

Hint

Note that since Clique is in $\mathcal{P}$, there exists a polynomial time algorithm $\mathcal{A}$ to solve that problem. Think of how you can call $\mathcal{A}$ multiple times to solve the MaxClique problem.

Submission

Submit part (a) and (b) separately

Make sure you submit the correct PDF to the correct submission link on Autolab. If you do not (e.g. if you submit Q1(a) PDF to Q2(a) or even Q1(b)), then you will lose ALL points.

We recommend that you typeset your solution but we will accept scans of handwritten solution-- you have to make sure that the scan is legible.

PDF only please

If Autolab cannot display your file, (irrespective of the reason) then you will get a zero (0) on the entire question.

Also the file size has to be at most 3MB.

Grading Guidelines

We will follow the usual grading guidelines for non-programming questions. Here is a high level grading rubric specific to part (a) of this problem:

Sub-part 1 Algorithm idea: 6 points.
Sub-part 2 Algorithm idea: 4 points.

and here is the high level grading rubric for part (b):

Algorithm idea: 3 points.
Algorithm details: 3 points.
Proof idea (for correctness): 3 points.
Proof details: 3 points.
Runtime analysis: 3 point.

Note

If you do not have labeled and separated out sub-part 1 and sub-part 2 for part (a) and proof idea, algorithm idea, proof details, algorithm details and runtime analysis for part (b), you will get a zero(0) irrespective of the technical correctness of your solution for the corresponding part.

Templates

Download LaTeX template Download Microsoft Word template

Note

Question 3 (Programming Assignment) [25 points]

Note

This assignment can be solved in either Java, Python or C++ (you should pick the language you are most comfortable with). Please make sure to look at the supporting documentation and files for the language of your choosing.

The Problem

In this problem we will explore the problem of reducing the graph $k$-colorability problem to SAT and then use a SAT solver to solve the $k$-colorability problem.

Direction of reduction matters!

In this problem you are doing a reduction in the following direction. We reduce $k$-colorability to SAT and then use a SAT solver to solve a $k$-colorability problem.

In fact, in practice this is the "practical application" of $\mathcal{NP}$-completeness. You need to solve an $\mathcal{NP}$-complete problem: so you reduce it to a SAT formula and then run a SAT solver. While this cannot always work, this direction seems to work out surprisingly often for real-life problem instances.

The graph $k$-colorability problem is as follows: Given an integer $k$ and an undirected, unweighted graph $G$, can you color $G$ using $k$ colors such that no two adjacent nodes have the same color? You will first generate a CNF formula to represent the constraints of this problem using the data structure explained in the Output section below. This CNF formula will get converted by our grading code to the format accepted by the SAT solver, and finally fed as input to the solver.

The solver outputs a variable assignment from which we can deduce what color a node was assigned if the constraints you create can be satisfied, and UNSAT if not. The solver output format is explained in the Solver output section below. Note that you do not need to concern yourself with the solver output format, but it could be extremely useful if you decide to play around with the solver by feeding it CNF formulas of your own. Further details are provided in the directory structure section of your preferred language below.

What kind of SAT formulas can we generate?

While there are many ways to create a SAT formula to "encode" a $k$-colorability instance (including what variables that can be used), for this problem, we will constrain you to use a very specific set of variables. However you have the full freedom to use whatever clauses you think are relevant.

Also see the support page on SAT solvers for more on SAT solvers (and SAT formulas in general).

Input

The input file is given with the first line as the value of $k$ (the number of available colors) and the remainder of the file is an adjacency list for graph $G$ (we assume that the set of vertices is $\{0,1,\dots,n-1\}$). The adjacency list assumes that the current node is the index of the line under consideration. For instance line 0 of the input file (not including the value of $k$) has the list of all nodes adjacent to node 0.

                k		<- Value of k (No. of colors available)
                u₁ u₄ u₆		<- All nodes that share edges with u₀
            	u₃ u₄		<- All nodes that share edges with u₁
		u₀		<- All nodes that share edges with u₂
		.
		.
		.
		u₀ u₄ u₂ u₇ 	<- All nodes that share edges with u_n-1

For example:

            	2		<- k
	 	1 2		<- Node 0 shares an edge with nodes 1 & 2
		0    		<- Node 1 shares an edge with node 0
		0		<- Node 2 shares an edge with node 0

If we draw the graph using the above input, it will look like this:

The color set

Given an integer $k\ge 1$, the set of colors will be the set $\{1,\dots,k\}$. While this can potentially clash with a node ID, it will always be clear from context whether an integer refers to a node ID or a color. In places where there can be confusion we will clarify by explicitly saying "node $i$" or "color $j$" as appropriate.

Note that in this case, it is indeed possible to color the graph with two colors. Say you color node $0$ with color $1$, color nodes $1$ and $2$ with color $2$. Alternatively, color node $0$ with color $2$ and nodes $1$ and $2$ with color $1$. (Since nodes $1$ and $2$ are not connected by an edge, we can assign the same color to both of them.)

Let us now consider another example:

            	2		<- k
	 	1 2		<- Node 0 shares an edge with nodes 1 & 2
		0 2   		<- Node 1 shares an edge with nodes 0 & 2
		0 1		<- Node 2 shares an edge with nodes 0 & 1

If we draw the graph using the above input, it will look like this:

In this case, it is not possible to color the graph with two colors. Say you assign two different colors to nodes $0$ and $1$, which is necessary since they are connected by an edge, $2$ will have to be assigned a third color since it is connected to both $0$ and $1$. However, we only have two colors at hand. Changing the value of $k$ to $3$ will cause this to become satisfiable, however, since we can then color all the nodes with the three different colors.

Output

Your method is expected to return a list (outer) of tuples (outer) of two lists (inner) of two-element tuples (inner) of int (shudder).
The above list will represent the constraints of the k-colorability problem.

Format of Output

The general format of the output is as follows:

$[clause_1, clause_2, ...., clause_n]$
Each $clause_i$ is itself of the form: $([positive\_literals], [negative\_literals])$.
Lastly, each positive or negative literal is of the form: $(node_i, color_c)$.

For eg. consider

[([(0, 1), (0, 2)], [])]

This set of clauses has only one clause since the outer list only has one tuple. Note that the main/outer list contains as many outer tuples as you have clauses. Within a clause, you have two types of literals, positive and negative and hence an outer tuple will always contain exactly two (inner) lists, one for positive literals and one for negative literals. Each literal is a two element tuple of integers. The first element of the inner tuple is the node, while the second element is the color.

A Notation

Before we move any further, we define $v_{ij}$ for $0\le i <n$ and $1\le j\le k$ to be a variable (or literal) representing a particular a (node, color) tuple. For example, $v_{01}$ represents node 0 and color 1, $v_{12}$ represents node 1 and color 2 and so on.

Some examples

In the example clauses listed above in this paragraph, the CNF formula that this translates to is $v_{01} \vee v_{02}$, implying node 0 should be assigned color 1 or 2. Notice that literals inside the same inner list are ORed together. On the other hand,

[([], [(0, 1), (0, 2)])]

translates to the formula $\neg v_{01} \vee \neg v_{02}$.

A few more examples, just to get you comfortable:

```
[([(0, 1), []), ([], [(0, 2)])]
```
is the same as $\ v_{01} \wedge \neg v_{02}$. Notice that there are two tuples with a single positive and negative literal each. Since the literals are in different clauses, they are ANDed together.
```
[([(0, 1), (1, 1)], []), ([], [(0, 2)])]
```
is the same as $\ (v_{01} \vee v_{11}) \wedge \neg v_{02}$
```
[([(0, 1), (1, 1), (2, 2)], []), ([], [(0, 2)])]
```
is the same as $\ (v_{01} \vee v_{11} \vee v_{22}) \wedge \neg v_{02}$ and so on.

The output you generate above is fed into a SAT solver as explained in the Problem section. Then Autolab will check if the output of the SAT solver in your generated SAT formula correctly determines if the graph is not $k$-colorable and if it is, if the output gives a valid $k$-coloring.

You do NOT have to understand how the SAT solver works

For this problem, you do not need to understand how the SAT solver works and can use it as a black box. Though understanding how the SAT solver works could be useful in debugging your submissions. If interested, read on what the output of a SAT solver would look like. If you are not interested, you could directly skip to the "Hint" below.

Solver Output

The solver output is a list of integers which for this particular problem represent a particular flattened (node, color) pair formatted as your language of choice's array-type data structure (see below for the exact details). Note that nodes are 0-indexed while colors are 1-indexed.

            	[
                    v₀₁,                                         Node 0 is assigned color c₁ if v₀₁ > 0
                    v₀₂,                                         Node 0 is assigned color c₂ if v₀₂ > 0
                    ...                                          ...
                    v_0k,                                         Node 0 is assigned color c_k if v_0k > 0
                    v₁₁,                                         Node 1 is assigned color c₁ if v₁₁ > 0
                    ...                                          ...
                    v_(n-1)k                                       Node (n-1) is assigned color c_k if v_(n-1)k > 0
                ]

For example, using the first example input from above: (every pair represents a node)

                [
                    1, -2,                                      Node 0 is assigned color c₁
                    -3, 4,                                      Node 1 is assigned color c₂
                    -5, 6                                       Node 2 is assigned color c₂
                ]

                [
                    -1, 2,                                      Node 0 is assigned color c₂
                    3, -4,                                      Node 1 is assigned color c₁
                    5, -6                                       Node 2 is assigned color c₁
                ]

Note that the actual output (on Autolab) will look something like:

s SATISFIABLE                           <- Whether the given set of constraints is satisfiable
v -1 2 3 -4 5 -6 0                      <- Variable assignments, the 0 marks the end of the output (it is not a variable/literal).
                                           There will be multiple such lines if there are a lot of clauses.

For the second example input, the output will simply be:

s UNSATISFIABLE

Note

All the graphs used for this problem are connected.

Hint

The best possible algorithm for this problem that we are aware of runs in time $O((m+n)k)$ where $k$ is the number of available colors, and $m$ is the total number of edges. Note that this is the time for only convert the input for $k$-colorability to a SAT formula.

Of course we do not expect to able to solve the SAT formula in $O((m+n)k)$ time since that would show $\mathsf{P}=\mathcal{NP}$!

Note

Both the input and output parsers in each of the three languages are already written for you.
Note that you have to work with the input data structures provided (which will come pre-loaded with the data from an input file).

Addition is the only change you should make

Irrespective of what language you use, you will have to submit just one file. That file will come pre-populated with some stuff in it. You should not change any of those things because if you do you might break what the grader expects and end up with a zero on the question. You should of course add stuff to it (including helper functions and data structures as you see fit).

Download Java Skeleton Code

Directory Structure

                                    ├── src
                                    │   ├── ub
                                    │       ├── cse
                                    │           ├── algo
                                    │               ├── Driver.java
                                    │               ├── Helpers.java
                                    │               ├── Graph.java
                                    │               ├── HW10Utility.java
                                    │               ├── Solution.java
                                    │               ├── Literals.java
                                    │               ├── Pair.java
                                    ├── testcases/
                                    │   ├── input1.txt
                                    │   ├── input2.txt
                                    │   ├── input5.txt
                                    │   ├── output1.txt
                                    │   ├── output2.txt
                                    │   └── output5.txt

You are given seven coding files: Driver.java, Graph.java, HW10Utility.java, Helpers.java, Literals.java, Pair.java and Solution.java. Driver.java takes the input file, parses it with a new instance of HW10Utility and creates an instance of the class Solution and calls the outputClauses() method on it. It then prints the list of clauses in the format specified above.

Using the methods defined in Helpers.java, it converts this format to the one understood by the SAT solver, and writes to a file called formula.txt. You may submit this file to Autolab for the "assessment" Run SAT solver on any valid formula, which itself is under SAT Solver. The solver is run on the clauses you submitted using formula.txt, and the feedback will show the solver's output as explained in the Solver Output section above. This step is useful for debugging purposes and to better understand what the solver is doing (say if you wanted to submit a CNF formula of your choice), but is not necessary for submission.

Note

Autolab will just run the SAT solver on the formula encoded by formula.txt and print its output: we will unfortunately not be able to provide any support/help out if the feedback does not make sense to you.

You only need to update the Solution.java file and submit on Autolab. The Pair.java provides you with all the semantics of a 2-element tuple. Use the constructor to set the first and second elements directly, there are no setters. The Literals.java file is primarily meant to serve as an alias for ArrayList<Pair<Integer, Integer>> for readability. You need to use both these classes in your solution, look at the method stub for an idea. Modifying either Literals.java or Pair.java will likely give you a 0, and so will not using either of them.

You may write your own helper methods and data structures in Solution.java.

The testcases folder has 3 input files and their corresponding output files for your reference. Note that the output files may have clauses in a different order than yours, but as long as they are all there everything's peachy. We will use these three input files (and seven others) in our autograding.

Method you need to write:

                                    public ArrayList<Pair<Literals, Literals>> outputClauses() {
                                        /**
                                         * @return A list of clauses representing the constraints of the k-coloring problem
                                         */
                                        ArrayList<Pair<Literals, Literals>> result = new ArrayList<>();
                                        result.add(new Pair(new Literals(), new Literals())); // Adding an empty clause
                                        return result;
                                    }

The Solution class has 2 instance variables:

k which is the number of colors available to color the graph.
graph which is of type HashMap<Integer, ArrayList<Integer>> where the key is the node id and the value is an ArrayList of adjacent nodes.

The output is a list of clauses, of type ArrayList<Pair<Literals, Literals>>. To reiterate, Literals is a list of, quite literally (pun intended), literals. Each literal represents a specific (node, color) tuple (or Pair<Integer, Integer>).

Compiling and executing from command line:

Assuming you're in the same directory level as src. Run javac src/ub/cse/algo/*.java to compile.

To execute your code on input1.txt, run java -cp "src" ub.cse.algo.Driver testcases/input1.txt. The output array will be printed to stdout.

Submission

You only need to submit Solution.java to Autolab.

Download Python Skeleton Code

Directory Structure

                                    ├── Driver.py
                                    ├── Helpers.py
                                    ├── Graph.py
                                    ├── Solution.py
                                    ├── Utility.py
                                    ├── testcases/
                                    │   ├── input1.txt
                                    │   ├── input2.txt
                                    │   ├── input5.txt
                                    │   ├── output1.txt
                                    │   ├── output2.txt
                                    │   └── output5.txt

You are given five coding files: Driver.py,Graph.py,Solution.py, Helpers.pyand Utility.py. Driver.py takes the input file, parses it with a new instance of Utility and creates an instance of the class Solution and calls the output_clauses() method on it. It then prints the list of clauses in the format specified above.

It uses the methods defined in Helpers.py, it converts this format to the one understood by the SAT solver, and writes to a file called formula.txt. You may submit this file to Autolab for the "assessment" Run SAT solver on any valid formula, which itself is under SAT Solver. The solver is run on the clauses you submitted using formula.txt, and the feedback will show the solver's output as explained in the Solver Output section above. This step is useful for debugging purposes and to better understand what the solver is doing (say if you wanted to submit a CNF formula of your choice), but is not necessary for submission.

Note

You only need to update the Solution.py file and submit it to Autolab. On top of that, you may write your own helper methods and data structures in it.

Method you need to write:

                                def output_clauses(self):
                                    """
                                    :return: A list of clauses representing the constraints of the k-coloring problem
                                    """
                                    clauses = [([], [])]
                                    return clauses

The Solution class has 2 instance variables.

k which is the number of colors available to color the graph.
graph which is a dictionary where the key is a unique node id and the value is a list of nodes that share an edge with the key node

Your method is expected to return a list of tuples of twolists of two-elementtuples of int, which represents the list of clauses you wish to feed to the solver. Look at the output section above for an explanation.

Executing from command line:

Assuming you're in the same directory level as Driver.py and you want to run your code on the input1.txt. Run python Driver.py testcases/input1.txt.

Submission

You only need to submit Solution.py to Autolab.

Download C++ Skeleton Code

Directory Structure

                                    ├── Driver.cpp
                                    ├── HW10Helper.h
                                    ├── Graph.h
                                    ├── HW10Utility.h
                                    ├── Solution.cpp
                                    ├── Utility.h
                                    ├── testcases/
                                    │   ├── input1.txt
                                    │   ├── input2.txt
                                    │   ├── input5.txt
                                    │   ├── output1.txt
                                    │   ├── output2.txt
                                    │   └── output5.txt

You are given six coding files: Driver.cpp,Graph.h,HW10Utility.h, HW10Helper.h, Utility.h and Solution.cpp. Driver.cpp takes the input file, parses it with a new instance of Utility and creates an instance of the class Solution and calls the outputClauses() method on it. It then prints the list of clauses in the format specified above.

Using the methods defined in HW10Helper.h, it converts this format to the one understood by the SAT solver, and writes to a file called formula.txt. You may submit this file to Autolab for the "assessment" Run SAT solver on any valid formula, which itself is under SAT Solver. The solver is run on the clauses you submitted using formula.txt, and the feedback will show the solver's output as explained in the Solver Output section above. This step is useful for debugging purposes and to better understand what the solver is doing (say if you wanted to submit a CNF formula of your choice), but is not necessary for submission.

Note

You only need to update the Solution.cpp file and submit it to Autolab. On top of that, you may write your own helper methods and data structures in the Solution.cpp.

Method you need to write:

                                    vector<pair<literals, literals>> outputClauses() {
                                        /**
                                         * @return A list of clauses representing the constraints of the k-coloring problem
                                         */
                                        vector<pair<literals, literals>> clauses;

                                        pair<literals, literals> empty_clause;
                                        empty_clause.first = {};
                                        empty_clause.second = {};
                                        clauses.push_back(empty_clause);//Adding an empty clause
                                        return clauses;
                                    }

The Solution class has 2 member variables.

k which is the number of colors available to color the graph.
graph which is a map where each key is a unique node id as an int and the value is a corresponding list of adjacent nodes for that key as a vector

The output is a list of clauses, of type vector<pair<literals, literals>>. To reiterate, literals is a list of, quite literally (pun intended), literals. Each literal represents a specific (node, color) tuple (or pair<int, int>).

Compiling and executing from command line:

Assuming you're in the same directory level as Driver.cpp. Run g++ -std=c++11 Driver.cpp to compile.

To execute your code on input1.txt, run ./a.out testcases/input1.txt.

Submission

You only need to submit Solution.cpp to Autolab.

Grading Guidelines

We will follow the usual grading guidelines for programming questions.