{\huge News on Intermediate Problems}

The Minimum Circuit Size Problem goes front and center

\vspace{.5in}

Eric Allender, Bireswar Das, Cody Murray, and Ryan Williams have proved new results about problems in the range between $latex {\mathsf{P}}&fg=000000$ and $latex {\mathsf{NP}}&fg=000000$-complete. According to the wide majority view of complexity the range is vast, but it is populated by scant few natural computational problems. Only Factoring, Discrete Logarithm, Graph Isomorphism (GI), and the Minimum Circuit Size Problem (MCSP) regularly get prominent mention. There are related problems like group isomorphism and others in subjects such as lattice-based cryptosystems. We covered many of them some years back.

Today we are delighted to report recent progress on these problems.

MCSP is the problem: given a string $latex {x}&fg=000000$ of length $latex {n = 2^k}&fg=000000$ and a number $latex {s}&fg=000000$, is there a Boolean circuit $latex {C}&fg=000000$ with $latex {s}&fg=000000$ or fewer wires such that

$latex \displaystyle C(0^k)\cdot C(0^{k-1} 1)\cdot C(0^{k-2}10) \cdots C(1^{k-1} 0)\cdot C(1^k) = x? &fg=000000$

For $latex {x}&fg=000000$ of other lengths $latex {m}&fg=000000$, $latex {2^{k-1} < m < 2^k}&fg=000000$, we catenate the values of $latex {C}&fg=000000$ for the first $latex {m}&fg=000000$ strings in $latex {\{0,1\}^k}&fg=000000$ under the standard order. Since every $latex {k}&fg=000000$-ary Boolean function has circuits of size $latex {O(\frac{2^k}{k}) = O(\frac{n}{\log n})}&fg=000000$, MCSP belongs to $latex {\mathsf{NP}}&fg=000000$ with linear witness size.

Several Soviet mathematicians studied MCSP in the late 1950s and 1960s. Leonid Levin is said to have desired to prove it $latex {\mathsf{NP}}&fg=000000$-complete before publishing his work on $latex {\mathsf{NP}}&fg=000000$-completeness. MSCP seemed to stand aloof until Valentine Kabanets and Jin-Yi Cai connected it to Factoring and Discrete Log via the ``Natural Proofs'' theory of Alexander Razborov and Steven Rudich. Eric and Harry Buhrman and Michal Kouck\'y and Dieter van Melkebeek and Detlef Ronneburger improved their results in a 2006 paper to read:

Theorem 1 Discrete Log is in $latex {\mathsf{BPP}^{\mathrm{MCSP}}}&fg=000000$ and Factoring is in $latex {\mathsf{ZPP}^{\mathrm{MCSP}}}&fg=000000$.

Now Eric and Bireswar have completed the triad of relations to the other intermediate problems:

Theorem 2 Graph Isomorphism is in $latex {\mathsf{RP}^{\mathrm{MCSP}}}&fg=000000$. Moreover, every promise problem in $latex {\mathsf{SZK}}&fg=000000$ belongs to $latex {\mathsf{BPP}^{\mathrm{MCSP}}}&fg=000000$ as defined for promise problems.

Cody and Ryan show on the other hand that proving $latex {\mathsf{NP}}&fg=000000$-hardness of MCSP under various reductions would entail proving breakthrough lower bounds:

Theorem 3

If $latex {\mathrm{SAT} \leq_m^p \mathrm{MCSP}}&fg=000000$ then $latex {\mathsf{EXP} \not\subseteq \mathsf{NP} \cap \mathsf{P/poly}}&fg=000000$, so $latex {\mathsf{EXP \neq ZPP}}&fg=000000$.
If $latex {\mathrm{SAT} \leq_m^{\log} \mathrm{MCSP}}&fg=000000$ then $latex {\mathsf{PSPACE \neq ZPP}}&fg=000000$.
If $latex {\mathrm{SAT} \leq_m^{ac_0} \mathrm{MCSP}}&fg=000000$ then $latex {\mathsf{NP} \not\subset \mathsf{P/poly}}&fg=000000$ (so $latex {\mathsf{NP \neq P}}&fg=000000$), and also $latex {\mathsf{E}}&fg=000000$ has circuit lower bounds high enough to de-randomize $latex {\mathsf{BPP}}&fg=000000$.
In any many-one reduction $latex {f}&fg=000000$ from $latex {\mathrm{Parity}}&fg=000000$ (let alone $latex {\mathrm{SAT}}&fg=000000$) to $latex {\mathrm{MCSP}}&fg=000000$, no random-access machine can compute any desired bit $latex {j}&fg=000000$ of $latex {f(x)}&fg=000000$ in $latex {|x|^{1/2-\epsilon}}&fg=000000$ time.

The last result is significant because it is unconditional, and because most familiar $latex {\mathsf{NP}}&fg=000000$-completeness reductions $latex {f}&fg=000000$ are local in the sense that one can compute any desired bit $latex {j}&fg=000000$ of $latex {f(x)}&fg=000000$ in only $latex {(\log |x|)^{O(1)}}&fg=000000$ time (with random access to x).

Why MCSP is Hard to Harden

The genius of MCSP is that it connects two levels of scaling---input lengths $latex {k}&fg=000000$ and $latex {n}&fg=000000$---in the briefest way. The circuits $latex {C}&fg=000000$ can have exponential size from the standpoint of $latex {k}&fg=000000$. This interplay of scaling is basic to the theory of pseudorandom generators, most obviously to conditions under which they can stretch a seed of $latex {\mathsf{poly}(k)}&fg=000000$ bits into $latex {n}&fg=000000$ bits,and to generators of pseudorandom functions $latex {g: \{0,1\}^k \longrightarrow \{0,1\}^k}&fg=000000$.

An issue articulated especially by Cody and Ryan is that reductions $latex {f}&fg=000000$ to $latex {\mathrm{MCSP}}&fg=000000$ carry seeds of being self-defeating. The ones we know best how to design involve ``gadgets'' whose size scales as $latex {k}&fg=000000$ not $latex {n}&fg=000000$. For instance, in a reduction from $latex {\mathrm{3SAT}}&fg=000000$ we tend to design gadgets for individual clauses in the given 3CNF formula $latex {\phi}&fg=000000$---each of which has constant-many variables and $latex {O(\log n) = O(k)}&fg=000000$ encoded size. But if $latex {f}&fg=000000$ involves only $latex {\mathsf{poly}(k)}&fg=000000$-sized gadgets and the connections between gadgets need only $latex {\mathsf{poly}(k)}&fg=000000$ lookup, then when the reduction outputs $latex {f(\phi) = (y,s)}&fg=000000$, the string $latex {y}&fg=000000$ will be the graph of a $latex {\mathsf{poly}(k)}&fg=000000$-sized circuit. This means that:

if $latex {s > \mathsf{poly}(k)}&fg=000000$ then the answer is trivially ``yes'';
if $latex {s \leq \mathsf{poly}(k)}&fg=000000$ then the answer can be found in $latex {\mathsf{poly}(n)}&fg=000000$ time---or at worst quasipolynomial in $latex {n}&fg=000000$ time---by exhaustively trying all circuits of size $latex {s}&fg=000000$.

The two horns of this dilemma leave little room to make a non-trivial reduction to $latex {\mathrm{MCSP}}&fg=000000$. Log-space and $latex {\mathsf{AC^0}}&fg=000000$ reductions are (to different degrees) unable to avoid the problem. The kind of reduction that could avoid it might involve, say, $latex {n^{1/2}}&fg=000000$-many clauses per gadget in an indivisible manner. But doing this would seem to require obtaining substantial non-local knowledge about $latex {\phi}&fg=000000$ in the first place.

Stronger still, if the reduction is from a polynomially sparse language $latex {A \in \mathsf{NP}}&fg=000000$ in place of $latex {\mathrm{SAT}}&fg=000000$, then even this last option becomes unavailable. Certain relations among exponential-time classes imply the existence of hard sparse sets in $latex {\mathsf{NP}}&fg=000000$. The hypothesis that $latex {\mathrm{MCSP}}&fg=000000$ is hard for these sets impacts these relations, for instance yielding the $latex {\mathsf{EXP \neq ZPP}}&fg=000000$ conclusion.

A paradox that at first sight seems stranger emerges when the circuits $latex {C}&fg=000000$ are allowed oracle gates. Such gates may have any arity $latex {m}&fg=000000$ and output 1 if and only if the string $latex {u_1 u_2\cdots u_m}&fg=000000$ formed by the inputs belongs to the associated oracle set $latex {A}&fg=000000$. For any $latex {A}&fg=000000$ we can define $latex {\mathrm{MCSP}^A}&fg=000000$ to be the minimum size problem for such circuits relative to $latex {A}&fg=000000$. It might seem axiomatic that when $latex {A}&fg=000000$ is a powerful oracle such as $latex {\mathrm{QBF}}&fg=000000$ then $latex {\mathrm{MCSP}^{\mathrm{QBF}}}&fg=000000$ should likewise be $latex {\mathsf{PSPACE}}&fg=000000$-complete. However, giving $latex {C}&fg=000000$ such an oracle makes it easier to have small circuits for meaningful problems. This compresses the above dilemma even more. In a companion paper by Eric with Kabanets and Dhiraj Holden they show that $latex {\mathrm{MCSP}^{\mathrm{QBF}}}&fg=000000$ is not complete under logspace reductions, nor even hard for $latex {\mathsf{TC}^0}&fg=000000$ under uniform $latex {\mathsf{AC}^0}&fg=000000$ reductions. More strikingly, they show that if it is hard for $latex {\mathsf{P}}&fg=000000$ under logspace reductions, then $latex {\mathsf{EXP = PSPACE}}&fg=000000$.

Nevertheless, when it comes to various flavors of bounded-error randomized Turing reductions, $latex {\mathrm{MCSP}}&fg=000000$ packs enough hardness to solve Factoring and Discrete Log and GI. We say some more about how this works.

Randomized Reductions to MCSP

What $latex {\mathrm{MCSP}}&fg=000000$ does well is efficiently distinguish strings $latex {x \in \{0,1\}^n}&fg=000000$ having $latex {n^{\alpha}}&fg=000000$-sized circuits from the vast majority having no $latex {n^{\beta}}&fg=000000$-sized circuits, where $latex {0 < \alpha < \beta < 1}&fg=000000$. The dense latter set $latex {B}&fg=000000$ is a good distinguisher between pseudorandom and uniform distributions on $latex {x \in \{0,1\}^n}&fg=000000$. Since one-way functions suffice to construct pseudorandom generators, $latex {\mathrm{MCSP}}&fg=000000$ turns into an oracle for inverting functions to an extent codified in Eric's 2006 joint paper:

Theorem 4 Let $latex {B}&fg=000000$ be a dense language of strings having no $latex {n^{\beta}}&fg=000000$-sized circuits, and let $latex {f(x,y) = z}&fg=000000$ be computable in polynomial time with $latex {x,y,z}&fg=000000$ of polynomially-related lengths. Then we can find a polynomial-time probabilistic oracle TM $latex {M}&fg=000000$ and $latex {c > 0}&fg=000000$ such that for all $latex {n}&fg=000000$ and $latex {y}&fg=000000$,
$latex \displaystyle \Pr_{x,r}[M^B(y,f(x,y),r) = w \text{ such that } f(w,y) = f(x,y)] \geq \frac{1}{n^c}. &fg=000000$

Here $latex {x}&fg=000000$ is selected uniformly from $latex {\{0,1\}^n}&fg=000000$ and $latex {r}&fg=000000$ is uniform over the random bits of the machine. We have restricted $latex {f}&fg=000000$ and $latex {B}&fg=000000$ more than their result requires for ease of discussion.

To attack GI we set things up so that ``$latex {x}&fg=000000$'' and ``$latex {y}&fg=000000$'' represent a graph $latex {G}&fg=000000$ and a permutation $latex {\pi}&fg=000000$ of its vertices, respectively. More precisely ``$latex {G}&fg=000000$'' means a particular adjacency matrix, and we define $latex {f(\pi,G) = G'}&fg=000000$ to mean the adjacency matrix $latex {G'}&fg=000000$ obtained by permuting $latex {G}&fg=000000$ according to $latex {\pi}&fg=000000$. By Theorem~4, using the $latex {\mathrm{MCSP}}&fg=000000$ oracle to supply $latex {B}&fg=000000$, one obtains $latex {M}&fg=000000$ and $latex {c}&fg=000000$ such that for all $latex {n}&fg=000000$ and $latex {n}&fg=000000$-vertex graphs $latex {G}&fg=000000$,

$latex \displaystyle \Pr_{\pi,r}[M^{\mathrm{MCSP}}(G,f(\pi,G),r) = \rho \text{ such that } f(\rho,G) = f(\pi,G)] \geq \frac{1}{n^c}. &fg=000000$

Since $latex {f}&fg=000000$ is 1-to-1 we can simplify this while also tying ``$latex {G'}&fg=000000$'' symbolically to $latex {f(\pi,G)}&fg=000000$:

$latex \displaystyle \Pr_{\pi,r}[M^{\mathrm{MCSP}}(G,G',r) = \pi] \geq \frac{1}{n^c}. \ \ \ \ \ (1)&fg=000000$

Now given an instance $latex {(G,H)}&fg=000000$ of GI via adjacency matrices, do the following for some constant times $latex {n^c}&fg=000000$ independent trials:

Pick $latex {\pi}&fg=000000$ and $latex {r}&fg=000000$ uniformly at random and put $latex {G' = f(\pi,G)}&fg=000000$.
Run $latex {M^{\mathrm{MCSP}}(H,G',r)}&fg=000000$ to obtain a permutation $latex {\rho}&fg=000000$.
Accept if $latex {\rho(H) = G'}&fg=000000$, which means $latex {H = \rho^{-1}\pi(G)}&fg=000000$.

This algorithm has one-sided error since it will never accept if $latex {G}&fg=000000$ and $latex {H}&fg=000000$ are not isomorphic. If they are isomorphic, then $latex {G'}&fg=000000$ arises as $latex {\rho(H)}&fg=000000$ with the same distribution over permutations that it arises as $latex {G' = \pi(G)}&fg=000000$, so Equation (1) applies equally well with $latex {H}&fg=000000$ in place of $latex {G}&fg=000000$. Hence $latex {M^{\mathrm{MCSP}}(H,G',r)}&fg=000000$ finds the correct $latex {\rho}&fg=000000$ with probability at least $latex {\frac{1}{n^c}}&fg=000000$ on each trial, yielding the theorem $latex {\mathrm{GI} \in \mathsf{RP}^{\mathrm{MCSP}}}&fg=000000$.

The proof for $latex {\mathsf{SZK \subseteq BPP}^{\mathrm{MCSP}}}&fg=000000$ is more detailed but similar in using the above idea. There are many further results in the paper by Cody and Ryan and in the oracle-circuit paper.

Open Problems

These papers also leave a lot of open problems. Perhaps more importantly, they attest that these open problems are attackable. Can any kind of many-one reducibility stricter than $latex {\leq_m^p}&fg=000000$ reduce every language in $latex {\mathsf{P}}&fg=000000$ to $latex {\mathrm{MCSP}}&fg=000000$? Can we simply get $latex {\mathsf{EXP} \not\subset \mathsf{P/poly}}&fg=000000$ from the assumption $latex {\mathrm{SAT} \leq_m^p \mathrm{MCSP}}&fg=000000$? The most interesting holistic aspect is that we know new lower bounds follow if $latex {\mathrm{MCSP}}&fg=000000$ is easy, and now we know that new lower bounds follow if $latex {\mathrm{MCSP}}&fg=000000$ is hard. If we assume that $latex {\mathrm{MCSP}}&fg=000000$ stays intermediate, can we prove lower bounds that combine with the others to yield some non-trivial unconditional result?