{\huge Does Program Size Matter?}

A problem about this forgotten complexity measure \vspace{.3in}

Erik Winfree is one of the leaders in DNA computation, especially the theory of self-assembly using DNA. He won a MacArthur Fellowship in 2000, among many other awards for his terrific work. With Paul Rothemund he wrote a STOC 2000 paper titled ``The Program-Size Complexity of Self-Assembled Squares.''

Today Ken and I want to talk about program size: about how many instructions are needed to compute something.

A DNA self-assembly system consists of a set of tile types, rules of assembly, and theorems of the form: The number of tile types to create $latex {X}&fg=000000$ is at most $latex {\dots}&fg=000000$ The rationale to reduce tile types is simple: making duplicates of tiles is like printing, its cheap; but making many types of tiles is expensive. Each tile type requires a separate printing. So there is a natural imperative to reduce the number of tile types.

Hence a major goal of their research, using their terminology, is to reduce the ``program complexity.'' Thus there are parts of computer science theory where program size is important, but this area uses a non-standard notion that may say nothing about programs for conventional computers, which are the kind we want to study.

If you are interested, consider going to the next DNA Conference which was started by one of us, Dick, years ago. They are now on the twenty-first edition.

Let's look at program size of machines not tile systems, since it came up recently in some thoughts Ken and I had on $latex {\mathsf{P=NP}}&fg=000000$. We do think about that question sometimes$latex {\dots}&fg=000000$

Size of Code, Old and New

In the real world the size of a program is probably much more important than the time it takes or the amount of memory it requires. The size of code directly correlates with the cost of writing the program and maintaining it, with its number of bugs, and with its security issues. Big program are usually expensive in all ways. If we are creating the next great app that will make us all a fortune, we want the program to be a manageable size.

Yet complexity theory---for the most part---is focused almost entirely on the time a program takes or the space it uses. Not on its size. Hence we study time classes like $latex {\mathsf{DTIME}(n^{3})}&fg=000000$ and space classes like $latex {\mathsf{L}}&fg=000000$, logspace. And we prove theorems like hierarchy theorems. But not many theorems say anything about the size of a program. In one of the first posts on the blog, I posed this challenge:

Prove there is no algorithm for $latex {\mathsf{SAT}}&fg=000000$ that runs in polynomial-time and can be written down on a single piece of paper. You can make the paper 8.5 by 11, if you wish. The writing has to be normal size. No tricks.

I went on to say that although ``conventional wisdom'' is confident in $latex {\mathsf{P \neq NP}}&fg=000000$, so that no such program exists, yet I find it shocking that even this one-page challenge appears to be hopeless. Well perhaps we can prove it equivalent after all, in the course of raising two questions. The first is about program size overall, and the second about the size of ``new code'' that needs to be written to solve a new problem, versus using already-written routines. To further our-real-world analogy, the latter have presumably already been debugged, and no one has to pay for writing them again.

Let's look at the main obstacle and then the two questions.

An Obstacle?

Here is the source of our questions that we have recently thought about. Imagine, try to imagine if you can, that $latex {\mathsf{P=NP}}&fg=000000$. This implies that every nondeterministic Turing Machine (NTM) that runs in linear time can be simulated by a deterministic TM (DTM) that runs in time $latex {n^{c}}&fg=000000$ for some constant $latex {c}&fg=000000$. The constant is absolute, but it can be small or large or astronomical. Thus one reason that $latex {\mathsf{P=NP}}&fg=000000$ could still imply nothing about practical computing is that $latex {c}&fg=000000$ could be the obstacle. Even a modest size $latex {c=5}&fg=000000$ would render many applications useless.

This is nothing new---we have pondered before that $latex {c}&fg=000000$ could be huge. And have postulated that one reason that none have found such a deterministic algorithm is that most tools in algorithm design are not set up to create algorithms that have running times of the form

$latex \displaystyle n^{2^{2^{2^{100}}}}. &fg=000000$

Dynamic programming, divide-and-conquer, clever data structures, random sampling, linear programming, semi-definite programming, and others do not yield such galactic running times.

Ken and I wonder if the obstacle could be not that $latex {c}&fg=000000$ is large, but that the program size hidden away in the statement

$latex \displaystyle \mathsf{NTIME}(n) \subset \mathsf{DTIME}(n^{c}) &fg=000000$

has a blowup. Suppose that $latex {c=1.1}&fg=000000$. That would sound great---too good to be true. But it could still be useless. What if given an NTM program $latex {N}&fg=000000$ that runs in linear time, the corresponding DTM program $latex {M}&fg=000000$ that runs in time $latex {n^{c}}&fg=000000$ is itself huge? The fact that the program is deterministic and runs fast is really of no value if the program $latex {M}&fg=000000$ is exponentially larger than $latex {N}&fg=000000$. If this is the case, then even though the running time is small, the program is of no value.

This is the obstacle that we are interested in. Our response will involve universal simulators, which can be small, even tiny. We will first wrap one inside a famous old construction by Leonid Levin, and then write a little more ``new code.'' But first, what questions might a complexity theorist ask about program size?

Two Simple Questions

For both questions, our general task will be taking a nondeterministic program $latex {N}&fg=000000$ of size $latex {s}&fg=000000$ that runs in some nondeterministic time $latex {t(n)}&fg=000000$, and simulating it by a deterministic program of size $latex {s'}&fg=000000$ in time $latex {t'(n)}&fg=000000$ that is at most polynomially slower: $latex {t'(n) = O(t(n)^c)}&fg=000000$. It's enough to suppose $latex {t(n)}&fg=000000$ is linear, so the goal is $latex {t'(n) = O(n^c)}&fg=000000$. We won't care if $latex {c}&fg=000000$ is huge, so long as it is fixed and $latex {s'}&fg=000000$ avoids being huge. Our questions are:

Could it be that any polynomial-time deterministic program must be exponentially bigger, that is $latex {s' = \exp(s)}&fg=000000$?
Can we arrange that $latex {s' = O(s)}&fg=000000$, or even $latex {s' = s + O(1)}&fg=000000$, where the constant in the $latex {O}&fg=000000$ is not only independent of $latex {N}&fg=000000$ (of course), but also small in an absolute sense?

Of course if $latex {\mathsf{P \neq NP}}&fg=000000$ these questions are generally moot. But if $latex {\mathsf{P=NP}}&fg=000000$, could the reason for the equality not being practical be that the program blows up in size, not that the running time blows up? Put another way, could we have $latex {\mathsf{P=NP}}&fg=000000$ but $latex {\mathsf{SAT}}&fg=000000$ requires a program so large that it cannot ever be used? Note, this says nothing about the running time of the $latex {\mathsf{SAT}}&fg=000000$ algorithm.

The short answer to the first question is no. This is because of reducibility and completeness for $latex {\mathsf{NP}}&fg=000000$. The code for the reduction from the language of $latex {N}&fg=000000$ to $latex {\mathsf{SAT}}&fg=000000$ is linear in the size $latex {s}&fg=000000$ of $latex {N}&fg=000000$. Call this code $latex {R}&fg=000000$. By $latex {\mathsf{P=NP}}&fg=000000$ there is a fixed deterministic polynomial-time program $latex {V}&fg=000000$ for $latex {\mathsf{SAT}}&fg=000000$. The program that takes an input $latex {x}&fg=000000$ and outputs $latex {V(R(x))}&fg=000000$ runs in poly-time and has size $latex {s' = O(s)}&fg=000000$. Moreover $latex {R}&fg=000000$ itself can be viewed as a fixed piece of code that takes the code of $latex {N}&fg=000000$ as a parameter, so the size $latex {s'}&fg=000000$ can be regarded as $latex {s + O(1)}&fg=000000$ or even just $latex {O(1)}&fg=000000$ in terms of ``new code.''

However, the '$latex {O}&fg=000000$' here includes the unknown size of the possibly-huge program $latex {V}&fg=000000$. Hence this doesn't fully answer the second question. That's our point: can we do better?

The Idea

The basic idea, Leonid Levin's ``universal search,'' is well known, and we covered one aspect of it here. We have not seen it addressed regarding program size, however.

Levin's universal machine $latex {U}&fg=000000$ accepts a Boolean formula $latex {\phi}&fg=000000$ only when it has found a satisfying assignment. Call this ``verified acceptance.'' It means that $latex {U}&fg=000000$ can possibly err only by rejecting a $latex {\phi}&fg=000000$ that is satisfiable after all. If $latex {\mathsf{P = NP}}&fg=000000$ is true, then Levin's code---which is known and small---will run in polynomial time, and accept $latex {\mathsf{SAT}}&fg=000000$ except for finitely many errors.

What $latex {U}&fg=000000$ does on a formula $latex {\phi}&fg=000000$ of size $latex {m}&fg=000000$ is first spend $latex {m}&fg=000000$ steps trying to prove that the first yea-many small polynomial-time programs don't recognize $latex {\mathsf{SAT}}&fg=000000$ with verified acceptance. Let $latex {P_k}&fg=000000$ be the least program that does not fail this test. $latex {U}&fg=000000$ then runs $latex {P_k(\phi)}&fg=000000$. If it gives a satisfying assignment, $latex {U}&fg=000000$ accepts $latex {\phi}&fg=000000$; else $latex {U}&fg=000000$ rejects. Then the code $latex {U(R(x))}&fg=000000$ has size $latex {s' = O(s)}&fg=000000$ with $latex {O(1)}&fg=000000$ ``new code'' where the constant in the $latex {O}&fg=000000$ is small and known, not huge.

If $latex {\mathsf{P=NP}}&fg=000000$, then some $latex {P_k}&fg=000000$ runs in polynomial time $latex {m^c}&fg=000000$, and (by the self-reducibility of $latex {\mathsf{SAT}}&fg=000000$) recognizes $latex {\mathsf{SAT}}&fg=000000$ with verified acceptance. Let $latex {k}&fg=000000$ be least for this. Then each of the machines $latex {P_1,\dots,P_{k-1}}&fg=000000$ makes some mistake, and some finite time $latex {n_0}&fg=000000$ suffices for $latex {U}&fg=000000$ to cross off each one. Hence whenever $latex {m \geq n_0}&fg=000000$, $latex {U(\phi)}&fg=000000$ simulates $latex {P_k(\phi)}&fg=000000$ and gives the same answer in basically the same time, which becomes the overall polynomial runtime of $latex {U}&fg=000000$. This carries over to runtime $latex {O(n^c)}&fg=000000$ for our code $latex {U(R(x))}&fg=000000$ on $latex {x}&fg=000000$ of length $latex {n \geq n_0}&fg=000000$.

The issue is, what happens when $latex {|x| < n_0}&fg=000000$? Our code may err. We could ``patch'' it by adjoining a lookup table for the finitely many errors. But this table becomes part of the program size $latex {s'}&fg=000000$ and might be huge. Moreover a given table could be buggy---remember that the famous Pentium bug of 20 years ago came from a few bad lookup-table entries.

Patchless Assembly?

Levin originally stated his $latex {U}&fg=000000$ for factoring rather than $latex {\mathsf{SAT}}&fg=000000$. The point is that the language

$latex \displaystyle \mathsf{Fact} = \{(x,w): w \text{ is a prefix of the unique prime factorization of } X\} &fg=000000$

belongs to $latex {\mathsf{NP \cap co}\text{-}\mathsf{NP}}&fg=000000$, and so can be recognized with verified rejection as well as verified acceptance. Then with reference to the above description of $latex {U}&fg=000000$ in general, $latex {U}&fg=000000$ rejects only if $latex {P_k}&fg=000000$ outputs a verified rejection. If $latex {P_k}&fg=000000$ gives no verified answer, then $latex {U}&fg=000000$ takes as much time as it needs to get the correct answer. This $latex {U}&fg=000000$ still settles down to polynomial time for $latex {n \geq n_0}&fg=000000$.

Thus it is not possible to have factoring in $latex {\mathsf{P}}&fg=000000$ but all polynomial-time programs for it are huge. To be sure, the running time can be huge---indeed $latex {P_k}&fg=000000$ itself may be huge and its size gets reflected in concrete running times---but the code for $latex {U}&fg=000000$ itself is small. And it is all known code which can be already debugged.

We could try to do the same for $latex {\mathsf{SAT}}&fg=000000$ by having $latex {U}&fg=000000$ solve $latex {\phi}&fg=000000$ exhaustively if $latex {P_k}&fg=000000$ fails to find a satisfying assignment. The problem then becomes not knowing when to stop and ``trust'' $latex {P_k}&fg=000000$ when in fact the first good one is found. The nub is whether we can use the assumption of $latex {\mathsf{P = NP}}&fg=000000$ to ``bootstrap'' the verification of succinct proofs of unsatisfiability. The mere existence of succinct proofs, however, that are verifiable without the assumption implies $latex {\mathsf{NP = co}\text{-}\mathsf{NP}}&fg=000000$.

Can we make the short code simulating $latex {N}&fg=000000$ perfect, without patches? We like the metaphor of $latex {U}&fg=000000$ as interacting and adapting in an environment of short programs, which it ``assembles'' into its own code before finding the one that works best.

We are not bothered about the conversion time $latex {n_0}&fg=000000$ being exponential in the size of the $latex {P_k}&fg=000000$ that is ultimately selected. After all, compared to the short timespan of recorded human history, the time it has taken us to evolve feels ``exponential.''

Open Problems

Can we write code today, on one sheet of paper, that accepts $latex {\mathsf{SAT}}&fg=000000$ exactly and runs in deterministic polynomial time if $latex {\mathsf{P = NP}}&fg=000000$? Incidentally Ken fit much of a 3-tape Turing machine that does universal simulation of assembly programs onto one sheet for Buffalo's theory courses.