{\huge No-Go Theorems}
How far does no-go go?
\vspace{.5in}
Simon Kochen is the Henry Burchard Fine Professor of Mathematics at Princeton University. He wrote a PhD thesis on ultrafilters at Princeton in 1959 under Alonzo Church, and supervised the thesis on bounded arithmetic and complexity theory by Sam Buss in 1985. He is also listed as faculty in Philosophy at Princeton, but may be best known for the impact of his work in physics. With the late Ernst Specker, he proved a theorem that implies quantum mechanics cannot be modeled by certain natural assumptions.
Today Ken and I want to discuss no-go theorems: results that show that one cannot prove something via a certain style argument.
The term ``no-go'' is actually used more by physicists than by mathematicians: a no-go theorem refers in their world to a physical state or action that is impossible. For example, thermodynamics demonstrates that perpetual-motion machines as historically understood are no-go. Relativity rules out faster-than-light communication, and some other theorems have been inspired by this.
Although no-go theorems have negative implications, often they flow from a positive result, a construction. This may have implications beyond the original motivation. The new constructed objects may promote advances, while the no-go aspect provides guidance toward techniques that are not ruled out---that are capable of making progress toward the goal that the no-go theorem guards. Crucial in all this is determining exactly how far the barrier erected by a new no-go theorem extends: what presumptions it makes, and what workarounds may be available.
Something exactly like this situation is going on now in combinatorial optimization. I am quite confused about it, hence it seemed like a perfect idea for Ken and me to try to work it out. Let's start first by reviewing the whole idea of no-go theorems. In a Part 2 of this post we will turn to the specific case that is being discussed.
Kochen-Specker and SAT
The Kochen-Specker theorem in one form can be boiled down to this fact:
There exist 18 vectors in $latex {\mathbb{R}^4}&fg=000000$ from which one can make 9 bases for $latex {\mathbb{R}^4}&fg=000000$ using each vector twice, such that each basis is orthogonal.
Our friends at Wikipedia give the construction; here we focus on the implications. The no-go conclusion itself comes from this simple fact:
One cannot assign a value 0 or 1 to each of the 18 vectors such that every basis has exactly one 1, because the duplication of each vector makes the total number of 1's even, while 9 is odd.
If certain local/non-contextual hidden-variable theories\/ of quantum mechanics were true, then Nature would act in a way that entails the existence of just such an impossible assignment. Along with the better-known Bell's Theorem and several related results, the theorem rules out such theories. However, we have encountered a common mistaken impression that the barrier applies to all ``hidden-variable'' theories. This is not true; for instance not only is Bohmian mechanics a hidden-variable theory that survives, it is said to have inspired John Bell toward his theorem in the first place.
Thus far the ``no-go'' sounds philosophical, but it has concrete purely mathematical facets. Cases of the Kochen-Specker theorem like the form above can be encoded by Boolean formulas $latex {\phi}&fg=000000$, with the conclusion saying that $latex {\phi}&fg=000000$ is unsatisfiable. It is possible that such $latex {\phi}&fg=000000$ might be tricky enough to ``fool'' certain attempts to solve SAT, thus showing that certain kinds of algorithms---who is talking about quantum theories now?---are no go. Ken gleaned this idea from a talk by Richard Cleve while he was in Montreal in 2009, which has since developed into this revised paper by Cleve with Peter H{\o}yer, Ben Toner, and John Watrous (which in turn has spawned other work on ``quantum games''). Two other papers of interest are:
No-Go Theorems
Simply put, a no go theorem says: ``You cannot do X by an argument of type Y.'' It is not easy to define exactly what such theorems are, but perhaps a few examples from mathematics and computer science will help.
$latex {\bullet}&fg=000000$ Unconstructables: Duplicating the cube, trisecting angles, and constructing regular polygons were classic tasks for the ancient Greek geometry of straightedge and compass. It took until the 1800's to prove that all these tasks are impossible in general. But note that some angles such as $latex {90^\circ}&fg=000000$ can be trisected, and some surprising polygons constructed. Moreover, if you are allowed to ``cheat'' by having one mark on the ruler for distance, then the first two tasks and more of the third become possible.
$latex {\bullet }&fg=000000$ Set Theory Forcing: In set theory we know, thanks to the deep work of Paul Cohen, that any proof of the Axiom of Choice must not be expressible in ZF. More precisely, if you could prove the Axiom of Choice in ZF, then it would be terrible. Such a proof would imply that ZF is inconsistent, and this would destroy the world of math as we know it.
$latex {\bullet }&fg=000000$ Oracle Results: In computational complexity theory we know, thanks to the brilliant work of Baker-Gill-Solvay, that any solution to $latex {\mathsf{P} = \mathsf{NP}}&fg=000000$ cannot ``relativize.'' This follows from one of their theorems:
Theorem 1 There are oracles $latex {A,B}&fg=000000$ such that $latex {\mathsf{P}^{A} = \mathsf{NP}^{A}}&fg=000000$ while $latex {\mathsf{P}^{B} \neq \mathsf{NP}^{B}}&fg=000000$.
Thus if you have proved---you think---that $latex {\mathsf{P}\neq \mathsf{NP}}&fg=000000$, or the opposite, then you had better check that your proof does not relativize.
Roughly this means that the proof cannot simply treat Turing machines as black boxes, with meters for their inputs, outputs, and running times. Instead the proof must get into the details of how computations work. You must somehow get your hands dirty in the details of how machines can compute. So far no one has figured out how to get dirty, how to get inside the machines, and how to show that they cannot solve hard problems efficiently.
There are further ``barriers'' under the monikers natural proofs and algebrization. These are basically more conditional: not a flat no-go but rather
If you can do X by an argument of type Y, then you can also do Z which would be a surprise if not a shock...
No-Go Against Particular Algorithms
If we cannot rule out all possible polynomial-time algorithms for $latex {\mathsf{NP}}&fg=000000$-complete problems, perhaps we can erect a barrier against some broad kinds of algorithms. Our notion of ``progressive'' algorithms has this motivation. The barrier would say that certain kinds of ``soft'' algorithms cannot deal with a certain ``super-hard'' kind of hardness.
A natural idea to solve any hard computational problem is to encode it as a linear program of polynomial size. Then invoke the deep result that all linear programs of size $latex {S}&fg=000000$ can be solved in time polynomial in $latex {S}&fg=000000$. The great Aristotle would be happy: \noindent 1. My problem is a polynomial size linear program.
2. All polynomial size linear programs are easy.
3. My problem is easy.
Thus the quest is for polynomial size encodings of hard problems, like the TSP or HC problem. The obvious ways to do this fail, and many have tried to find the right encoding so they could invoke the above syllogism. This has led over the years to the following kind of no-go theorem, which we covered last fall:
Theorem 2 A problem $latex {Y}&fg=000000$ cannot have a polynomial-sized linear program provided there is another problem $latex {X}&fg=000000$ so that (i) $latex {Y}&fg=000000$ is an extension of $latex {X}&fg=000000$, and (ii) $latex {X}&fg=000000$ not only does not have a polynomial size linear program, but is ``super-hard.''
As Aristotle might say: \noindent 1. Your problem extends my super-hard problem.
2. All problems that extend a super-hard problem are super-hard.
3. Your problem is super-hard.
Exactly what ``super-hard'' means is ingrained in the details of the proofs, but it applies to the standard TSP polytope $latex {X}&fg=000000$, and implies an exponential lower bound on the linear program size. The notion of ``extension'' $latex {Y}&fg=000000$ is called extended formulation (EF) and basically means that $latex {X}&fg=000000$ is a projection of $latex {Y}&fg=000000$. This argument can carry extra power because often the problem $latex {Y}&fg=000000$ may be easier to understand than the problem $latex {X}&fg=000000$. But now comes an issue of whether it has too much power. Suppose $latex {Z}&fg=000000$ is any ``ordinary'' hard problem, to which $latex {X}&fg=000000$ reduces via an ordinary complexity-theoretic reduction function. Now we make a problem $latex {Y}&fg=000000$ that augments $latex {Z}&fg=000000$ to include the variables used to define $latex {X}&fg=000000$, in such a way that solutions to $latex {Y}&fg=000000$ map back to solutions to $latex {X}&fg=000000$, while $latex {Y}&fg=000000$ remains equivalent to $latex {Z}&fg=000000$. If $latex {Y}&fg=000000$ then always counts as an ``extension'' of $latex {X}&fg=000000$, then we get the following syllogism:
\noindent 1. Your problem $latex {Z}&fg=000000$ is hard.
2. Your problem is equivalent to an extension $latex {Y}&fg=000000$ of my $latex {X}&fg=000000$ that is super-hard.
3. Your problem is super-hard.
However, the no-go theorem does not say that every hard problem is super-hard. Its barrier is only stated to extensions of $latex {X}&fg=000000$, but now it sounds like the barrier is applying everywhere. Where does it stop?
In our particular case, we are worried about whether we are getting a syllogism like this:
\noindent 1. Every extended formulation $latex {Y}&fg=000000$ of the standard polytope $latex {X}&fg=000000$ for the TSP must have exponentially many constraints.
2. Every linear-programming formulation $latex {Z}&fg=000000$ for TSP yields such a $latex {Y}&fg=000000$, with polynomial overhead.
3. Every linear program for TSP requires exponentially many constraints.
But 3. seems too strong. Because linear programming (LP) is complete for polynomial time, it sounds very close to saying $latex {\mathsf{NP}}&fg=000000$ has exponential-time lower bounds. It is stronger than what the no-go theorem says. So how far down does the barrier really go, to what kinds of $latex {Z}&fg=000000$'s? That is what we wish to know.
Ways Around The EF No-Go?
Every no-go theorem is based on some assumptions. If you play by the assumptions---the rules---then you cannot succeed. But what if one chooses to violate the rules, then the no-go theorem says nothing. One way to think about a no-go theorem is precisely this: it shows you where to look for your proof. Stay away from any proof that is covered by the no-go theorem, and by doing this you at least have a chance to succeed.
There are several ways that the EF no-go theorem for TSP could be defeated. Here are two that come to mind. The first is a standard observation, while the second is one that I (Dick) have thought about and believe is new. In any event these two show that the ``no'' in the EF no-go theorem could be more of a yellow light then a red light.
$latex {\bullet }&fg=000000$ Big LP's may be okay: Suppose that you find a way to encode the TSP as an exponential size LP. This means that the naive way of running the LP solver will fail. But thanks to an insight in the famous ellipsoid-method paper by Martin Grötschel, Laszló Lovász, and Lex Schrijver, it is possible to solve an exponential sized LP in polynomial time. A separation oracle produces a violated constraint whenever a given point $latex {\mathbf{x}}&fg=000000$ is not feasible.
Theorem 3 Let {\cal L} be an exponential-size LP with a polynomial-time computable separation oracle. Then there is a polynomial-time algorithm that solves {\cal L}.
$latex {\bullet }&fg=000000$ Having more than one LP may be okay: Suppose that you find a set of polynomial-sized LP's whose union equals the LP you wish to solve. Then this would allow you to get around the EF no-go theorem. In geometric terms suppose that you want to solve an LP over the polytope $latex {X}&fg=000000$. You find a set of polynomially many polytopes
$latex \displaystyle Q_{1}, \dots Q_{m}, &fg=000000$
so that the projection of the $latex {Q_{i}}&fg=000000$'s covers the polytope $latex {X}&fg=000000$. Then you can solve $latex {X}&fg=000000$ by solving all the $latex {Q_{i}}&fg=000000$'s and taking the best answer. This seems to me to possibly avoid the EF no-go.
Open Problems
How far does the EF no-go go? As we said above, we are preparing a ``Part 2'' post with a more detailed look at this question.
Note there is a danger in calling anything part I without having written part II. John McCarthy famously wrote the paper, ``Recursive Functions of Symbolic Expressions and their Computation by Machine (Part I),'' and alas never wrote part II. Kurt Gödel's famous paper on undecidability was also a ``Part I,'' but by his telling he had a long manuscript of the ``part II'' which he would have published had people felt they needed to see the extra details. Hopefully we are closer to Gödel than McCarthy here, but trying to strike a stick on newly-hewn land is always uncertain.