{\huge The Shapes of Computations}
Or rather, what can the shapes of proofs tell us about them?
\vspace{.25in}
Juris Hartmanis did much to lay the landscape of computational complexity beginning in the 1960s. His seminal paper with Richard Stearns, ``On the Computational Complexity of Algorithms,'' was published 50 years ago this month. It is a great achievement to open a new world, but all the more mysterious that after 50 years so much of the landscape remains unknown.
Today we ask how much some recent large-data discoveries may reveal of the unseen contours.
The idea for this post arose from a possibly phantom memory that Juris (co-)wrote a short draft survey on ``Shapes of Computations'' sometime in 1986--1989 when I was at Cornell. I recalled the specific phrase ``long and skinny'' to describe space-bounded computations. An $latex {\ell = O(\log n)}&fg=000000$ space-bounded computation can explore all of a polynomial $latex {p(n)}&fg=000000$-sized undirected graph by saving just the current node and some $latex {O(\log n)}&fg=000000$ auxiliary information by which to choose a neighbor for the next step. The trace of this computation becomes an $latex {\exp(\ell)}&fg=000000$-length sequence of $latex {\ell}&fg=000000$-bit strings. A polynomial-space computation doing an exponential search of a game tree has the same long-and-skinny ``shape'' even though the scale is greater with regard to the input length $latex {n}&fg=000000$. Polynomial time-bounded computations, however, can use polynomial space, whereupon they become ``short and fat.'' Breadth-first search of a graph is a canonical algorithm that hogs space for its relatively short duration.
Which computations fall between these ``Laurel and Hardy'' extremes? For SAT and the other $latex {\mathsf{NP}}&fg=000000$-complete problems, this is the great question. The surest way to separate $latex {\mathsf{NP}}&fg=000000$ from $latex {\mathsf{P}}&fg=000000$ and from $latex {\mathsf{PSPACE}}&fg=000000$ would be to characterize these problems by a third distinct shape of computation. But we have not even separated $latex {\mathsf{P}}&fg=000000$ from $latex {\mathsf{PSPACE}}&fg=000000$, nor logspace from $latex {\mathsf{NP}}&fg=000000$, so what can we pretend to know?
Shapes of Proofs
My memory has probably been transferred from a column Juris wrote with his students Richard Chang, Desh Ranjan, and Pankaj Rohatgi for the May 1990 issue of the Bulletin of the EATCS. It has this nice diagram:
\includegraphics[width=3.5in]{ProofShapes.png}
The analogy between computations and proofs has been instrumental since the early days of Kurt Gödel and Alonzo Church and Alan Turing. Proofs do, however, give nondeterminism ``for free''; $latex {\mathsf{NP}}&fg=000000$ is treated the same as $latex {\mathsf{P}}&fg=000000$ in their diagram, $latex {\mathsf{NEXP}}&fg=000000$ same as $latex {\mathsf{EXP}}&fg=000000$, while ``nondeterministic polynomial space'' equals $latex {\mathsf{PSPACE}}&fg=000000$. Hence I've regarded ``proofs'' as secondary to ``computations'' in this regard. However:
The EATCS column defines a notion of width of a proof, characterizes $latex {\mathsf{PSPACE}}&fg=000000$ via polynomial width proofs, and marvels at how the classic interactive protocol for $latex {\mathsf{IP = PSPACE}}&fg=000000$ retains the ``skinny'' shape with less length. Indeed, in cases where the verifier is able directly to check evaluations of the unique multilinear extension $latex {\hat{f}}&fg=000000$ of the arithmetization $latex {f}&fg=000000$ of a quantified Boolean formula, every proof step involves just two terms $latex {a_i x_i + b_i}&fg=000000$ for the rounds $latex {i = 1}&fg=000000$ to $latex {n}&fg=000000$. A related form of skinniness had been brought out by Jin-Yi Cai and Merrick Furst in their 1987 paper, ``$latex {\mathsf{PSPACE}}&fg=000000$ Survives Three-Bit Bottlenecks.'' The column goes on to emphasize that the $latex {\mathsf{IP}}&fg=000000$ form of the proof lends itself to quicker probabilistic verification. This aspect was shortly codified in the definition of probabilistically checkable proof, which lends itself most readily to characterize $latex {\mathsf{NP}}&fg=000000$ and $latex {\mathsf{NEXP}}&fg=000000$.
Three Kinds of Proofs
Amid all this development on ``long and skinny'' proofs, what can we say about ``short and fat'' ones? Intuitively, such proofs have lots of cases, but that is not their full story. The network of logical dependence matters too. Hence we think there are most helpfully three basic kinds of proofs:
Direct evaluations of quantified Boolean formulas in $latex {\mathsf{PSPACE}}&fg=000000$ have type 2, while the interactive proof with polynomials gives the ``feel'' of type 1 to both the prover and the recipient of the proof.
Chess problems prefer types 1 or a limited form of 2 for esthetics. The website ChessBase.com recently republished the longest branching-free ``Mate-In-N'' problem ever created, by Walther Jörgenson in 1976. It is mate-in-203 with no alternative move allowed to the winner, and virtually no case analysis of alternate defensive tries by the loser either.
\includegraphics[width=2in]{JorgensenMate203.png}
However, a chess search often has type 3. Often there will be different starting sequences of moves that come to the same position $latex {P}&fg=000000$. The value of $latex {P}&fg=000000$ that was computed the first time is stored in a hash table so that the later sequences are immediately shown that value, cutting off their need for any further work. This resembles breadth-first search insofar as marked nodes may be touched later along other paths. The dependencies of values become web-like. Injured values from hash collisions can cause huge ripple effects, as I covered in a post three years ago.
The stored-hash tactic is much the same as a lemma in a proof that is used multiple times. We suspect that last year's 6-gigabyte computer-generated proof of discrepancy $latex { > 2}&fg=000000$ in the Erd\H{o}s Discrepancy Conjecture has many such lemmas, and hence is more type 3 than 2. The provers Boris Konev and Alexei Lisitsa have an updated page with further computations. They do not link the whole impossibility proof of a length 1,161-or-more sequence of discrepancy 2, but do give some of the ``DRUP'' certificates of unsatisfiability. DRUP stands for reverse-unit propagation with clause deletion, and that propagation strikes us as largely composed of cases and lemmas. The subcases might be codable at high level via predicates $latex {P(m,\cdots)}&fg=000000$ for $latex {m < 1,161}&fg=000000$ expressing the unavailability of length-$latex {m}&fg=000000$ subsequences fulfilling some extra conditions $latex {(\cdots)}&fg=000000$, with such predicates being copiously re-used.
One finds an explosion of stored sub-cases in chess endgame tables. However, in many positions a vast majority of them are to prove wins that a chess master would see as ``trivially true'' in a few seconds. In other cases an alternative by the loser might simply jump to a later stage of the main trunk line, thus merely accelerating the same loss rather than posing a new case. (Similarly, an alternative by the winner might merely allow the defender to wind back to an initial stage, without much need for separate verification.) We wonder how far this pertains to the longest endgame win discovered in the new 7-piece endgame tables--mate in 545 moves. That is, how large is the part of the proof tree that needs to be specified, so that a chess program given values for positions in the tree could verify the rest via its local search?
Computations Short-Cutting Proofs?
This last eventuality prompts our new speculation: Perhaps we can rigorously develop a science of when large sections of proofs can be effectively handwaved. This would appeal to the computational style most often postulated for our human brains: shallow but broad and with great power of heuristic association, analogous to the complexity class $latex {\mathsf{TC^0}}&fg=000000$ of poly-size constant-depth threshold circuits. Even when yoked with computers the added-value of our grey matter is not to be discounted, as attested in my joint paper last summer---in section 6 showing how human-computer tandems performed better at chess in 2005--08 than computers alone.
We have recently twice covered a conjecture by Freeman Dyson that one feels should lend itself to this kind of treatment. Many other open conjectures in number theory are felt to be ``probably'' true, where ``probably'' has a technical sense that might be developed further into some kind of dependent structure: if this handwave is valid then all-the-more certainly so is this one. The idea could be helped by enumeration of exceptions that, once handled, enable the rest of the proof to be executed with a broad brush.
This could push many proofs of type 3 toward types 2 or 1. Note that in the way interaction and randomness combine to move type 2 toward type 1, we are already agreeing to tolerate a chance of error. It is the nature of the kind of error involved in converting instances of type 3 into type 2 that needs further consideration. We wonder whether current developments such as homotopy type theory are embracing not just exact patterns but also heuristics for when a search is overwhelmingly likely to succeed---or to fail.
This still leaves our original question of shapes of computations. In the past both Dick and I have undertaken some exploration of conditions under which computations might be similarly ``self-improved.'' That idea will have to be for another time.
Open Problems
Can we assign particular ``shapes'' of computations canonically to specific computational problems? Is doing so no more than tantamount to solving the big open questions to begin with?
Again we congratulate Juris and Richard on this 50th anniversary of their achievement. We also tip our hat to a comment by John Sidles in our ``Case Against Cases'' post which partly prompted this further essay.