{\huge TOC In The Future}

Results of the panel at the Theory Fest \vspace{.5in}

Geraud Szénizergues proved in 1997 that equivalence of deterministic pushdown automata (DPDAs) is decidable. Solving this decades-open problem won him the 2002 Gödel Prize.

Today Ken and I want to ponder how theory of computing (TOC) has changed over the years and where it is headed.

Of course we have some idea of how it has changed over the years, since we both have worked in TOC for decades, but the future is a bit more difficult to tell. Actually the future is also safer: people may feel left out and disagree about the past, but the future is yet to happen so who could be left out?

For example, we might represent the past by the following table of basic decision problems involving automata such as one might teach in an intro theory course. The result by Szénizergues filled in what had been the last unknown box:

\bigskip
\hline Problem/machine DFA NFA DPDA NPDA DLBA DTM
\hline Does $latex {M}&fg=000000$ accept $latex {x}&fg=000000$? In P In P In P In P PSPC Undec.
Is $latex {L(M) = \emptyset}&fg=000000$? In P In P In P In P Undec. Undec.
Is $latex {L(M_1) \cap L(M_2) = \emptyset}&fg=000000$? In P In P Undec. Undec. Undec. Undec.
Is $latex {L(M)= \Sigma^*}&fg=000000$? In P PSPC In P Undec. Undec. Undec.
Is $latex {L(M_1) = L(M_2)}&fg=000000$? In P PSPC Decidable Undec. Undec. Undec.
\hline

\bigskip Here `PSPC' means $latex {\mathsf{PSPACE}}&fg=000000$-complete. This table is central but leaves out whole fields of important theory.

At the Theory Fest this June---which we mentioned here---there will be a panel on the future of TOC. We will try to guess what they will say.

Hints of the Panel

Of course we don't know what the panel will say. They don't necessarily give statements ahead of time like in some Senate hearings. But we can get a hint from the subjects and titles of some of the invited plenary talks, which are the last afternoon session each day:

Alon Orlitsky: ``Competitive Distribution Estimation: Why is Good Turing Good?'' (NIPS 2015)
John Preskill: ``Is spacetime a quantum error-correcting code?'' (High Energy Physics)
Tim Roughgarden: ``Why Prices Need Algorithms'' (EC 2015)
Wim Martens: ``Optimizing Tree Pattern Queries: Why Cutting is Not Enough'' (PODS 2016)
Atri Rudra: ``Answering FAQs in CSPs, Probabilistic Graphical Models, Databases, Logic and Matrix operations'' (PODS 2016)
Vasilis Syrgkanis: ``Fast convergence of learning in games'' (NIPS 2015)
Chris Umans: On cap sets and the group-theoretic approach to matrix multiplication (Discrete Analysis)
Chris De Sa: ``Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling'' (ICML 2016)
Nate Foster: The Next 700 Network Programming Languages (POPL 2014)
Mohsen Ghaffari: ``An Improved Distributed Algorithm for Maximal Independent Set'' (SODA 2016)
Valeria Nikolaenko: ``Practical post-quantum key agreement from generic lattices'' (CCS 2016)

There is also a keynote by Orna Kupferman titled, ``Examining classical graph-theory problems from the viewpoint of formal-verification methods.'' And there is one by Avi Widerson titled ``On the Nature and Future of ToC''---which is the subject of this post.

We can get a fix on the present by looking at the regular papers in the conference program. But like Avi we want to try to gauge the future. One clear message of the above range of talks is that it will be diverse. But to say more about how theory is changing we take another look at the past.

TOC: Problems

We can divide the changes in TOC into two parts. One is

the class of questions that we attack,

\noindent and the other is

the types of solutions that we allow.

Years ago, most of the questions we considered were basic questions about strings and other fundamental objects of computing. A classic example was one of Zeke Zalcstein's, my mentor's, favorite problem: the star height problem. You probably do not know this---I knew it once and still had to look it up. Here is a definition:

The basic regular expressions $latex {\mathbf{\emptyset}}&fg=000000$, $latex {\mathbf{\lambda}}&fg=000000$, and $latex {\mathbf{c}}&fg=000000$ for each character $latex {c}&fg=000000$ have star-height $latex {0}&fg=000000$.
If $latex {k}&fg=000000$ is the maximum start height of two expressions $latex {r}&fg=000000$ and $latex {s}&fg=000000$, then $latex {r \cup s}&fg=000000$ and $latex {r \cdot s}&fg=000000$ have star height $latex {k}&fg=000000$.
If $latex {r}&fg=000000$ has star height $latex {k}&fg=000000$, then $latex {r^*}&fg=000000$ has star-height $latex {k+1}&fg=000000$.

Lawrence Eggan seems to have been the first to raise the following questions formally, in 1963:

Is there a fixed $latex {k}&fg=000000$ such that for every regular expression $latex {r}&fg=000000$, there is an expression $latex {r'}&fg=000000$ of star height at most $latex {k}&fg=000000$ such that $latex {L(r) = L(r')}&fg=000000$?
Is there an algorithm that given $latex {r}&fg=000000$ finds an equivalent $latex {r'}&fg=000000$ of minimum star height $latex {k}&fg=000000$?

Regarding the first question, at first it wasn't even known whether $latex {k}&fg=000000$ needed to be greater than $latex {1}&fg=000000$. There are contexts in which one level of nesting suffices, most notably the theorem that one while-loop suffices for any program. Eggan proved however that $latex {k}&fg=000000$ is unbounded, and in 1966, Fran\c{c}ois Dejean and Marcel Schützenberger showed this for languages over a binary alphabet.

The second question became a noted open problem until Kosiburo Hashiguchi proved it decidable in 1988. His algorithm was not even elementary---that is, its time was not bounded by any fixed stack of exponentials in $latex {n}&fg=000000$---but Daniel Kirsten in 2005 improved it to double exponential space, hence at worst triple exponential time. It is known to be $latex {\mathsf{PSPACE}}&fg=000000$-hard, so we might hope only faintly for a runnable algorithm, but special cases (including ones involving groups that interested Zeke) may be tractable. Narrowing the gap is open and interesting but likely to be difficult.

Do you wish you could travel back to the early 1960s to work on the original problems? Well, basically you can: Just add a complementation operator $latex {\sim\!\!r}&fg=000000$ and define it to leave star-height unchanged. Then the resulting generalized star-height problem is wide open, even regarding whether $latex {k=1}&fg=000000$ suffices. To see why it is trickier, note that over the alphabet $latex {\{a,b\}}&fg=000000$,

$latex \displaystyle \begin{array}{rcl} \mathbf{a}^* &=& \mathbf{\sim((\sim \emptyset)b(\sim\emptyset))},\\ (\mathbf{ab})^* &=& \mathbf{\sim(b(\sim(\emptyset) + (\sim\emptyset)a + (\sim \emptyset)(aa+bb)(\sim\emptyset))}, \end{array} &fg=000000$

so those languages have generalized star-height $latex {0}&fg=000000$. Whereas, $latex {(\mathbf{aa})^*}&fg=000000$ does not---it needs the one star. See this 1992 paper and these recent slides for more.

TOC: Solutions

Diversifying areas are certainly giving us new domains of questions to attack. Often the new problem is an old problem with an new application. For instance, Google's PageRank algorithm derives from the theory of random walks on graphs, as we noted here.

The novelty we find it most fruitful to realize, however, comes from changes in what we regard as solutions---the second point at the head of the last section. We used to demand exact solutions and measure worst-case complexity. Now we allow various grades of approximation. Answers may be contingent on conjectures. For example, edit distance requires quadratic time unless the Strong Exponential Time Hypothesis is false---but some approximations to it run in nearly linear time. We have talked at length about such contingencies in crypto.

A nice survey in Nature by Ashley Montanaro shows the progression within the limited field of quantum algorithms. In the classical worst-case sense, it is said that there aren't many quantum algorithms. For a long time the ``big three'' were the algorithms by Peter Shor and Lov Grover and the ability of quantum computers to simulate quantum $latex {N}&fg=000000$-body problems and quantum physics in general. Quantum walks became a fourth and linear algebra a fifth, but as Montanaro notes, the latter needs changing what we consider a solution to a linear system $latex {\mathbf{A}x = \mathbf{b}}&fg=000000$ where $latex {\mathbf{A}}&fg=000000$ is $latex {N \times N}&fg=000000$. You don't get $latex {x}&fg=000000$, rather you get a quantum state $latex {\phi}&fg=000000$ that approximates $latex {\sum_{i=1}^N x_i e_i}&fg=000000$ over a space of $latex {n = \lceil \log_2 N \rceil}&fg=000000$ qubits. The approximation is good enough to answer some predicates with high probability, such as whether the same $latex {x}&fg=000000$ solves another system $latex {\mathbf{A}' x = \mathbf{b}'}&fg=000000$. You lose exactness but what you gain is running time that is polynomial in $latex {n}&fg=000000$ rather than in $latex {N}&fg=000000$, which might be huge.

The survey goes on to problems with restricted numbers of true qubits, even zero. These problems seem important today because it has been so hard to build real quantum computers with more than a handful of qubits. Beyond the survey there are quantum versions of online algorithms and approximations of those.

If we are willing to change what we consider to be an answer, it follows that we are primed to handle fuzzier questions and goals. Online auctions are a major recent topic, and we have talked about them a little. There are many design goals: fairness, truthfulness, minimizing regret, profitability for one side or the other. Again we note that old classic problems are often best adaptable to the new contexts, such as stable-marriage graph problems with various new types of constraints.

The old classic problems never go away. What may determine how much they are worked on, however, is how well we can modify what counts as a solution or at least some progress. It seems hard to imagine partial or approximate answers to questions such as, ``is logspace equal to polynomial time?''

The problem we began with about equivalence of DPDAs may be a good test case. Szénigueres gave a simple yes-answer to a definite question, but as with star-height, his algorithm is completely hopeless. Now (D)PDAs and grammars have become integral to compression schemes and their analysis---see this or this, for instance. Will that lead to important new cases and forms of the classic problems we started with?

Open Problems

What are your senses of the future of ToC?