{\huge A Proof Of The Halting Theorem}

Toward the idea of teaching computability and complexity simultaneously \vspace{.5in}

Wilhelm Ackermann was a mathematician best known for work in constructive aspects of logic. He has a function named after him: the Ackermann function, which is used in complexity theory and in data structure theory. That is a pretty neat combination.

I would like today to talk about a proof of the famous Halting Problem.

I am teaching CS4510 which at GIT is the introduction to complexity theory. We usually study general Turing machines and then use the famous Cantor Diagonal method to show that the Halting Problem is not computable. My students over the years have always had trouble with this proof. We have discussed this method multiple times: see here and here and here and in motion pictures here.

This leads always to the question, what really is a proof? The formal answer is that it is a derivation of a theorem statement in a sound and appropriate system of logic. But as reflected in our last two posts, such a proof night not help human understanding. The original meaning of ``proof'' in Latin was the same as ``probe''---to test and explore. I mean ``proof of the Halting Problem'' in this sense. We think the best proofs are those that show a relationship between concepts that one might not have thought to juxtapose.

Diagonal and Impossible

The question is how best to convince students that there is no way to compute a halting function. We can define Turing machines $latex {M_x}&fg=000000$ in a particular way---or define other kinds of machines. Then we get the particular definition

$latex \displaystyle H(x,y) = \begin{cases} 1 & \text{if the machine &fg=000000$

How can we prove that $latex {H(x,y)}&fg=000000$ is not computable? We want to convey not only that this particular $latex {H(x,y)}&fg=000000$ is uncomputable, but also that no function like it is computable.

Trying the diagonal method means first defining the set

$latex \displaystyle D = \{x: M_x \text{ does not accept } x\}. &fg=000000$

We need to have already defined what ``accept'' means. OK, we show that there is no machine $latex {M_d}&fg=000000$ whose set of accepted strings equals $latex {D}&fg=000000$. Then what? We can say that the complementary language $latex {K = \{x: M_x \text{ \emph{does} accept } x\}}&fg=000000$ id not decidable, but we still need another step to conclude that $latex {H}&fg=000000$ is uncomputable. And when you trace back the reason, you have to fall back on the diagonal contradiction---which feels disconnected and ad hoc to the particular way $latex {D}&fg=000000$ and $latex {H}&fg=000000$ are defined.

Ken in his classes goes the $latex {D}&fg=000000$ route first, but Sipser's and several other common textbooks try to hit $latex {H}&fg=000000$ directly. The targeted reason is one that anyone can grab:

It is impossible for a function $latex {f}&fg=000000$ to give the value $latex {f(n) = f(n) + 1}&fg=000000$---or any greater value.

Implementations of this, however, resort to double loops to define $latex {f(n)}&fg=000000$. Or like Sipser's they embed the ``$latex {D}&fg=000000$'' idea in the proof anyway, which strikes us as making it harder than keeping it separate. We want the cleanest way.

The Proof

Here is the plan. As usual we need to say that $latex {M_{x}(y)}&fg=000000$ represents a computation. If the computation halts then it returns a result. We allow the machine $latex {M_{x}}&fg=000000$ to return an integer, not just accept or reject. If the machine does not halt then we can let this value be undefined; our point will be that by ``short-circuit'' reasoning the question of an undefined value won't even enter.

Now let $latex {H(x,y)}&fg=000000$ be defined as the halting function above.

Theorem 1 The function $latex {H(x,y)}&fg=000000$ is not computable.

Proof: Define the function $latex {f(y)}&fg=000000$ as follows:

$latex \displaystyle f(y) = 1 + \sum_{x=0}^{y} H(x,y) \cdot M_{x}(y). &fg=000000$

Suppose that $latex {H}&fg=000000$ is computable. Then so is $latex {f(y)}&fg=000000$. This is easy to see: just do the summation and when computing $latex {H(x,y) \cdot M_{x}(y)}&fg=000000$ compute the $latex {H}&fg=000000$ part first. If it is $latex {0}&fg=000000$ then it adds nothing to the summation, so it ``short-circuits'' $latex {M_{x}(y)}&fg=000000$ and we move on to the next $latex {y}&fg=000000$. If it is $latex {1}&fg=000000$ then we compute $latex {M_{x}(y)}&fg=000000$ and add that to the summation. Let $latex {A_y = \sum_{x=0}^{y - 1} H(x,y) \cdot M_{x}(y)}&fg=000000$ stand for the summation before the last $latex {x = y}&fg=000000$ term; then $latex {A_y \geq 0}&fg=000000$.

Now if the theorem is false, then there must be some $latex {n}&fg=000000$ such that the machine $latex {M_n}&fg=000000$ computes $latex {f(y)}&fg=000000$. But then

$latex \displaystyle f(n) = 1 + \sum_{x=0}^n H(x,n)M_{x}(n) = 1 + A_n + M_{n}(n) \ge 1 + M_{n}(n) = 1 + f(n). &fg=000000$

This is impossible and the theorem follows. $latex \Box&fg=000000$

Extending Simplicity

What Ken and I are really after is relating this to hierarchies in complexity classes. When the $latex {M_{x}}&fg=000000$ are machines of a complexity class $latex {\mathsf{C}}&fg=000000$ then the functions $latex {H}&fg=000000$ and $latex {f}&fg=000000$ are computable. It follows that $latex {f}&fg=000000$ is not computed by any $latex {M_n}&fg=000000$ and so does not belong to $latex {\mathsf{C}}&fg=000000$. What we want is to find similar functions $latex {f'}&fg=000000$ that are natural.

Ackermann's famous function $latex {A(n)}&fg=000000$ does this when $latex {\mathsf{C}}&fg=000000$ is the class of primitive recursive functions. There are various ways to define the machines $latex {M_{x}}&fg=000000$, for instance by programs with counted loops only. The $latex {f(n)}&fg=000000$ that tumbles out is not primitive recursive---indeed it out-grows all primitive recursive functions. Showing that $latex {A(n)}&fg=000000$ does likewise takes a little more formal work.

In complexity theory we have various time and space hierarchy theorems, say where $latex {\mathsf{C}}&fg=000000$ is $latex {\mathsf{DTIME}[O(n^2)]}&fg=000000$. For any time constructible $latex {t(n) = \omega(n^2 \log n)}&fg=000000$, we can separate $latex {\mathsf{C}}&fg=000000$ from $latex {\mathsf{C' = DTIME}[t(n)]}&fg=000000$ by a ``slowed'' diagonalization. The $latex {f'}&fg=000000$ obtained this way, however, needs knowledge of $latex {t(n)}&fg=000000$ and its constructibility to define it. By further ``padding'' and ``translation'' steps, one can make it work for $latex {t(n) = \omega(n^2 (\log n)^{\epsilon})}&fg=000000$, for any fixed $latex {\epsilon > 0}&fg=000000$, and a similar theorem for deterministic space needs no log factors at all. This is all technical in text and lectures.

Suppose we're happy with $latex {\mathsf{C' = DTIME}[n^3]}&fg=000000$, that is, with a non-``tight'' hierarchy. Can we simply find a natural $latex {f'}&fg=000000$ that works for $latex {\mathsf{C'}}&fg=000000$? Or suppose $latex {\mathsf{C}}&fg=000000$ is a combined time and space class, say machines that run in $latex {O(n^2)}&fg=000000$ time and $latex {O(n)}&fg=000000$ space simultaneously. Can we possibly get a natural $latex {\mathsf{C'}}&fg=000000$ that is different from what we get by considering time or space separately?

We'd like the ``non-tight'' proofs to be simple enough to combine with the above proof for halting. This leads into another change we'd like to see. Most textbooks define computability several chapters ahead of complexity, so the latter feels like a completely different topic. Why should this be so? It is easy to define the length and space usage of a computation in the same breath. Even when finite automata are included in the syllabus, why not present them as special cases of Turing machines and say they run in linear time, indeed time $latex {n+1}&fg=000000$?

Open Problems

Is the above Halting Problem proof clearer than the usual ones? Or is it harder to follow?

What suggestions would you make for updating and tightening theory courses? Note some discussion in the comments to two other recent posts.