{\huge Twin Primes Can Be Useful}

Why the recent breakthrough is important \vspace{.5in}

Yitang Zhang, of the University of New Hampshire, has apparently proved a finite approximation to the famous Twin Prime Conjecture. This is a result of first order. After ten days of progressively more detailed news, including today's article in the New York Times, Zhang's 56-page preprint has just been released on the Annals of Mathematics journal website. This is a revision of the original submission, which was said in a story in Nature last week to have needed ``a few minor tweaks.''

Today I want to explain why the Twin Prime Conjecture is really important.

Recall that the Twin Prime Conjecture states that there are an infinite number of primes $latex {p}&fg=000000$ and $latex {q}&fg=000000$ that are as close as possible: $latex {p-q = 2}&fg=000000$. Well $latex {3}&fg=000000$ and $latex {2}&fg=000000$ are closer, but that can only happen once, so the best one can hope for is primes that are within two of each other.

Zhang's beautiful result is that there are an infinite number of primes $latex {p}&fg=000000$ and $latex {q}&fg=000000$ so that $latex {p-q = 2N}&fg=000000$ and $latex {N}&fg=000000$ is bounded by an absolute constant. The constant is large---in the tens of millions---but it is a constant. Perhaps we should call these ``Cousin Primes,'' since they are not within two; I will leave the naming of them to the number theorists. Whatever you call them, his result is a huge improvement to what was previously proved, and is a singular advance.

The proof is long, which is not unexpected. We saw the news of the paper's release earlier today on Terry Tao's Google+ page about the work, which gives some idea of how the proof goes. There are many links and comments in a post by Peter Woit that also mentions a recently announced proof by Harald Helfgott of the ``ternary Goldbach conjecture'' that every odd number above $latex {5}&fg=000000$ is the sum of three primes.

So what can we possibly add to the discussion about Zhang's breakthrough? Nothing on the proof. Something, however, on why the Twin Prime Conjecture is really important. This is all from a personal point of view, but one that I hope you will enjoy. Let's first take a quick look at what was known before his work, and then discuss what it may be useful for.

Before Zhang

One measure of the density of the primes is that the summation

$latex \displaystyle \sum_{p} \frac{1}{p} &fg=000000$

does not converge. That is, the sum

$latex \displaystyle \sum_{p<x} \frac{1}{p} &fg=000000$

tends to infinity as $latex {x}&fg=000000$ tends to infinity. The growth is slow, but the sum does divergent. In 1915, Viggo Brun used sieve methods to prove that Twin Primes were rarer in a precise sense: the summation over twin primes

$latex \displaystyle \sum_{p} \frac{1}{p} &fg=000000$

converges. Indeed his result can be improved to show that the number of Twin Primes less that $latex {x}&fg=000000$ is bounded above by

$latex \displaystyle O\left(\frac{x}{(\log x)^{2}}\right). &fg=000000$

Usually heuristic arguments Godfrey Hardy and John Littlewood guessed that not only that there are an infinite number of Twin Primes, but that there density is close to what a ``random'' model would predict. Let $latex {\pi_{2}(x)}&fg=000000$ be the number of Twin Primes less than $latex {x}&fg=000000$---recall $latex {\pi(x)}&fg=000000$ is used to denote the number of primes less than $latex {x}&fg=000000$---then the Hardy-Littlewood Conjecture is that

$latex \displaystyle \pi_{2}(x) \approx C\left(\frac{x}{(\log x)^{2}}\right), &fg=000000$

for an explicit constant

$latex \displaystyle C = 0.6601618158468695739278121100145\dots &fg=000000$

Tao is on record as saying that certain approaches based on sieve theory cannot resolve the Twin Prime conjecture---see this for a short discussion. Of course Zhang used a different method of proof. Mark Lewko, however, in a comment to a MathOverflow thread on Zhang's paper, indicates that its mechanism alone cannot reduce the gap under $latex {16}&fg=000000$, and it does not circumvent a more general obstacle to gaps below $latex {6}&fg=000000$.

Another Problem

Years ago I worked on a problem that had nothing directly to do with the Twin Prime Conjecture. The question is a fundamental one about the complexity of Boolean functions. It is not classic, has not been open for a very long time, and could be trivial. But like many problems about Boolean functions it turned out to fight back hard, and the best we could do was to make a small dent in the problem.

The work was joint with Mihail Kolountzakis, Evangelos Markakis, Aranyak Mehta, and Nisheeth Vishnoi. See the paper for details. Indeed while I helped start the research with Markakis, Mehta, and Vishnoi, without Kolountzakis the work would have never been completed. Our result was then greatly improved by Amir Shpilka and Avishay Tal, as we covered here.

The problem is concretely about Boolean functions $latex {f}&fg=000000$ of $latex {k}&fg=000000$ variables, and seems not to involve prime numbers at all. For any subset $latex {S}&fg=000000$ of the coordinates, the corresponding Fourier coefficient is given by

$latex \displaystyle \hat{f}(S) = \frac{1}{2^k} \sum_{x \in \mathbb{Z}_2^k} f(x)\chi_S(x)&fg=000000$

where $latex {\chi_S(x)}&fg=000000$ is $latex {-1}&fg=000000$ if $latex {\sum_{i \in S} x_i}&fg=000000$ is odd, and $latex {+1}&fg=000000$ otherwise. The problem is:

What is the smallest value $latex {\tau(k)}&fg=000000$ such that for every symmetric 0-1 function $latex {f(x_1,\dots,x_k)}&fg=000000$ on $latex {\mathbb{Z}_2^k}&fg=000000$ that is not affine linear---by symmetry, this means neither constant nor a parity function---some $latex {\hat{f}(S)}&fg=000000$ with $latex {1 \leq |S| \leq \tau(k)}&fg=000000$ does not vanish?

The Prime Gap Trick

A heuristic that I have used before is this: When trying to prove some theorem in number theory, assume any reasonable property that you need about primes. Either you will at least get a conditional theorem, or you might later be able to weaken the assumption you made. The latter is what happened to us. We reduced the problem to showing that a certain integer-valued polynomial is constant over the set $latex {\{0,1,\dots,k\}}&fg=000000$. Then we expressed the connection in the paper in these terms:

First, we show that $latex {P}&fg=000000$ is constant over the union of two small intervals $latex {\{0, ..., t\} \cup \{k - t, ..., k\}}&fg=000000$. This is obtained by looking at $latex {P}&fg=000000$ modulo carefully chosen prime numbers. One way to prove this (at least infinitely often) would be to assume the twin primes conjecture... We manage to replace [this] by choosing four different primes in a more involved manner...

Well, we actually did something else besides use twin primes, but this is how we got the idea. Moreover, Shpilka and Tal used gaps between consecutive primes in a different way, obtaining bounds in terms of the largest such gap in the interval $latex {1\dots k}&fg=000000$, which is known to be $latex {O(k^{0.525...})}&fg=000000$. If we can get our hands on enough cases where the gaps are small, maybe we can improve the estimates further. Why is it useful to have primes with small gaps?

We have covered the use of the Chinese Remainder Theorem for analytical results before. Usually for complexity-theoretic applications such as this one, we want the primes themselves to be small---and don't mind having a bunch of primes. For logspace we can work with polynomially-many primes in a sequential manner, so long as each needs only $latex {O(\log n)}&fg=000000$ bits to store. When we don't need size to be so constrained, however, it can be more useful to have the gap between primes $latex {p,p+c}&fg=000000$ in the representation be small. Then if we know in advance that values $latex {m = P(x)}&fg=000000$ are below $latex {p^2/c}&fg=000000$, we not only have the full congruence information from $latex {m \pmod{p}}&fg=000000$ and $latex {m \pmod{p+c}}&fg=000000$, we also control the values of the quotients. For higher $latex {m}&fg=000000$ we retain a lot of this control. The larger $latex {p}&fg=000000$ and the smaller $latex {c}&fg=000000$, the bigger the effective range.

We will hence be further interested in how dense the pairs of ``cousin'' primes must be, and how efficiently they can be generated. Anytime there is a breakthrough, it is time to revisit old ideas and see whether they too can profit.

Open Problems

How far can the gap between consecutive primes, for infinitely many such pairs, be reduced? Do this and Helfgott's result on Goldbach herald a more general breakthrough in number theory?