{\huge There Are Many Primes}

Various senses of 'many' from proofs of the infinitude of primes

\vspace{.2in}

Hillel Furstenberg is entering his 50th years as a professor of mathematics at Hebrew University in Jerusalem. He shared the 2006-07 Wolf Prize with Stephen Smale. He has shown many connections between analysis and combinatorics. One was showing how ergodic theory can prove Endre Szemerédi's theorem that every positive-density subset of the integers includes arbitrarily-long arithmetic progressions $latex {[a + bn]}&fg=000000$. This led to a new multidimensional formulation with Yitzhak Katznelson, and the two later proved positive-density versions of the Hales-Jewett theorem.

Today Ken and I wish to discuss proofs of the infinitude of primes, and what they begin to say about analysis and combinatorics.

We thought of this after viewing an old StackExchange thread on different proofs of the infinitude of primes. The oldest proof is ascribed to Euclid of Alexandria around 300 BCE. The newest one mentioned there was conceived by Furstenberg as an undergraduate in 1955. The visions accompanying these proofs interest us.

The Prime Number theorem is of course a deeper statement. It was proved first by analysis, and its later proof by ``elementary'' methods was considered a major achievement. Something similar on a vast collaborative scale has recently happened with the density Hales-Jewett theorem. The primes do not have positive density, but Ben Green and Terence Tao still proved they have the property of Szemerédi's theorem.

As remarked by Pete Clark at the end of his own notes on proofs of infinitely many primes, the frontier has moved to what can be proved generically about sets of density similar to that of the primes. What density gives ``many'' enough, and what if any extra structure needs to be postulated rather than derived? Let's try to pick up these issues from the beginning.

Euclidean Proofs

Euclid's proof appears in his Elements\/, Book IX, Proposition 20. We all know it, I believe. But here is the main idea. Suppose that

$latex \displaystyle p_{1},\dots,p_{m} &fg=000000$

are distinct primes. Look at

$latex \displaystyle N=p_{1}p_{2} \cdots p_{m} + 1. &fg=000000$

Clearly $latex {N>1}&fg=000000$ and so must have some prime factor, say $latex {q}&fg=000000$. But $latex {q}&fg=000000$ cannot be any $latex {p_{i}}&fg=000000$ and therefore is a new prime. This shows that there are an infinite number of primes.

There are many variations on this theme. Ernst Kummer in 1878 used

$latex \displaystyle N=p_{1}p_{2} \cdots p_{m} - 1. &fg=000000$

This affords some small, with all due respect, technical advantage over Euclid's proof.

The power of this idea is manifest in that it can prove much more than just there are an infinite number of primes. It can prove special cases of the famous theorem by Peter Dirichlet's on arithmetic progressions. Let $latex {a,b}&fg=000000$ be relatively prime positive numbers. Then the progression

$latex \displaystyle a,\ a+n,\ a+2n,\ a+3n,\ \dots,\ &fg=000000$

contains an infinite number of primes. The theorem is so famous that it has been re-proved and discussed many times. It was even recently uploaded to the ArXiv here---in English. I guess if your paper is super important eventually it gets put there even if you are no longer around. Pretty neat.

For example, Euclid's method can be easily modified to prove:

Theorem 1 There are infinitely many prime numbers of the form $latex {4n-1}&fg=000000$.

Let $latex {p_{1},\dots,p_{m}}&fg=000000$ be all the primes of this form. The key is to define

$latex \displaystyle N = 4 \prod_{i=1}^{m}p_{i} \ -1. &fg=000000$

The rest of the argument is almost the same as before: no $latex {p_i}&fg=000000$ divides $latex {N}&fg=000000$, and if all other primes were congruent to $latex {+1}&fg=000000$ mod $latex {4}&fg=000000$, then $latex {N}&fg=000000$ as a product of them would be too. Note the trick was in defining $latex {N}&fg=000000$. Many congruences cases can be handled by just defining the ``right'' integer $latex {N}&fg=000000$, but some seem not to work. Can we prove that?

Counting Proofs

There are several of what I would call counting proofs. They show that if there are a finite number of primes, then there are not enough ``names'' for all integers. Clark's notes have some examples, and the one by Paul Erd\H{o}s is also in notes by John Cook. It goes like this: Consider the integers

$latex \displaystyle 1,\dots,x. &fg=000000$

Suppose that there are only $latex {m}&fg=000000$ primes. Every integer in this range can be written as $latex {r^{2}s}&fg=000000$ where $latex {s}&fg=000000$ is a square-free number. But $latex {r}&fg=000000$ must be at most $latex {\sqrt x}&fg=000000$ and clearly there are only $latex {2^{m}}&fg=000000$ possible values for $latex {s}&fg=000000$. This shows that

$latex \displaystyle \sqrt{x}2^{m} \ge x. &fg=000000$

But this is impossible for $latex {x}&fg=000000$ large enough. Very neat.

Proofs By Higher Knowledge

One proof, by Lawrence Washington and mentioned by Clark, is just a finite computation that rests on higher knowledge:

$latex \displaystyle (1 + \sqrt{-5})\cdot(1 - \sqrt{-5}) = 6 = 2\cdot 3. &fg=000000$

But if the ring $latex {\mathbb{Z}}&fg=000000$ had only finitely many prime ideals then every extension such as $latex {\mathbb{Z}[\sqrt{-5}]}&fg=000000$ would be a unique factorization domain---contradiction, Q.E.D.

Well not so fast. The ``then'' part rests on theory about number fields and integer closures and principal ideal domains proved by Richard Dedekind and others.

A common feature of all these proofs is that they rest on properties of factorization\/ that are regarded as intuitive. You have a number $latex {N}&fg=000000$ where adding or subtracting 1 (or something else) has destroyed the previous product structure, so you have to make inferences about its unknown factors de novo. Is this somehow cheating? Washington's proof has no ``$latex {N}&fg=000000$'' but it would take much care to convince us that the Dedekind theory wasn't somehow using the infinitude of primes to begin with.

What strikes us as remarkable about Furstenberg's proof next is that it sublimates the notion of factoring into something else. Arithmetical progressions are defined by common factors but those factors are known.

Furstenberg's Topological Proof

Clark's notes reproduce Furstenberg's one-paragraph proof as published in the American Mathematical Monthly; we elide one sentence that made a side remark:

``In this note we would like to offer an elementary `topological' proof of the infinitude of the prime numbers. We introduce a topology into the space of integers $latex {S}&fg=000000$, by using the arithmetic progressions (from $latex {-\infty}&fg=000000$ to $latex {+\infty}&fg=000000$) as a basis. It is not difficult to verify that this actually yields a topological space. Each arithmetic progression is closed as well as open, since its complement is the union of other arithmetic progressions (having the same difference). As a result the union of any finite number of arithmetic progressions is closed. Consider now the set $latex {A = \cup A_p}&fg=000000$ where $latex {A_p}&fg=000000$ consists of all multiples of $latex {p}&fg=000000$, and $latex {p}&fg=000000$ runs though the set of primes $latex {\geq 2}&fg=000000$. The only numbers not belonging to $latex {A}&fg=000000$ are $latex {−1}&fg=000000$ and $latex {1}&fg=000000$, and since the set $latex {\{−1, 1\}}&fg=000000$ is clearly not an open set, $latex {A}&fg=000000$ cannot be closed. Hence $latex {A}&fg=000000$ is not a finite union of closed sets which proves that there are an infinity of primes.''

This is so compact as to make one wonder about ``cheating.'' However, as noted in an April 2009 Monthly article by Idris Mercer, one can dispense with topological language and ground the proof even more simply in set theory.

Mercer's Set-Theory Rendition

Let $latex {U}&fg=000000$ be an infinite universe and $latex {\cal A}&fg=000000$ a family of infinite subsets of $latex {U}&fg=000000$ such that $latex {{\cal A}' = {\cal A} \cup \{\emptyset\}}&fg=000000$ is closed under intersection. Also let us suppose $latex {\cal A}&fg=000000$ is such that for every $latex {A \in {\cal A}}&fg=000000$, the set $latex {U \setminus A}&fg=000000$ is a union of members of $latex {\cal A}&fg=000000$. (We will in fact not care whether this union is finite, though it is when $latex {\cal A}&fg=000000$ is the collection of all arithmetical progressions.)

Fix a nonempty finite subset $latex {F}&fg=000000$ of $latex {U}&fg=000000$. Then, for any $latex {A_{1},\dots,A_{m}}&fg=000000$ in the family $latex {\cal A}&fg=000000$,

$latex \displaystyle A_{1} \cup \cdots \cup A_{m} = U \setminus F&fg=000000$

is impossible. For if it were possible, then by taking complements, we would have

$latex \displaystyle {\bar A_{1}} \cap \dots \cap {\bar A_{m}} = F. &fg=000000$

By hypothesis, each $latex {{\bar A_{k}}}&fg=000000$ is a union of sets from the family. By the distributive law of $latex {\cup}&fg=000000$ and $latex {\cap}&fg=000000$, we may bring the entire union to the outside, getting

$latex \displaystyle F = \bigcup_{\lambda} A_\lambda,\quad\mbox{where}\quad A_{\lambda} = A_{\lambda,1} \cap \cdots \cap A_{\lambda,m},\mbox{~each &fg=000000$

By the closure hypothesis on $latex {\cal A}&fg=000000$, each $latex {A_{\lambda}}&fg=000000$ is either $latex {\emptyset}&fg=000000$ or infinite. Hence the whole union cannot be finite, which is a contradiction.

Finally, if there were finitely many primes $latex {p_1,\dots,p_m}&fg=000000$, then we would have such an impossible representation with $latex {F = \{-1,1\}}&fg=000000$ and $latex {A_j = [0 + p_j n]}&fg=000000$. $latex {\Box}&fg=000000$

\bigskip How elementary is this? Although distributing an infinite union gets into aspects of set theory that are formally more advanced, in fact we don't need it since $latex {\bar{A_{j}}}&fg=000000$ equals the finite union $latex {[1 + p_j n] \cup \cdots \cup [(p_j - 1) + p_j n]}&fg=000000$. That the intersection of two arithmetical progressions $latex {[a + bn]}&fg=000000$ and $latex {[c + dn]}&fg=000000$ is either empty or infinite needs only the finitistic reasoning that if $latex {x}&fg=000000$ belongs to the intersection then so does $latex {x + bd}&fg=000000$ (not to mention $latex {x + \text{lcm}(b,d)}&fg=000000$). Hence the lone open-ended inference seems to be that $latex {F}&fg=000000$ is the complement of the union of the $latex {A_j}&fg=000000$. This strikes us as involving only the notion of ``prime,'' in a way separate from the other uses of ``factoring.''

Open Problems

Does our interpretation of Furstenberg's proof as ``more elementary'' hold water? Is it a valid hint as to how concrete number-theoretic properties that matter to complexity might be obtained from a ``rising sea'' of knowledge at the juncture of set theory, topology, and abstract algebra?

Can you find some other proofs? On the ``silly'' or ``cheating'' side, having finitely many primes would give a polynomial time algorithm for factoring (actually linear time), contradicting the usual cryptography assumption that factoring is hard. We wish there were a more serious connection that something known in complexity theory would be false---the trouble with the above proof is that the cryptography assumption is unproved.