### Weirdness in the Primes

#### Posted by John Baez

What percent of primes end in a 7? I mean when you write them out in base ten.

Well, if you look at the first hundred million primes, the answer is 25.000401%. That’s very close to 1/4. And that makes sense, because there are just 4 digits that a prime can end in, unless it’s really small: 1, 3, 7 and 9.

So, you might think the endings of prime numbers are random, or very close to it. But 3 days ago two mathematicians shocked the world with a paper that asked some other questions, like this:

*If you have a prime that ends in a 7, what’s the probability that the next prime ends in a 7?*

I would have expected the answer to be close to 25%. But these mathematicians, Robert Oliver and Kannan Soundarajan, actually looked. And they found that among the first hundred million primes, the answer is just 17.757%.

So if a prime ends in a 7, it seems to somehow tell the next prime *“I rather you wouldn’t end in a 7. I just did that.”*

This strikes me as weird. And apparently it’s not just because I don’t know enough number theory. Ken Ono is a real expert on number theory, and when he learned about this, he said:

I was floored. I thought, “For sure, your program’s not working.”

Needless to say, it’s not magic. There will be an explanation. In fact, Oliver and Soundarajan have conjectured a formula that says exactly how much of a discrepancy to expect - and they’ve checked it, and it seems to work. It works in every base, not just base ten. But we still need a proof that it really works.

By the way, their formula says the discrepancy gets smaller and smaller when we look at more and more primes. If we look at primes less than $N$, the discrepancy is on the order of

$\frac{\log(\log(N))}{\log(N)}$

This goes to zero as $N \to \infty$. But this discrepancy is *huge* compared to the discrepancy for the simpler question, “what percentage of primes ends in a given digit?” For that, the discrepancy, called the Chebyshev bias, is on the order of

$\frac{1}{\log(N) \sqrt{N}}$

Of course, what’s really surprising is not this huge correlation between the last digits of consecutive primes, but that number theorists hadn’t thought to look for it until now!

Any amateur with decent programming skills could have spotted this and won everlasting fame, if they’d thought to look. What other patterns are hiding in the primes?

For more, read this:

- Erica Klarreich, Mathematicians discover prime conspiracy,
*Quanta*, March 13, 2016.

and this:

- Terry Tao, Biases between consecutive primes,
*What’s New*, March 14, 2016.

and of course the actual paper:

- Robert J. Lemke Oliver and Kannan Soundararajan, Unexpected biases in the distribution of consecutive primes, March 11, 2016.

Their work involves a variant of the Hardy–Littlewood $k$-tuple conjecture, which is a conjectured formula for the density of ‘constellations’ of primes of a given ‘shape’—that is, $k$-tuples of primes that are of the form

$(a_1 + n , \dots, a_k + n)$

for some given ‘shape’ $(a_1, \dots, a_k)$.

I just noticed something funny. It seems that the Hardy–Littlewood $k$-tuple conjecture is also called the ‘first Hardy–Littlewood conjecture’. The ‘second Hardy–Littlewood conjecture’ says that

$\pi(M + N) \le \pi(M) + \pi(N)$

whenever $M,N \ge 2$, where $\pi(N)$ is the number of primes $\le N$.

What’s funny is what Wikipedia says about the second Hardy–Littlewood conjecture! It says:

This is probably false in general as it is inconsistent with the more likely first Hardy–Littlewood conjecture on prime $k$-tuples, but the first violation is likely to occur for very large values of $M$.

Is this true? If so, did Hardy and Littlewood *notice* that their two conjectures contradicted each other? Isn’t there some rule against this? Otherwise you could just conjecture $P$ and also $not(P)$, disguising $not(P)$ in some very different language, and be sure that *one* of your conjectures was true!

(Unless, of course, you’re an intuitionist.)

## Re: Unexpected Biases in the Distribution of Consecutive Primes

Has anyone checked the function field analog? Does anyone know if it is easier to prove?