# Gaps in prime-like sets

This post was inspired by a question of Trevor Wooley after a talk of James Maynard at MSRI. He asked what was known for lower bounds of large gaps between integers that have a representation as the sum of two squares. We assume some familiarity with the large gaps between primes problem.

A theorem of Fermat asserts that an integer, $n$ is represented by the sum of two squares if and only if all the prime factors of $n$ equivalent to 3 modulo 4 appear with an even power.

For technical simplicity, we consider the set of integers that simply have no prime factors equivalent to 3 modulo 4, but one may extend the considerations below to the original question.

The size of such a set is proportional to $x / \log^{1/2} x$. Thus the average gap is $\asymp \log^{1/2} x$ and so the average gap is at least this big.

Surprisingly, there is an easy argument that yields a gap of size $\gg \log x$. We sketch it here. For each prime $p \leq \log x$ equivalent to 3 modulo 4, choose a residue class $a_p \pmod p$. Using a greedy algorithm to choose the $a_p$, we may sieve out the integers in $\{1 , \ldots , y\}$ for $y \ll \log x$. By the Chinese remainder theorem, there is a $t \leq x$ such that $t = - a_p \pmod p$ for all the relevant $p$. Thus $t+1 , \ldots , t+y$ all contain a prime factor equivalent to 3 modulo 4, as desired.

What should one expect? Adopting the Cramer model, we expect that there should be gaps of size at least $\asymp \log^{3/2 - \epsilon} x$. We give a rough sketch as to why one should expect this. Let $S = \{ n \leq x : p|n \implies p \neq 3 \pmod 4\}$. We suppose that an integer is chosen random with probability $q = 1 / \log^{1/2} x$. Then the probability that $n+1 , \ldots , n_h \notin S$ is $(1- q)^h$ which is well approximated by $e^{-qh}$. If $h \asymp (\log x) / q$, then this is around the number of choices for $n$. One can see the Cramer model worked out more carefully here.

Fix $q$ and $a$ coprime to $q$. Now we consider the set $S = \{n \leq x : p|n \implies p \equiv a \mod q\}$. The Cramer model, as above, suggests that there should be gaps between elements of $S$ that are $\leq x$ of size $\log^{1 + 1/\phi (q) - \epsilon } x$. Nevertheless we may utilize the trivial bound above to obtain gaps of size $\log x$. Thus, for these $S$, we can come arbitrarily close (i.e. an arbitrarily small power of log) to the lower bounds associated to Cramer’s conjecture.

Note for primes, we are nowhere near Cramer’s conjecture. Ford, Green, Konyagin, Maynard, and Tao have the best lower bound. Things look worse for upper bounds, as no one knows how to improve upon a power of $x$ for the primes or any of the sets mentioned in this post (even if one assumes the Riemann hypothesis).

# A lower bound for the least prime in an arithmetic progression

Junxian Li, Kyle Pratt, and I recently uploaded our paper A lower bound for the least prime in an arithmetic progression to the arXiv.

Here is a file where the heuristics considered in section 2 of the paper are developed in a slightly simpler situation.

Given a positive integer $k$ and a $\ell$ coprime to $k$, define $p(k , \ell)$ to be the smallest prime equivalent to $\ell$ modulo $k$. We are interested in the worst case behavior, that is $P(k) := \max_{(\ell , k) = 1} p(k , \ell).$ Thus $P(5) = 19$ and $P(12) = 13$. In particular we are interested in lower bounds for $P(k)$ for large $k$. An elementary observation, due to Pomerance, in Theorem 1 shows roughly that to find lower bounds for $P(k)$, it is enough to find lower bounds for the Jacobsthal function (the roughly will be explained below). For an integer $m$, the Jacobsthal function, $g(m)$, is the largest difference between consecutive integers coprime to $m$.

In recent work on Long gaps between primes by Ford, Green, Konyagin, Maynard, and Tao, they improve upon lower bounds for $g(m)$ where $m$ is the product of the first $u$ primes (they also mention the connection to the least prime problem; indeed it was Kevin Ford who originally introduced us to the problem). The key difference in the current problem is that we seek lower bounds for $g(m)$ where $m$ is the product of the first $u$ primes that are coprime to $k$. Our main new idea is to modify these sieve weights of Maynard used in the work of Ford, Green, Konyagin, Maynard, and Tao. We outline our approach in section 4 of our paper.

We finish by taking some time here to discuss smooth number estimates, which is perhaps the most important input to our work as well as all previous work on large gaps between primes (Westzynthius, in 1931, was the first to realize this connection). For $x \geq y \geq 1$, let $\Psi(x,y)$ be the number of integers $\leq x$ whose prime factors are all $\leq y$. Thus $\Psi(x,2)$ is the number of powers of $2$ that are at most $x$ and $\Psi(x , 3)$ is the number of integers of the form $2^a 3^b \leq x$. Estimating $\Psi(x,2)$ is straightforward and for $y$ is fixed, one can obtain an asymptotic for $\Psi(x,y)$ by counting lattice points in a simplex, as I describe in this previous blog post.

For our current problem, it is crucial that we are allowed to let $y$ depend on $x$. The important fact is that $\Psi(x,y)$ is much smaller than expected (by sieve theory heuristics). Rankin, in 1938, in his work on gaps between primes (see also: these set of notes) improved upon smooth number estimates to obtain better lower bounds for large gaps between primes. Westzynthius’ strategy, along with Rankin’s estimates, are still the starting points for current methods of constructing large gaps between primes.