Category Archives: Number Theory

Breaking the Square root Barrier via the Burgess Bound

Thanks to Kyle Pratt for enlightening discussions leading to this post.

Recall the Burgess bound which states for any interval {I \subset \mathbb{F}_p} of size at least {p^{1/4 + \epsilon}}, one has

\displaystyle \left| \sum_{x \in I} \chi(x) \right| \ll \frac{|I|}{p^{\delta}}, \ \ \ \ |I| \geq p^{1/4+ \epsilon}\ \ \ \ \ (1)

where {\chi} is the Legendre symbol modulo {p}.

Conjecture 1: For any {\alpha > 0}, (1) holds for intervals, {I}, of size {p^{\alpha}}. { \ \ \ \spadesuit}

Conjecture 1 is one way quantify that {\chi} behaves like a random sign pattern. Indeed almost surely, (1) holds for a random {\pm 1}–valued function from {\mathbb{F}_p}.

Character sums are central objects of study in analytic number theory and such a result would have wide applications. For instance, it would solve the Vinogradov least square problem. It would also have implications to sub–convexity bounds of {L}–fucntions, using

\displaystyle L(1/2 , \chi) = \sum_{n \leq \sqrt{p}} \chi( n) / \sqrt{n} + O(\log p).

For instance, summation by parts and (1) gives the sub–convexity bound

\displaystyle L(1/2 , \chi) \ll p^{1/4 - \delta} .

The key obstacle is the famous square root barrier. In {N} coin flips, one can change {\sqrt{N}} of the outcomes without affecting the global statistics, for instance in the central limit theorem. Thus when applying global statistics to problems in short intervals, one naturally arise at a square root barrier for the size of the interval.

To this end, it is no surprise that the P\’lya–Vinogradov inequality}, that is conjecture 1 with {\alpha = 1/2 + \epsilon}, was proved about forty years before the Burgess bound. The P\'{o}lya–Vinogradov theorem can be proved using only the global statistics of {\chi}, for instance for all {\xi \in \mathbb{F}_p},

\displaystyle \left| \sum_{x \in \mathbb{F}_p} \chi(x) e_p(\xi x) \right| \leq \sqrt{p} , \ \ \ e_p(\theta) : = e^{2 \pi i \theta / p }.

On the other hand, the Burgess bound breaks the square root barrier, so one needs additional ideas, which we now discuss. We first use {I+x \approx I} for small {x}. Then we use the multiplicativity of {\chi} to amplify an interval of size {p^{1/4 + \epsilon}} to a set of size {p^{1/2 + \epsilon}}. Finally, we use completion and apply a global statistic of {\chi} (Hasse-Weil bound). Without the amplification step, this method would not work. Let us sketch the proof.

By the near translation invariance of {I}, one has

\displaystyle \sum_{x \in I} \chi(x) =\sum_{x \in I} \chi(x + yt) + O(p^{1/4 + \epsilon / 2}),

for all {y \in J = \{1 , \ldots , \lfloor p^{1/4} \rfloor \}} and {t \in T = \{1 , \ldots ,\lfloor p^{\epsilon / 2} \rfloor \}}. But by the multiplicativity of {\chi},

\displaystyle \left| \sum_{y \in J ,t \in T} \sum_{x \in I} \chi(x + yt) \right| \leq \sum_{x \in I , y \in J} \left| \sum_{t \in T} \chi(y^{-1} x + t) \right| .\ \ \ \ \ (2)

Now the number of ways to write an integer as {xy^{-1}} is roughly one on average on a set of size {p^{1/2 + \epsilon}}, so (2) is approximately

\displaystyle \sum_{u \in S} \left| \sum_{t \in T} \chi(u+ t) \right| ,

for some {S} of size {p^{1/2 + \epsilon}}. Since {S} is so large, we can now efficiently complete the sum after an application of H\”{o}lder’s inequality to arrive at

\displaystyle \sum_{u \in \mathbb{F}_p} \left| \sum_{t \in T} \chi(u+ t) \right|^{2r} .

This may be bounded by the following global statistic. We remark in our case, there is a short proof using the polynomial method (the proof in the note is only for elliptic curves, but is easily generalized to our situation).

Lemma 1: Let {f} be a polynomial which is not a multiple of a perfect square. Then

\displaystyle \sum_{x \in \mathbb{F}_p} \chi (f(x)) = O_{{\rm deg}(f)} (\sqrt{p}).

In particular,

\displaystyle \sum_{u \in \mathbb{F}_p} \left| \sum_{t \in T} \chi(u+ t) \right|^{2r} \ll_r p^{1/2} |T|^{2r} + p |T|^r. \ \ \spadesuit

Putting this altogether, as done beautifully in Chang’s survey one arrives at the Burgess bound of the form (1), for {\delta = \epsilon^2 / 4}.

Another successful instance of breaking the square root barrier is in this paper of Hanson. One result he proved was for any {\delta > 0},

\displaystyle \sum_{a \in A , b \in B , c \in C} \chi(a + b + c) = o_{\delta} (|A||B||C|),

for arbitrary {A, B , C} of size at least {\delta \sqrt{p}}. He takes advantage of the additive structure of the problem as was pioneered in this paper of Chang using Freiman’s theorem from additive combinatorics. This has the same flavor as the Burgess bound in (1) as one replaces the very additively structured interval {I} with a sumset. The main problem along these line is the following

Conjecture 2: For any {A, B} of size at least {p^{\alpha}}, there is a {\delta >0} such that

\displaystyle \sum_{a \in A , b \in B} \chi (a + b) \ll |A| |B| p^{-\delta}. \ \ \spadesuit.

This would have an application to sparse signal recovery. One can see Bourgain’s work and the references within for the dual problem, where multiplicative character is replaced by additive character and sumset is replaced by product set.

Granville and Soundararajan (Theorem 2) building upon work of Montgomery and Vaughan showed that by assuming GRH, one has

\displaystyle \sum_{n \leq x} \chi (n) \sim \sum_{n \leq x} \chi(n) \Psi_y(n) , \ \ \ y \geq \log^5 q ,

where {\Psi_y} is the characteristic function of {y}-smooth numbers. Thus the sum in (1) (with {I = [1 , \ldots , x]}) is determined solely by the values of {\chi} at rather small primes. Using smooth number estimates, this implies Conjecture 1 in the case {I = [1, \ldots , x]}.

Finally, we remark that it is know that GRH implies Conjecture 1.

Fractal solutions of dispersive partial differential equations on the torus

Burak Erdoğan and I just put “Fractal solutions of dispersive partial differential equations on the torus” on the arXiv. We study cancellation in exponential sums and how this leads to bounds for the fractal dimension of solutions to certain PDE, the ultimate “square root cancellation” implying exact knowledge of the dimension.

In this post, we consider the case of Schrödinger’s equation with initial data {\chi := \chi_{[0 , \pi]}} for simplicity. Here,

\displaystyle if_t + f_{xx} = 0 , \ \ t \in \mathbb{R}, \ \ x \in \mathbb{R} / 2 \pi \mathbb{Z} \ \ , \ \ f(0,x) = \chi(x) ,

and the solution is given by

\displaystyle f(t,x) = \sum_{n \in \mathbb{Z}} \widehat{\chi}(n) e^{i t n^2 + i n x}.

Note the solution is periodic in both {x} and {t}.

For a line {\mathcal{L} \subset \left( \mathbb{R} / 2 \pi \mathbb{Z} \right)^2}, we are interested in the fractal dimension of the real and imaginary parts of the graph of {f |_{\mathcal{L}} (t,x)}. This dimension must lie in the interval {[1,2]}.

An important case to consider is {t = a/q}, which leads to so–called quantization and is related to the Talbot effect. In this case, it was shown by Berry and Klein that

\displaystyle f(t,x) = \sum_{j=0}^{q-1} c_j \chi_{[0 , 1/q] + j/q} (x), \ \ \ \ \ (1)

for some {c_j \in \mathbb{C}} that are Gauss sums. Thus {f} is a linear combination of at most {q} intervals and has fractal dimension 1. See page 4 of this paper of Chen and Olver for some pictures of this phenomenon.

The story is entirely different for irrational {t}. To see why, observe for a sequence {\frac{a_n}{q_n} \rightarrow t}, the functions {f(\frac{a_n}{q_n} , x)} as given in (1) increase in complexity as {q_n} increases.

To make this precise, we use the theory of Besov spaces and show that bounds for the {L^p} norm of

\displaystyle H_N(t,x) : = \sum_{N \leq n < 2N} e^{itn^2 + i nx}, \ \ (t,x) \in \mathcal{L},

imply fractal dimension bounds for the real and imaginary parts of {f_{\mathcal{L}}(t,x)}.

Remark: The starting point of the theory of of Besov spaces that we use can be illustrated by the following basic facts from analysis:

  • The graph of a real valued {C^{\gamma}} function has fractal dimension {\leq 2 - \gamma},
  • For {g \in C^{\gamma}}, one has {\widehat{g}(n) \ll n^{-\gamma}}.

Thus we see that both fractal dimension and Fourier decay are related to the smoothness of {g}. In our situation, we have more information than that of the conclusion of the second bullet, that is we know that

\displaystyle \sum_{N \leq n < 2N} \widehat{g}(n),

exhibits cancellation and so we are able to use more involved theory to roughly reverse the implication in the second bullet as well as provide lower bounds. ♠

Let us first consider the case of horizontal lines in space–time, that is {t} fixed. Then by orthogonality, Weyl’s inequality, and the divisor bound, one has for almost every {t},

\displaystyle ||H_N||_{L_x^2} = N^{1/2} , \ ||H_N||_{L_x^{\infty}} \ll N^{1/2 + \epsilon}, \ ||H_N||_{L_x^4} \ll N^{1/2 + \epsilon}.

By a previous blog post, this implies

\displaystyle ||H_N||_{L_x^1} \gg N^{1/2 - \epsilon}.

One can use this to show that the fractal dimension of {f_{\mathcal{L}}(t,x)} is equal to {3/2} for almost every horizontal line, recovering a theorem of Rodnianski. In one part of our paper, we adapt the above strategy to show the following.

Theorem (oblique): The fractal dimension of the real and imaginary parts of the solution to (1) restricted to almost every

\displaystyle \mathcal{L} = \{(t,x) : a t + bx = c\},

is in the interval

\displaystyle [7/4 , 19/10]. ♠

Note that the lower bound 7/4 is bigger than the 3/2 for horizontal lines. To show the upper bound in Theorem (oblique), we study exponential sums such as

\displaystyle \sup_x |\sum_{n \leq N} e^{i(c+x)n^2 + i nx}| , \ \ c \in \mathbb{R} / 2 \pi \mathbb{Z}.

In this case, square root cancellation would imply the fractal dimension is exactly {7/4}.

In our paper, we study a variety of dispersive PDE, for instance the Airy, KDV, non–linear Schrödinger, free water boundary, Boussinesq, and gravity–capillary wave equation. To handle non–linear equations, we use smoothing estimates, which roughly state that the solution to a nonlinear PDE is the linear part plus a smoother perturbation.

Gaps in prime-like sets

This post was inspired by a question of Trevor Wooley after a talk of James Maynard at MSRI. He asked what was known for lower bounds of large gaps between integers that have a representation as the sum of two squares. We assume some familiarity with the large gaps between primes problem.

A theorem of Fermat asserts that an integer, n is represented by the sum of two squares if and only if all the prime factors of n equivalent to 3 modulo 4 appear with an even power.

For technical simplicity, we consider the set of integers that simply have no prime factors equivalent to 3 modulo 4, but one may extend the considerations below to the original question.

The size of such a set is proportional to x / \log^{1/2} x. Thus the average gap is \asymp \log^{1/2} x and so the average gap is at least this big.

Surprisingly, there is an easy argument that yields a gap of size \gg \log x. We sketch it here. For each prime p \leq \log x equivalent to 3 modulo 4, choose a residue class a_p \pmod p. Using a greedy algorithm to choose the a_p, we may sieve out the integers in \{1 , \ldots , y\} for y \ll \log x. By the Chinese remainder theorem, there is a t \leq x such that t = - a_p \pmod p for all the relevant p. Thus t+1 , \ldots , t+y all contain a prime factor equivalent to 3 modulo 4, as desired.

What should one expect? Adopting the Cramer model, we expect that there should be gaps of size at least \asymp \log^{3/2 - \epsilon} x. We give a rough sketch as to why one should expect this. Let S = \{ n \leq x : p|n \implies p \neq 3 \pmod 4\}. We suppose that an integer is chosen random with probability q = 1 / \log^{1/2} x. Then the probability that n+1 , \ldots , n_h \notin S is (1- q)^h which is well approximated by e^{-qh}. If h \asymp (\log x) / q, then this is around the number of choices for n. One can see the Cramer model worked out more carefully here.

Fix q and a coprime to q. Now we consider the set S = \{n \leq x : p|n \implies p \equiv a \mod q\}. The Cramer model, as above, suggests that there should be gaps between elements of S that are \leq x of size \log^{1 + 1/\phi (q) - \epsilon } x. Nevertheless we may utilize the trivial bound above to obtain gaps of size \log x. Thus, for these S, we can come arbitrarily close (i.e. an arbitrarily small power of log) to the lower bounds associated to Cramer’s conjecture.

Note for primes, we are nowhere near Cramer’s conjecture. Ford, Green, Konyagin, Maynard, and Tao have the best lower bound. Things look worse for upper bounds, as no one knows how to improve upon a power of x for the primes or any of the sets mentioned in this post (even if one assumes the Riemann hypothesis).

An introduction to Vinogradov’s mean value theorem

VMT-Intro: pdf file containing the post

A link to a talk I gave on the topic. The talk is much more elementary than the blog post.

 

Vinogradov’s mean value theorem (recently proved by Bourgain, Demeter and Guth) is a beautiful problem in the intersection of number theory and harmonic analysis.

As a problem in number theory, the mean value theorem requests to show most of the solutions to a certain system of Diophantine equations of the form f_j(\vec{x}) = f_j(\vec{y}) for some integral polynomials f_1 , \ldots , f_k are of the form \vec{x} = \vec{y}. As a problem in harmonic analysis, the mean value theorem requests for an upper bound for the L^p norm of a certain function. These two problems turn out to be equivalent, as explained in the pdf linked at the top of the post, thanks to Fourier analytic identities.

One goal in the above pdf is to understand the nature of the so called “critical exponent.” Interpolation reveals that Vinogradov’s mean value theorem follows from an L^{k(k+1)} bound of a certain function. While a first step to understanding this critical exponent is interpolation, consideration of the major arcs gives proper insight into why k(k+1) appears.

In the final section, I attempt to explain how the mean value theorem can be interpreted as a stronger form of orthogonality of certain complex exponentials. For a vector n \in \mathbb{Z}^k, we define e_n : \mathbb{R}^k / \mathbb{Z}^k \to \mathbb{C} via e_n(\alpha) : = e^{2 \pi i n \cdot \alpha}. Then Vinogradov’s mean value theorem can be interpreted as showing \{e_{(j , \ldots , j^k)}\}_{1 \leq j \leq N} are stronger than orthogonal (k \geq 2, not true for k =1). We make this somewhat precise in the above pdf, adopting a probabilistic perspective.

I’d like to thank Chris Gartland, a fellow graduate student, for helping me formulate the ideas in the last section. For instance, it was his idea to utilize equation 5.

I’d also like to thank Junxian Li, a fellow graduate student in number theory, for very useful discussions regarding the major arcs.

Lastly, anyone at UIUC interested in hearing more about Vinogradov’s mean value theorem (for instance Bourgain, Demeter and Guth’s recent result or classical number theoretic methods), please get in touch with me. My email can be found here, or you can visit my office in Altgeld 169.

 

A lower bound for the least prime in an arithmetic progression

Junxian Li, Kyle Pratt, and I recently uploaded our paper A lower bound for the least prime in an arithmetic progression to the arXiv.

Here is a file where the heuristics considered in section 2 of the paper are developed in a slightly simpler situation.

Given a positive integer k and a \ell coprime to k, define p(k , \ell) to be the smallest prime equivalent to \ell modulo k. We are interested in the worst case behavior, that is P(k) := \max_{(\ell , k) = 1} p(k , \ell). Thus P(5) = 19 and P(12) = 13. In particular we are interested in lower bounds for P(k) for large k. An elementary observation, due to Pomerance, in Theorem 1 shows roughly that to find lower bounds for P(k), it is enough to find lower bounds for the Jacobsthal function (the roughly will be explained below). For an integer m, the Jacobsthal function, g(m), is the largest difference between consecutive integers coprime to m.

In recent work on Long gaps between primes by Ford, Green, Konyagin, Maynard, and Tao, they improve upon lower bounds for g(m) where m is the product of the first u primes (they also mention the connection to the least prime problem; indeed it was Kevin Ford who originally introduced us to the problem). The key difference in the current problem is that we seek lower bounds for g(m) where m is the product of the first u primes that are coprime to k. Our main new idea is to modify these sieve weights of Maynard used in the work of Ford, Green, Konyagin, Maynard, and Tao. We outline our approach in section 4 of our paper.

We finish by taking some time here to discuss smooth number estimates, which is perhaps the most important input to our work as well as all previous work on large gaps between primes (Westzynthius, in 1931, was the first to realize this connection). For x \geq y \geq 1, let \Psi(x,y) be the number of integers \leq x whose prime factors are all \leq y. Thus \Psi(x,2) is the number of powers of 2 that are at most x and \Psi(x , 3) is the number of integers of the form 2^a 3^b \leq x. Estimating \Psi(x,2) is straightforward and for y is fixed, one can obtain an asymptotic for \Psi(x,y) by counting lattice points in a simplex, as I describe in this previous blog post.

For our current problem, it is crucial that we are allowed to let y depend on x. The important fact is that \Psi(x,y) is much smaller than expected (by sieve theory heuristics). Rankin, in 1938, in his work on gaps between primes (see also: these set of notes) improved upon smooth number estimates to obtain better lower bounds for large gaps between primes. Westzynthius’ strategy, along with Rankin’s estimates, are still the starting points for current methods of constructing large gaps between primes.