Breaking the 6/5 threshold for sums and products modulo a prime

Ilya Shkredov and I just released our preprint, “Breaking the 6/5 threshold for sums and products modulo a prime.”

Since this paper of Roche–Newton, Rudnev, and Shkredov, the best bound for the sum–product problem in {\mathbb{F}_p} was given by the following.

Theorem (Sum–Product): Let {A \subset \mathbb{F}_p} of size at most {p^{5/8}}. Then

\displaystyle |A+A| + |AA| \gg |A|^{6/5}. \ \ \spadesuit

The major breakthrough was a point–plane incidence theorem of Rudnev. Let us recall their proof, which has been simplified since their original.

We consider the set of points and lines

\displaystyle P = (A+A) \times (AA), \ \ \ \mathcal{L} = \{(u,v) \in \mathbb{F}_p^2 : v = a(u - c)\}_{a , c \in A}.

Note that each line contains the points {(b + c , ab)} and so there are at least {|A|^3} incidences. On the other hand, one may apply a point–line incidence, for instance the cartesian version found in Theorem 4 of this paper of Stevens and Zeeuw to obtain an upper bound for the number of incidences between {P} and {\mathcal{L}}. Combining these two bounds gives Theorem 1.

We are able to make the following improvement.

Theorem 1: Let {A \subset \mathbb{F}_p} of size at most {p^{3/5}}. Then

\displaystyle |A+A| + |AA| \gg |A|^{6/5 + c}, \ \ \ c = 4/305. \ \ \spadesuit

We can improve this application of a Szemerédi–Trotter type bound in the specific case of the sum–product problem. The basic idea is to evaluate the above proof and obtain a bound for the {\tau}–rich lines, instead of incidences. We then obtain the bound for the higher order energy:

\displaystyle d_4^+(A) \lesssim |AA|^2 |A|^{-2} , \ \ d_4^+(A) := E_4^+(A,B) |A|^{-1} |B|^{-3}.\ \ \ \ \ (1)

A simple application of Cauchy–Schwarz gives

\displaystyle |A+A|^{4/3} d_4^+(A)^{-1/3} \leq |A+A|. \ \ \ \ \ (2)

Combining (1) and (2) recovers Theorem 1. We are able to make an improvement to (2) and in turn to the sum–product problem.


Covering sumsets and Plünnecke’s inequality

Recall Plünnecke’s inequality which states the following. We write {mA} for all of the {m}–fold sums of {A}.

Theorem (Plun): Let {A} be a finite subset of an abelian group. Let {K = |A+A| |A|^{-1}} be the doubling constant. Then

\displaystyle |mA | \leq K^{m}|A|. \ \ \ \spadesuit

We describe one natural attempt to prove Theorem (Plun). Suppose that

\displaystyle A+A \subset A+X,

for some small {X}. Then

\displaystyle A+A+A \subset A+X+X.\ \ \ \ \ (1)

and so

\displaystyle |A+A+A| \leq |A| |X|^2.

This motivates the following definition.

Definition (Cov): Let {A} be a finite subset of an abelian group. We let

\displaystyle C(A) = \min \{|X| : A+A \subset A+X , \ X \subset A\}. \ \ \ \spadesuit

Clearly {C(A) \leq |A|}, since we may choose {|X| = |A|}. Note that

\displaystyle C(A) \geq |A+A| |A|^{-1},\ \ \ \ \ (2)

since if {A+A \subset A+X}, then {|A+A| \leq |A| |X|}. One natural question is to what extent can one reverse (2). By the argument in (1),

\displaystyle |mA| \leq C(A)^{m-1} |A|,

and so this would have implications to Plünnecke’s inequality.

The truth is one cannot reverse (2). We give an example due to Ruzsa.

Example: We construct a set {Y} in {\mathbb{Z}^3} via

\displaystyle Y := \left( [n] \cdot e_1 \times [n] \cdot e_2 \times [n] \cdot e_3 \right) \cup [n^2] e_1 \cup [n^2] e_2 \cup [n^2] e_3.


We have the following properties of {Y}:

\displaystyle |Y| \asymp n^3 , \ \ |Y+Y| \asymp n^5 = n^2 |Y| , \ \ C(Y) \asymp n^3 \asymp n |Y+Y| |Y|^{-1}.

Note that {|Y+Y|} comes from the sum of a line of size {n^2} with the box, while {C(Y)} comes from the sum of two perpendicular lines of size {n^2}. The point is that one cannot efficiently cover a two dimensional object with either a one dimensional object or a three dimensional object, even in the presence of additive structure. {\spadesuit}

We do have {C(A) \lesssim |A+A| |A|^{-1}} statistically in the following sense.

Proposition (Stat): Let

\displaystyle S = \{x \in A+A : r_{A+A}(x) \geq |A|^2 |A+A|^{-1}\}.


\displaystyle S \subset A+X, \ \ \ |X| \leq 3|A+A| |A|^{-1} \log |A|. \ \ \spadesuit

This should be compared to this post of Tao, where he talks about an entropy version of Plünnecke’s inequality.

Proof: We choose {X} uniformly at random of size {p |A|}. Let {d = |A|^2 |A+A|^{-1}}. Then for {s \in S},

\displaystyle \mathbb{P}(s \notin A+X) = (1-p)^{r_{A+A}(s)} \leq (1-p)^{d} \leq \exp{(-pd)}.

We choose {p = 3d^{-1} \log |A| =3 |A+A| |A|^{-2} \log |A|}. If {p > 1}, the Proposition follows from taking {X=A} and so we assume the opposite. Let {Z} be the random variable that denotes the number of {s \in S} not covered by {A+X}. By the union bound,

\displaystyle \mathbb{E}[|Z|] \leq |A+A| \exp{(-pd)} \leq |A|^{-1}.

Thus there is a choice of {X} so that {Z = 0} and thus {S \subset A+X} {\spadesuit}

On the other hand, we can bound {C(A)} in terms of the doubling constant if we are allowed to take {X} a subset of the smoother {A+A-A}.

Proposition (Smooth): Let {A} be a finite subset of an abelian group. Then {A+A \subset A+X} where {X \subset A+A-A} and {|X| \ll |A+A-A||A|^{-1} \log |A|}. Thus {A+A} can be covered by {\ll K^3 \log |A|} translates of {A}.

The proof of Proposition (Smooth) is similar to that of Proposition (Stat), so we omit it.

This is related to the following annoying question. We let {m \cdot A} be the dilate of {A} by {m}. We know from Plünnecke’s inequality that

\displaystyle |A+ 2 \cdot A| \leq |A+A+A| \leq K^3 |A| , \ \ K = |A+A| |A|^{-1}.\ \ \ \ \ (3)

This inequality came up in Lemma 5.1 of the almost periodicity paper of Croot and Sisask.

Question (aplus2a): Does there exist a {\delta < 3} such that

\displaystyle |A+2 \cdot A| \leq K^{ \delta} |A|? \ \ \ \spadesuit

This was asked in a paper of Bukh and appears on his website. Note that both of the inequalities in (3) can be basically tight. For the first one, consider and arithmetic progression, and for the second, the example given above.

We now give a construction that shows one cannot take {\delta} too small in Question (aplus2a), even for large {A}, using the tensor power trick.

Example: Consider {A = \{0,1\}}. Then {|A+A| = 3} and {|A+2 \cdot A| = 4}. This example shows that one cannot take {\delta} smaller than

\displaystyle \log (2) / \log (3/2) \approx 1.71.

To make large examples, consider {A = \{0,1\}^d} for some large {d}.

Monochromatic Hilbert cubes and arithmetic progressions

József Balogh, Misha Lavrov, Adam Wagner, and I recently uploaded our paper “Monochromatic Hilbert cubes and arithmetic progressions” to the arXiv.

Van der Waerden’s theorem is a central question in arithmetic Ramsey theory. The goal is to find bounds for {W(k,r)}, which is the least integer such that any {r}–coloring of {1 , \ldots , W(k,r)} contains a {k}–term arithmetic progression. The current records are of the form

\displaystyle c\cdot r^{k-1} \leq W(k,r) \leq 2^{2^{r^{2^{2^{k+9}}}}}.

The upper bound is due to Gowers, while the lower bound is due to Kozik and Shabanov (see also Blankenship, Cummings, and Taranchuk for the prime case).

A Hilbert cube, or affine {k}–cube, is a subset of the integers of the form

\displaystyle \left\{x_0 + \sum_{b \in B} b : B \subseteq A\right\} =: x_0 + \Sigma^* A,

for some {|A|=k} and {x_0 \in \mathbb{Z}}. Hilbert cubes are an important substructure in arithmetic Ramsey theory, coming up in the Gower’s uniformity norm, the density Hales Jewett theorem, and Szemerédi’s original proof of his theorem. We set {h(k,r)} to be the least integer such that any {r}–coloring of {1 , \ldots , W(k,r)} contains an affine {k}–cube. The current records are of the form

\displaystyle r^{ck^2} \leq h(k,r) \leq O(r)^k.

The lower bound is due to Conlon, Fox, and Sudakov, while the upper bound was originally due to Szemerédi and is the so–called Szemerédi cube lemma. Note that the gap in bounds for Hilbert cubes is much less than the gap in Van der Waerden’s problem.

Improving the upper bound would be of interest, and there are a plethora of proofs for it. For instance, one nice geometric proof can be found in this paper of Graham and Solymosi and a standard averaging can be found in these notes of Zeeuw.

We instead consider improving the lower bound. A first try is to randomly {r}–color the integers {1 ,\ldots , n} and show that with high probability there are no monochromatic affine {k}–cubes. An affine {k}–cube of the form {x_0 + \Sigma^* A} is not monochromatic with probability

\displaystyle \frac{2}{2^{\Sigma A}}.

It turns out if every {A} was disassociated, that is having {|\Sigma^* A| = 2^k}, then we could complete this argument and greatly improved upon the Conlon, Fox, Sudakov bound. The problem is with sets of the form {x_0 + \Sigma^* A}, where {\Sigma^* A} is small. The question is how small. It turns out that {|\Sigma^* A| \geq {k +1 \choose 2}}. To make improvements, we must consider the case {|\Sigma^* A| \leq k^{2 + \delta}}. But then, in light of Littlewood–Offord theory, one might expect {A} to look a lot like an arithmetic progression. Thus {\Sigma^* A} should also be like an arithmetic progression. Thus we are lead back to Van der Waerden’s problem. Our main tools for making this precise are this paper of Nguyen and Vu and this paper of Szemerédi and Vu.

Our main theorem is the following.

Theorem 1: Let {k \geq 3} be an integer. Then for every {\epsilon >0}, there is a {c > 0} such that

\displaystyle h(k,4) \ge \min\{W(\lfloor c k^2 \rfloor, 2), 2^{k^{2.5-\epsilon}}\}.

Theorem 1 asserts that either

  • (i) the best lower bound for {W(k,2)} is far from sharp and {h(k,4)} is larger than the Conlon, Fox, Sudakov bound,
  • (ii) the best lower bound for {W(k,2)} is nearly sharp and so the Hilbert cube number is very close to the Van der Waerden number.

Graham conjectured that {W(k,2)} should be closer to the lower bound as opposed to the tower type upper bound. One dream would be to use this to prove a $1000 conjecture of Graham, by showing a bound of the form {h(k,4) \leq 2^{k^{2.5 - \delta}}}. Needless to say, this is likely hard.

A short proof of the Hardy-Littlewood maximal inequality

Here is a short post to advertise a proof of the weak L^1 bound for the Hardy-Littlewood maximal function. The proof was told to me by Terry Harris, a fellow graduate student at UIUC, and can be found on his webpage. In short, he replaces the use of Vitali’s covering lemma with a clever greedy algorithm. Incidentally his proof gives the better constant 2^d, though this is well known, see for instance exercise 42 in these notes of Tao. One cute related geometric question is can one improve the constant in Vitali’s covering lemma to 2^d? This is open for d = 2.

Kevin Hughes pointed out to me that this proof basically appears in this paper of Carlsson.

Convex sets have many differences

Thanks to Sophie Stevens for helpful discussions at the Georgia discrete analysis conference leading to this post. We show that if {A} has small (slightly generalized) third order energy, then {A} has a large difference set. We present an argument due to Shkredov and Schoen. This has the advantage of being drastically simpler (and slightly stronger) than the analog for sumsets, which was discussed in this previous blog post. We say {b \gtrsim a} if {a = O(b \log^c |A|)} for some {c >0}.

Let {A, B \subset \mathbb{R}} be finite. We define the sumset and product set via

\displaystyle A+B := \{a + b : a \in A , b \in B\} , \ \ \ \ \ \ AB := \{ab : a \in A , b \in B\}.

We have the following difference–product conjecture.

Conjecture 1: Let {\delta \leq 1}. Then for any finite {A \subset \mathbb{Z}} one has

\displaystyle |A-A| + |A A| \gtrsim |A|^{1 + \delta}. \ \ \spadesuit

We say a finite {A = \{a_1 < \ldots < a_k\} \subset \mathbb{R}} is convex if

\displaystyle a_{i+1} - a_i > a_i - a_{i-1}, \ \ \ 2 \leq i \leq k-1.

The following conjecture is due to Erdős.

Conjecture (Convex): Suppose {A \subset \mathbb{R}} is convex. Then

\displaystyle |A-A| \gtrsim |A|^2. \ \ \spadesuit

Study of the quantity {d^+(A)} has played a crucial role in recent progress on both conjectures, which we define now. For {A \subset \mathbb{R}} finite, we define

\displaystyle d^{+}(A) := \sup_{B \neq \emptyset}\frac{ E_3^+(A,B) }{|A| |B|^2},


\displaystyle E_3^+(A,B) = \sum_x r_{A- B}(x)^3 , \ \ \ \ r_{A-B}(x) = \#\{(a, b ) \in A \times B : x = a - b\}.

See my recent paper on Conjecture 1 for a lengthy introduction to {d^+(A)}. We mention here that {1 \leq d^+(A) \leq |A|} and intuitively the larger {d^+(A)} is the more additive structure {A} has. For instance, we have

\displaystyle E_3^+(A) \leq d^+(A) |A|^3. \ \ \ \ \ (1)

Indeed {d^+(A)} is a higher order analog of the more common additive energy of a set, where we have some flexibility in choosing {B} (for instance, we choose {B = A-A} below).

An application of Cauchy–Schwarz, as explained in equation 2 of this paper, reveals

\displaystyle |A|^{3/2} \leq |A-A| d^+(A)^{1/2}.\ \ \ \ \ (2)

Thus if {d^+(A) \lesssim 1}, we have that {|A+A| \gtrsim |A|^{3/2}}. We conjecture that this can be improved.

Conjecture (dplus): Let {A \subset \mathbb{R}} be finite. Then

\displaystyle |A-A| \gtrsim |A|^2 d^+(A)^{-1}. \ \ \ \ \spadesuit

This would imply Conjecture (Convex) and Conjecture 1 for {\delta = 1/2}. The goal of the rest of this post is to prove the current state of the art progress due to Shkredov and Schoen. The proof of Theorem 1 is simple enough that we compute the constant.

Theorem 1: Let {A} be a subset of any additive group. Then

\displaystyle |A-A| > \frac{|A|^{8/5}}{4d^+(A)^{3/5}}. \ \ \ \spadesuit

We will use

\displaystyle E_3^+(A) = \sum_x E^+(A, A_x) ,\ \ \ \ \ (3)

as is explained using a simple geometric argument in equation 9 of this paper of Murphy, Rudnev, Shkredov, and Shteinikov. For {x \in A-A}, let

\displaystyle A_x := A \cap (A + x).


\displaystyle |A_x| = r_{A-A}(x).


\displaystyle P =\{x \in A-A : r_{A-A}(x) \geq \frac{|A|^2}{2|A-A|}\},

be the set of popular differences. Thus

\displaystyle \sum_{x \in P} r_{A-A}(x) \geq (1/2) |A|^2.\ \ \ \ \ (4)

By two applications of Cauchy–Schwarz,

\displaystyle \sum_{x \in P} |A_x| |A| = \sum_{x \in P } \sum_y r_{A - A_x}(y) \leq \sum_{x\in P} E^+(A,A_x)^{1/2} |A - A_x|^{1/2} \leq \left(\sum_x E^+(A,A_x) \sum_{x \in P} |A- A_x| \right)^{1/2} .

Combining this with (3) and (4), we find

\displaystyle \frac{|A|^6}{4E_3^+(A)} \leq \sum_{x \in P} |A - A_x|.

Now we use Katz–Koelster inclusion of the form

\displaystyle A- A_x \subset (A-A) \cap (A - A -x),

to find

\displaystyle \frac{|A|^6}{4E_3^+(A)} \leq \sum_{x \in P} |D \cap (D - x) |, \ \ \ D := A- A.

Since {x \in P}, we have {(1/2) |A|^2 |A-A|^{-1} \leq r_{A-A}(x)}, and so 

\displaystyle \frac{|A|^6}{4E_3^+(A)} \leq \frac{2 |A-A|}{|A|^2} \sum_{x \in P} r_{D-D}(x)r_{A-A}(x) \leq \frac{2|A-A|}{|A|^2} E^+(A,D).\ \ \ \ \ (5)

By Cauchy–Schwarz (interpolating {\ell_2} between {\ell_3} and {\ell_1}), we have

\displaystyle E^+(A,D)^2 \leq E_3^+(A,D) |A||D|,

and so by the definition of {d^+(A)}, (2), and (5), we arrive at

\displaystyle \frac{|A|^{15}}{64|D|^{3}} \leq E_3^+(A)^2 E_3^+(A,D) \leq d^+(A)^3 |A|^7 |D|^2.

Simplifying gives the (slightly stronger) desired

\displaystyle \frac{|A|^8}{64 d^+(A)^3} \leq |A-A|^5. \ \ \ \spadesuit

Developing a fourth moment analog of the above result would be worthwhile as it would have applications to finite field sum-product problem.

Breaking the Square root Barrier via the Burgess Bound

Thanks to Kyle Pratt for enlightening discussions leading to this post.

Recall the Burgess bound which states for any interval {I \subset \mathbb{F}_p} of size at least {p^{1/4 + \epsilon}}, one has

\displaystyle \left| \sum_{x \in I} \chi(x) \right| \ll \frac{|I|}{p^{\delta}}, \ \ \ \ |I| \geq p^{1/4+ \epsilon}\ \ \ \ \ (1)

where {\chi} is the Legendre symbol modulo {p}.

Conjecture 1: For any {\alpha > 0}, (1) holds for intervals, {I}, of size {p^{\alpha}}. { \ \ \ \spadesuit}

Conjecture 1 is one way quantify that {\chi} behaves like a random sign pattern. Indeed almost surely, (1) holds for a random {\pm 1}–valued function from {\mathbb{F}_p}.

Character sums are central objects of study in analytic number theory and such a result would have wide applications. For instance, it would solve the Vinogradov least square problem. It would also have implications to sub–convexity bounds of {L}–fucntions, using

\displaystyle L(1/2 , \chi) = \sum_{n \leq \sqrt{p}} \chi( n) / \sqrt{n} + O(\log p).

For instance, summation by parts and (1) gives the sub–convexity bound

\displaystyle L(1/2 , \chi) \ll p^{1/4 - \delta} .

The key obstacle is the famous square root barrier. In {N} coin flips, one can change {\sqrt{N}} of the outcomes without affecting the global statistics, for instance in the central limit theorem. Thus when applying global statistics to problems in short intervals, one naturally arise at a square root barrier for the size of the interval.

To this end, it is no surprise that the P\’lya–Vinogradov inequality}, that is conjecture 1 with {\alpha = 1/2 + \epsilon}, was proved about forty years before the Burgess bound. The P\'{o}lya–Vinogradov theorem can be proved using only the global statistics of {\chi}, for instance for all {\xi \in \mathbb{F}_p},

\displaystyle \left| \sum_{x \in \mathbb{F}_p} \chi(x) e_p(\xi x) \right| \leq \sqrt{p} , \ \ \ e_p(\theta) : = e^{2 \pi i \theta / p }.

On the other hand, the Burgess bound breaks the square root barrier, so one needs additional ideas, which we now discuss. We first use {I+x \approx I} for small {x}. Then we use the multiplicativity of {\chi} to amplify an interval of size {p^{1/4 + \epsilon}} to a set of size {p^{1/2 + \epsilon}}. Finally, we use completion and apply a global statistic of {\chi} (Hasse-Weil bound). Without the amplification step, this method would not work. Let us sketch the proof.

By the near translation invariance of {I}, one has

\displaystyle \sum_{x \in I} \chi(x) =\sum_{x \in I} \chi(x + yt) + O(p^{1/4 + \epsilon / 2}),

for all {y \in J = \{1 , \ldots , \lfloor p^{1/4} \rfloor \}} and {t \in T = \{1 , \ldots ,\lfloor p^{\epsilon / 2} \rfloor \}}. But by the multiplicativity of {\chi},

\displaystyle \left| \sum_{y \in J ,t \in T} \sum_{x \in I} \chi(x + yt) \right| \leq \sum_{x \in I , y \in J} \left| \sum_{t \in T} \chi(y^{-1} x + t) \right| .\ \ \ \ \ (2)

Now the number of ways to write an integer as {xy^{-1}} is roughly one on average on a set of size {p^{1/2 + \epsilon}}, so (2) is approximately

\displaystyle \sum_{u \in S} \left| \sum_{t \in T} \chi(u+ t) \right| ,

for some {S} of size {p^{1/2 + \epsilon}}. Since {S} is so large, we can now efficiently complete the sum after an application of H\”{o}lder’s inequality to arrive at

\displaystyle \sum_{u \in \mathbb{F}_p} \left| \sum_{t \in T} \chi(u+ t) \right|^{2r} .

This may be bounded by the following global statistic. We remark in our case, there is a short proof using the polynomial method (the proof in the note is only for elliptic curves, but is easily generalized to our situation).

Lemma 1: Let {f} be a polynomial which is not a multiple of a perfect square. Then

\displaystyle \sum_{x \in \mathbb{F}_p} \chi (f(x)) = O_{{\rm deg}(f)} (\sqrt{p}).

In particular,

\displaystyle \sum_{u \in \mathbb{F}_p} \left| \sum_{t \in T} \chi(u+ t) \right|^{2r} \ll_r p^{1/2} |T|^{2r} + p |T|^r. \ \ \spadesuit

Putting this altogether, as done beautifully in Chang’s survey one arrives at the Burgess bound of the form (1), for {\delta = \epsilon^2 / 4}.

Another successful instance of breaking the square root barrier is in this paper of Hanson. One result he proved was for any {\delta > 0},

\displaystyle \sum_{a \in A , b \in B , c \in C} \chi(a + b + c) = o_{\delta} (|A||B||C|),

for arbitrary {A, B , C} of size at least {\delta \sqrt{p}}. He takes advantage of the additive structure of the problem as was pioneered in this paper of Chang using Freiman’s theorem from additive combinatorics. This has the same flavor as the Burgess bound in (1) as one replaces the very additively structured interval {I} with a sumset. The main problem along these line is the following

Conjecture 2: For any {A, B} of size at least {p^{\alpha}}, there is a {\delta >0} such that

\displaystyle \sum_{a \in A , b \in B} \chi (a + b) \ll |A| |B| p^{-\delta}. \ \ \spadesuit.

This would have an application to sparse signal recovery. One can see Bourgain’s work and the references within for the dual problem, where multiplicative character is replaced by additive character and sumset is replaced by product set.

Granville and Soundararajan (Theorem 2) building upon work of Montgomery and Vaughan showed that by assuming GRH, one has

\displaystyle \sum_{n \leq x} \chi (n) \sim \sum_{n \leq x} \chi(n) \Psi_y(n) , \ \ \ y \geq \log^5 q ,

where {\Psi_y} is the characteristic function of {y}-smooth numbers. Thus the sum in (1) (with {I = [1 , \ldots , x]}) is determined solely by the values of {\chi} at rather small primes. Using smooth number estimates, this implies Conjecture 1 in the case {I = [1, \ldots , x]}.

Finally, we remark that it is know that GRH implies Conjecture 1.

Convex sumsets and sum-product estimates

Thanks to Junxian Li for carefully reading and providing useful suggestions.

Here we advertise a conjecture that would have consequences to two major problems in additive combinatorics and provide a proof of the state of the art progress towards them. In this post, we say {b \gtrsim a} if {a = O(b \log^c |A|)} for some {c >0}. Let {A, B \subset \mathbb{R}} be finite. We define the sumset and product set via

\displaystyle A+B := \{a + b : a \in A , b \in B\} , \ \ \ \ \ \ AB := \{ab : a \in A , b \in B\}.

We have the beautiful sum–product conjecture of Erdős and Szemerédi.

Conjecture (ESconj): Let {\delta < 1}. Then for any finite {A \subset \mathbb{Z}} one has

\displaystyle |A+A| + |A A| \gtrsim |A|^{1 + \delta}. \ \ \spadesuit

We say a finite {A = \{a_1 < \ldots < a_k\} \subset \mathbb{R}} is convex if

\displaystyle a_{i+1} - a_i > a_i - a_{i-1}, \ \ \ 2 \leq i \leq k-1.

The following conjecture is due to Erdős.

Conjecture (Convex) Suppose {A \subset \mathbb{R}} is convex. Then

\displaystyle |A+A| \gtrsim |A|^2. \ \ \spadesuit

Study of the quantity {d^+(A)} has played a crucial role in recent progress on both conjectures, which we define now. For {A \subset \mathbb{R}} finite, we define

\displaystyle d^{+}(A) := \sup_{B \neq \emptyset}\frac{ E_3^+(A,B) }{|A| |B|^2},


\displaystyle E_3^+(A,B) = \sum_x r_{A- B}(x)^3 , \ \ \ \ r_{A-B}(x) = \#\{(a, b ) \in A \times B : x = a - b\}.

See my recent paper on Conjecture (ESconj) for an introduction to {d^+(A)}. We mention here that {1 \leq d^+(A) \leq |A|} and intuitively the larger {d^+(A)} is the more additive structure {A} has. Indeed {d^+(A)} is a higher order analog of the more common additive energy of a set. An application of Cauchy–Schwarz implies, as explain in equation (2) of this paper, reveals

\displaystyle |A|^{3/2} \leq |A+A| d^+(A)^{1/2}.\ \ \ \ \ (1)

Thus if {d^+(A) \lesssim 1}, we have that {|A+A| \gtrsim |A|^{3/2}}. We conjecture that this can be improved.

Conjecture (dplus): Let {A \subset \mathbb{R}} be finite. Then

\displaystyle |A+A| \gtrsim |A|^2 d^+(A)^{-1}. \ \ \ \ \spadesuit

This would imply Conjecture (Convex) and Conjecture (ESconj) for {\delta = 1/2}. The goal of the rest of this post is to prove the current state of the art progress due to Shkredov. We remark that the rest of the post is technical in nature. Please see this survey of de Zeeuw and the introduction of this paper of mine for background. Also, one can see this paper of Murphy, Rudnev, Shkredov, and Shteinikov for another perspective (thanks to Sophie Stevens for pointing this reference out to me).

Theorem 1: (Shkredov) Let {A \subset \mathbb{R}} be finite. Then

\displaystyle |A+A| \gtrsim |A|^{58/37} d^+(A)^{-21/37} . \ \ \spadesuit

His idea is to analyze spectral and combinatorial properties of an associated matrix: the spectral properties depend on {|A+A|}, while the combinatorial properties depend on {d^+(A)}. One of the simplest versions of this idea is that the number of {k}–walks in a graph is the sum of the the {k^{\rm th}} powers of the eigenvalues of the graph, which plays a crucial role in the theory of expander graphs.

For a function {g : \mathbb{R} \rightarrow \mathbb{C}}, we define the following {|A| \times |A|} matrices

\displaystyle T_A^g(x,y) : = 1_A(x)1_A(y) g(x-y), \ \ S_A^g(x,y) : = 1_A(x)1_A(y) g(x+y).

We set

\displaystyle E^+(A , B) := \sum_x r_{A-B}(x)^2 , \ \ \ E^+(A) : = E^+(A,A),

\displaystyle E_3^+(A, B , C) := \sum_x r_{A-A}(x) r_{B-B}(x) r_{C-C}(x) , \ \ E_3^+(A) : = E_3^+(A, A , A) .

Theorem 1 follows from the following two propositions.

Proposition 1: Let

\displaystyle S = \{x \in A+A : r_{A+A}(x) \geq \frac{ |A|^2 }{2|A+A|}\},

be a set of popular sums of {A} and {g = 1_S}. Let {\mu_1 , \ldots , \mu_{|A|}} be the eigenvalues of {S_A^g} with associated orthonormal eigenvectors {f_j : A \rightarrow \mathbb{R}}. Then

\displaystyle \frac{|A|^5}{2^5 |S|} \leq \frac{\mu_1^5}{||g||_2^2 ||g||_{\infty}} \leq \mu_1^2 \langle T_A^{r_{A-A}} f_1 , f_1 \rangle \leq \sum_{j=1}^{|A|} \mu_j^2 \langle T_A^{r_{A-A}} f_j , f_j \rangle \leq d^+(A)^{1/2} |A|^{3/2}E_3^+(A,A,S)^{1/2} . \ \ \ \spadesuit

The second and third inequality hold for general {g}. The right hand side arises from combinatorial properties of {T_A^{r_{A-A}}}, while the left hand side comes from spectral properties. Before proving Proposition 1, we note that in order to apply Proposition 1, we need a bound for {E_3^+(A,A,S)}, which is slightly different than what is considered in the definition of {d^+(A)}. This is handled by the following.

Proposition 2: Let {A \subset \mathbb{R}} be finite and

\displaystyle S = \{x \in A+A : r_{A+A}(x) \geq \frac{ |A|^2 }{2|A+A|}\}.


\displaystyle E_3^+(A,A,S) \lesssim d^+(A)^{11/10} |A|^{6/5} |A+A|^{17/10}. \ \ \ \spadesuit

Theorem 1 follows immediately from combining Proposition 1 and Proposition 2. We start with the proof of Proposition 2.

Proof of Proposition 2: The idea is to bound {E_3^+(A,A,S)} by various second moment estimates and then use the fact that {a - a' = b - b'} if and only if {a + b' = b + a'}. By a dyadic decomposition, there is a set {Q \subset A-A} such that

\displaystyle E_3^+(A,A,S) \gtrsim \sum_{x \in Q} r_{A-A}(x)^2 r_{S-S}(x) , \ \ Q = \{x : q \leq r_{A-A}(x) \leq 2q \}.

We have

\displaystyle E_3^+(A,A,S) \lesssim q E(A,S).\ \ \ \ \ (2)

On the other hand,

\displaystyle E_3^+(A,A,S) \lesssim q^2 \sum_x 1_Q(x) \sum_y 1_S(y) 1_S(x+ y)

\displaystyle \lesssim \frac{|A+A|}{|A|^2} q^2 \sum_y r_{A+A}(y)\sum_x 1_Q(x) 1_S(x+y)

\displaystyle = \frac{|A+A|}{|A|^2} q^2 \sum_y r_{A+A}(y)r_{S-Q}(y) \leq \frac{|A+A|}{|A|^2} q^2 E^+(A,S)^{1/2} E^+(A,Q)^{1/2}.\ \ \ \ \ (3)

By the asymmetric version of (1), we have

\displaystyle E^+(A,Q) \leq d^+(A)^{1/2} |A| |Q|^{3/2},

and we also have

\displaystyle |Q| \leq \#\{x : r_{A-A} (x) \geq q\} \leq E_3^+(A) q^{-3} \leq d^+(A) |A|^3 q^{-3}.

Combining these two, we find

\displaystyle E^+(A,Q) \leq \frac{d^+(A)^2 |A|^{11/2}}{q^{9/2}}.

Using (3), we find

\displaystyle E_3^+(A,A,S) \lesssim E^+(A,S)^{1/2} |A+A| d^+(A) |A|^{3/4} q^{-1/4} \ \ \ \ \ (4)

Combining (2) and (4) yields

\displaystyle E_3^+(A,A,S) \lesssim E^+(A,S)^{1/2} \min \{ q E^+(A,S)^{1/2} , |A+A| d^+(A) |A|^{3/4} q^{-1/4}. \}.

This quantity is minimized when

\displaystyle q^{5/4} = \frac{|A+A| d^+(A) |A|^{3/4}}{E^+(A,S)^{1/2}},

and so

\displaystyle E_3^+(A,A,S) \lesssim E^+(A,S)^{1/10} \left( |A+A| d^+(A) |A|^{3/4} \right)^{4/5}

The result follows from

\displaystyle E^+(A,S) \leq d^+(A)^{1/2} |A| |S|^{3/2}, \ \ |S| \leq |A+A|,

and simplification.

Proof of Proposition 1: Note that {T_A^{r_{A-A}}} is a symmetric, positive–semidefinite matrix. To see the second point, note

\displaystyle T_A^{r_{A-A}}(x,y) = \langle 1_{A - x} , 1_{A - y} \rangle.

Thus {\langle T_A^{r_{A-A}} h , h \rangle \geq 0} for any {h : A \rightarrow \mathbb{C}}. This proves the third inequality

Now, we show the first inequality. We have {||1_S||_2^2 = |S|} and by a popularity argument

\displaystyle \langle S_A^{1_S} 1 , 1 \rangle = \sum_{x \in S} r_{A+A}(x) \geq |A|^2 / 2,

and so {\mu_1 \geq |A| / 2}, since

\displaystyle \mu_1 \geq \sup_{||f||_2 = 1} \frac{\langle T_A^{r_{A-A}} f , f \rangle}{||f||_2}.

We now prove the fourth inequality. Since {f_1 , \ldots , f_{|A|}} are an orthonormal basis of eigenvectors for {S_A^{1_S}},

\displaystyle \sum_{j=1}^{|A|} \mu_j^2 \langle T_A^{r_{A-A}} f_j , f_j \rangle = \sum_{x,y , j } T_A^{r_{A-A}}(x,y) \mu_j f_j(x) \mu_j f_j(y) = \sum_{x,y , j , i} T_A^{r_{A-A}}(x,y) \mu_i f_i(x) \mu_j f_j(y) \sum_z f_j(z) f_i(z)

\displaystyle = \sum_{x,y , z } T_A^{r_{A-A}}(x,y) \sum_i \mu_i f_i(x) f_i(z) \sum_j \mu_j f_j(y) f_j(z) = \sum_{x, y , z} T_A^{r_{A-A}}(x,y) S_A^{1_S} (x,z) S_A^{1_S} (y , z),

where in the final equality, we used that

\displaystyle S_A^{1_S} = \sum_j \mu_j f_j f_j^T,

from the spectral theorem of self adjoint matrices. Now we have

\displaystyle \sum_{x, y , z} T_A^{r_{A-A}}(x,y) S_A^{1_S} (x,z) S_A^{1_S} (y , z) = \sum_{x, y , z \in A} T_A^{r_{A-A}}(x,y) 1_S(x+z) 1_S(y+z).

Using the change of variables {\alpha = x+z} and {\beta = y+z}, we have this is equal to

\displaystyle \sum_{z, \alpha , \beta} r_{A-A}(\alpha - \beta ) 1_S(\alpha) 1_S (\beta) \sum_z 1_A (z) 1_A(\alpha - z) 1_A( \beta - z).

By Cauchy–Schwarz and simplification, we get this is

\displaystyle \leq \left( \left( \sum_{\alpha , \beta} r_{A-A} (\alpha - \beta)^2 1_S(\alpha) 1_S(\beta) \right) \sum_{\alpha , \beta} |\sum_z 1_A(z) 1_A(\alpha - z) 1_A( \beta - z) |^2 \right)^{1/2}

\displaystyle = E_3^+(A)^{1/2} E_3^+(A, A , S)^{1/2}.

The inequality follows from

\displaystyle E_3^+(A) \leq d^+(A) |A|^3.

We now prove the second inequality. It is enough to show

\displaystyle \mu_1 \leq ||g||_{\infty}\left( \sum_z f_1(z) \right)^2, \ \ \ \left( \sum_z \mu_1 f_1(z) \right)^2 \leq ||g||_2^2 \langle T_A^{r_{A-A}} f_1 , f_1 \rangle .

We start with the first inequality. Since {f_1} is a unit vector and {S_A^g f_1 = \mu_1 f_1},

\displaystyle \mu_1 = \sum_x f_1(x) \mu_1 f_1(x) = \sum_{x,y} f_1(x) g(x +y) f_1(y) \leq ||g||_{\infty} \left( \sum_{x} f_1(x) \right)^2.

In the last inequality, we used that we may choose {f_1 \geq 0} entry–wise by the Perron–Frobenius theorem. We move onto the second inequality. Using the change of variables {w = x+y} we find

\displaystyle \sum_{x \in A} \mu_1 f_1(x) = \sum_{x,y} g(x+y) f_1(y) 1_A(x) = \sum_{w , x} g(w) f_1(w-x) 1_A(x) .

By Cauchy–Schwarz and a modest computation, we find this is

\displaystyle \leq ||g||_2 \left( \sum_w \left( \sum_x f_1(w-x) 1_A(x) \right)^2 \right)^{1/2} = ||g||_2 \langle T_A^{r_{A-A}} f_1 , f_1 \rangle^{1/2}. \ \ \spadesuit

We end by mentioning that

\displaystyle |A-A| \gtrsim |A|^{8/5} d^+(A)^{-3/5},

can be inferred from this paper of Schoen and Shkredov, which is stronger than the plus version discussed in this blog post (see this post). Any improvement to either bound would be of interest.