A Berry-Esseen type inequality for convex bodies with an unconditional basis
Bo'az Klartag
Introduction
Equivalently, the random vector has the same distribution as for any choice of signs.
We prove the following Berry-Esseen type theorem:
The bound in (1) is optimal, up to the precise value of the constant, as shown by the example of being independent random variables, with each distributed, say, uniformly in a symmetric interval (see, e.g., [14, Vol. II, Section XVI.4]). A central element in the proof of Theorem 2 is the sharp estimate
Previous techniques for obtaining thin spherical shell estimates under convexity assumptions relied almost entirely on concentration of measure ideas, either on the sphere (see ), or on the orthogonal group (see ). The quantitative estimates that these techniques have yielded so far are sub-optimal. Inequality (3) was previously known to hold with the bound in place of , where the exponent is slightly smaller than , see . The latter result is applicable for all isotropically-normalized random vectors with a log-concave density.
In this article we suggest a different approach. Rather than employing concentration of measure inequalities, our proof of the optimal inequality (3) is based on analysis of the Neumann Laplacian on convex domains, the so-called -method in convexity, going back to Hörmander and to Helffer and Sjöstrand . The argument is further simplified by using the theory of optimal transportation of measures. We expect this technique to be useful also in the study of other problems in convex geometry, such as central limit theorems for convex bodies with various types of symmetries. The argument leading to the thin shell estimate occupies Section 2, Section 3 and Section 5. In Section 6 we apply these estimates and complete the proof of Theorem 2.
Readers who are interested only in the proof of inequality (3) and Theorem 2 may skip Section 4. This section is devoted to several results, that were obtained as by-products, regarding the first non-zero eigenvalue and the corresponding eigenfunctions of the Neumann Laplacian on -dimensional convex bodies. In particular, we show that the eigenfunctions are all “biased” towards some direction in space. This rules out, for instance, the possibility of an even eigenfunction.
Acknowledgement. We would like to express our gratitude to Sasha Sodin for his kind help with the analysis related to the classical central limit theorem, to Tom Spencer for illuminating explanations regarding the work of Helffer and Sjöstrand, and to Dario Cordero-Erausquin, Leonid Friedlandler, Robert McCann, Emanuel Milman, Vitali Milman and Elias Stein for valuable discussions on related topics. Thanks also to the referee for useful comments and suggestions.
Convexity and the Neumann Laplacian
with . The main result of this section reads as follows:
and for . For instance, we may select . Note that for any , the vector is the outer unit normal to at .
The following lemma is a standard Bochner-Weitzenböck type integration by parts formula, going back at least to Lichnerowicz , to Hörmander and to Kadlec . We write for the hessian matrix of the function .
Let and denote . Then,
Proof: The function vanishes on . Since is tangential to , the derivative of the function in the direction of vanishes on . That is,
The boundary term vanishes, since on . We conclude from (7) and from an additional application of Stokes theorem that
Note that the integrand in the integral over is exactly . Hence, from (6),
The convexity of will be used next. Recall that is a convex function, and hence its hessian is a positive semi-definite matrix for any . Therefore, Lemma 5 implies that for any ,
where . Lemma 4 will be proven by dualizing inequality (8), in a way which is very much related to the approach taken by Hörmander and by Helffer and Sjöstrand .
Proof of Lemma 4: We are given and we would like to prove (4). We may assume that (otherwise, subtract from the function ).
Since and , there exists with
The existence of such is a consequence of the classical existence and regularity theory of the Neumann problem for the Laplacian on domains with a -smooth boundary (see, e.g., Folland’s book [16, chapter 7]). Stokes theorem yields
where the boundary term vanishes since . From the definition of the -norm and the Cauchy-Schwartz inequality,
Transportation of Measure
This definition fits with the one given in Section 2; We have where denotes the restriction of the Lebesgue measure to .
The next theorem is an extension of a remark by Yann Brenier that we learned from Robert McCann. For the convenience of the reader, we provide in the appendix a detailed exposition of the elegant proof from Villani [40, Section 7.6].
For a sufficiently small , let be the measure whose density with respect to is the non-negative function . Then,
the line segment from to . See Figure 1.
For consider the projection
For a sufficiently small denote by the measure whose density with respect to is . Then,
Proof: Without loss of generality, assume that . For a sufficiently small , the function is positive on , and hence is a non-negative measure. Fix such a sufficiently small .
Consequently, the densities and have an equal amount of mass on the interval . We consider the monotone transportation between these two densities. That is, we define a map by requiring that for any ,
The unique map that satisfies (11) transports the measure whose density is on to the Lebesgue measure on . We deduce from (11) that for ,
with bounded by a constant depending only on and (and in particular, independent of or ). We now let vary, and we write
Therefore the map transports to . According to (3),
with smaller than a constant depending only on and , and in particular independent of . To complete the proof, let tend to zero.
A digression: Neumann eigenvalues and eigenfunctions
This section presents some additional relations between convexity and the Neumann Laplacian. We retain the setup and notation of Section 2. We write for the Hilbert space that is the completion of with respect to the norm
The operator , acting on the subspace , is a symmetric, positive semi-definite operator. The classical theory implies that has a complete system of orthonormal Neumann eigenfunctions and Neumann eigenvalues (see, e.g., [16, Chapter 7]). The first eigenvalue is , with the eigenfunction being constant. It is well-known that when is convex (see, e.g, . It is actually enough to assume that is connected, see e.g., [11, Theorem 1]). We refer to as the first non-zero Neumann eigenvalue of . It is well-known that for any -smooth function with ,
Equality in (13) holds if and only if is an eigenfunction corresponding to the eigenvalue .
We say that the boundary of is uniformly strictly convex if is a positive definite matrix for any . Equivalently, is uniformly strictly convex if the principal curvatures are all positive – and not merely non-negative – everywhere on the boundary. Our next corollary claims, loosely speaking, that any non-trivial eigenfunction corresponding to cannot be “spatially isotropic”, but must have “preference” for a certain direction in space.
Consequently, the multiplicity of the first non-zero Neumann eigenvalue is at most .
We write for the first non-zero eigenvalue, i.e., . Since , inequality (8) gives
From (15) we know that for all . Thus (16) and (13) yield
Therefore, there must be equality in all steps and hence are all Neumann eigenfunctions with eigenvalue . We necessarily have equality also in (16). According to Lemma 5 this means that
Since the integrand is non-negative and continuous, necessarily
So far we have only used the convexity of . The uniform strict convexity of means that on . Equation (17) has the consequence that on , and therefore
This is well-known to be impossible for a Neumann eigenfunction corresponding to the first non-zero eigenvalue. We sketch the standard argument, see, e.g., for more information. Denote
Remark. Leonid Friedlandler explained to us how to eliminate the uniform strict convexity requirement from Corollary 1. His idea is to observe that since are all eigenfunctions, then the restriction of to the boundary is actually an eigenfunction of the Laplacian associated with the Riemannian manifold . However, (17) entails that is constant in some open set in , which is known to be impossible for an eigenfunction. We omit the details.
i.e., we flip the sign of the coordinate. For a function , we write . Our next corollary exploits the well-known relationship between the eigenfunctions and symmetry. Similar arguments appear, e.g., in .
If is unconditional, then there exist and an eigenfunction , such that
If is centrally-symmetric (i.e., ), then there exists an eigenfunction , such that
Proof: Begin with the proof of (i). We are given the unconditional convex body . Since is unconditional, then implies for . Begin with any non-zero eigenfunction , and recursively define
Then . If there exists such that then we are done: Suppose is the minimal such index. Then with , and we found our desired eigenfunction.
It remains to deal with the case where is a non-zero eigenfunction. Note that and hence
In the proof of Corollary 1 (the first part, which did not use the uniform strict convexity) we observed that (20) implies that . Since , there exists with . We see from (19) that is the eigenfunction we are looking for. This completes the proof of the first part of the lemma.
The proof of the second part is similar. Begin with any and set . If , then is an odd function and we are done. Otherwise, is an even function, hence . As before, this implies that are all odd eigenfunctions corresponding to the same eigenvalue .
Corollary 1 and Corollary 2 seem very much expected. Notably, Nadirashvili has proved that in two dimensions, the multiplicity of the first non-zero Neumann eigenvalue is at most for any simply-connected domain. Our simple proof of Corollary 1 is not applicable in such generality. Corollary 1 is related to the “hot spots” problem, see, e.g., Burdzy , Jerison and Nadirashvili and references therein. A proof of Corollary 2 for the two-dimensional case – under much more general assumptions than convexity – can be found in [2, Theorem 4.3]. However, the proofs of the two-dimensional results mentioned do not seem to admit easy generalization to higher dimensions. As observed by Payne and Weinberger , Corollary 2 leads to the following comparison principle:
Denote by the first non-zero Neumann eigenvalue of . Then,
Equality holds when , an -dimensional cube.
According to Corollary 2(i), there exists an index and a non-zero eigenfunction corresponding to such that . By Fubini’s theorem and (21),
hence .
Corollary 3 shows that the cube satisfies a certain domain monotonicity principle for the Neumann Laplacian, at least in the category of unconditional, convex bodies. The Euclidean ball, for instance, does not satisfy a corresponding principle.
where is the first non-zero Neumann eigenvalue of , and is a universal constant. To establish (22), consider
Use Corollary 3 to deduce the bound . The body is a good approximation to the body : It is easily proven that
We may thus apply E. Milman’s result [27, Theorem 1.7], which builds upon the Sternberg-Zumbrun concavity principle , to conclude that and the bound (22) follows. See for a conjectural better bound, without the logarithmic factor.
Unconditional convex bodies
We begin this section with a corollary to the theorems of Section 2 and Section 3.
Proof: Begin with (i). By approximation, we may assume that has a -smooth boundary, and that is a -smooth function. Lemma 4 states that
Fix . We may apply Theorem 2 for since , as implied by the symmetries of . We may apply Lemma 3, since clearly for any . Theorem 2 and Lemma 3 entail the inequality
This proves (i). To deduce (ii), denote . Observe that is unconditional and that for any ,
We will use the following simple identities:
According to Corollary 4(i), it suffices to prove that for any ,
Fix . We will prove (25) by Fubini’s theorem. Fix a point
and denote . In order to prove (25), it is enough to show that
The equality we need is exactly the content of (23). The proof of (i) is thus complete, in the case where is distributed uniformly in a convex body. The proof of (ii) is almost entirely identical. By approximation, we may assume that are continuous. According to Corollary 4(ii), it is sufficient to prove that
This follows by Fubini’s theorem and (24). The lemma is thus proven, in the case where is distributed uniformly in an unconditional convex body.
where is the volume of the -dimensional Euclidean unit ball. Suppose that is a random vector that is distributed uniformly in . According to the case already considered, conclusions (i) and (ii) hold when the are replaced by . However, the random vector has the same distribution as . Thus (i) and (ii) hold also in the case where the density is -concave.
Finally, an approximation argument eliminates the requirement that the density of be -concave: Write for the unconditional, log-concave density of . Then, for any , the function
Lemma 4 may be viewed as a substitute for the sub-independent coordinates idea of Anttila, Ball and Perissinaki : Note the absence of cross terms from the right-hand side of Lemma 4(i). Suppose is a real-valued random variable with an even, log-concave density. A classical inequality (see, e.g., , or [3, Theorem 12] and references therein) states that for any ,
The following corollary contains a few obvious consequences of Lemma 4.
where is a universal constant. Consequently,
with , a positive universal constant. Moreover, for any ,
where is a constant depending only on .
Proof: According to the Prékopa-Leindler inequality (see, e.g., the first pages of ), the random variable has an even, log-concave density for all . From Lemma 4(i) and (26) we see that
This proves (i). By setting in (5), we deduce that
where is a constant depending solely on . This completes the proof.
where are universal constants. Another large-deviations estimate that was proved by Bobkov and Nazarov is that
with, say, and (see ).
Cordero-Erausquin, Fradelizi and Maurey have recently proved the so-called (B)-conjecture in the unconditional case. This entails the following improvement over the Brunn-Minkowski theory:
(The Prékopa-Leindler inequality leads to the weaker statement in which the is replaced by ). Corollary 5(ii) and Markov-Chebychev’s inequality yield
After some simple manipulations, we deduce the inequality
follows by combining Corollary 5(ii) with the distribution inequalities of Nazarov, Sodin and Volberg . We omit the details.
Berry-Esseen type bounds
In previous sections we established sharp thin shell estimates for unconditional, log-concave densities. In the present section we complete the proof of Theorem 2. The argument we present is quite technical and is very much related to classical treatments of the central limit theorem for independent random variables. The reader may refer to, e.g., [14, Vol. II, Chapter XVI] for background on the rate of convergence in the classical central limit theorem. We are indebted to Sasha Sodin for many discussions, suggestions and simplifications that have lead to the proofs we present below.
Before proceeding to the actual proof, let us describe the general idea. Introduce independent, symmetric Bernoulli variables . That is,
These Bernoulli variables are also assumed to be independent of . Write
where the last inequality holds only for “typical” values of . Since is strongly concentrated around , as we learn from (3), we may substitute the term in (33) by . Observe that since is unconditional, the random variables
have exactly the same distribution. Hence, by considering the expectation over in (33), we deduce a weaker version of (1) where the is replaced with . In order to arrive at the optimal bound, we need to apply a smoothing technique: The estimate (33) will be replaced with a much better Berry-Esseen inequality which is available for the random variable , for an appropriate “small” random variable . The details will be described next.
For instance, may be the random variable whose density is
for appropriate universal constants . (For this specific choice, is the -fold convolution of the characteristic function of an interval.) We shall use the standard -notation in this section. The notation , for some expression , is an abbreviation for some complicated quantity with the property that
for some universal constant . All constants hidden in the -notation in our proof are in principle explicit. The following lemma seems rather standard (see [14, Vol. II, Chapter XVI] for similar statements). For lack of a precise reference, we provide its proof.
Remark. Note that when for all , the error term in Lemma 5 is . The addition of allows us to deduce a better bound than the guaranteed by the Berry-Esseen inequality.
Thus, from the Fourier inversion formula (see, e.g., [14, Vol. II, Chapter XVI]),
Denote . To prove the lemma, it suffices to bound the absolute value of the integral in (38) by . We express the integral in (38) as where is the integral over , is the integral over (when , we set ) and is the integral over .
Begin with estimating . We use the elementary inequality
Since for all , then for ,
Combine (39) with (35) to deduce that for ,
Next we estimate , in the case where (in the complementary case, ). Denote . Then, by (36),
We will use the elementary inequality for . According to (40), whenever ,
Apply the well-known bound for , to deduce
The bound for is easy. From (34) we have for . Hence,
The lemma follows by combining the above bound for with the bound (41) for and the bound (6) for .
Denote . Clearly,
The lemma follows from (42) and (43).
We may apply Lemma 5 for and for , and conclude that,
where we used the estimates for and the bounds (48) and (49). This completes the proof of (47). The lemma is proven.
Our next goal is to eliminate the “” term from the conclusion of Lemma 7. The following short computational lemma serves this purpose. We shall use the standard estimate
for any (see, e.g., [14, Vol. I, Section VII.1]).
Let and denote . Then,
.
.
Suppose satisfies . Then .
Here, and are universal constants.
Proof: We have according to (50). Hence,
Note also that . Consequently, for any ,
Proof: By approximation, we may assume that the density of is -smooth and everywhere positive (e.g., convolve with a very small gaussian). We may also assume that for a small universal constant . The function
To prove the lemma, it suffices to show that .
Step 1: Suppose first that , for being the universal constant from Lemma 8. Then by (51),
Consequently, since ,
The desired estimate (52) is therefore proven, in the case where .
Step 2: It remains to deal with the case where satisfies . Denote . Note that
under the legitimate assumption that is smaller than a given universal constant. From Lemma 8(i) we have , hence by (51),
A similar argument, using Lemma 8(ii) in place of Lemma 8(i), shows that
We conclude that for any ,
Consequently, when ,
We conclude from (55) that for any ,
Equivalently, in the interval . Hence,
for being the universal constant from Lemma 8. Recall from (53) that . Lemma 8(iii) thus implies that
with . Returning to (56), we finally deduce the bound
Through Taylor’s theorem, the latter bound entails that
where is the constant from (57). The crucial observation is that is an odd function, hence its integral on a symmetric interval about the origin vanishes. By (57) and (58),
where is the constant from (57). We apply (51) and conclude that
Since , the proof of the lemma is complete.
with some universal constant . The random variable has an even, log-concave density by Prékopa-Leindler. We may thus apply Lemma 9, and conclude from (59) that
With Cédric Villani’s permission, we reproduce below the proof of Theorem 2 from his book [40, Section 7.6] with a few minor changes.
By taking the infimum over all couplings of and , we obtain
with depending only on . We may assume that ; otherwise, there is nothing to prove. Consequently,
Hence by letting tend to zero in (62), we deduce (60). The proof is complete.