A Berry-Esseen type inequality for convex bodies with an unconditional basis

Bo'az Klartag

Introduction

Equivalently, the random vector (X1,,Xn)(X_{1},\ldots,X_{n}) has the same distribution as (±X1,,±Xn)(\pm X_{1},\ldots,\pm X_{n}) for any choice of signs.

We prove the following Berry-Esseen type theorem:

The bound in (1) is optimal, up to the precise value of the constant, as shown by the example of X1,,XnX_{1},\ldots,X_{n} being independent random variables, with each XiX_{i} distributed, say, uniformly in a symmetric interval (see, e.g., [14, Vol. II, Section XVI.4]). A central element in the proof of Theorem 2 is the sharp estimate

Previous techniques for obtaining thin spherical shell estimates under convexity assumptions relied almost entirely on concentration of measure ideas, either on the sphere (see ), or on the orthogonal group (see ). The quantitative estimates that these techniques have yielded so far are sub-optimal. Inequality (3) was previously known to hold with the bound C/nκC/n^{\kappa} in place of C/nC/n, where the exponent κ\kappa is slightly smaller than 1/51/5, see . The latter result is applicable for all isotropically-normalized random vectors with a log-concave density.

In this article we suggest a different approach. Rather than employing concentration of measure inequalities, our proof of the optimal inequality (3) is based on analysis of the Neumann Laplacian on convex domains, the so-called L2L^{2}-method in convexity, going back to Hörmander and to Helffer and Sjöstrand . The argument is further simplified by using the theory of optimal transportation of measures. We expect this technique to be useful also in the study of other problems in convex geometry, such as central limit theorems for convex bodies with various types of symmetries. The argument leading to the thin shell estimate occupies Section 2, Section 3 and Section 5. In Section 6 we apply these estimates and complete the proof of Theorem 2.

Readers who are interested only in the proof of inequality (3) and Theorem 2 may skip Section 4. This section is devoted to several results, that were obtained as by-products, regarding the first non-zero eigenvalue and the corresponding eigenfunctions of the Neumann Laplacian on nn-dimensional convex bodies. In particular, we show that the eigenfunctions are all “biased” towards some direction in space. This rules out, for instance, the possibility of an even eigenfunction.

Acknowledgement. We would like to express our gratitude to Sasha Sodin for his kind help with the analysis related to the classical central limit theorem, to Tom Spencer for illuminating explanations regarding the work of Helffer and Sjöstrand, and to Dario Cordero-Erausquin, Leonid Friedlandler, Robert McCann, Emanuel Milman, Vitali Milman and Elias Stein for valuable discussions on related topics. Thanks also to the referee for useful comments and suggestions.

Convexity and the Neumann Laplacian

with E=Voln(K)1KfE=Vol_{n}(K)^{-1}\int_{K}f. The main result of this section reads as follows:

and ρ(x)0\rho(x)\leq 0 for xKx\in K. For instance, we may select ρ(x)=d(x,K)=infyKxy\rho(x)=-d(x,\partial K)=-\inf_{y\in\partial K}|x-y|. Note that for any xKx\in\partial K, the vector ρ(x)\nabla\rho(x) is the outer unit normal to K\partial K at xx.

The following lemma is a standard Bochner-Weitzenböck type integration by parts formula, going back at least to Lichnerowicz , to Hörmander and to Kadlec . We write 2u\nabla^{2}u for the hessian matrix of the function uu.

Let uDu\in\mathcal{D} and denote f=uf=-\triangle u. Then,

Proof: The function xu(x)ρ(x)x\mapsto\nabla u(x)\cdot\nabla\rho(x) vanishes on K\partial K. Since u\nabla u is tangential to K\partial K, the derivative of the function xu(x)ρ(x)x\mapsto\nabla u(x)\cdot\nabla\rho(x) in the direction of u\nabla u vanishes on K\partial K. That is,

The boundary term vanishes, since uρ=0\nabla u\cdot\nabla\rho=0 on K\partial K. We conclude from (7) and from an additional application of Stokes theorem that

Note that the integrand in the integral over K\partial K is exactly 2u(ρ)u\nabla^{2}u(\nabla\rho)\cdot\nabla u. Hence, from (6),

The convexity of KK will be used next. Recall that ρ\rho is a convex function, and hence its hessian 2ρ(x)\nabla^{2}\rho(x) is a positive semi-definite matrix for any xKx\in\partial K. Therefore, Lemma 5 implies that for any uDu\in\mathcal{D},

where f=uf=\triangle u. Lemma 4 will be proven by dualizing inequality (8), in a way which is very much related to the approach taken by Hörmander and by Helffer and Sjöstrand .

Proof of Lemma 4: We are given fC(K)f\in C^{\infty}(K) and we would like to prove (4). We may assume that Kf=0\int_{K}f=0 (otherwise, subtract 1Voln(K)Kf\frac{1}{Vol_{n}(K)}\int_{K}f from the function ff).

Since fC(K)f\in C^{\infty}(K) and Kf=0\int_{K}f=0, there exists uDu\in\mathcal{D} with

The existence of such uDu\in\mathcal{D} is a consequence of the classical existence and regularity theory of the Neumann problem for the Laplacian on domains with a CC^{\infty}-smooth boundary (see, e.g., Folland’s book [16, chapter 7]). Stokes theorem yields

where the boundary term vanishes since uDu\in\mathcal{D}. From the definition of the H1(K)H^{-1}(K)-norm and the Cauchy-Schwartz inequality,

Transportation of Measure

This definition fits with the one given in Section 2; We have uH1(λK)=uH1(K)\|u\|_{H^{-1}(\lambda_{K})}=\|u\|_{H^{-1}(K)} where λK\lambda_{K} denotes the restriction of the Lebesgue measure to KK.

The next theorem is an extension of a remark by Yann Brenier that we learned from Robert McCann. For the convenience of the reader, we provide in the appendix a detailed exposition of the elegant proof from Villani [40, Section 7.6].

For a sufficiently small ε>0\varepsilon>0, let με\mu_{\varepsilon} be the measure whose density with respect to μ\mu is the non-negative function 1+εh1+\varepsilon h. Then,

the line segment from Bi(x)\mathcal{B}_{i}^{-}(x) to Bi+(x)\mathcal{B}_{i}^{+}(x). See Figure 1.

For i=1,,ni=1,\ldots,n consider the projection

For a sufficiently small ε>0\varepsilon>0 denote by με\mu_{\varepsilon} the measure whose density with respect to μ\mu is 1+εiΨ1+\varepsilon\partial^{i}\Psi. Then,

Proof: Without loss of generality, assume that i=1i=1. For a sufficiently small ε>0\varepsilon>0, the function 1+ε1Ψ1+\varepsilon\partial^{1}\Psi is positive on KK, and hence με\mu_{\varepsilon} is a non-negative measure. Fix such a sufficiently small ε>0\varepsilon>0.

Consequently, the densities t1t\mapsto 1 and t1+ε1Ψ(t,y)t\mapsto 1+\varepsilon\partial^{1}\Psi(t,y) have an equal amount of mass on the interval [p,q][p,q]. We consider the monotone transportation between these two densities. That is, we define a map T=Ty:[p,q][p,q]T=T^{y}:[p,q]\rightarrow[p,q] by requiring that for any x1[p,q]x_{1}\in[p,q],

The unique map T:[p,q][p,q]T:[p,q]\rightarrow[p,q] that satisfies (11) transports the measure whose density is 1+ε1Ψ(t,y)1+\varepsilon\partial^{1}\Psi(t,y) on [p,q][p,q] to the Lebesgue measure on [p,q][p,q]. We deduce from (11) that for x1[p,q]x_{1}\in[p,q],

with R|R| bounded by a constant depending only on Ψ\Psi and KK (and in particular, independent of ε\varepsilon or yy). We now let yπ1(K)y\in\pi_{1}(K) vary, and we write

Therefore the map SS transports με\mu_{\varepsilon} to μ\mu. According to (3),

with R|R^{\prime}| smaller than a constant depending only on KK and Ψ\Psi, and in particular independent of ε\varepsilon. To complete the proof, let ε\varepsilon tend to zero. \square

A digression: Neumann eigenvalues and eigenfunctions

This section presents some additional relations between convexity and the Neumann Laplacian. We retain the setup and notation of Section 2. We write L2(K)L^{2}(K) for the Hilbert space that is the completion of C(K)C^{\infty}(K) with respect to the norm

The operator -\triangle, acting on the subspace DL2(K)\mathcal{D}\subset L^{2}(K), is a symmetric, positive semi-definite operator. The classical theory implies that -\triangle has a complete system of orthonormal Neumann eigenfunctions φ0,φ1,D\varphi_{0},\varphi_{1},\ldots\in\mathcal{D} and Neumann eigenvalues 0λ0λ1...0\leq\lambda_{0}\leq\lambda_{1}\leq... (see, e.g., [16, Chapter 7]). The first eigenvalue is λ0=0\lambda_{0}=0, with the eigenfunction φ0\varphi_{0} being constant. It is well-known that λ1>0\lambda_{1}>0 when KK is convex (see, e.g, . It is actually enough to assume that KK is connected, see e.g., [11, Theorem 1]). We refer to λ1\lambda_{1} as the first non-zero Neumann eigenvalue of KK. It is well-known that for any C(K)C^{\infty}(K)-smooth function uu with Ku=0\int_{K}u=0,

Equality in (13) holds if and only if uu is an eigenfunction corresponding to the eigenvalue λ1\lambda_{1}.

We say that the boundary of KK is uniformly strictly convex if 2ρ(x)\nabla^{2}\rho(x) is a positive definite matrix for any xKx\in\partial K. Equivalently, K\partial K is uniformly strictly convex if the principal curvatures are all positive – and not merely non-negative – everywhere on the boundary. Our next corollary claims, loosely speaking, that any non-trivial eigenfunction corresponding to λ1\lambda_{1} cannot be “spatially isotropic”, but must have “preference” for a certain direction in space.

Consequently, the multiplicity of the first non-zero Neumann eigenvalue is at most nn.

We write λ1\lambda_{1} for the first non-zero eigenvalue, i.e., φ=λ1φ\triangle\varphi=-\lambda_{1}\varphi. Since φD\varphi\in\mathcal{D}, inequality (8) gives

From (15) we know that Kiφ=0\int_{K}\partial^{i}\varphi=0 for all ii. Thus (16) and (13) yield

Therefore, there must be equality in all steps and hence 1φ,,nφ\partial^{1}\varphi,\ldots,\partial^{n}\varphi are all Neumann eigenfunctions with eigenvalue λ1\lambda_{1}. We necessarily have equality also in (16). According to Lemma 5 this means that

Since the integrand is non-negative and continuous, necessarily

So far we have only used the convexity of KK. The uniform strict convexity of K\partial K means that 2ρ>0\nabla^{2}\rho>0 on K\partial K. Equation (17) has the consequence that φ=0\nabla\varphi=0 on K\partial K, and therefore

This is well-known to be impossible for a Neumann eigenfunction corresponding to the first non-zero eigenvalue. We sketch the standard argument, see, e.g., for more information. Denote

Remark. Leonid Friedlandler explained to us how to eliminate the uniform strict convexity requirement from Corollary 1. His idea is to observe that since 1φ,,nφ\partial^{1}\varphi,\ldots,\partial^{n}\varphi are all eigenfunctions, then the restriction of φ\varphi to the boundary K\partial K is actually an eigenfunction of the Laplacian associated with the Riemannian manifold K\partial K. However, (17) entails that φ\varphi is constant in some open set in K\partial K, which is known to be impossible for an eigenfunction. We omit the details.

i.e., we flip the sign of the ithi^{th} coordinate. For a function ff, we write σi(f)(x)=f(σi(x))\sigma_{i}(f)(x)=f(\sigma_{i}(x)). Our next corollary exploits the well-known relationship between the eigenfunctions and symmetry. Similar arguments appear, e.g., in .

If KK is unconditional, then there exist i=1,,ni=1,\ldots,n and an eigenfunction 0≢φEλ10\not\equiv\varphi\in E_{\lambda_{1}}, such that

If KK is centrally-symmetric (i.e., K=KK=-K), then there exists an eigenfunction 0≢φEλ10\not\equiv\varphi\in E_{\lambda_{1}}, such that

Proof: Begin with the proof of (i). We are given the unconditional convex body KK. Since KK is unconditional, then fEλ1f\in E_{\lambda_{1}} implies σi(f)Eλ1\sigma_{i}(f)\in E_{\lambda_{1}} for i=1,,ni=1,\ldots,n. Begin with any non-zero eigenfunction f0Eλ1f_{0}\in E_{\lambda_{1}}, and recursively define

Then f0,f1,,fnEλ1f_{0},f_{1},\ldots,f_{n}\in E_{\lambda_{1}}. If there exists i=1,,ni=1,\ldots,n such that fi0f_{i}\equiv 0 then we are done: Suppose ii is the minimal such index. Then 0≢fi1Eλ10\not\equiv f_{i-1}\in E_{\lambda_{1}} with σi1(fi1)=fi1\sigma_{i-1}(f_{i-1})=-f_{i-1}, and we found our desired eigenfunction.

It remains to deal with the case where ψ=fn\psi=f_{n} is a non-zero eigenfunction. Note that σi(ψ)=ψ\sigma_{i}(\psi)=\psi and hence

In the proof of Corollary 1 (the first part, which did not use the uniform strict convexity) we observed that (20) implies that 1ψ,,nψEλ1\partial^{1}\psi,\ldots,\partial^{n}\psi\in E_{\lambda_{1}}. Since Kψ2>0\int_{K}|\nabla\psi|^{2}>0, there exists i=1,,ni=1,\ldots,n with iψ≢0\partial^{i}\psi\not\equiv 0. We see from (19) that iψEλ1\partial^{i}\psi\in E_{\lambda_{1}} is the eigenfunction we are looking for. This completes the proof of the first part of the lemma.

The proof of the second part is similar. Begin with any 0≢fEλ10\not\equiv f\in E_{\lambda_{1}} and set ψ(x)=f(x)+f(x)\psi(x)=f(x)+f(-x). If ψ0\psi\equiv 0, then ff is an odd function and we are done. Otherwise, ψ\psi is an even function, hence Kψ=0\int_{K}\nabla\psi=0. As before, this implies that 1ψ,nψ\partial^{1}\psi,\ldots\partial^{n}\psi are all odd eigenfunctions corresponding to the same eigenvalue λ1\lambda_{1}. \square

Corollary 1 and Corollary 2 seem very much expected. Notably, Nadirashvili has proved that in two dimensions, the multiplicity of the first non-zero Neumann eigenvalue is at most 22 for any simply-connected domain. Our simple proof of Corollary 1 is not applicable in such generality. Corollary 1 is related to the “hot spots” problem, see, e.g., Burdzy , Jerison and Nadirashvili and references therein. A proof of Corollary 2 for the two-dimensional case – under much more general assumptions than convexity – can be found in [2, Theorem 4.3]. However, the proofs of the two-dimensional results mentioned do not seem to admit easy generalization to higher dimensions. As observed by Payne and Weinberger , Corollary 2 leads to the following comparison principle:

Denote by λ1>0\lambda_{1}>0 the first non-zero Neumann eigenvalue of KK. Then,

Equality holds when K=[R,R]nK=[-R,R]^{n}, an nn-dimensional cube.

According to Corollary 2(i), there exists an index 1in1\leq i\leq n and a non-zero eigenfunction φ\varphi corresponding to λ1\lambda_{1} such that σi(φ)=φ\sigma_{i}(\varphi)=-\varphi. By Fubini’s theorem and (21),

hence λ1π2/R2\lambda_{1}\geq\pi^{2}/R^{2}. \square

Corollary 3 shows that the cube satisfies a certain domain monotonicity principle for the Neumann Laplacian, at least in the category of unconditional, convex bodies. The Euclidean ball, for instance, does not satisfy a corresponding principle.

where λ1(K)>0\lambda_{1}(K)>0 is the first non-zero Neumann eigenvalue of KK, and c>0c>0 is a universal constant. To establish (22), consider

Use Corollary 3 to deduce the bound λ1(K)>c/log2(n+1)\lambda_{1}(K^{\prime})>c/\log^{2}(n+1). The body KK^{\prime} is a good approximation to the body KK: It is easily proven that

We may thus apply E. Milman’s result [27, Theorem 1.7], which builds upon the Sternberg-Zumbrun concavity principle , to conclude that λ1(K)cλ1(K)\lambda_{1}(K)\geq c\lambda_{1}(K^{\prime}) and the bound (22) follows. See for a conjectural better bound, without the logarithmic factor.

Unconditional convex bodies

We begin this section with a corollary to the theorems of Section 2 and Section 3.

Proof: Begin with (i). By approximation, we may assume that KK has a CC^{\infty}-smooth boundary, and that Ψ\Psi is a C(K)C^{\infty}(K)-smooth function. Lemma 4 states that

Fix i=1,,ni=1,\ldots,n. We may apply Theorem 2 for h=iΨh=\partial^{i}\Psi since KiΨ=0\int_{K}\partial^{i}\Psi=0, as implied by the symmetries of Ψ\Psi. We may apply Lemma 3, since clearly Ψ(Bi+(x))=Ψ(Bi(x))\Psi\left(\mathcal{B}_{i}^{+}(x)\right)=\Psi\left(\mathcal{B}_{i}^{-}(x)\right) for any xKx\in K. Theorem 2 and Lemma 3 entail the inequality

This proves (i). To deduce (ii), denote Ψi(x1,,xn)=fi(xi)\Psi_{i}(x_{1},\ldots,x_{n})=f_{i}(x_{i}). Observe that Ψ(x)=i=1nΨi(x)\Psi(x)=\sum_{i=1}^{n}\Psi_{i}(x) is unconditional and that for any xK,i=1,,nx\in K,i=1,\ldots,n,

We will use the following simple identities:

According to Corollary 4(i), it suffices to prove that for any i=1,,ni=1,\ldots,n,

Fix i=1,,ni=1,\ldots,n. We will prove (25) by Fubini’s theorem. Fix a point

and denote r=qi+(x)0r=q_{i}^{+}(x^{\prime})\geq 0. In order to prove (25), it is enough to show that

The equality we need is exactly the content of (23). The proof of (i) is thus complete, in the case where XX is distributed uniformly in a convex body. The proof of (ii) is almost entirely identical. By approximation, we may assume that f1,,fnf_{1},\ldots,f_{n} are continuous. According to Corollary 4(ii), it is sufficient to prove that

This follows by Fubini’s theorem and (24). The lemma is thus proven, in the case where XX is distributed uniformly in an unconditional convex body.

where κs=πs/2/Γ(s/2+1)\kappa_{s}=\pi^{s/2}/\Gamma(s/2+1) is the volume of the ss-dimensional Euclidean unit ball. Suppose that Z=(Z1,,ZN)Z=(Z_{1},\ldots,Z_{N}) is a random vector that is distributed uniformly in KK. According to the case already considered, conclusions (i) and (ii) hold when the X1,,XnX_{1},\ldots,X_{n} are replaced by Z1,,ZnZ_{1},\ldots,Z_{n}. However, the random vector (Z1,,Zn)(Z_{1},\ldots,Z_{n}) has the same distribution as X=(X1,,Xn)X=(X_{1},\ldots,X_{n}). Thus (i) and (ii) hold also in the case where the density ff is ss-concave.

Finally, an approximation argument eliminates the requirement that the density of ff be ss-concave: Write f=eψf=e^{-\psi} for the unconditional, log-concave density of XX. Then, for any s>0s>0, the function

Lemma 4 may be viewed as a substitute for the sub-independent coordinates idea of Anttila, Ball and Perissinaki : Note the absence of cross terms from the right-hand side of Lemma 4(i). Suppose XX is a real-valued random variable with an even, log-concave density. A classical inequality (see, e.g., , or [3, Theorem 12] and references therein) states that for any p2p\geq 2,

The following corollary contains a few obvious consequences of Lemma 4.

where C16C^{\prime}\leq 16 is a universal constant. Consequently,

with C4C\leq 4, a positive universal constant. Moreover, for any p1p\geq 1,

where Cp>0C_{p}>0 is a constant depending only on pp.

Proof: According to the Prékopa-Leindler inequality (see, e.g., the first pages of ), the random variable XiX_{i} has an even, log-concave density for all ii. From Lemma 4(i) and (26) we see that

This proves (i). By setting ai=1 (i=1,,n)a_{i}=1\ (i=1,\ldots,n) in (5), we deduce that

where CpC_{p} is a constant depending solely on p1p\geq 1. This completes the proof. \square

where c,C>0c,C>0 are universal constants. Another large-deviations estimate that was proved by Bobkov and Nazarov is that

with, say, α=0.33\alpha=0.33 and β=3.33\beta=3.33 (see ).

Cordero-Erausquin, Fradelizi and Maurey have recently proved the so-called (B)-conjecture in the unconditional case. This entails the following improvement over the Brunn-Minkowski theory:

(The Prékopa-Leindler inequality leads to the weaker statement in which the ete^{t} is replaced by tt). Corollary 5(ii) and Markov-Chebychev’s inequality yield

After some simple manipulations, we deduce the inequality

follows by combining Corollary 5(ii) with the distribution inequalities of Nazarov, Sodin and Volberg . We omit the details.

Berry-Esseen type bounds

In previous sections we established sharp thin shell estimates for unconditional, log-concave densities. In the present section we complete the proof of Theorem 2. The argument we present is quite technical and is very much related to classical treatments of the central limit theorem for independent random variables. The reader may refer to, e.g., [14, Vol. II, Chapter XVI] for background on the rate of convergence in the classical central limit theorem. We are indebted to Sasha Sodin for many discussions, suggestions and simplifications that have lead to the proofs we present below.

Before proceeding to the actual proof, let us describe the general idea. Introduce independent, symmetric Bernoulli variables Δ1,,Δn\Delta_{1},\ldots,\Delta_{n}. That is,

These Bernoulli variables are also assumed to be independent of XX. Write

where the last inequality holds only for “typical” values of XX. Since X/n|X|/\sqrt{n} is strongly concentrated around 11, as we learn from (3), we may substitute the Φ(tn/X)\Phi\left(t\sqrt{n}/|X|\right) term in (33) by Φ(t)\Phi(t). Observe that since XX is unconditional, the random variables

have exactly the same distribution. Hence, by considering the expectation over XX in (33), we deduce a weaker version of (1) where the C/nC/n is replaced with C/nC/\sqrt{n}. In order to arrive at the optimal bound, we need to apply a smoothing technique: The estimate (33) will be replaced with a much better Berry-Esseen inequality which is available for the random variable Γ+(iΔiXi)/n\Gamma+\left(\sum_{i}\Delta_{i}X_{i}\right)\left/\sqrt{n}\right., for an appropriate “small” random variable Γ\Gamma. The details will be described next.

For instance, Γ\Gamma may be the random variable whose density is

for appropriate universal constants κ1,κ2\kappa_{1},\kappa_{2}. (For this specific choice, γ\gamma is the 88-fold convolution of the characteristic function of an interval.) We shall use the standard OO-notation in this section. The notation O(x)O(x), for some expression xx, is an abbreviation for some complicated quantity yy with the property that

for some universal constant C>0C>0. All constants hidden in the OO-notation in our proof are in principle explicit. The following lemma seems rather standard (see [14, Vol. II, Chapter XVI] for similar statements). For lack of a precise reference, we provide its proof.

Remark. Note that when θi=1/n=σ\theta_{i}=1/\sqrt{n}=\sigma for all ii, the error term in Lemma 5 is O(1/n)O(1/n). The addition of Γ/n\Gamma/\sqrt{n} allows us to deduce a better bound than the O(1/n)O(1/\sqrt{n}) guaranteed by the Berry-Esseen inequality.

Thus, from the Fourier inversion formula (see, e.g., [14, Vol. II, Chapter XVI]),

Denote ε=iθi4\varepsilon=\sqrt{\sum_{i}\theta_{i}^{4}}. To prove the lemma, it suffices to bound the absolute value of the integral in (38) by C(ε2+σ2)C^{\prime}(\varepsilon^{2}+\sigma^{2}). We express the integral in (38) as I1+I2+I3I_{1}+I_{2}+I_{3} where I1I_{1} is the integral over ξ[ε1/2,ε1/2]\xi\in[-\varepsilon^{-1/2},\varepsilon^{-1/2}], I2I_{2} is the integral over ε1/2ξσ1\varepsilon^{-1/2}\leq|\xi|\leq\sigma^{-1} (when ε1/2>σ1\varepsilon^{-1/2}>\sigma^{-1}, we set I2=0I_{2}=0) and I3I_{3} is the integral over ξmax{σ1,ε1/2}|\xi|\geq\max\{\sigma^{-1},\varepsilon^{-1/2}\}.

Begin with estimating I1I_{1}. We use the elementary inequality

Since θiε1/2|\theta_{i}|\leq\varepsilon^{1/2} for all ii, then for ξε1/2|\xi|\leq\varepsilon^{-1/2},

Combine (39) with (35) to deduce that for ξε1/2|\xi|\leq\varepsilon^{-1/2},

Next we estimate I2I_{2}, in the case where ε1/2σ1\varepsilon^{-1/2}\leq\sigma^{-1} (in the complementary case, I2=0I_{2}=0). Denote I={1in;θiσ}\mathcal{I}=\left\{1\leq i\leq n\,;\,|\theta_{i}|\leq\sigma\right\}. Then, by (36),

We will use the elementary inequality cossecs2|\cos s|\leq e^{-cs^{2}} for s1|s|\leq 1. According to (40), whenever ξσ1|\xi|\leq\sigma^{-1},

Apply the well-known bound seu2/2Cecs2\int_{s}^{\infty}e^{-u^{2}/2}\leq Ce^{-cs^{2}} for s0s\geq 0, to deduce

The bound for I3I_{3} is easy. From (34) we have γ(σξ)=0\gamma(\sigma\xi)=0 for ξσ1|\xi|\geq\sigma^{-1}. Hence,

The lemma follows by combining the above bound for I3|I_{3}| with the bound (41) for I2|I_{2}| and the bound (6) for I1|I_{1}|. \square

Denote Y=i;θiXiεθi2Xi2Y=\sum_{i;|\theta_{i}X_{i}|\geq\varepsilon}\theta_{i}^{2}X_{i}^{2}. Clearly,

The lemma follows from (42) and (43). \square

We may apply Lemma 5 for (θ1x1,,θnxn)(\theta_{1}x_{1},\ldots,\theta_{n}x_{n}) and for σ=ε\sigma=\varepsilon, and conclude that,

where we used the estimates for F,FF^{\prime},F^{\prime\prime} and the bounds (48) and (49). This completes the proof of (47). The lemma is proven. \square

Our next goal is to eliminate the “εΓ\varepsilon\Gamma” term from the conclusion of Lemma 7. The following short computational lemma serves this purpose. We shall use the standard estimate

for any t00t_{0}\geq 0 (see, e.g., [14, Vol. I, Section VII.1]).

Let t00t_{0}\geq 0 and denote δ=Φ(t0)\delta=\Phi(t_{0}). Then,

Φ(t0+2δ1/4)C11δ\displaystyle\Phi\left(t_{0}+2\delta^{1/4}\right)\geq C_{1}^{-1}\delta.

1Φ(t02δ1/4)1Φ(2)C11C11δ\displaystyle 1-\Phi\left(t_{0}-2\delta^{1/4}\right)\geq 1-\Phi(-2)\geq C_{1}^{-1}\geq C_{1}^{-1}\delta.

Suppose x>0x>0 satisfies 1x1φ(t0)c2δ3/4\displaystyle\left|\frac{1}{x}-\frac{1}{\varphi(t_{0})}\right|\leq c_{2}\delta^{-3/4}. Then x2C1δ\displaystyle x^{2}\leq C_{1}\delta.

Here, C1>1C_{1}>1 and 0<c2<10<c_{2}<1 are universal constants.

Proof: We have t0δ1/4Ct0(φ(t0))1/4Ct_{0}\delta^{1/4}\leq Ct_{0}(\varphi(t_{0}))^{1/4}\leq C^{\prime} according to (50). Hence,

Note also that φ(t0)C/(t0+1)\varphi(t_{0})\leq C/(t_{0}+1). Consequently, for any x>0x>0,

Proof: By approximation, we may assume that the density of XX is C1C^{1}-smooth and everywhere positive (e.g., convolve XX with a very small gaussian). We may also assume that εc\varepsilon\leq c for a small universal constant c>0c>0. The function

To prove the lemma, it suffices to show that maxtE(t)=E(t0)CAε2\max_{t}E(t)=E(t_{0})\leq CA\varepsilon^{2}.

Step 1: Suppose first that Φ(t0)2C1Aε2\Phi(t_{0})\leq 2C_{1}A\varepsilon^{2}, for C1C_{1} being the universal constant from Lemma 8. Then by (51),

Consequently, since Φ(t0)2C1Aε2\Phi(t_{0})\leq 2C_{1}A\varepsilon^{2},

The desired estimate (52) is therefore proven, in the case where Φ(t0)2C1Aε2\Phi(t_{0})\leq 2C_{1}A\varepsilon^{2}.

Step 2: It remains to deal with the case where t00t_{0}\geq 0 satisfies Φ(t0)>2C1Aε2\Phi(t_{0})>2C_{1}A\varepsilon^{2}. Denote δ=Φ(t0)2C1Aε2Aε2\delta=\Phi(t_{0})\geq 2C_{1}A\varepsilon^{2}\geq A\varepsilon^{2}. Note that

under the legitimate assumption that ε\varepsilon is smaller than a given universal constant. From Lemma 8(i) we have Φ(t0+2δ1/4)δ/C1\Phi\left(t_{0}+2\delta^{1/4}\right)\geq\delta/C_{1}, hence by (51),

A similar argument, using Lemma 8(ii) in place of Lemma 8(i), shows that

We conclude that for any t[t0δ1/4,t0+δ1/4]t\in[t_{0}-\delta^{1/4},t_{0}+\delta^{1/4}],

Consequently, when f(x0)0f^{\prime}(x_{0})\neq 0,

We conclude from (55) that for any t[t0δ1/4,t0+δ1/4]t\in[t_{0}-\delta^{1/4},t_{0}+\delta^{1/4}],

Equivalently, (1/f)4C1δ1|(1/f)^{\prime}|\leq 4C_{1}\delta^{-1} in the interval [t0δ1/4,t0+δ1/4][t_{0}-\delta^{1/4},t_{0}+\delta^{1/4}]. Hence,

for c2>0c_{2}>0 being the universal constant from Lemma 8. Recall from (53) that f(t0)=φ(t0)f(t_{0})=\varphi(t_{0}). Lemma 8(iii) thus implies that

with c=c2/4C1c=c_{2}/4C_{1}. Returning to (56), we finally deduce the bound

Through Taylor’s theorem, the latter bound entails that

where c^>0\hat{c}>0 is the constant from (57). The crucial observation is that sf(t0)sη(s)s\mapsto f(t_{0})s\eta(s) is an odd function, hence its integral on a symmetric interval about the origin vanishes. By (57) and (58),

where c^>0\hat{c}>0 is the constant from (57). We apply (51) and conclude that

Since E(t0)=maxtE(t)E(t_{0})=\max_{t}E(t), the proof of the lemma is complete. \square

with some universal constant C1C\geq 1. The random variable YY has an even, log-concave density by Prékopa-Leindler. We may thus apply Lemma 9, and conclude from (59) that

With Cédric Villani’s permission, we reproduce below the proof of Theorem 2 from his book [40, Section 7.6] with a few minor changes.

By taking the infimum over all couplings γ\gamma of μ\mu and με\mu_{\varepsilon}, we obtain

with RR depending only on φ\varphi. We may assume that lim infε0+W2(μ,με)/ε<\liminf_{\varepsilon\rightarrow 0^{+}}W_{2}(\mu,\mu_{\varepsilon})/\varepsilon<\infty; otherwise, there is nothing to prove. Consequently,

Hence by letting ε\varepsilon tend to zero in (62), we deduce (60). The proof is complete. \square

References