A Berry-Esseen type inequality for convex bodies with an unconditional basis

Bo'az Klartag

Introduction

Equivalently, the random vector $(X_{1},\ldots,X_{n})$ has the same distribution as $(\pm X_{1},\ldots,\pm X_{n})$ for any choice of signs.

We prove the following Berry-Esseen type theorem:

The bound in (1) is optimal, up to the precise value of the constant, as shown by the example of $X_{1},\ldots,X_{n}$ being independent random variables, with each $X_{i}$ distributed, say, uniformly in a symmetric interval (see, e.g., [14, Vol. II, Section XVI.4]). A central element in the proof of Theorem 2 is the sharp estimate

Previous techniques for obtaining thin spherical shell estimates under convexity assumptions relied almost entirely on concentration of measure ideas, either on the sphere (see ), or on the orthogonal group (see ). The quantitative estimates that these techniques have yielded so far are sub-optimal. Inequality (3) was previously known to hold with the bound $C/n^{\kappa}$ in place of $C/n$ , where the exponent $\kappa$ is slightly smaller than $1/5$ , see . The latter result is applicable for all isotropically-normalized random vectors with a log-concave density.

In this article we suggest a different approach. Rather than employing concentration of measure inequalities, our proof of the optimal inequality (3) is based on analysis of the Neumann Laplacian on convex domains, the so-called $L^{2}$ -method in convexity, going back to Hörmander and to Helffer and Sjöstrand . The argument is further simplified by using the theory of optimal transportation of measures. We expect this technique to be useful also in the study of other problems in convex geometry, such as central limit theorems for convex bodies with various types of symmetries. The argument leading to the thin shell estimate occupies Section 2, Section 3 and Section 5. In Section 6 we apply these estimates and complete the proof of Theorem 2.

Readers who are interested only in the proof of inequality (3) and Theorem 2 may skip Section 4. This section is devoted to several results, that were obtained as by-products, regarding the first non-zero eigenvalue and the corresponding eigenfunctions of the Neumann Laplacian on $n$ -dimensional convex bodies. In particular, we show that the eigenfunctions are all “biased” towards some direction in space. This rules out, for instance, the possibility of an even eigenfunction.

Acknowledgement. We would like to express our gratitude to Sasha Sodin for his kind help with the analysis related to the classical central limit theorem, to Tom Spencer for illuminating explanations regarding the work of Helffer and Sjöstrand, and to Dario Cordero-Erausquin, Leonid Friedlandler, Robert McCann, Emanuel Milman, Vitali Milman and Elias Stein for valuable discussions on related topics. Thanks also to the referee for useful comments and suggestions.

Convexity and the Neumann Laplacian

with $E=Vol_{n}(K)^{-1}\int_{K}f$ . The main result of this section reads as follows:

and $\rho(x)\leq 0$ for $x\in K$ . For instance, we may select $\rho(x)=-d(x,\partial K)=-\inf_{y\in\partial K}|x-y|$ . Note that for any $x\in\partial K$ , the vector $\nabla\rho(x)$ is the outer unit normal to $\partial K$ at $x$ .

The following lemma is a standard Bochner-Weitzenböck type integration by parts formula, going back at least to Lichnerowicz , to Hörmander and to Kadlec . We write $\nabla^{2}u$ for the hessian matrix of the function $u$ .

Let $u\in\mathcal{D}$ and denote $f=-\triangle u$ . Then,

Proof: The function $x\mapsto\nabla u(x)\cdot\nabla\rho(x)$ vanishes on $\partial K$ . Since $\nabla u$ is tangential to $\partial K$ , the derivative of the function $x\mapsto\nabla u(x)\cdot\nabla\rho(x)$ in the direction of $\nabla u$ vanishes on $\partial K$ . That is,

The boundary term vanishes, since $\nabla u\cdot\nabla\rho=0$ on $\partial K$ . We conclude from (7) and from an additional application of Stokes theorem that

Note that the integrand in the integral over $\partial K$ is exactly $\nabla^{2}u(\nabla\rho)\cdot\nabla u$ . Hence, from (6),

The convexity of $K$ will be used next. Recall that $\rho$ is a convex function, and hence its hessian $\nabla^{2}\rho(x)$ is a positive semi-definite matrix for any $x\in\partial K$ . Therefore, Lemma 5 implies that for any $u\in\mathcal{D}$ ,

where $f=\triangle u$ . Lemma 4 will be proven by dualizing inequality (8), in a way which is very much related to the approach taken by Hörmander and by Helffer and Sjöstrand .

Proof of Lemma 4: We are given $f\in C^{\infty}(K)$ and we would like to prove (4). We may assume that $\int_{K}f=0$ (otherwise, subtract $\frac{1}{Vol_{n}(K)}\int_{K}f$ from the function $f$ ).

Since $f\in C^{\infty}(K)$ and $\int_{K}f=0$ , there exists $u\in\mathcal{D}$ with

The existence of such $u\in\mathcal{D}$ is a consequence of the classical existence and regularity theory of the Neumann problem for the Laplacian on domains with a $C^{\infty}$ -smooth boundary (see, e.g., Folland’s book [16, chapter 7]). Stokes theorem yields

where the boundary term vanishes since $u\in\mathcal{D}$ . From the definition of the $H^{-1}(K)$ -norm and the Cauchy-Schwartz inequality,

Transportation of Measure

This definition fits with the one given in Section 2; We have $\|u\|_{H^{-1}(\lambda_{K})}=\|u\|_{H^{-1}(K)}$ where $\lambda_{K}$ denotes the restriction of the Lebesgue measure to $K$ .

The next theorem is an extension of a remark by Yann Brenier that we learned from Robert McCann. For the convenience of the reader, we provide in the appendix a detailed exposition of the elegant proof from Villani [40, Section 7.6].

For a sufficiently small $\varepsilon>0$ , let $\mu_{\varepsilon}$ be the measure whose density with respect to $\mu$ is the non-negative function $1+\varepsilon h$ . Then,

the line segment from $\mathcal{B}_{i}^{-}(x)$ to $\mathcal{B}_{i}^{+}(x)$ . See Figure 1.

For $i=1,\ldots,n$ consider the projection

For a sufficiently small $\varepsilon>0$ denote by $\mu_{\varepsilon}$ the measure whose density with respect to $\mu$ is $1+\varepsilon\partial^{i}\Psi$ . Then,

Proof: Without loss of generality, assume that $i=1$ . For a sufficiently small $\varepsilon>0$ , the function $1+\varepsilon\partial^{1}\Psi$ is positive on $K$ , and hence $\mu_{\varepsilon}$ is a non-negative measure. Fix such a sufficiently small $\varepsilon>0$ .

Consequently, the densities $t\mapsto 1$ and $t\mapsto 1+\varepsilon\partial^{1}\Psi(t,y)$ have an equal amount of mass on the interval $[p,q]$ . We consider the monotone transportation between these two densities. That is, we define a map $T=T^{y}:[p,q]\rightarrow[p,q]$ by requiring that for any $x_{1}\in[p,q]$ ,

The unique map $T:[p,q]\rightarrow[p,q]$ that satisfies (11) transports the measure whose density is $1+\varepsilon\partial^{1}\Psi(t,y)$ on $[p,q]$ to the Lebesgue measure on $[p,q]$ . We deduce from (11) that for $x_{1}\in[p,q]$ ,

with $|R|$ bounded by a constant depending only on $\Psi$ and $K$ (and in particular, independent of $\varepsilon$ or $y$ ). We now let $y\in\pi_{1}(K)$ vary, and we write

Therefore the map $S$ transports $\mu_{\varepsilon}$ to $\mu$ . According to (3),

with $|R^{\prime}|$ smaller than a constant depending only on $K$ and $\Psi$ , and in particular independent of $\varepsilon$ . To complete the proof, let $\varepsilon$ tend to zero. $\square$

A digression: Neumann eigenvalues and eigenfunctions

This section presents some additional relations between convexity and the Neumann Laplacian. We retain the setup and notation of Section 2. We write $L^{2}(K)$ for the Hilbert space that is the completion of $C^{\infty}(K)$ with respect to the norm

The operator $-\triangle$ , acting on the subspace $\mathcal{D}\subset L^{2}(K)$ , is a symmetric, positive semi-definite operator. The classical theory implies that $-\triangle$ has a complete system of orthonormal Neumann eigenfunctions $\varphi_{0},\varphi_{1},\ldots\in\mathcal{D}$ and Neumann eigenvalues $0\leq\lambda_{0}\leq\lambda_{1}\leq...$ (see, e.g., [16, Chapter 7]). The first eigenvalue is $\lambda_{0}=0$ , with the eigenfunction $\varphi_{0}$ being constant. It is well-known that $\lambda_{1}>0$ when $K$ is convex (see, e.g, . It is actually enough to assume that $K$ is connected, see e.g., [11, Theorem 1]). We refer to $\lambda_{1}$ as the first non-zero Neumann eigenvalue of $K$ . It is well-known that for any $C^{\infty}(K)$ -smooth function $u$ with $\int_{K}u=0$ ,

Equality in (13) holds if and only if $u$ is an eigenfunction corresponding to the eigenvalue $\lambda_{1}$ .

We say that the boundary of $K$ is uniformly strictly convex if $\nabla^{2}\rho(x)$ is a positive definite matrix for any $x\in\partial K$ . Equivalently, $\partial K$ is uniformly strictly convex if the principal curvatures are all positive – and not merely non-negative – everywhere on the boundary. Our next corollary claims, loosely speaking, that any non-trivial eigenfunction corresponding to $\lambda_{1}$ cannot be “spatially isotropic”, but must have “preference” for a certain direction in space.

Consequently, the multiplicity of the first non-zero Neumann eigenvalue is at most $n$ .

We write $\lambda_{1}$ for the first non-zero eigenvalue, i.e., $\triangle\varphi=-\lambda_{1}\varphi$ . Since $\varphi\in\mathcal{D}$ , inequality (8) gives

From (15) we know that $\int_{K}\partial^{i}\varphi=0$ for all $i$ . Thus (16) and (13) yield

Therefore, there must be equality in all steps and hence $\partial^{1}\varphi,\ldots,\partial^{n}\varphi$ are all Neumann eigenfunctions with eigenvalue $\lambda_{1}$ . We necessarily have equality also in (16). According to Lemma 5 this means that

Since the integrand is non-negative and continuous, necessarily

So far we have only used the convexity of $K$ . The uniform strict convexity of $\partial K$ means that $\nabla^{2}\rho>0$ on $\partial K$ . Equation (17) has the consequence that $\nabla\varphi=0$ on $\partial K$ , and therefore

This is well-known to be impossible for a Neumann eigenfunction corresponding to the first non-zero eigenvalue. We sketch the standard argument, see, e.g., for more information. Denote

Remark. Leonid Friedlandler explained to us how to eliminate the uniform strict convexity requirement from Corollary 1. His idea is to observe that since $\partial^{1}\varphi,\ldots,\partial^{n}\varphi$ are all eigenfunctions, then the restriction of $\varphi$ to the boundary $\partial K$ is actually an eigenfunction of the Laplacian associated with the Riemannian manifold $\partial K$ . However, (17) entails that $\varphi$ is constant in some open set in $\partial K$ , which is known to be impossible for an eigenfunction. We omit the details.

i.e., we flip the sign of the $i^{th}$ coordinate. For a function $f$ , we write $\sigma_{i}(f)(x)=f(\sigma_{i}(x))$ . Our next corollary exploits the well-known relationship between the eigenfunctions and symmetry. Similar arguments appear, e.g., in .

If $K$ is unconditional, then there exist $i=1,\ldots,n$ and an eigenfunction $0\not\equiv\varphi\in E_{\lambda_{1}}$ , such that

If $K$ is centrally-symmetric (i.e., $K=-K$ ), then there exists an eigenfunction $0\not\equiv\varphi\in E_{\lambda_{1}}$ , such that

Proof: Begin with the proof of (i). We are given the unconditional convex body $K$ . Since $K$ is unconditional, then $f\in E_{\lambda_{1}}$ implies $\sigma_{i}(f)\in E_{\lambda_{1}}$ for $i=1,\ldots,n$ . Begin with any non-zero eigenfunction $f_{0}\in E_{\lambda_{1}}$ , and recursively define

Then $f_{0},f_{1},\ldots,f_{n}\in E_{\lambda_{1}}$ . If there exists $i=1,\ldots,n$ such that $f_{i}\equiv 0$ then we are done: Suppose $i$ is the minimal such index. Then $0\not\equiv f_{i-1}\in E_{\lambda_{1}}$ with $\sigma_{i-1}(f_{i-1})=-f_{i-1}$ , and we found our desired eigenfunction.

It remains to deal with the case where $\psi=f_{n}$ is a non-zero eigenfunction. Note that $\sigma_{i}(\psi)=\psi$ and hence

In the proof of Corollary 1 (the first part, which did not use the uniform strict convexity) we observed that (20) implies that $\partial^{1}\psi,\ldots,\partial^{n}\psi\in E_{\lambda_{1}}$ . Since $\int_{K}|\nabla\psi|^{2}>0$ , there exists $i=1,\ldots,n$ with $\partial^{i}\psi\not\equiv 0$ . We see from (19) that $\partial^{i}\psi\in E_{\lambda_{1}}$ is the eigenfunction we are looking for. This completes the proof of the first part of the lemma.

The proof of the second part is similar. Begin with any $0\not\equiv f\in E_{\lambda_{1}}$ and set $\psi(x)=f(x)+f(-x)$ . If $\psi\equiv 0$ , then $f$ is an odd function and we are done. Otherwise, $\psi$ is an even function, hence $\int_{K}\nabla\psi=0$ . As before, this implies that $\partial^{1}\psi,\ldots\partial^{n}\psi$ are all odd eigenfunctions corresponding to the same eigenvalue $\lambda_{1}$ . $\square$

Corollary 1 and Corollary 2 seem very much expected. Notably, Nadirashvili has proved that in two dimensions, the multiplicity of the first non-zero Neumann eigenvalue is at most $2$ for any simply-connected domain. Our simple proof of Corollary 1 is not applicable in such generality. Corollary 1 is related to the “hot spots” problem, see, e.g., Burdzy , Jerison and Nadirashvili and references therein. A proof of Corollary 2 for the two-dimensional case – under much more general assumptions than convexity – can be found in [2, Theorem 4.3]. However, the proofs of the two-dimensional results mentioned do not seem to admit easy generalization to higher dimensions. As observed by Payne and Weinberger , Corollary 2 leads to the following comparison principle:

Denote by $\lambda_{1}>0$ the first non-zero Neumann eigenvalue of $K$ . Then,

Equality holds when $K=[-R,R]^{n}$ , an $n$ -dimensional cube.

According to Corollary 2(i), there exists an index $1\leq i\leq n$ and a non-zero eigenfunction $\varphi$ corresponding to $\lambda_{1}$ such that $\sigma_{i}(\varphi)=-\varphi$ . By Fubini’s theorem and (21),

hence $\lambda_{1}\geq\pi^{2}/R^{2}$ . $\square$

Corollary 3 shows that the cube satisfies a certain domain monotonicity principle for the Neumann Laplacian, at least in the category of unconditional, convex bodies. The Euclidean ball, for instance, does not satisfy a corresponding principle.

where $\lambda_{1}(K)>0$ is the first non-zero Neumann eigenvalue of $K$ , and $c>0$ is a universal constant. To establish (22), consider

Use Corollary 3 to deduce the bound $\lambda_{1}(K^{\prime})>c/\log^{2}(n+1)$ . The body $K^{\prime}$ is a good approximation to the body $K$ : It is easily proven that

We may thus apply E. Milman’s result [27, Theorem 1.7], which builds upon the Sternberg-Zumbrun concavity principle , to conclude that $\lambda_{1}(K)\geq c\lambda_{1}(K^{\prime})$ and the bound (22) follows. See for a conjectural better bound, without the logarithmic factor.

Unconditional convex bodies

We begin this section with a corollary to the theorems of Section 2 and Section 3.

Proof: Begin with (i). By approximation, we may assume that $K$ has a $C^{\infty}$ -smooth boundary, and that $\Psi$ is a $C^{\infty}(K)$ -smooth function. Lemma 4 states that

Fix $i=1,\ldots,n$ . We may apply Theorem 2 for $h=\partial^{i}\Psi$ since $\int_{K}\partial^{i}\Psi=0$ , as implied by the symmetries of $\Psi$ . We may apply Lemma 3, since clearly $\Psi\left(\mathcal{B}_{i}^{+}(x)\right)=\Psi\left(\mathcal{B}_{i}^{-}(x)\right)$ for any $x\in K$ . Theorem 2 and Lemma 3 entail the inequality

This proves (i). To deduce (ii), denote $\Psi_{i}(x_{1},\ldots,x_{n})=f_{i}(x_{i})$ . Observe that $\Psi(x)=\sum_{i=1}^{n}\Psi_{i}(x)$ is unconditional and that for any $x\in K,i=1,\ldots,n$ ,

We will use the following simple identities:

According to Corollary 4(i), it suffices to prove that for any $i=1,\ldots,n$ ,

Fix $i=1,\ldots,n$ . We will prove (25) by Fubini’s theorem. Fix a point

and denote $r=q_{i}^{+}(x^{\prime})\geq 0$ . In order to prove (25), it is enough to show that

The equality we need is exactly the content of (23). The proof of (i) is thus complete, in the case where $X$ is distributed uniformly in a convex body. The proof of (ii) is almost entirely identical. By approximation, we may assume that $f_{1},\ldots,f_{n}$ are continuous. According to Corollary 4(ii), it is sufficient to prove that

This follows by Fubini’s theorem and (24). The lemma is thus proven, in the case where $X$ is distributed uniformly in an unconditional convex body.

where $\kappa_{s}=\pi^{s/2}/\Gamma(s/2+1)$ is the volume of the $s$ -dimensional Euclidean unit ball. Suppose that $Z=(Z_{1},\ldots,Z_{N})$ is a random vector that is distributed uniformly in $K$ . According to the case already considered, conclusions (i) and (ii) hold when the $X_{1},\ldots,X_{n}$ are replaced by $Z_{1},\ldots,Z_{n}$ . However, the random vector $(Z_{1},\ldots,Z_{n})$ has the same distribution as $X=(X_{1},\ldots,X_{n})$ . Thus (i) and (ii) hold also in the case where the density $f$ is $s$ -concave.

Finally, an approximation argument eliminates the requirement that the density of $f$ be $s$ -concave: Write $f=e^{-\psi}$ for the unconditional, log-concave density of $X$ . Then, for any $s>0$ , the function

Lemma 4 may be viewed as a substitute for the sub-independent coordinates idea of Anttila, Ball and Perissinaki : Note the absence of cross terms from the right-hand side of Lemma 4(i). Suppose $X$ is a real-valued random variable with an even, log-concave density. A classical inequality (see, e.g., , or [3, Theorem 12] and references therein) states that for any $p\geq 2$ ,

The following corollary contains a few obvious consequences of Lemma 4.

where $C^{\prime}\leq 16$ is a universal constant. Consequently,

with $C\leq 4$ , a positive universal constant. Moreover, for any $p\geq 1$ ,

where $C_{p}>0$ is a constant depending only on $p$ .

Proof: According to the Prékopa-Leindler inequality (see, e.g., the first pages of ), the random variable $X_{i}$ has an even, log-concave density for all $i$ . From Lemma 4(i) and (26) we see that

This proves (i). By setting $a_{i}=1\ (i=1,\ldots,n)$ in (5), we deduce that

where $C_{p}$ is a constant depending solely on $p\geq 1$ . This completes the proof. $\square$

where $c,C>0$ are universal constants. Another large-deviations estimate that was proved by Bobkov and Nazarov is that

with, say, $\alpha=0.33$ and $\beta=3.33$ (see ).

Cordero-Erausquin, Fradelizi and Maurey have recently proved the so-called (B)-conjecture in the unconditional case. This entails the following improvement over the Brunn-Minkowski theory:

(The Prékopa-Leindler inequality leads to the weaker statement in which the $e^{t}$ is replaced by $t$ ). Corollary 5(ii) and Markov-Chebychev’s inequality yield

After some simple manipulations, we deduce the inequality

follows by combining Corollary 5(ii) with the distribution inequalities of Nazarov, Sodin and Volberg . We omit the details.

Berry-Esseen type bounds

In previous sections we established sharp thin shell estimates for unconditional, log-concave densities. In the present section we complete the proof of Theorem 2. The argument we present is quite technical and is very much related to classical treatments of the central limit theorem for independent random variables. The reader may refer to, e.g., [14, Vol. II, Chapter XVI] for background on the rate of convergence in the classical central limit theorem. We are indebted to Sasha Sodin for many discussions, suggestions and simplifications that have lead to the proofs we present below.

Before proceeding to the actual proof, let us describe the general idea. Introduce independent, symmetric Bernoulli variables $\Delta_{1},\ldots,\Delta_{n}$ . That is,

These Bernoulli variables are also assumed to be independent of $X$ . Write

where the last inequality holds only for “typical” values of $X$ . Since $|X|/\sqrt{n}$ is strongly concentrated around $1$ , as we learn from (3), we may substitute the $\Phi\left(t\sqrt{n}/|X|\right)$ term in (33) by $\Phi(t)$ . Observe that since $X$ is unconditional, the random variables

have exactly the same distribution. Hence, by considering the expectation over $X$ in (33), we deduce a weaker version of (1) where the $C/n$ is replaced with $C/\sqrt{n}$ . In order to arrive at the optimal bound, we need to apply a smoothing technique: The estimate (33) will be replaced with a much better Berry-Esseen inequality which is available for the random variable $\Gamma+\left(\sum_{i}\Delta_{i}X_{i}\right)\left/\sqrt{n}\right.$ , for an appropriate “small” random variable $\Gamma$ . The details will be described next.

For instance, $\Gamma$ may be the random variable whose density is

for appropriate universal constants $\kappa_{1},\kappa_{2}$ . (For this specific choice, $\gamma$ is the $8$ -fold convolution of the characteristic function of an interval.) We shall use the standard $O$ -notation in this section. The notation $O(x)$ , for some expression $x$ , is an abbreviation for some complicated quantity $y$ with the property that

for some universal constant $C>0$ . All constants hidden in the $O$ -notation in our proof are in principle explicit. The following lemma seems rather standard (see [14, Vol. II, Chapter XVI] for similar statements). For lack of a precise reference, we provide its proof.

Remark. Note that when $\theta_{i}=1/\sqrt{n}=\sigma$ for all $i$ , the error term in Lemma 5 is $O(1/n)$ . The addition of $\Gamma/\sqrt{n}$ allows us to deduce a better bound than the $O(1/\sqrt{n})$ guaranteed by the Berry-Esseen inequality.

Thus, from the Fourier inversion formula (see, e.g., [14, Vol. II, Chapter XVI]),

Denote $\varepsilon=\sqrt{\sum_{i}\theta_{i}^{4}}$ . To prove the lemma, it suffices to bound the absolute value of the integral in (38) by $C^{\prime}(\varepsilon^{2}+\sigma^{2})$ . We express the integral in (38) as $I_{1}+I_{2}+I_{3}$ where $I_{1}$ is the integral over $\xi\in[-\varepsilon^{-1/2},\varepsilon^{-1/2}]$ , $I_{2}$ is the integral over $\varepsilon^{-1/2}\leq|\xi|\leq\sigma^{-1}$ (when $\varepsilon^{-1/2}>\sigma^{-1}$ , we set $I_{2}=0$ ) and $I_{3}$ is the integral over $|\xi|\geq\max\{\sigma^{-1},\varepsilon^{-1/2}\}$ .

Begin with estimating $I_{1}$ . We use the elementary inequality

Since $|\theta_{i}|\leq\varepsilon^{1/2}$ for all $i$ , then for $|\xi|\leq\varepsilon^{-1/2}$ ,

Combine (39) with (35) to deduce that for $|\xi|\leq\varepsilon^{-1/2}$ ,

Next we estimate $I_{2}$ , in the case where $\varepsilon^{-1/2}\leq\sigma^{-1}$ (in the complementary case, $I_{2}=0$ ). Denote $\mathcal{I}=\left\{1\leq i\leq n\,;\,|\theta_{i}|\leq\sigma\right\}$ . Then, by (36),

We will use the elementary inequality $|\cos s|\leq e^{-cs^{2}}$ for $|s|\leq 1$ . According to (40), whenever $|\xi|\leq\sigma^{-1}$ ,

Apply the well-known bound $\int_{s}^{\infty}e^{-u^{2}/2}\leq Ce^{-cs^{2}}$ for $s\geq 0$ , to deduce

The bound for $I_{3}$ is easy. From (34) we have $\gamma(\sigma\xi)=0$ for $|\xi|\geq\sigma^{-1}$ . Hence,

The lemma follows by combining the above bound for $|I_{3}|$ with the bound (41) for $|I_{2}|$ and the bound (6) for $|I_{1}|$ . $\square$

Denote $Y=\sum_{i;|\theta_{i}X_{i}|\geq\varepsilon}\theta_{i}^{2}X_{i}^{2}$ . Clearly,

The lemma follows from (42) and (43). $\square$

We may apply Lemma 5 for $(\theta_{1}x_{1},\ldots,\theta_{n}x_{n})$ and for $\sigma=\varepsilon$ , and conclude that,

where we used the estimates for $F^{\prime},F^{\prime\prime}$ and the bounds (48) and (49). This completes the proof of (47). The lemma is proven. $\square$

Our next goal is to eliminate the “ $\varepsilon\Gamma$ ” term from the conclusion of Lemma 7. The following short computational lemma serves this purpose. We shall use the standard estimate

for any $t_{0}\geq 0$ (see, e.g., [14, Vol. I, Section VII.1]).

Let $t_{0}\geq 0$ and denote $\delta=\Phi(t_{0})$ . Then,

$\displaystyle\Phi\left(t_{0}+2\delta^{1/4}\right)\geq C_{1}^{-1}\delta$ .

$\displaystyle 1-\Phi\left(t_{0}-2\delta^{1/4}\right)\geq 1-\Phi(-2)\geq C_{1}^{-1}\geq C_{1}^{-1}\delta$ .

Suppose $x>0$ satisfies $\displaystyle\left|\frac{1}{x}-\frac{1}{\varphi(t_{0})}\right|\leq c_{2}\delta^{-3/4}$ . Then $\displaystyle x^{2}\leq C_{1}\delta$ .

Here, $C_{1}>1$ and $0<c_{2}<1$ are universal constants.

Proof: We have $t_{0}\delta^{1/4}\leq Ct_{0}(\varphi(t_{0}))^{1/4}\leq C^{\prime}$ according to (50). Hence,

Note also that $\varphi(t_{0})\leq C/(t_{0}+1)$ . Consequently, for any $x>0$ ,

Proof: By approximation, we may assume that the density of $X$ is $C^{1}$ -smooth and everywhere positive (e.g., convolve $X$ with a very small gaussian). We may also assume that $\varepsilon\leq c$ for a small universal constant $c>0$ . The function

To prove the lemma, it suffices to show that $\max_{t}E(t)=E(t_{0})\leq CA\varepsilon^{2}$ .

Step 1: Suppose first that $\Phi(t_{0})\leq 2C_{1}A\varepsilon^{2}$ , for $C_{1}$ being the universal constant from Lemma 8. Then by (51),

Consequently, since $\Phi(t_{0})\leq 2C_{1}A\varepsilon^{2}$ ,

The desired estimate (52) is therefore proven, in the case where $\Phi(t_{0})\leq 2C_{1}A\varepsilon^{2}$ .

Step 2: It remains to deal with the case where $t_{0}\geq 0$ satisfies $\Phi(t_{0})>2C_{1}A\varepsilon^{2}$ . Denote $\delta=\Phi(t_{0})\geq 2C_{1}A\varepsilon^{2}\geq A\varepsilon^{2}$ . Note that

under the legitimate assumption that $\varepsilon$ is smaller than a given universal constant. From Lemma 8(i) we have $\Phi\left(t_{0}+2\delta^{1/4}\right)\geq\delta/C_{1}$ , hence by (51),

A similar argument, using Lemma 8(ii) in place of Lemma 8(i), shows that

We conclude that for any $t\in[t_{0}-\delta^{1/4},t_{0}+\delta^{1/4}]$ ,

Consequently, when $f^{\prime}(x_{0})\neq 0$ ,

We conclude from (55) that for any $t\in[t_{0}-\delta^{1/4},t_{0}+\delta^{1/4}]$ ,

Equivalently, $|(1/f)^{\prime}|\leq 4C_{1}\delta^{-1}$ in the interval $[t_{0}-\delta^{1/4},t_{0}+\delta^{1/4}]$ . Hence,

for $c_{2}>0$ being the universal constant from Lemma 8. Recall from (53) that $f(t_{0})=\varphi(t_{0})$ . Lemma 8(iii) thus implies that

with $c=c_{2}/4C_{1}$ . Returning to (56), we finally deduce the bound

Through Taylor’s theorem, the latter bound entails that

where $\hat{c}>0$ is the constant from (57). The crucial observation is that $s\mapsto f(t_{0})s\eta(s)$ is an odd function, hence its integral on a symmetric interval about the origin vanishes. By (57) and (58),

where $\hat{c}>0$ is the constant from (57). We apply (51) and conclude that

Since $E(t_{0})=\max_{t}E(t)$ , the proof of the lemma is complete. $\square$

with some universal constant $C\geq 1$ . The random variable $Y$ has an even, log-concave density by Prékopa-Leindler. We may thus apply Lemma 9, and conclude from (59) that

With Cédric Villani’s permission, we reproduce below the proof of Theorem 2 from his book [40, Section 7.6] with a few minor changes.

By taking the infimum over all couplings $\gamma$ of $\mu$ and $\mu_{\varepsilon}$ , we obtain

with $R$ depending only on $\varphi$ . We may assume that $\liminf_{\varepsilon\rightarrow 0^{+}}W_{2}(\mu,\mu_{\varepsilon})/\varepsilon<\infty$ ; otherwise, there is nothing to prove. Consequently,