Dimensionality and the stability of the Brunn-Minkowski inequality
Ronen Eldan, Bo`az Klartag
Introduction
The Brunn-Minkowski inequality states, in one of its normalizations, that
The literature contains various stability estimates for the Brunn-Minkowski inequality, which imply that when there is almost-equality in (1), then and are almost-translates of each other. Such estimates appear in Diskant , in Groemer , and in Figalli, Maggi and Pratelli . We recommend Osserman for a general survey on the stability of geometric inequalities.
The present stability estimates do not seem to imply much about the proximity of to a translate of under the assumption (2). Only if the constant “” in (2) is replaced by something like or so, then the results of Figalli, Maggi and Pratelli can yield meaningful information. The goal of this note is to raise the possibility that the stability of the Brunn-Minkowski inequality actually improves as the dimension increases. In particular, we would like to deduce from (2) that
for a family of non-negative functions , when the dimension is high. Here, and denote the barycenters of and respectively. Furthermore, in some non-trivial cases we may conclude (3) even when the constant “” in (2) is replaced by an expression that grows with the dimension, such as or for a small universal constant .
In this note we take the first steps towards a dimension-sensitive stability theory of the Brunn-Minkowski inequality. First, let us focus on the simplest case in which in (3) is a quadratic polynomial. In fact, we are interested mainly in expressions related to the quadratic form
Let be the inertia form of defined in (4) and (5). Then,
Here are universal constants, and are the barycenters of respectively.
See Theorem 4.5 below for explicit bounds on the universal constants from Theorem 1.1. Our interest in the inertia form stems from the central limit theorem for convex sets, see for background reading. As we shall explain in Proposition 6.4 below, Theorem 1.1 implies the bound
where is the thin shell parameter from , is a universal constant and is the constant from Theorem 1.1. In fact, Theorem 4.5 and (51) below show that the inequality (7) is essentially an equivalence. Consequently, the universal constant from Theorem 1.1 is intimately connected with the thin shell parameter . The question of whether is bounded by a universal constant is currently one of the central problems in high-dimensional convex geometry.
In fact, the assumption that is -Lipschitz may typically be weakened. For instance, when is convex or concave, it is well-known that
(in order to use (8) we also need a crude estimate for , hence we applied Corollary 2.4 to obtain such an estimate). In view of (11) and Proposition 6.4 below, we match (up to logarithmic factors) the best bounds for the width of the thin spherical shell for unconditional convex bodies proven in .
The structure of the remainder of this note is as follows: In the next section we establish some well-known facts about one-dimensional log-concave measures. In Section 3 we prove Theorem 1.1 and in Section 4 we prove Theorem 1.2. Section 5 is dedicated to attaining some inequalities related to one-dimensional transportation of measure. In Section 6, using these inequalities, we prove Theorem 1.3.
Background about log-concave densities on the line
A nice characterization of log-concavity that we learned from Bobkov is that is log-concave if and only if the function
is a concave function. This characterization lies at the heart of the proof of the following Poincaré-type inequality which appears as Corollary 4.3 in Bobkov :
Let be a log-concave probability measure on the real line, and set
for the variance of . Then for any smooth function with ,
Further information about log-concave densities on the line is provided by the following standard lemma.
; and
If , then .
Proof: Part (a) is the content of Lemma 3.2 in Bobkov . In order to prove (b), we show that for some ,
with , where here are the constants from part (a). Indeed, if there is no such , then from (a),
in contradiction to Grünbaum’s inequality (see, e.g., [4, Lemma 3.3]). By symmetry, there exists some with
From log-concavity, for , and (b) is proven since .
The following lemma is essentially a one-dimensional, functional version of Theorem 1.1. The Lemma states, roughly, that if the supremum-convolution of two log-concave probability densities has a bounded integral, then their respective variances cannot be too far from each other.
Let be random variables with corresponding densities and variances . Assume that and are log-concave. Define
a supremum-convolution of and . Then,
Proof: The function is clearly measurable (it is even log-concave). It follows from Lemma 2.2(b) that there exist intervals such that
Combining this with (13), we learn that there exists an interval with such that,
In order to prove (14), it suffices to show that
Denote the respective densities of by . The Prékopa-Leindler theorem (see, e.g., the first pages of Pisier ) implies that and are log-concave. Furthermore, using the Prékopa-Leindler theorem again we derive,
Plugging this into lemma 2.3 we deduce (15).
Next, for a measure and measurable sets with define
Thus the probability measure is the conditioning of to the set . Clearly, for a log-concave measure and an interval , the measure remains log-concave.
Proof: It is enough to prove the lemma for being rays. Denote by the interior of the support of , and by the density of . Abbreviate and set
To prove the lemma, it suffices to show that for any , or equivalently, that
Deriving a stability estimate from the central limit theorem for convex sets
A second ingredient will be a calculation which shows that the integral of the supremum-convolution of two Gaussian densities whose covariance matrix is a multiple of the identity, becomes very large when their respective covariances are not close to one another. This will imply that when is not large, the covariance matrices of both marginals are roughly the same multiple of the identity. Therefore the inertia forms of and must have had roughly the same trace (the trace of the matrix will determine the multiple of the identity).
for all with . Here, are universal constants.
It can be seen directly from the proof in that the constants in Theorem 3.1 may be selected to be . Other constants would imply different universal constants in Theorem 1.1. We shall need the following elementary lemma:
for and also for , where is a universal constant.
Proof: First we prove the lemma for . Note that for ,
The case where follows as .
The following lemma is the second ingredient in our proof of Theorem 1.1 described above. The essence of the lemma is that the integral of the supremum-convolution of two spherically-symmetric Gaussian densities must be quite large when the covariances are not close to each other.
whenever . Assume that is measurable. Then,
We would like to find which maximizes the right-hand side in (20). We select and verify that when we have and . We conclude that for any ,
for some universal constants .
Proof: Clearly, we may assume that the sequence is non-decreasing. Translating , we may assume that the barycenter of is at the origin. Let and be random vectors that are distributed according to the laws , respectively. Fix . Consider the subspace spanned by , where is an orthonormal basis of eigenvectors corresponding the the eigenvalues . Denote and assume that . Since the ’s are in increasing order, the subspace has the form,
for some . Write and . Now, fix . Define,
Inspect the function . We have and . By continuity, there exists a certain for which
Equation (21) and the fact that are orthonormal eigenvectors imply that for every , one has . Moreover, . We now apply Theorem 3.1 which claims that if , then there exists a subspace with such that
On the other hand, we may use the Prekopá-Leindler inequality as in (16) above, and deduce that
Consequently, under the assumption that ,
Since , we conclude
By repeating the argument, with the subspace replacing the subspace , we conclude the proof.
Proof of Theorem 1.1: By applying affine transformations to both and , we can assume that both bodies have the origin as their barycenter, and that while . By Lemma 3.4,
for any . Since for all , as follows from Corollary 2.4, then
where are universal constants. To obtain (6), note that
Remark: When in Theorem 1.1 is isotropic, we actually prove in (24) that
where is the square of the Hilbert-Schmidt norm of the matrix .
Obtaining stability estimates using a transportation argument
the covariance matrix. Finally, we normalize this density by defining
A theorem of Brenier asserts that a convex solution to the above equation on the domain exists. The regularity theory developed by Caffarelli implies that the convex function is smooth. For precise definitions and properties, see . The map pushes forward the measure whose density is to the measure whose density is , and is referred to as the Brenier map between the two measures. The matrix is positive-definite since it has a positive determinant and it is the Hessian matrix of a convex function.
Remark. The Knothe map, used in Section 6, is in some sense a limiting case of the Brenier map. See .
The following lemma contains the central idea of this section.
where and is a random variable distributed uniformly in $$.
Proof: By a standard approximation argument we may assume that and are sufficiently smooth. Denote and . Furthermore, define,
Using the fact that is log-concave, we obtain
A simple calculation shows that the Jacobian of is
By changing variables using and applying (30) and (31), we calculate
Applying the change of variables completes the proof.
Combining this with the above lemma yields
In view of (33), we would like to have a lower bound for in terms of and in terms of . The following lemma serves this purpose.
and . Then hence . Consequently,
Combining (35), (36) and (37) completes the proof.
Proof of Theorem 1.2: Write and . Substituting the result of Lemma 4.2 into (33) yields
Let be the random vectors whose densities are respectively. By the definition of the transportation distance,
where the transportation distance between random vectors is defined to be the distance between the corresponding distribution measures. The fact that and have barycenters at the origin implies
The Cauchy-Schwartz inequality together with (38), (39) and (40) yield,
where is the operator norm of . From the remark to Corollary 2.4 we conclude that
The function is log-concave and it is bounded from below by one, according to the Prékopa-Leindler inequality. Therefore,
The rest of this section aims at a better understanding of the exponents in Theorem 1.1. The next lemma exploits the second summand in our basic estimate (41).
Proof: We use the notation of the proof of Theorem 1.2. In order to establish (42), we fix , and assume that
Consequently, in order to establish (42), it suffices to show that for some universal constant ,
In view of (41), the last inequality will be concluded if we only manage to show,
The above fact follows from an application of Lemma 3.4 with and from the assumption that . Equation (42) is established, and the proof of (43) is analogous. The proof of the lemma is thus complete.
so that . Note that the thin-shell conjecture implies that and . We apply the estimate from the previous lemma for various marginals of our -dimensional measures, and obtain:
where are some universal constants.
Proof: The bound (46) follows directly from the remark to Corollary 2.4. In order to establish (47), denote by the orthonormal basis of eigenvectors corresponding to the eigenvalues . Define
Let be the subspace with the larger dimension among these two subspaces. Then . Denote by the maximal for which . Then . According to our assumption, , and hence we may apply Lemma 4.3 in the subspace . Denote by and the marginals of and to the subspace . Using (42) and (43) for and we obtain
where we used the fact that as well as the Prékopa-Leindler inequality which implies that for any .
The next theorem demonstrates that the exponent in Theorem 1.1 may be made arbitrarily close to , thus complementing the inequality (7) which goes in the opposite direction. This provides yet another piece of evidence for the close relationship between the thin shell problem and the stability of the Brunn-Minkowski inequality in high dimensions.
where is the identity matrix. Consequently,
Proof: We may clearly assume that is a diagonal matrix whose diagonal is , where the sequence is non-increasing. Since our measures are log-concave, then we may use Lemma 4.4 and calculate
The bound (50) follows. In order to deduce (51) from (50), argue as in (25) above. The proof is complete.
Transportation in one dimension
For , the map pushes forward the uniform measure on $\mu_{j}\mu_{1}\mu_{2}$ is the continuous, non-decreasing function
defined for . Observe that
Furthermore, is differentiable in and
Additionally, it is well-known (see, e.g., Villani’s book ) that
where is the monotone transportation map between and and is a universal constant.
We begin the proof of Proposition 5.1 with the following crude lemma.
Let and be probability measures on the real line.
If and are even, then
If are supported on and respectively, and have non-increasing densities, then
Proof: Denote by the Dirac measure at the origin. Assume that and are even. By the triangle inequality for the transportation metric,
Let be the Dirac measures supported on respectively. By the triangle inequality,
Therefore, by using ,
Proof of Proposition 5.1: Use (52), the definition of , and the fact that pushes forward the uniform measure on $\mu_{1}$, in order to obtain
Recall that when is a log-concave measure, the function is concave on $I_{j}(t)=\rho_{j}(\Phi_{j}^{-1}(t))j=1,2I_{1}I_{2}I_{j}(t)=I_{j}(1-t)t\in(0,1)[0,1/2][1/2,1]\varepsilon>0$ be such that
Suppose first that . In this case, from part (i) of lemma 5.2,
So whenever , the inequality (54) holds trivially for a sufficiently large universal constant .
From now on, we restrict attention to the case where . We divide the rest of the proof into several steps.
Step 1: Let us prove that there exists a universal constant such that
Once we prove (57), the desired bound (56) follows from (55). We thus focus on the proof of (57). Suppose that satisfies . We will show that in this case
If for all , then according to (55). Thus (58) holds true in this case. Otherwise, there exists with . Let be the supremum over all such . Since and are continuous and non-decreasing on , then
Since is concave, non-decreasing and non-negative on , then necessarily . We conclude that for any . From (55) it follows that . Therefore (58) is proven in all cases. By symmetry, we conclude (57), and the proof of (56) is complete.
Step 2: For any we have
where the last inequality is the content of Step 1. Denote , an even log-concave probability measure. According to Lemma 2.5, we have . Note that the function is odd, hence its -average its zero. Using the Poincaré-type inequality in Lemma 2.1, we see that for any ,
Step 3: Let and let . We use (59) and conclude that there exists with
Denote and . These are log-concave probability densities with . Note that we have, owing to (59),
In order to prove the lemma it remains to show that But in view of (60), the latter is a direct consequence of part (ii) in lemma 5.2: Since , then the log-concave densities of and are non-increasing. This completes the proof.
We thus view the function as a refined variant of the supremum-convolution of and . The following proposition is a stability estimate for the Prékopa-Leindler inequality in one dimension. It may be viewed as the transportation-metric version of the -stability estimates from Ball and Böröczky .
where the function is defined via (61) and is a universal constant.
Proof: Multiplying the functions and by positive constants, if necessary, we may assume that . Indeed, neither the left-hand side nor the right-hand side of (54) is changed under such normalization. Let be the monotone transportation map between and and as before, for . Applying the change of variables we see that
According to (52), we have for any in the support of . Since is log-concave, it does not vanish in , and hence for any . Therefore,
where we used Lemma 3.2(ii) in the last passage. Since , then
We may thus apply Proposition 5.1 and deduce that
Unconditional Convex Bodies
where is a universal constant and are the probability measures with densities respectively.
The main tool in the proof of Theorem 6.1 is the Knothe map from , which we define next. Let be as in Theorem 6.1. Then the support of is a convex set, and does not vanish in . The Knothe map between and is the continuous function for which
For any , the function actually depends only on the variables . We may thus speak of .
For any and for any fixed , the function is non-decreasing in .
It may be proven by induction on (see ) that the Knothe map between and exists, and that in fact, the three requirements above determine the function completely. Denoting , it follows from property (b) that
for any , where is the Jacobian of the map . Below we will also use the fact that the map , defined for , is one-to-one, as follows from properties (b) and (c). Set
and let be the densities of the probability measures , respectively. Then and are unconditional and log-concave. Write for the Knothe map between and , and set
Then is the Knothe map between and . Observe that for fixed , the map
is the monotone transportation map between the probability densities proportional to
for . For we set
which is a one-to-one, continuous function, defined for when and for when . According to (65) and to property (b), the Jacobian of the map satisfies
Since is one-to-one, then is a well-defined function on a subset of . We extend to the entire by setting it to be zero outside its original domain of definition.
Let be a measurable function. Then,
Proof: We use (65) for the Knothe map to conclude that
where we used (66) and (67) in the last passage. The map is one-to-one in the support of . Changing variables we obtain
The following lemma will serve as the induction step in the proof of Theorem 6.1.
where is a universal constant (in fact, it is the same constant as in Proposition 5.3).
In order to prove the lemma, it therefore suffices to show that
Recall that is the monotone transportation map between the even, log-concave probability measures supported on , whose densities are proportional to and . The variance of an even measure supported on cannot exceed . We may therefore use Proposition 5.3, together with (53), to conclude that for any ,
In particular, the right-hand side of (70) is non-negative. We use the definition (67) and integrate with respect to . This yields:
where the last passage is legal according to Lemma 6.2. The desired estimate (69) follows, and the proof is complete.
Proof of Theorem 6.1: We will prove by induction on the dimension that
where is the constant from Lemma 6.3. The case follows from Proposition 5.3 and from the fact that the variance of an even measure supported on cannot exceed . We assume that (71) is proven for dimension and proceed with the proof for dimension . Apply the induction hypothesis for the unconditional, log-concave probability densities and conclude that
and (71) is proven for dimension , hence for all dimensions. Using (71) and the fact that , the theorem follows by the definition of transportation distance.
The uniform measure on a convex body is a prime example for a log-concave measure. Consequently, we may deduce Theorem 1.3 from Theorem 6.1 by using a crude “cut with a big cube” argument. The logarithmic factor of Theorem 1.3 may be an artifact of this clumsy procedure.
Proof of Theorem 1.3: Let be a parameter to be specified later on. For we denote
According to Corollary 2.4, we have . Using Lemma 2.2 and a union bound, we deduce that
We now select and so that
Denote by the uniform probability measure on and similarly for . By elementary properties of the transportation metric , it follows that
where is the diameter of . It is well-known (see ) that and therefore,
Note that and satisfy the requirements of Theorem 6.1 with . Denote . Then,
From Theorem 6.1 and (75) we conclude that
All that remains is to select . In the case where , we choose
and deduce the desired bound (10) from (76). In the case where , we select and still deduce (10). The theorem is thus proven for all cases.
for any with . Then,
Proof: Standard bounds on the distribution of polynomials on high-dimensional convex sets (see Bourgain or Nazarov, Sodin and Volberg ) reduce the desired inequality (78) to the estimate
In order to prove (79), select such that . From (77),
For the upper bound, let be such that and . Then, from (77),
Hence, , or equivalently,
It is now clear that (79) follows from (80) and (81).