On the infimum convolution inequality
Rafał Latała, Jakub Onufry Wojtaszczyk
Introduction and Notation
In the seminal paper , B. Maurey introduced the so called property for a probability measure with a cost function (see Definition 2.1 below) and established a very elegant and simple proof of Talagrand’s two level concentration for the product exponential distribution using for this distribution and an appropriate cost function .
It is natural to ask what other pairs have property ? As any satisfies with , one will rather ask how big a cost function can one take. In this paper we study the probability measures that have property with respect to the largest (up to a multiplicative factor) possible convex cost function . This bound comes from checking property for linear functions. We say a measure satisfies the infimum convolution inequality ( for short) if the pair satisfies .
It turns out that such an optimal infimum convolution inequality has very strong consequences. It gives the best possible concentration behaviour, governed by the so–called -centroid bodies (Corollary 3.10). This, in turn, implies in particular a weak–strong moment comparison (Proposition 3.12), the Central Limit Theorem of Klartag and the tail estimates estimates of Paouris (Proposition 3.15). We believe that holds for any log–concave probability measure, which is the main motivation for this paper.
Organization of the paper. This section, apart from the above introduction, defines the notation used throughout the paper. The second section is devoted to studying the general properties of the inequality . In subsection 2.1 we recall the definition of property and its ties to concentration from . In subsection 2.2 we study the opposite implication — what additional assumptions one needs to infer from concentration inequalities. In subsection 2.3 we show that is indeed the largest possible cost function and define the inequality . In subsection 2.4 we show that product log–concave measures satisfy .
In the third section we give more attention to the concentration inequalities tied to . In subsection 3.1 we show the connection to bodies. In subsection 3.2 we continue in this vein with the additional assumption our measure is –regular. In subsection 3.3 we show how implies a comparison of weak and strong moments and the results of and .
In the fourth section we give a modification of the two–level concentration for the exponential measure, in which for sets lying far away from the origin only an enlargement by is used. This will be used in the fifth section, which focuses on the uniform measure on the ball. In subsection 5.1 we define and study two rather standard transports of measure used further on. In subsection 5.2 we use these transports along with the concentration from section 4 and a Cheeger inequality from to give a proof of for . In section 5.3 we show a proof of for and a proof of the log–Sobolev inequality for .
We conclude with a few possible extensions of the results of the paper in the sixth section.
The letters denote absolute numerical constants, which may change from line to line. By we mean constants dependent on (or, formally, a family of absolute constants indexed by ), these also may change from line to line. Other letters, in particular greek letters, denote constants fixed for a given proof or section. For any sets of positive real numbers and , , by we mean there exist absolute numerical constants such that for any . Similarly for collections of sets and by we mean for any , where again are absolute numerical constants. By we mean the constants above can depend on .
Infimum convolution inequality
denotes the infimum convolution of and .
The following two easy observations are almost immediate (c.f. ):
If pairs , have property and , then the couple also has property .
Then the pair has property .
Maurey noticed that property implies where
We will need a slight modification of this estimate.
Property for implies for any Borel set and ,
Thus from property for we have
from which, extracting the condition upon by direct calculation, we get (2).
Let , notice that is increasing in and for ,
hence and (3) follows. Moreover for
Let and . Previous calculations show that for , if or . Since and , we get that for all , hence (2) implies (5). ∎
The main theorem of states that satisfies () with a sufficiently chosen cost function.
Let for and otherwise. Then the pair has property .
Theorem 2.5 together with Proposition 2.4 immediately gives the following two-level concentration:
that was first established (with different universal, rather large constants) by Talagrand .
2 From concentration to property (τ)𝜏(\tau)
Proposition 2.4 shows that property implies concentration, the next result presents the first approach to the converse implication.
Then the pair has property . In particular if is convex, symmetric and then (7) implies property for .
To finish the proof of the first assertion, by Theorem 2.5 it is enough to show that
Since the set is a halfline, it is enough to prove that
Let us fix and with and take with . Put , then . By the definition of it easily follows that , hence by (7), Since
The last part of the statement immediately follows since any symmetric convex function is radius-wise nondecreasing and if additionally , then for any . ∎
The next proposition shows that inequalities (3) and (4) are strongly related.
The following two conditions are equivalent for any Borel set and ,
and we get the contradiction with (10). ∎
Let us fix the set with . Notice that . If , then . If , Proposition 2.7 gives
Finally, if , we get , hence by the previous case,
Corollary 2.8 shows that if the cost function is symmetric and convex, condition (7) (with instead of ) for is implied by the following:
To treat the case we will need Cheeger’s version of the Poincaré inequality.
It is not hard to check that Cheeger’s inequality (cf. [6, Theorem 2.1]) implies
Finally, we may summarize this section with the following statement.
Suppose that the cost function is convex, symmetric with and for all . If the measure satisfies Cheeger’s inequality with the constant and the condition (11) is satisfied for all and then has property with the constant .
Notice that for all , hence Cheeger’s inequality implies that condition (7) holds for with . Therefore (7) holds for all with and the assertion follows by Corollary 2.6. ∎
3 Optimal cost functions
A natural question arises: what other pairs have property ()? First we have to choose the right cost function. To do this let us recall the following definitions.
The Legendre transform of any function is a convex function. If is convex and lower semi-continuous, then , and otherwise . In general, if , then . The Legendre transform satisfies and if , then . For this and other properties of , cf. . The Legendre transform has been previously used in the context of convex geometry, see for instance and .
The function plays a crucial role in the theory of large deviations cf. .
Take . Then
where the last equality uses the fact that is symmetric. Thus by taking the logarithm we get , and by applying the Legendre transform we obtain The inequality follows by the convexity of . ∎
The above remark motivates the following definition.
which substituted into (13) gives the thesis. ∎
Direct calculation shows that for and
Since for , we get . Finally
The last statement follows by Theorem 2.5, since . ∎
4 Logaritmically concave product measures
It is easy to check that for any measure with a full–dimensional support there exists a linear map such that is isotropic.
The next theorem (with a different universal, but rather large constant) may be deduced from the results of Gozlan . We give the following, relatively short proof for the sake of completeness.
then for and for . Therefore
Notice that , thus by concavity of , for . Moreover for , .
where is as in Theorem 2.5. We have two cases. i) , then
We expect that in fact a more general fact holds.
Concentration inequalities.
Sets for , when is the uniform disribution on the convex body are called -centroid bodies of , their properties were investigated in .
Let us take , we need to show that , that is
i) . Then, since ,
Hence and . Therefore
Using the symmetry and isotropicity of , we get
where to get the last inequality we used for . ∎
2 α𝛼\alpha-regular measures.
To establish inlusions opposite to those in the previous subsection, we introduce the following property:
If is -regular for some , then for any ,
Take any , then we may find such that and obtain
If is symmetric, isotropic -regular for some , then
We have by the symmetry, isotropicity and regularity of ,
so for . Thus for all . ∎
We always have for , and . If the measure is -regular, then and for . Moreover for any symmetric measure , , hence by the convexity of , for all .
Symmetric log–concave measures are 1-regular.
so it is enough to show that the function is nonincreasing on . Binet’s form of the Stirling formula (cf. [1, Theorem 1.6.3]) gives
where is decreasing function. Thus
is indeed nonincreasing on . ∎
This definition is motivated by the following Corollary:
By Remark 3.7, Proposition 2.4 and the definition of we have
so the first part of the statement immediately follows by Proposition 3.5.
By Proposition 2.7 this implies property (11). Additionally is convex, symmetric and . Finally, from Proposition 3.3 we have . Thus, from Proposition 2.9 we get the second part of the statement.
By Proposition 2.7 in the definition 3.9 we could use the equivalent condition . The next proposition shows that for log-concave measures these conditions are satisfied for large and for small sets.
Using a standard volumetric estimate for any we may choose with such that . Then for ,
Let , we will consider two cases.
i) . Then by Remark 3.7,
so , hence and
The previous facts motivate the following.
Proposition 3.10 shows that Conjecture 1 implies Conjecture 2. Both hypotheses would be equivalent provided that the following conjecture of Kannan, Lovász and Simonovits holds.
There exists an absolute constant such that any symmetric isotropic log–concave probability measure satisfies Cheeger’s inequality with constant .
3 Comparison of weak and strong moments
where denotes the norm dual to .
hence for . Thus for ,
Under the assumptions of Proposition 3.12 by the triangle inequality we get for ,
Notice that and . Hence i) follows directly from (19) with . Moreover (19) with implies
by the -regularity and isotropicity of . ∎
Property i) plays the crucial role in the Klartag proof of the central limit theorem for convex bodies . Paouris showed that moments of the Euclidean norm for symmetric isotropic log-concave measures are bounded by . Thus Conjecture 4 would imply both Klartag CLT (with the optimal speed of convergence) and Paouris concentration.
We conclude this section with the estimate that shows comparison of weak and strong moments for any probability measure and .
As in the proof of Proposition 3.11 we can find with , such that for all . Then
Modified Talagrand concentration for exponential measure
In this section we show that for a set lying far from the origin Talagrand’s two level concentration for the exponential measure may be somewhat improved, namely (for sufficiently large ) it is enough to enlarge the set by instead of .
If then for any we have
Obviously we may assume that and . Let and . From the definition of and we have . On the other hand , so . Thus
Then we easily check that . Since we get and . Now we can use the Brunn-Minkowski inequality to get
Notice that , so we obtain
A similar result (although with a constant multiplicative factor) can be obtained using the same technique and more calculations for instead of for .
If then for any we have
and also . From Lemma 4.1 we have
Let . By Lemma 4.3 we get for any and any :
To get the assertion it is enough to take the sum over all and notice that the function is convex on , hence
Then , hence
From Proposition 4.4 applied for we have
In particular holds if .
Let denote for . If for any we have , the thesis is proved. Thus we assume otherwise. Let . From Lemma 4.5 we have
By a simple induction we get for any . Thus we get
where the last part follows from Stirling’s formula. Thus .
where for and for .
We shall use the facts proved in Section 3 to approximate . Note that is log-concave (as its density is log-concave) and symmetric. It is 1–regular from Proposition 3.8. Also
Thus for , so by Propositions 3.2 and 3.5, . Hence, for all we have , so . As is symmetric, the proof is finished. ∎
For we use Propositions 3.3 and 3.6. Both and are symmetric, log–concave measures, and both can be rescaled as in Proposition 5.1 to be isotropic, thus .
Lemma 6 from gives (after rescaling by ),
It is not hard to verify that for .
We are now going to investigate two transports of measure. They will combine to transport a measure with known concentration properties ( or , that is the exponential or Gaussian measure) to the uniform measure . We will investigate the contractive properties of these transports with respect to various norms. Our motivation is the following:
Let us prove the first statement, the second proof is almost identical. Suppose . Then there exists such that From the assumption we have , which means , and . ∎
Let us first show the following simple estimate.
Then and for . ∎
Now we are ready to state the basic properties of .
i) The map transports the probability measure onto the measure . ii) For all we have and . iii) For any , . iv) The function is decreasing on and for any ,
The definition of directly implies i). Differentiation of (22) gives
which, when the -th root is taken, give the first part of ii).
For the second part of ii) we use (23) and the estimate above to get
To show iii) first notice that by (23) and ii),
thus . Moreover by ii), , so we may assume that . By (22) and Lemma 5.7 we obtain
Thus using again (23) and part ii) we get
By iii) we get , which proves the first part of iv). For the second part suppose that , then
The next Proposition may be also deduced (with different constant) from the more general fact proved in .
Assume , we apply Proposition 5.8 and get
Let and , we use Proposition 5.8 as in the proof of Proposition 5.9, and the Hölder inequality,
The second transport we will use is a simple product transport which transports the measure onto . We shall be particularly interested in the cases and , but most of the results can be stated in the more general setting.
Note that and . Differentiating equality (24) we get
We will prove that behaves very much like for large , and is more or less linear for small . We begin with the bound for .
For we have i) and for , ii) and for , iii) .
Note that . We have for ,
and for , since for , we get
Notice that by (25), , hence we may estimate using the just derived bounds on .
The lower bound on yields for . The same estimate holds for , since is odd. Finally for we have
i) For , and . ii) For ,
Since the function is odd, we may and will assume that .
i) We have by the monotonicity of on ,
thus and . Formula (25) gives .
ii) We begin by the following Gaussian tail estimate for :
We have equality when , and direct calculation shows the derivative of the left–hand–side is no larger than the derivative of the right–hand–side.
Let , we will now show that for all and ,
Suppose on the contrary that for some and . Note that by i) we have . Thus is equal to the second part of the maximum. This in particular implies that , since for we have
Therefore . Now by (26), (24) and (27),
After simplifying this gives . Hence
which is impossible. This condratiction shows that (28) holds.
Thus we have and by (25) we obtain
By taking for sufficiently large and estimating carefully one may arrive at the bound . One cannot, however, receive a bound of the order of .
Property i) follows from the definition of and . Since we get ii) by Lemma 5.13 i). Property iii) is a direct consequence of ii).
The above inequality together with ii) gives iv) and iv) yields v). ∎
Now we define a transport from the exponential measure to for :
This transport satisfies the following bound:
Let . By direct calculation we get
Since , Proposition 5.13 i) implies , while by Proposition 5.8 we have . Thus the first summand can be bounded by .
For the second summand note that by Proposition 5.8 iii),
Moreover, by the Hölder inequality and
Let for . Note that , and , hence
Let . By vector–valued integration and (29) we get
As in the proof of Proposition 5.17 we show that
To deal with the sum of ’s we notice that, since and ,
If for some , then for all , .
Let us fix with . By Proposition 5.15 iv),
By Hölder’s inequality for .
Assume now that . Let be such that and . Then by Proposition 5.17 and by Proposition 5.15 v). Thus by Proposition 5.18,
Hence . ∎
The last function we define transports the Gaussian measure to for .
The first summand is bounded by 2 as in the proof of Proposition 5.17. Since we get by Lemma 5.13 ii)
We start with the version of Theorem 4.6 for .
or
{\nu_{p}^{n}}\big{(}(A+20t^{1/p}B_{p}^{n})\cap 100\sqrt{n}B_{2}^{n}\big{)}\geq\frac{1}{2}{\nu_{p}^{n}}(A).
We will use the transport from to . Proposition 5.15 v) gives By Remark 5.5 this means that . Let us fix and apply Theorem 4.6 to and . If the second case occurs, we have
If the first case of Theorem 4.6 occurs, then due to Proposition 5.15 iii) we have , so for any . Thus
Corollary 5.2 gives for . By Corollary 2.19, satisfies , which, due to Proposition 2.4 implies for any Borel set . Thus we have
where in the last step we use the Stirling approximation and as always denotes a universal constant. Thus it is enough to take . ∎
By Propositions 2.7, 3.11, 3.5 and 5.3 it is enough to show
for and .
Recall that denotes the map transporting to . Apply Lemma 5.22 to and . If the first case occurs, we have
Proposition 5.9 gives , thus by Remark 5.5,
Hence we may assume that the second case of Lemma 5.22 holds, that is
In particular . Let
We apply Lemma 5.23 for and to get
Proposition 5.9, Remark 5.5 and the definitions of and yield
Putting the four estimates together, we can write
which gives (33) in the second case and ends the proof. ∎
A recent result of S. Sodin ([19, Theorem 1]) states (after rescaling from to ) that
3 The easy case – p≥2𝑝2p\geq 2
This case will follow easily from the exponential case and the facts from subsection 5.1.
By Propositions 2.7, 3.11, 3.5 and 5.3, Theorem 5.27 yield the following.
For any and the measure satisfies Cheeger’s inequality (12) with the constant .
Again we shall transport this result from the exponential measure. By Cheeger’s inequality holds for with the constant , thus by Proposition 5.17 satisfies (12) with the constant . ∎
As in the proof of Corollary 5.26 we show that Theorem 5.29 and Corollary 5.28 imply infimum convolution inequality for , . Adding the two results together we get
We conclude this section with the proof of logaritmic Sobolev–type inequality for .
In particular there exists a universal constant such that
Concluding Remarks
Following the proof of Proposition 3.12 we also get for all ,