On the role of Convexity in Functional and Isoperimetric Inequalities
Emanuel Milman
Introduction
Let denote a metric probability space. More precisely, we assume that is a separable metric space and that is a Borel probability measure on which is not a unit mass at a point. Although it is not essential for the ensuing discussion, it will be more convenient to specialize to the case where is a complete smooth oriented -dimensional Riemannian manifold , is the induced geodesic distance, and is an absolutely continuous measure with respect to the Riemannian volume form on . This work continues the study of interplay between the metric and the measure initiated in . There are various different ways to measure this relationship, which may be typically arranged according to strength, forming a hierarchy. In this work, we will be primarily concerned with two such different ways.
The first way is by means of an isoperimetric inequality. Recall that Minkowski’s (exterior) boundary measure of a Borel set , which we denote here by , is defined as:
We will say that the space satisfies an Orlicz-Sobolev inequality () if:
A similar (yet different) definition was given by Roberto and Zegarlinski in the case following the work of Maz’ya [40, p. 112]. Our preference to use the median in our definition (in place of the more standard expectation ) is immaterial whenever is a convex function (see Lemma 2.1).
When , in which case is just the usual norm, we will refer to the inequality (1.2) as a Poincaré inequality. If in addition in (1.2) is replaced by , the case is then just the classical Poincaré inequality, and we denote the best constant in this inequality by . Similarly, the case corresponds to the Gagliardo–Nirenberg–Sobolev inequality, and a limitting case when tends to infinity is the so-called log-Sobolev inequality. More generally, we say that our space satisfies a -log-Sobolev inequality (), if there exists a constant so that:
The best possible constant above is denoted by . Although these inequalities do not precisely fit into our announced framework, it follows from the work of Bobkov and Zegarlinski that they are in fact equivalent to some corresponding Orlicz-Sobolev inequalities (see Section 4). Various other functional inequalities admit an equivalent (up to universal constants) formulation using an appropriate Orlicz norm on the left hand side of (1.2). We refer the reader to the recent paper of Barthe and Kolesnikov and the references therein for an account of several other types of functional inequalities.
It is well known that various isoperimetric inequalities imply their functional “counterparts”. It was shown by Maz’ya and independently by Cheeger , to whom this is usually attributed, that Cheeger’s isoperimetric inequality implies Poincaré’s inequality: (Cheeger’s inequality). It was first observed by M. Ledoux that a Gaussian isoperimetric inequality implies a -log-Sobolev inequality: , for some universal constant . This has been later refined by Beckner (see ) using an equivalent functional form of the Gaussian isoperimetric inequality due to S. Bobkov (see also ): . The constants and above are known to be optimal.
2 Reversing the Hierarchy
In both cases, we will say that “our convexity assumptions are fulfilled”. More generally, we recall the following definition from :
We will say that our smooth convexity assumptions are fulfilled if:
denotes the induced geodesic distance on .
, , and as tensor fields on :
We will say that our convexity assumptions are fulfilled if can be approximated in total-variation by measures so that satisfy our smooth convexity assumptions.
The condition (1.4) is the well-known Curvature-Dimension condition , introduced by Bakry and Émery in their celebrated paper (in the more abstract framework of diffusion generators). Here denotes the Ricci curvature tensor and denotes the second covariant derivative.
It is known that under our convexity assumptions, the implications stated in the previous subsection can be reversed: and , for some universal constants . That Cheeger’s inequality can be reversed was first shown by Buser when is uniform on a closed manifold with , and was recently strengthened and generalized by Ledoux to the Bakry–Émery abstract setting, assuming our smooth convexity assumptions. That a -log-Sobolev inequality implies a Gaussian isoperimetric inequality under these assumptions was first shown by Bakry and Ledoux [2, Section 4] (see also Ledoux ).
3 The Results
In this work, we generalize all of the above mentioned implications following Ledoux’s diffusion semi-group approach to a more general framework. Such a program was initiated in our previous work , where it was first shown how to use the condition via Ledoux’s semi-group gradient estimates to deduce isoperimetric inequalities from Poincaré inequalities. Contrary to previous approaches, which could only deduce isoperimetric information from functional inequalities with a term with (see [8, p. 3] and the references therein), it was shown in how to handle arbitrary . In the case of Poincaré inequalities, an easy reduction step in fact enables one to handle arbitrary . In this work, we show how to deduce isoperimetric inequalities from very general Orlicz-Sobolev inequalities in the entire range .
The easier case of is handled in Section 2, by generalizing our argument for Poincaré inequalities from . Extending our results to the case (which is very important for applications) requires additional work, to which end we employ the notion of capacity. Capacity inequalities are certain functional formulations of isoperimetric inequalities, which were introduced around 1960 by Maz’ya , Federer and Fleming , and used by Bobkov and Houdré in . Maz’ya’s notion of -capacity for has recently been extended to the metric probability space setting by Barthe, Cattiaux and Roberto in (after being introduced in ), where it was used to deduce isoperimetric inequalities, and has subsequently appeared in other works as well (e.g. ). We recall the appropriate definitions in Section 3, and show that -capacity inequalities are equivalent in full generality to an appropriate weak-type variant of these Orlicz-Sobolev inequalities (in the same sense that is the weak-type quasi-norm). We also give a very general condition for capacity inequalities to be equivalent to the usual (non-weak) Orlicz-Sobolev inequalities, which we require for the sequel. This extends a more restrictive condition (and partly implicit) obtained for in , following a similar condition for general in .
In Section 4 we use capacities to extend our results to the whole range . We also demonstrate that our estimates are sharp, by showing that the isoperimetric inequalities we obtain are in fact equivalent (up to universal constants) to the functional inequalities used to derive them. To give a taste of the type of results we obtain, we state the following theorem (see Theorem 4.13 for more details and a slightly stronger version):
Let and let denote a Young function, so that:
Then under our convexity assumptions, the following statements are equivalent:
where the best constants above satisfy:
with universal constants and depending explicitly on . In fact, the convexity assumptions are not needed for the direction , and the assumptions (1.5) are not needed if for the direction .
When , the direction reduces (up to constants) to Cheeger’s inequality, and the direction to its reversed form due to Buser–Ledoux. In addition, using and a result of Bobkov and Zegarlinski (generalizing a previous result of Bobkov and Götze ), a variant of Theorem 1.1 implies (see Corollary 4.8) the following:
Under our convexity assumptions, the -log-Sobolev inequality (1.3) (for ) is equivalent to the isoperimetric inequality:
with the best constants satisfying for some universal constants , uniformly on .
That the latter implies the former was previously shown by Bobkov and Zegarlinski without any convexity assumptions (we prove a more general result in Section 4). That the former implies the latter for is precisely the statement that under our convexity assumptions, recovering the previously mentioned result of Bakry–Ledoux and Ledoux .
Theorem 1.1 coupled with the equivalence between Orlicz-Sobolev inequalities and capacity inequalities, enables us to directly infer isoperimetric inequalities from their -capacity counterparts under our convexity assumptions. Previous works of Barthe–Roberto , Barthe–Cattiaux–Roberto and Roberto–Zegarlinski have shown that -capacity inequalities are often equivalent to certain other types of functional inequalities, such as the Latała–Oleszkiewicz inequality (or more general Beckner-type inequalities) and additive -Sobolev inequalities. The advantage of these inequalities compared to the Orlicz-Sobolev inequalities lies in the fact that they admit tensorization. To further demonstrate the usefulness of the framework we develop, we easily deduce in Section 5 as a by-product of our methods the dimension-free tensorization results of . In fact, we prove the following natural extension of these results. By the Central-Limit Theorem, one cannot expect a dimension-free result for isoperimetric profiles which are better than the one for the Gaussian measure (and even in this case some badly behaved examples due to Franck Barthe are known ), so some condition needs to be imposed (we refer to Section 5 for more details):
Then without any additional convexity assumptions, there exists a constant depending only on , such that for any :
As already mentioned, our convexity assumptions throughout this work are used via the semi-group argument described in Section 2. More precisely, in that section we assume that our smooth convexity assumptions are fulfilled. To justify the passage to the limit and conclude that our results are valid under arbitrary convexity assumptions, we develop a careful approximation argument in Section 6. We emphasize that this is not just a technical matter, in general it is simply not true that Orlicz-Sobolev inequalities on the spaces are stable under taking limit of in the total-variation norm (see Section 6), so the convexity assumptions will need to be exploited one last time. To the best of our knowledge, with the exclusion of the tensorization results above, all the previously known results which were mentioned did not address this point, and these results were deduced under the additional smoothness assumptions.
Acknowledgements. I would like to thank Professor Jean Bourgain and the Institute for Advanced Study for providing the perfect research environment. Most especially, I would like to thank Sasha Sodin for his invaluable help - acquainting me with capacities, suggesting to look at Ledoux’s semi-group argument, countless other references, many informative conversations and comments on this manuscript. I am also thankful to Professors Franck Barthe and Michel Ledoux for their remarks on earlier versions of this manuscript.
The Semi-Group Argument
In this section, we prove the direction of Theorem 1.1 for . Our proof is an adaptation of the semi-group argument used in our earlier work , which in turn closely follows Ledoux’s proof of [34, Theorem 5.2].
Let denote an Orlicz norm associated to the Young function . Then:
This lemma implies that we can pass back and forth between using the median and the expectation when excluding constant functions in our functional inequalities, at the expense of losing a universal constant.
is always convex, but unfortunately it may attain the value of , so it will not be a Young function according to our definition. To avoid this minor issue, it will be more convenient to work with the dual norm to :
We denote by the dual norm to , given by:
Although this will not be used, we comment that it is a nice exercise (e.g. ) to show that when is a Young function then:
The second inequality is usually called Young’s inequality.
Let denote a Young function. Then for any Borel set with :
On one hand, denoting , since we have:
On the other hand, by Jensen’s inequality, for any with , we have:
2 Semi-Group Gradient Estimates
where is the usual Laplace-Beltrami operator on . acts on , the space of bounded smooth real-valued functions on . Let denote the semi-group associated to the diffusion process with infinitesimal generator (cf. ), characterized by the following system of second order differential equations:
For each , is a bounded linear operator and its action naturally extends to the entire spaces (). We collect several elementary properties of these operators:
for all .
The following crucial dimension-free reverse Poincaré inequality was shown by Bakry and Ledoux in [2, Lemma 4.2], extending Ledoux’s approach for proving Buser’s Theorem (see also [2, Lemma 2.4], [34, Lemma 5.1]):
Assume that the following Bakry-Émery Curvature-Dimension condition holds on :
Then for any and , we have:
Our convexity assumptions are that in Lemma 2.3, and this is what we will henceforth assume. It is clear that our results in this section as well as Section 4 may be extended to the case of , but we do not pursue this direction in this work.
From Lemma 2.3, it is immediate that for any :
and using , Ledoux easily deduces the following dual statement [34, (5.5)]:
3 Orlicz-Sobolev implies Isoperimetry for q≥2𝑞2q\geq 2
Let and let denote a Young function. Then under our convexity assumptions, the statement:
with , a universal constant.
We will see how to relax the assumption that to as well as the requirement that is convex in Theorem 4.5, in which case we will get a different lower bound on which will depend on and .
Since is a Young function, we may replace in (2.5) by using Lemma 2.1 at the expense of an additional universal constant in the final conclusion.
Let denote an arbitrary Borel set in , and let denote a continuous approximation in to the characteristic function of . Clearly:
Applying Corollary 2.4 to functions in which approximate (in say ) and passing to the limit inferior as , it follows that:
We start by rewriting the right hand side above as:
To estimate the right-most expression, we use the definition of the dual norm:
Note that we could have also used Young’s inequality, yielding instead of above, but this would lead to slightly worse numeric estimates. Using our assumption (2.5) with replaced by , we get:
Using (2.3) (recall that ) to estimate , we conclude that:
Using Lemma 2.2, we estimate :
We also have the following rough estimate (for ):
It remains to optimize on . Evaluating (2.7) at time:
As evident from the proof, the definition of smooth convexity assumptions given in the Introduction may be extended to encompass the more general case treated in this section. Consequently, the same remark applies to all of the subsequent results which employ our convexity assumptions.
Capacities
As already mentioned in the Introduction, -capacity inequalities are certain functional formulations of isoperimetric inequalities. We conform to the definition given in , which is a variation on the definition introduced by Maz’ya (for general ) and extended by Barthe, Cattiaux and Roberto (with ) in (after being introduced in ). In this section, we introduce a coherent unified framework which provides an equivalence between capacity inequalities and weak-type Orlicz-Sobolev functional inequalities (introduced below), and a general sufficient condition for an equivalence to Orlicz-Sobolev inequalities. We also provide an argument for handling general metric probability spaces. There is essentially no novel content in some parts of this section, and these are provided here for completeness.
Given a metric probability space , and , we denote:
where the infimum is on all which are Lipschitz-on-balls.
Both Maz’ya’s definition for general and the definition of Barthe–Cattiaux–Roberto for the case use instead of our normalized . Our definition seems more convenient, as witnessed by the formulation of our results below.
The use of the metric induced by the geodesic distance on was essential for applying the (linear) semi-group argument of the previous section. Throughout this section, as well as the relevant parts of Sections 4 and 5, such a restriction no longer exists, and one may use an arbitrary metric . In this case, we interpret for any as the following Borel function:
(and we define it as 0 if is an isolated point - see [16, pp. 184,189] for more details).
A remark which will be useful for dealing with general metric probability spaces, is that in the definition of capacity, we may always assume that , for any , even though we may have . The argument is as follows.
Denote the discrete countable set of atoms of under , and write , , with (and set if and if ). Denote , and set for :
Clearly and , so is a valid approximation. Since is Lipschitz-on-balls and has the same set of atoms as , it is immediate to verify that for every :
But the integral on the right hand side of (3.1) is 0 since .
The following proposition (see , , , [50, Proposition A]) encapsulates the connection between capacity and the isoperimetric profile (we refer to for a careful proof).
Since obviously , we have the following useful corollary:
Note that the operation is an involution on , and that for .
is non-decreasing iff is non-increasing ().
It is enough to prove the “only if” direction for by Remark 3.6. Our assumption is that for all :
Let be given. Using , above (which is legitimate since is increasing), we deduce:
We denote by the weak quasi-norm, defined as:
We now extend the definition of the weak quasi-norm to Orlicz quasi-norms , using the adjoint function :
Given , define the weak quasi-norm as:
This definition is consistent with the one for , and satisfies:
as easily checked using the Markov-Chebyshev inequality. Also note that by a simple union-bound:
The motivation for the definition of stems from the immediate observation that for any Borel set :
For this reason, the expression already appears in the works of Maz’ya [40, p. 112] and Roberto–Zegarlinski .
is called a weak-type Orlicz-Sobolev inequality.
The weak-type Orlicz-Sobolev inequality (3.4) implies:
Apply (3.4) to , where is any Lipschitz-on-balls function so that and . Since , it follows that:
Taking the infimum over all as above, the assertion is verified. ∎
2 Equivalences
Let , then the following statements are equivalent:
and the best constants above satisfy .
Given a non-negative function as above ( hence ), and , define and . Then:
Taking supremum on , the assertion follows. ∎
and the best constants above satisfy .
As already mentioned in the Introduction, we call an inequality of the form (3.6) an Orlicz-Sobolev inequality (even though may not be convex).
One may show (see e.g. the proof of [46, Theorem 1]) that when is convex (so in particular is non-decreasing), Proposition 3.11 is equivalent to a theorem of Maz’ya [40, p. 112], but there the condition on is hidden. Such a stronger assumption is too restrictive for our purposes. Under this stronger assumption, the statement of this proposition was used in the case in and for in .
The last inequality follows from the fact that is non-decreasing, so denoting , we indeed verify that:
We will first assume that is bounded. Given a bounded non-negative function as above ( and ), we may assume by homogeneity that . For , denote , , and set . Also denote . Now:
It remains to show that . Indeed:
where in the last inequality we have used the fact that is non-decreasing, hence is non-decreasing, and therefore:
whenever , which is indeed the case for us.
For a non-bounded with , we may define so that and (just for safety) . It then follows by what was proved for bounded functions that:
where all limits exist since they are non-decreasing. To conclude, , since is continuous, so by the Monotone Convergence Theorem:
We immediately deduce from Propositions 3.10 and 3.11 the following peculiar corollary on the equivalence of the weak and usual Orlicz norms for some functional inequalities:
and the best constants above satisfy .
This corollary seems useful, even in the case of and , where this amounts to an equivalent characterization of the classical Poincaré inequality, using the weak quasi-norm on the left hand side. We do not know whether this characterization was previously noticed.
Another useful fact which follows from Propositions 3.10 and 3.11 is that the behavior of at a neighborhood of 0 is simply irrelevant as far as Orlicz-Sobolev inequalities are concerned:
Then the following statements are equivalent:
and the best constants above satisfy .
Note that still satisfies that is non-decreasing and that on . Using Proposition 3.10 to pass from the Orlicz-Sobolev inequality to a capacity inequality, we can then exchange between and , and use Proposition 3.11 to pass back to the other Orlicz-Sobolev inequality. ∎
The General Theorem
Note that the assumption was needed for the proof of Theorem 2.5 in order to use the estimate (2.3), and the convexity of was needed to employ Lemma 2.2. In order to relax these assumptions, as well as to deduce the direction in Theorem 1.1, we will need some additional observations, which are most-naturally formulated in the language of capacities.
In the following proposition, the case is due to Maz’ya [40, p. 105]. Motivated by the method used in our joint work with Sodin in , we provide an independent proof, which generalizes to the case of an arbitrary metric probability space and . We denote the conjugate exponent to by .
Let and set . Then for all :
Let be given, and let be a function in such that and . As usual (see Remark 3.3), by approximating , we may assume that for all . Let denote the discrete set of atoms of under , set and denote .
We now choose , so that denoting for , , and setting , we have , where will be chosen later. Denote in addition , . Applying Hölder’s inequality twice, we estimate:
Since and is non-decreasing in , we continue to estimate as follows:
where we have used that , , and in the last inequality the fact that is non-decreasing in . The assertion now follows by taking supremum on all as above, and choosing the optimal . ∎
Let , and let so that is non-decreasing for some (in particular this holds with when is a Young function). Then for any :
Let us evaluate the integral on and separately:
On the other hand, since is non-increasing:
Summing these two expressions, the assertion follows. ∎
We do not optimize on the dependence on here, since in our applications . In this case, note that in (4.1) and in (4.2) conveniently satisfy:
where is some universal constant. This will be used in the proof of Theorem 4.5 below.
Let , and let satisfy:
is non-decreasing.
If then is a convex (hence Young) function.
Note that since , it is almost everywhere differentiable. Also note that our integrability conditions (4.3) together with ensure that . We will assume that , the case follows by taking limit.
For the first part, it is equivalent to show that is non-increasing, which in turn is equivalent to checking that , defined below, is non-decreasing:
and the integrability condition (4.3) ensures that . We will show that is non-increasing, from which it will follow that , hence , as claimed. Indeed, for almost all :
The last expression is indeed non-positive, since is non-decreasing, hence is non-increasing, and by differentiating the latter expression one verifies that .
For the second part, let us substitute the definitions of in (4.4) and perform the change of variables . This amounts to:
Taking the derivative, we obtain that for almost every :
Multiplying by the denominator on the left hand side and taking the derivative once again yields that for almost every :
In particular, is twice differentiable for almost every , and it is clear that if almost everywhere. The latter amounts to checking that for almost all :
When , this follows from the stronger statement:
which indeed holds for almost all , as verified by differentiating , which by assumption is non-decreasing.
2 Orlicz-Sobolev implies Isoperimetry for q≥1𝑞1q\geq 1
We can now prove the following extension of Theorem 2.5:
The assumption that in Theorem 2.5 can be relaxed to , and the assumption that is a Young function omitted, if we assume in addition that:
In this case, under our convexity assumptions, (2.5) implies (2.6) with:
where is a universal constant, and:
Estimating the expression in (4.5) is connected to Hardy-type inequalities. We do not proceed in this direction in this work, since for our applications the bounds are easy to deduce directly. We remark that whenever is non-decreasing for some , is non-increasing, and so:
Using this estimate, it is immediate to show that the expression on the right hand side of (4.5) is bounded from above by a universal constant whenever , even if the infimum in (4.5) is replaced by a supremum. In particular, this obviously applies to all Young functions (with ).
First, note that whichever the value of , we have:
By Corollary 3.16, we can always assume that on , so that on , and therefore:
Using the assumption that is non-decreasing, hence is non-increasing, it follows that:
The assumption (2.5) implies by Proposition 3.10 that:
We start with the case . Using Proposition 4.1 (with ) to pass from to , together with Lemma 4.2 (with ) and Remark 4.3, we obtain that:
for some universal constant , where is a function so that:
Since is non-decreasing and the integrability conditions (4.7), (4.8) are fulfilled, we can apply Lemma 4.4 with , , and conclude that is a Young function and that is non-decreasing. Proposition 3.11 then implies that:
We can now apply Theorem 2.5, and conclude that:
with a universal constant. The value of in (4.5) ensures that this implies:
as required. This concludes the proof when .
When , we use a similar argument. Let denote the function so that:
Again, by Lemma 4.4 with , , we know that is a Young function and that is non-decreasing. Recalling Remark 4.6, the assumption (4.9) implies that:
for some universal . Proposition 3.11 then implies that:
We can now apply Theorem 2.5, and using the definition of in (4.5), conclude that:
Let , , and assume that:
with as in . Then under our convexity assumptions, the assumption (2.5) implies the conclusion (2.6) with:
whenever . Applying Theorem 4.5 and using (4.12), it is straightforward to obtain a lower bound on the expression in (4.5), which yields the bound in (4.11). ∎
It was shown by Bobkov and Zegarlinski [18, Proposition 3.1] (generalizing the case due to Bobkov and Götze [15, Proposition 4.1]) that the following -log-Sobolev inequality (with ):
where , and uniformly on . Using Lemma 2.1, we can replace in (4.14) by , at the expense of an additional universal constant. Using Corollary 4.7 with and in the range , we can easily show that the -log-Sobolev inequality (4.13) implies a corresponding isoperimetric inequality. However, to handle the entire range uniformly, we will need to turn to Theorem 4.5.
Under our convexity assumptions, the -log-Sobolev inequality (4.13) for implies the following isoperimetric inequality:
is non-decreasing, so using Lemma 2.1 and Corollary 3.16, (4.13) implies that:
Note that is still non-decreasing. Clearly:
uniformly on . Hence, using Theorem 4.5 to deduce the isoperimetric inequality (4.15), it remains to bound the expression in (4.5) from below uniformly in . This amounts to showing that:
where and is a universal constant. First, we bound the tail of this integral using (4.16):
which is bounded by a universal constant for . Next, we use (4.16) and the change of variables to bound:
We see that this is also bounded in the range , and this concludes the proof.
The case was previously shown by Bakry–Ledoux and Ledoux . For general , the reverse direction without any convexity assumptions was shown by Bobkov and Zegarlinski , and given a different proof by Sodin and the author . We will see a general argument for this in the next theorem.
3 Isoperimetry implies Orlicz-Sobolev
Let , and set . Let , so that is non-decreasing. Then:
We rewrite (4.17) using Corollary 3.5 as:
Using Proposition 4.1 (with ) to pass from to , we obtain that:
where is defined on as:
Incidentally, if we replace in the upper range of the above integral by , by Lemma 4.4 with , , we would have that is non-increasing, but this will not be used. The estimate in (4.19) ensures that:
Using that is non-decreasing, Proposition 3.11 then implies (4.18), as asserted. ∎
Note that the assumption (4.17) implies (4.20) without assuming that is non-decreasing.
Let and set . Assume that:
with some . Then the assumption (4.17) implies the conclusion (4.18) with:
Exactly as in the proof of Corollary 4.7. ∎
Using this for and , we see that as already noted in Remark 4.9, the isoperimetric inequality implies without any further assumptions the -log-Sobolev inequalities (4.14) and (4.13).
4 Summary
To conclude this section, we provide a slightly stronger version of Theorem 1.1 from the Introduction, on the equivalence of isoperimetric and Orlicz-Sobolev functional inequalities under our convexity assumptions. Our results in this section are more general, but this theorem summarizes the most useful cases given by Theorem 2.5 and Corollaries 4.7 and 4.12, and generalizes the results from (which dealt with the case below).
Let , and let . Assume that:
Then the following statements are equivalent:
where the best constants above satisfy:
with universal constants and:
In fact, among the assumptions (4.22), (4.23), (4.24), (4.25):
For the direction only (4.24) is needed.
For the direction with only (4.22) and one of (4.23) or (4.24) are needed, and if (4.23) is used then can be chosen to be .
For the direction with (4.23) is not needed.
The direction was proved in Theorem 2.5 and Corollary 4.7. The direction was proved in Corollary 4.12. ∎
Tensorization
As mentioned in the Introduction, the results of Section 4 coupled with the results of Section 3 on the equivalence of capacity inequalities and Orlicz-Sobolev inequalities, allow us to directly infer isoperimetric inequalities from capacity inequalities (under convexity assumptions of course).
Let , and let . If:
then the following statements are equivalent:
where the best constants above satisfy:
with universal constants and as in (4.26). In fact, among the assumptions (5.1), (5.2), (5.3), (5.4):
For the direction only (5.3) is needed.
For the direction with only (5.1) and one of (5.2) or (5.3) are needed, and if (5.2) is used then in (4.26) can be chosen to be .
For the direction with (5.2) is not needed.
The direction follows from Proposition 3.11 coupled with Theorem 2.5 and Corollary 4.7. The direction follows from Theorem 4.10, Remark 4.11 and Corollary 4.12 (note that we indeed do not need the assumption that is non-decreasing). ∎
It has been established in recent years that several other types of functional inequalities are equivalent to -capacity inequalities. These include Beckner-type inequalities (including the Latała–Oleszkiewicz inequality as in ) and additive -Sobolev inequalities . The advantage of these inequalities compared to the Orlicz-Sobolev inequalities lies in the fact that they admit tensorization. This easily allows us to deduce an extension of the dimension-free tensorization results of Barthe–Cattiaux–Roberto . We demonstrate this with Beckner-type inequalities (5.5), using Theorem 9 and Lemma 8 in as cited in (with a trivial change of notation):
with the best constants above satisfying .
It is known (e.g. ) that Beckner-type inequalities (5.5) admit tensorization, in the sense that if they hold for then they also hold for the Riemannian product for any . To obtain the most general result, we will also need the following remarkable observation of Franck Barthe [4, Theorem 10] (which in fact holds for very general metric probability spaces, but for simplicity we quote it in less general form; see also Ros ):
Then (without any additional convexity assumptions) there exists a constant depending only on , such that for any :
Our formulation of Theorem 5.4 using the condition (5.7), without refering to an auxiliary profile where is some 1-dimensional density, seems more natural than previous requirements, and this will also be evident in the proof. As mentioned in the Introduction, it seems possible to produce a proof of this theorem using the approach of , but the main obstacle would be to pass from the isoperimetric inequality to the appropriate -capacity inequality, using only and without passing via the auxiliary density (compare with Theorem 7 and Propositions 9,13 in ), which would otherwise result in requiring some additional technical assumptions and in the constant to depend on . On the other hand, without any further technical assumptions, Theorem 5.4 basically follows from the argument used to derive Theorem 5.1 coupled with Theorems 5.2 and 5.3. To make this precise we will need to be slightly more careful.
Denote for . Clearly , is non-decreasing and on . Now denote:
Next, let denote the function so that:
Indeed, Fact 3 implies that , and together with the linear growth of at infinity, this means that and hence . We now apply Lemma 4.4 with and . The appeal to Lemma 4.4 is legitimate since and since is non-decreasing by Fact 2. We deduce that:
In addition, Facts 2 and 3 provide the following estimates:
and an elementary computation provided in Lemma 5.5 below implies that:
with meaning that the bounds depend on .
Proposition 4.1 (with ) implies that for all :
is essentially non-decreasing (with the same constant) on . The latter follows from (5.11) and Fact 3 on .
This concludes the proof, up to the proof of Lemma 5.5 below. ∎
for some constant , which will conclude the proof.
The lower bound is immediate from the lower bound in (5.10). For the upper bound, we decompose the integral into two parts:
with the first one interpreted as if . By Fact 1 and the definition of , the second integral can be estimated for by:
To estimate the first integral for , we use the upper bound in (5.10) and the change of variables :
In fact, by inspecting the bound given by Theorem 9 in more carefully, one can repeat our argument for an arbitrary isoperimetric profile (perhaps violating the Central-Limit obstruction), and study what happens to the profile under tensorization. We leave this for another note.
Approximation Argument
Recall that is said to converge to in total-variation if:
In this section, we provide a careful approximation argument for deducing that our results from Section 2 hold under arbitrary convexity assumptions, without requiring any further smoothness conditions (as defined in the Introduction or more generally in Section 2 and Remark 2.7). We recall that at this point, the proof of Theorem 2.5 is only valid under the additional smoothness conditions. We emphasize that this is not just a technical matter, and that our convexity assumptions will need to be invoked once again. To explain this better, let us describe a naive approximation approach which completely fails. Suppose that satisfies a Orlicz-Sobolev inequality as in the assumption of Theorem 2.5, and we would like to deduce from this the conclusion of this theorem, assuming that satisfies our convexity assumptions. By definition, we know that there exists a sequence which approximates in total-variation, such that satisfy our smooth convexity assumptions, and so the proof of Theorem 2.5 applies to these spaces. One may hope that since approximate , the spaces will also satisfy the Orlicz-Sobolev inequality (perhaps with a worse constant), allowing us to apply Theorem 2.5. Unfortunately, this is completely false in general. For instance, consider the measures which are uniform on the set , and converge to , the uniform measure on $(\Omega,d,\mu_{m})m\geq 3(N,q)(\Omega,d,\mu)$ will satisfy any reasonable inequality (Poincaré, log-Sobolev, etc.). We conclude that a different approach is needed.
Our strategy in this section will be to show that the semi-group estimates of Section 2 can be transferred to a setting without any smoothness assumptions. Our original argument, which at first relied on a method of weak-convergence due to Williams and Zheng (see also Burdzy and Chen ), has been replaced by an elementary argument which we provide below. We continue with the notations used in Section 2, and recall the following definition:
A domain is said to be locally convex, if all geodesics in tangent to are locally outside of . By a result of Bishop , in case that has boundary, this is equivalent to requiring that the second fundamental form of with respect to the normal pointing into be positive semi-definite on all of .
One may always choose a version of which is locally Lipschitz on .
Let denote the Borel subset of points for which the sequence converges to (in the wide sense). We know that . We will show that for each , there exists a neighborhood and a constant , so that:
Consequently, it will follow that one may extend by continuity from to the entire , defining for as . The estimate (6.1) will imply that this limit is well defined and that the resulting satisfies the same locally Lipschitz condition.
It is known that for any , the geodesic open ball for small enough is convex embedded in , in the sense that it is both geodesically convex and that the exponential map is a diffeomorphism between and . We will therefore choose our neighborhood to be a convex embedded ball , so that in addition is contained in . Since converge to almost everywhere, it is clear that if then must be contained in for all (we could add “apart from a subset of zero measure” for safety, but this is in fact not necessary due to the convexity of the domains).
Choosing small enough, it is known (e.g. [28, p. 643], [3, p. 311]) that is a function on whose Riemannian Hessian satisfies on for some . Denoting:
we define .
Since for all , on , it follows from the above construction that on . Since , it is known ([3, p. 310]) that this is equivalent to being geodesically convex in . We now employ [47, Theorem 10.8], whose proof easily passes to the Riemannian setting (taking into account a slight modification provided in [3, Lemma 2.1] of a Euclidean argument). This theorem asserts that if a sequence of geodesically convex functions pointwise converges on a dense subset of a geodesically convex open set (to a finite value in each point of the subset), then the pointwise limit in fact exists for each , and the function is finite and geodesically convex on . Since converges to the finite function on , it follows that in fact converges to a geodesically convex function on the entire . Writing this function as , we realize that coincides with on . The argument is therefore complete. ∎
In fact, we have shown that converge pointwise on all of , and that the limit is a semi-convex function (in the sense that in a small enough geodesically convex neighborhood, we may add to it a smooth function to obtain a geodesically convex function in that neighborhood).
We will henceforth choose as constructed in Lemma 6.1. This implies by Rademacher’s theorem that:
The differential exists almost everywhere on .
is bounded almost everywhere on compact subsets of .
Let denote the diffusion process on with reflection on the boundary generated by , defined on the probability space and some fixed . Let for denote the semi-group associated to . We will assume that the initial distribution of is given by the stationary measure , so that is a stationary process.
Using a forward-backward martingale decomposition due to Lyons and Zheng and a tightness criterion for stochastic processes with continuous paths, it can be shown as in [56, p. 472] (see also [21, p. 31], [27, pp. 248-257]) with a minor adaptation to the Riemannian setting, that there exists a subsequence which converges weakly (as measures on endowed with the locally uniform topology) to some process . By passing to a subsequence, let us assume that converges weakly to . In particular, for any fixed , the law of weakly converges to that of , and hence is also stationary with stationary measure . Since is by definition distributed according to , there is a one-to-one correspondence between the spaces and , where is the -field generated by and is defined on the space with probability measure . Consequently, we can define for any the following (bounded) linear operator on :
In fact, as in , it should be possible to show that is a continuous Markov process, that is a strongly continuous semi-group associated to it, and that the associated Dirichlet form is exactly given by:
However, we will not require all this information. We will only use the weak convergence (through a subsequence) of the processes , defined on the 2-point set , to . We provide an elementary argument to deduce the tightness of this sequence (from which the former statement follows by Prokhorov’s Theorem). Fixing a point , since the process is stationary:
which is easily seen to hold, since this holds for the (probability) measure , and converge to in total-variation.
We conclude that by passing to a subsequence, we may assume that converges weakly to . In other words, for any continuous and bounded on :
where is the linear operator defined in (6.2). Clearly, this also extends to hold for all and .
Using (6.3), we can now transfer the known estimates (2.3) and Corollary 2.4 on the semi-groups to the operators . Indeed, given a bounded and smooth function on , we need to show that:
using the same estimates (6.4) and (6.5) for and replacing and , respectively (the latter are known to be true as described in Section 2, after approximating with functions in ). Note that we interpret the second estimate (6.5) regarding in the sense of distributions as described below, since this is all that is needed for the applications of Section 2. Writing:
the estimate (6.4) immediately follows from the same estimate for and and the weak convergence (6.3). The estimate (6.5) is harder to handle, since it involves the distributional gradient of . Setting , we interpret:
and is interpreted in the distributional sense (using integration by parts). Let denote such a vector field as above. Since converges to it total-variation, it remains to show the first equality in:
Since for all , the convergence of to in total-variation implies that the second term converges to 0 as . To handle the first term, we intergrate by parts:
It follows from Lemma 6.1 and Remark 6.3 that is a bounded function. This implies that the first term converges to 0 by , and since , the convergence of to in total-variation implies that the second term converges to 0 as well.
We conclude that the estimates (6.4) and (6.5) hold for the linear operators defined using the limiting process , and we may use these operators in place of the diffusion semi-group in the relevant parts of the proof of Theorem 2.5 (and consequently Theorems 4.5, 4.13 and 5.1), so that the conclusion of these theorems remains valid for as above.