On the role of Convexity in Functional and Isoperimetric Inequalities

Emanuel Milman

Introduction

Let (Ω,d,μ)(\Omega,d,\mu) denote a metric probability space. More precisely, we assume that (Ω,d)(\Omega,d) is a separable metric space and that μ\mu is a Borel probability measure on (Ω,d)(\Omega,d) which is not a unit mass at a point. Although it is not essential for the ensuing discussion, it will be more convenient to specialize to the case where Ω\Omega is a complete smooth oriented nn-dimensional Riemannian manifold (M,g)(M,g), dd is the induced geodesic distance, and μ\mu is an absolutely continuous measure with respect to the Riemannian volume form volMvol_{M} on MM. This work continues the study of interplay between the metric dd and the measure μ\mu initiated in . There are various different ways to measure this relationship, which may be typically arranged according to strength, forming a hierarchy. In this work, we will be primarily concerned with two such different ways.

The first way is by means of an isoperimetric inequality. Recall that Minkowski’s (exterior) boundary measure of a Borel set AΩA\subset\Omega, which we denote here by μ+(A)\mu^{+}(A), is defined as:

We will say that the space (Ω,d,μ)(\Omega,d,\mu) satisfies an (N,q)(N,q) Orlicz-Sobolev inequality (NN,q1N\in\mathcal{N},q\geq 1) if:

A similar (yet different) definition was given by Roberto and Zegarlinski in the case q=2q=2 following the work of Maz’ya [40, p. 112]. Our preference to use the median MμM_{\mu} in our definition (in place of the more standard expectation EμE_{\mu}) is immaterial whenever NN is a convex function (see Lemma 2.1).

When N(t)=tpN(t)=t^{p}, in which case N(μ)N(\mu) is just the usual Lp(μ)L_{p}(\mu) norm, we will refer to the inequality (1.2) as a (p,q)(p,q) Poincaré inequality. If in addition MμM_{\mu} in (1.2) is replaced by EμE_{\mu}, the case p=q=2p=q=2 is then just the classical Poincaré inequality, and we denote the best constant in this inequality by DPoinD_{Poin}. Similarly, the case q=1,p=nn1q=1,p=\frac{n}{n-1} corresponds to the Gagliardo–Nirenberg–Sobolev inequality, and a limitting case when nn tends to infinity is the so-called log-Sobolev inequality. More generally, we say that our space satisfies a qq-log-Sobolev inequality (qq\in), if there exists a constant D>0D>0 so that:

The best possible constant DD above is denoted by DLSq=DLSq(Ω,d,μ)D_{LS_{q}}=D_{LS_{q}}(\Omega,d,\mu). Although these inequalities do not precisely fit into our announced framework, it follows from the work of Bobkov and Zegarlinski that they are in fact equivalent to some corresponding Orlicz-Sobolev inequalities (see Section 4). Various other functional inequalities admit an equivalent (up to universal constants) formulation using an appropriate Orlicz norm N(μ)N(\mu) on the left hand side of (1.2). We refer the reader to the recent paper of Barthe and Kolesnikov and the references therein for an account of several other types of functional inequalities.

It is well known that various isoperimetric inequalities imply their functional “counterparts”. It was shown by Maz’ya and independently by Cheeger , to whom this is usually attributed, that Cheeger’s isoperimetric inequality implies Poincaré’s inequality: DPoinDChe/2D_{Poin}\geq D_{Che}/2 (Cheeger’s inequality). It was first observed by M. Ledoux that a Gaussian isoperimetric inequality implies a 22-log-Sobolev inequality: DLS2cDGauD_{LS_{2}}\geq cD_{Gau}, for some universal constant c>0c>0. This has been later refined by Beckner (see ) using an equivalent functional form of the Gaussian isoperimetric inequality due to S. Bobkov (see also ): DLS2DGau/2D_{LS_{2}}\geq D_{Gau}/\sqrt{2}. The constants 22 and 2\sqrt{2} above are known to be optimal.

2 Reversing the Hierarchy

In both cases, we will say that “our convexity assumptions are fulfilled”. More generally, we recall the following definition from :

We will say that our smooth convexity assumptions are fulfilled if:

dd denotes the induced geodesic distance on (M,g)(M,g).

dμ=exp(ψ)dvolMd\mu=\exp(-\psi)dvol_{M}, ψC2(M)\psi\in C^{2}(M), and as tensor fields on MM:

We will say that our convexity assumptions are fulfilled if μ\mu can be approximated in total-variation by measures {μm}\left\{\mu_{m}\right\} so that (Ω,d,μm)(\Omega,d,\mu_{m}) satisfy our smooth convexity assumptions.

The condition (1.4) is the well-known Curvature-Dimension condition CD(0,)CD(0,\infty), introduced by Bakry and Émery in their celebrated paper (in the more abstract framework of diffusion generators). Here RicgRic_{g} denotes the Ricci curvature tensor and HessgHess_{g} denotes the second covariant derivative.

It is known that under our convexity assumptions, the implications stated in the previous subsection can be reversed: DChec1DPoinD_{Che}\geq c_{1}D_{Poin} and DGauc2DLS2D_{Gau}\geq c_{2}D_{LS_{2}}, for some universal constants c1,c2>0c_{1},c_{2}>0. That Cheeger’s inequality can be reversed was first shown by Buser when μ\mu is uniform on a closed manifold with Ricg0Ric_{g}\geq 0, and was recently strengthened and generalized by Ledoux to the Bakry–Émery abstract setting, assuming our smooth convexity assumptions. That a 22-log-Sobolev inequality implies a Gaussian isoperimetric inequality under these assumptions was first shown by Bakry and Ledoux [2, Section 4] (see also Ledoux ).

3 The Results

In this work, we generalize all of the above mentioned implications following Ledoux’s diffusion semi-group approach to a more general framework. Such a program was initiated in our previous work , where it was first shown how to use the CD(0,)CD(0,\infty) condition via Ledoux’s semi-group gradient estimates to deduce isoperimetric inequalities from (p,q)(p,q) Poincaré inequalities. Contrary to previous approaches, which could only deduce isoperimetric information from functional inequalities with a fLq(μ)\left\|\left|\nabla f\right|\right\|_{L_{q}(\mu)} term with q=2q=2 (see [8, p. 3] and the references therein), it was shown in how to handle arbitrary q2q\geq 2. In the case of (p,q)(p,q) Poincaré inequalities, an easy reduction step in fact enables one to handle arbitrary q1q\geq 1. In this work, we show how to deduce isoperimetric inequalities from very general Orlicz-Sobolev inequalities in the entire range q1q\geq 1.

The easier case of q2q\geq 2 is handled in Section 2, by generalizing our argument for (p,q)(p,q) Poincaré inequalities from . Extending our results to the case q1q\geq 1 (which is very important for applications) requires additional work, to which end we employ the notion of capacity. Capacity inequalities are certain functional formulations of isoperimetric inequalities, which were introduced around 1960 by Maz’ya , Federer and Fleming , and used by Bobkov and Houdré in . Maz’ya’s notion of qq-capacity for q=2q=2 has recently been extended to the metric probability space setting by Barthe, Cattiaux and Roberto in (after being introduced in ), where it was used to deduce isoperimetric inequalities, and has subsequently appeared in other works as well (e.g. ). We recall the appropriate definitions in Section 3, and show that qq-capacity inequalities are equivalent in full generality to an appropriate weak-type variant of these Orlicz-Sobolev inequalities (in the same sense that Lp,L_{p,\infty} is the weak-type LpL_{p} quasi-norm). We also give a very general condition for capacity inequalities to be equivalent to the usual (non-weak) Orlicz-Sobolev inequalities, which we require for the sequel. This extends a more restrictive condition (and partly implicit) obtained for q=2q=2 in , following a similar condition for general qq in .

In Section 4 we use capacities to extend our results to the whole range q1q\geq 1. We also demonstrate that our estimates are sharp, by showing that the isoperimetric inequalities we obtain are in fact equivalent (up to universal constants) to the functional inequalities used to derive them. To give a taste of the type of results we obtain, we state the following theorem (see Theorem 4.13 for more details and a slightly stronger version):

Let 1q1\leq q\leq\infty and let NN denote a Young function, so that:

Then under our convexity assumptions, the following statements are equivalent:

where the best constants D1,D2D_{1},D_{2} above satisfy:

with c1,c2>0c_{1},c_{2}>0 universal constants and Bα,q,Cα,qB_{\alpha,q},C_{\alpha,q} depending explicitly on α,q\alpha,q. In fact, the convexity assumptions are not needed for the direction (2)(1)(2)\Rightarrow(1), and the assumptions (1.5) are not needed if q2q\geq 2 for the direction (1)(2)(1)\Rightarrow(2).

When N(t)=t2,q=2N(t)=t^{2},q=2, the direction (2)(1)(2)\Rightarrow(1) reduces (up to constants) to Cheeger’s inequality, and the direction (1)(2)(1)\Rightarrow(2) to its reversed form due to Buser–Ledoux. In addition, using N(t)=tqlog(1+tq)N(t)=t^{q}\log(1+t^{q}) and a result of Bobkov and Zegarlinski (generalizing a previous result of Bobkov and Götze ), a variant of Theorem 1.1 implies (see Corollary 4.8) the following:

Under our convexity assumptions, the qq-log-Sobolev inequality (1.3) (for qq\in) is equivalent to the isoperimetric inequality:

with the best constants DLSq,DIqD_{LS_{q}},D_{I_{q}} satisfying C1DLSqDIqC2DLSqC_{1}D_{LS_{q}}\leq D_{I_{q}}\leq C_{2}D_{LS_{q}} for some universal constants C1,C2>0C_{1},C_{2}>0, uniformly on qq\in.

That the latter implies the former was previously shown by Bobkov and Zegarlinski without any convexity assumptions (we prove a more general result in Section 4). That the former implies the latter for q=2q=2 is precisely the statement that DGaucDLS2D_{Gau}\geq cD_{LS_{2}} under our convexity assumptions, recovering the previously mentioned result of Bakry–Ledoux and Ledoux .

Theorem 1.1 coupled with the equivalence between Orlicz-Sobolev inequalities and capacity inequalities, enables us to directly infer isoperimetric inequalities from their qq-capacity counterparts under our convexity assumptions. Previous works of Barthe–Roberto , Barthe–Cattiaux–Roberto and Roberto–Zegarlinski have shown that 22-capacity inequalities are often equivalent to certain other types of functional inequalities, such as the Latała–Oleszkiewicz inequality (or more general Beckner-type inequalities) and additive Φ\Phi-Sobolev inequalities. The advantage of these inequalities compared to the Orlicz-Sobolev inequalities lies in the fact that they admit tensorization. To further demonstrate the usefulness of the framework we develop, we easily deduce in Section 5 as a by-product of our methods the dimension-free tensorization results of . In fact, we prove the following natural extension of these results. By the Central-Limit Theorem, one cannot expect a dimension-free result for isoperimetric profiles which are better than the one for the Gaussian measure (and even in this case some badly behaved examples due to Franck Barthe are known ), so some condition needs to be imposed (we refer to Section 5 for more details):

Then without any additional convexity assumptions, there exists a constant cD>0c_{D}>0 depending only on DD, such that for any k1k\geq 1:

As already mentioned, our convexity assumptions throughout this work are used via the semi-group argument described in Section 2. More precisely, in that section we assume that our smooth convexity assumptions are fulfilled. To justify the passage to the limit and conclude that our results are valid under arbitrary convexity assumptions, we develop a careful approximation argument in Section 6. We emphasize that this is not just a technical matter, in general it is simply not true that (N,q)(N,q) Orlicz-Sobolev inequalities on the spaces (Ω,d,μm)(\Omega,d,\mu_{m}) are stable under taking limit of μm\mu_{m} in the total-variation norm (see Section 6), so the convexity assumptions will need to be exploited one last time. To the best of our knowledge, with the exclusion of the tensorization results above, all the previously known results which were mentioned did not address this point, and these results were deduced under the additional smoothness assumptions.

Acknowledgements. I would like to thank Professor Jean Bourgain and the Institute for Advanced Study for providing the perfect research environment. Most especially, I would like to thank Sasha Sodin for his invaluable help - acquainting me with capacities, suggesting to look at Ledoux’s semi-group argument, countless other references, many informative conversations and comments on this manuscript. I am also thankful to Professors Franck Barthe and Michel Ledoux for their remarks on earlier versions of this manuscript.

The Semi-Group Argument

In this section, we prove the direction (1)(2)(1)\Rightarrow(2) of Theorem 1.1 for q2q\geq 2. Our proof is an adaptation of the semi-group argument used in our earlier work , which in turn closely follows Ledoux’s proof of [34, Theorem 5.2].

Let N(μ)N(\mu) denote an Orlicz norm associated to the Young function NN. Then:

This lemma implies that we can pass back and forth between using the median MμM_{\mu} and the expectation EμE_{\mu} when excluding constant functions in our functional inequalities, at the expense of losing a universal constant.

NN^{*} is always convex, but unfortunately it may attain the value of ++\infty, so it will not be a Young function according to our definition. To avoid this minor issue, it will be more convenient to work with the dual norm to N(μ)N(\mu):

We denote by N(μ)N(\mu)^{*} the dual norm to N(μ)N(\mu), given by:

Although this will not be used, we comment that it is a nice exercise (e.g. ) to show that when NN is a Young function then:

The second inequality is usually called Young’s inequality.

Let NN denote a Young function. Then for any Borel set AA with μ(A)>0\mu(A)>0:

On one hand, denoting g0:=N1(1/μ(A))χAg_{0}:=N^{-1}(1/\mu(A))\chi_{A}, since g0N(μ)=1\left\|g_{0}\right\|_{N(\mu)}=1 we have:

On the other hand, by Jensen’s inequality, for any gg with gN(μ)1\left\|g\right\|_{N(\mu)}\leq 1, we have:

2 Semi-Group Gradient Estimates

where ΔΩ\Delta_{\Omega} is the usual Laplace-Beltrami operator on Ω\Omega. Δ(Ω,μ)\Delta_{(\Omega,\mu)} acts on B(Ω)\mathcal{B}(\Omega), the space of bounded smooth real-valued functions on Ω\Omega. Let (Pt)t0(P_{t})_{t\geq 0} denote the semi-group associated to the diffusion process with infinitesimal generator Δ(Ω,μ)\Delta_{(\Omega,\mu)} (cf. ), characterized by the following system of second order differential equations:

For each t0t\geq 0, Pt:B(Ω)B(Ω)P_{t}:\mathcal{B}(\Omega)\rightarrow\mathcal{B}(\Omega) is a bounded linear operator and its action naturally extends to the entire Lp(μ)L_{p}(\mu) spaces (p1p\geq 1). We collect several elementary properties of these operators:

Pt(f)pPt(fp)\left|P_{t}(f)\right|^{p}\leq P_{t}(\left|f\right|^{p}) for all p1p\geq 1.

The following crucial dimension-free reverse Poincaré inequality was shown by Bakry and Ledoux in [2, Lemma 4.2], extending Ledoux’s approach for proving Buser’s Theorem (see also [2, Lemma 2.4], [34, Lemma 5.1]):

Assume that the following Bakry-Émery Curvature-Dimension condition holds on Ω\Omega:

Then for any t0t\geq 0 and fB(Ω)f\in\mathcal{B}(\Omega), we have:

Our convexity assumptions are that K=0K=0 in Lemma 2.3, and this is what we will henceforth assume. It is clear that our results in this section as well as Section 4 may be extended to the case of K>0K>0, but we do not pursue this direction in this work.

From Lemma 2.3, it is immediate that for any 2q2\leq q\leq\infty:

and using q=q=\infty, Ledoux easily deduces the following dual statement [34, (5.5)]:

3 Orlicz-Sobolev implies Isoperimetry for q≥2𝑞2q\geq 2

Let 2q2\leq q\leq\infty and let NN denote a Young function. Then under our convexity assumptions, the statement:

with CN,qc>0C_{N,q}\geq c>0, a universal constant.

We will see how to relax the assumption that q2q\geq 2 to q1q\geq 1 as well as the requirement that NN is convex in Theorem 4.5, in which case we will get a different lower bound on CN,qC_{N,q} which will depend on NN and qq.

Since NN is a Young function, we may replace MμfM_{\mu}f in (2.5) by EμfE_{\mu}f using Lemma 2.1 at the expense of an additional universal constant in the final conclusion.

Let AA denote an arbitrary Borel set in Ω\Omega, and let χA,ε(x):=(11εdg(x,A))0\chi_{A,\varepsilon}(x):=(1-\frac{1}{\varepsilon}d_{g}(x,A))\vee 0 denote a continuous approximation in Ω\Omega to the characteristic function χA\chi_{A} of AA. Clearly:

Applying Corollary 2.4 to functions in B(Ω)\mathcal{B}(\Omega) which approximate χA,ε\chi_{A,\varepsilon} (in say W1,1(Ω,μ)W^{1,1}(\Omega,\mu)) and passing to the limit inferior as ε0\varepsilon\rightarrow 0, it follows that:

We start by rewriting the right hand side above as:

To estimate the right-most expression, we use the definition of the dual norm:

Note that we could have also used Young’s inequality, yielding 2gN(μ)2\left\|g\right\|_{N^{*}(\mu)} instead of gN(μ)\left\|g\right\|_{N(\mu)^{*}} above, but this would lead to slightly worse numeric estimates. Using our assumption (2.5) with MμM_{\mu} replaced by EμE_{\mu}, we get:

Using (2.3) (recall that q2q\geq 2) to estimate PtχALq(μ)\left\|\left|\nabla P_{t}\chi_{A}\right|\right\|_{L_{q}(\mu)}, we conclude that:

Using Lemma 2.2, we estimate χAμ(A)N(μ)\left\|\chi_{A}-\mu(A)\right\|_{N(\mu)^{*}}:

We also have the following rough estimate (for q2q\geq 2):

It remains to optimize on tt. Evaluating (2.7) at time:

As evident from the proof, the definition of smooth convexity assumptions given in the Introduction may be extended to encompass the more general case treated in this section. Consequently, the same remark applies to all of the subsequent results which employ our convexity assumptions.

Capacities

As already mentioned in the Introduction, qq-capacity inequalities are certain functional formulations of isoperimetric inequalities. We conform to the definition given in , which is a variation on the definition introduced by Maz’ya (for general qq) and extended by Barthe, Cattiaux and Roberto (with q=2q=2) in (after being introduced in ). In this section, we introduce a coherent unified framework which provides an equivalence between capacity inequalities and weak-type Orlicz-Sobolev functional inequalities (introduced below), and a general sufficient condition for an equivalence to Orlicz-Sobolev inequalities. We also provide an argument for handling general metric probability spaces. There is essentially no novel content in some parts of this section, and these are provided here for completeness.

Given a metric probability space (Ω,d,μ)(\Omega,d,\mu), 1q<1\leq q<\infty and 0ab10\leq a\leq b\leq 1, we denote:

where the infimum is on all Φ:Ω\Phi:\Omega\rightarrow which are Lipschitz-on-balls.

Both Maz’ya’s definition for general qq and the definition of Barthe–Cattiaux–Roberto for the case q=2q=2 use Φqdμ\int\left|\nabla\Phi\right|^{q}d\mu instead of our normalized ΦLq(μ)\left\|\left|\nabla\Phi\right|\right\|_{L_{q}(\mu)}. Our definition seems more convenient, as witnessed by the formulation of our results below.

The use of the metric dd induced by the geodesic distance on (M,g)(M,g) was essential for applying the (linear) semi-group argument of the previous section. Throughout this section, as well as the relevant parts of Sections 4 and 5, such a restriction no longer exists, and one may use an arbitrary metric dd. In this case, we interpret f\left|\nabla f\right| for any fFf\in\mathcal{F} as the following Borel function:

(and we define it as 0 if xx is an isolated point - see [16, pp. 184,189] for more details).

A remark which will be useful for dealing with general metric probability spaces, is that in the definition of capacity, we may always assume that {Φ=t}Φqdμ=0\int_{\left\{\Phi=t\right\}}\left|\nabla\Phi\right|^{q}d\mu=0, for any t(0,1)t\in(0,1), even though we may have μ{Φ=t}>0\mu\left\{\Phi=t\right\}>0. The argument is as follows.

Denote Γ:={t(0,1);μ{Φ=t}>0}\Gamma:=\left\{t\in(0,1);\mu\left\{\Phi=t\right\}>0\right\} the discrete countable set of atoms of Φ\Phi under μ\mu, and write Γ={γi}i=A,,B\Gamma=\left\{\gamma_{i}\right\}_{i=-A,\ldots,B}, A,B{0,1,,}A,B\in\left\{0,1,\ldots,\infty\right\}, with γi<γi+1\gamma_{i}<\gamma_{i+1} (and set γ(A+1)=0\gamma_{-(A+1)}=0 if A<A<\infty and γB+1=1\gamma_{B+1}=1 if B<B<\infty). Denote βi=(γi+γi+1)/2\beta_{i}=(\gamma_{i}+\gamma_{i+1})/2, and set for ε>0\varepsilon>0:

Clearly ΦεF\Phi_{\varepsilon}\in\mathcal{F} and ΦεLq(μ)(1+ε)ΦLq(μ)\left\|\left|\nabla\Phi_{\varepsilon}\right|\right\|_{L_{q}(\mu)}\leq(1+\varepsilon)\left\|\left|\nabla\Phi\right|\right\|_{L_{q}(\mu)}, so Φε\Phi_{\varepsilon} is a valid approximation. Since Φ\Phi is Lipschitz-on-balls and Φε\Phi_{\varepsilon} has the same set of atoms Γ\Gamma as Φ\Phi, it is immediate to verify that for every γiΓ\gamma_{i}\in\Gamma:

But the integral on the right hand side of (3.1) is 0 since μi,νiΓ\mu_{i},\nu_{i}\notin\Gamma.

The following proposition (see , , , [50, Proposition A]) encapsulates the connection between capacity and the isoperimetric profile I=I(Ω,d,μ)I=I_{(\Omega,d,\mu)} (we refer to for a careful proof).

Since obviously Cap1(a,b)=Cap1(1b,1a)Cap_{1}(a,b)=Cap_{1}(1-b,1-a), we have the following useful corollary:

Note that the operation NNN\rightarrow N^{\wedge} is an involution on N\mathcal{N}, and that N(α)=(N)1/αN(\cdot^{\alpha})^{\wedge}=(N^{\wedge})^{1/\alpha} for α>0\alpha>0.

N(tα)/tN(t^{\alpha})/t is non-decreasing iff N(t)1/α/tN^{\wedge}(t)^{1/\alpha}/t is non-increasing (α>0\alpha>0).

It is enough to prove the “only if” direction for α=1\alpha=1 by Remark 3.6. Our assumption is that for all 0<t1t20<t_{1}\leq t_{2}:

Let s1s2>0s_{1}\geq s_{2}>0 be given. Using ti=N1(1/si)t_{i}=N^{-1}(1/s_{i}), i=1,2i=1,2 above (which is legitimate since NN is increasing), we deduce:

We denote by Ls,(μ)L_{s,\infty}(\mu) the weak LsL_{s} quasi-norm, defined as:

We now extend the definition of the weak LsL_{s} quasi-norm to Orlicz quasi-norms N(μ)N(\mu), using the adjoint function NN^{\wedge}:

Given NNN\in\mathcal{N}, define the weak N(μ)N(\mu) quasi-norm as:

This definition is consistent with the one for Ls,L_{s,\infty}, and satisfies:

as easily checked using the Markov-Chebyshev inequality. Also note that by a simple union-bound:

The motivation for the definition of NN^{\wedge} stems from the immediate observation that for any Borel set AA:

For this reason, the expression 1/N1(1/t)1/N^{-1}(1/t) already appears in the works of Maz’ya [40, p. 112] and Roberto–Zegarlinski .

is called a weak-type Orlicz-Sobolev inequality.

The weak-type Orlicz-Sobolev inequality (3.4) implies:

Apply (3.4) to f=Φf=\Phi, where Φ:Ω\Phi:\Omega\rightarrow is any Lipschitz-on-balls function so that μ{Φ=1}t\mu\left\{\Phi=1\right\}\geq t and μ{Φ=0}1/2\mu\left\{\Phi=0\right\}\geq 1/2. Since MμΦ=0M_{\mu}\Phi=0, it follows that:

Taking the infimum over all Φ\Phi as above, the assertion is verified. ∎

2 Equivalences

Let 1q<1\leq q<\infty, then the following statements are equivalent:

and the best constants D1,D2D_{1},D_{2} above satisfy D1D24D1D_{1}\leq D_{2}\leq 4D_{1}.

Given a non-negative function ff as above (μ{f=0}1/2\mu\left\{f=0\right\}\geq 1/2 hence Mμf=0M_{\mu}f=0), and t>0t>0, define Ωt={ft}\Omega_{t}=\left\{f\leq t\right\} and ft:=f/t1f_{t}:=f/t\wedge 1. Then:

Taking supremum on t>0t>0, the assertion follows. ∎

and the best constants D1,D2D_{1},D_{2} above satisfy D1D24D1D_{1}\leq D_{2}\leq 4D_{1}.

As already mentioned in the Introduction, we call an inequality of the form (3.6) an Orlicz-Sobolev inequality (even though NN may not be convex).

One may show (see e.g. the proof of [46, Theorem 1]) that when N(t1/q)N(t^{1/q}) is convex (so in particular N(t)1/q/tN(t)^{1/q}/t is non-decreasing), Proposition 3.11 is equivalent to a theorem of Maz’ya [40, p. 112], but there the condition on NN is hidden. Such a stronger assumption is too restrictive for our purposes. Under this stronger assumption, the statement of this proposition was used in the case q=2q=2 in and for N(t)=t2,q=2N(t)=t^{2},q=2 in .

The last inequality follows from the fact that N1/q(t)/tN^{1/q}(t)/t is non-decreasing, so denoting v±=f±N(μ)v_{\pm}=\left\|f_{\pm}\right\|_{N(\mu)}, we indeed verify that:

We will first assume that ff is bounded. Given a bounded non-negative function ff as above (Mμf=0M_{\mu}f=0 and μ{f=0}1/2\mu\left\{f=0\right\}\geq 1/2), we may assume by homogeneity that fL=1\left\|f\right\|_{L_{\infty}}=1. For i1i\geq 1, denote Ωi={1/2if1/2i1}\Omega_{i}=\left\{1/2^{i}\leq f\leq 1/2^{i-1}\right\}, mi=μ(Ωi)m_{i}=\mu(\Omega_{i}), fi=2i(f1/2i)01f_{i}=2^{i}(f-1/2^{i})\vee 0\wedge 1 and set m0=0m_{0}=0. Also denote J:=NJ:=N^{\wedge}. Now:

It remains to show that fN(μ)V\left\|f\right\|_{N(\mu)}\leq V. Indeed:

where in the last inequality we have used the fact that N(t)1/q/tN(t)^{1/q}/t is non-decreasing, hence (J1)1/q(t)/t(J^{-1})^{1/q}(t)/t is non-decreasing, and therefore:

whenever x/y1x/y\leq 1, which is indeed the case for us.

For a non-bounded fFf\in\mathcal{F} with μ{f=0}1/2\mu\left\{f=0\right\}\geq 1/2, we may define fm=fbmf_{m}=f\wedge b_{m} so that μ{f>bm}1/m\mu\left\{f>b_{m}\right\}\leq 1/m and (just for safety) μ{f=bm}=0\mu\left\{f=b_{m}\right\}=0. It then follows by what was proved for bounded functions that:

where all limits exist since they are non-decreasing. To conclude, ZfN(μ)Z\geq\left\|f\right\|_{N(\mu)}, since NN is continuous, so by the Monotone Convergence Theorem:

We immediately deduce from Propositions 3.10 and 3.11 the following peculiar corollary on the equivalence of the weak and usual Orlicz norms for some functional inequalities:

and the best constants D1,D2D_{1},D_{2} above satisfy D1D24D1D_{1}\leq D_{2}\leq 4D_{1}.

This corollary seems useful, even in the case of F(t)=t2F(t)=t^{2} and q=2q=2, where this amounts to an equivalent characterization of the classical Poincaré inequality, using the weak L2,L_{2,\infty} quasi-norm on the left hand side. We do not know whether this characterization was previously noticed.

Another useful fact which follows from Propositions 3.10 and 3.11 is that the behavior of NN at a neighborhood of 0 is simply irrelevant as far as Orlicz-Sobolev inequalities are concerned:

Then the following statements are equivalent:

and the best constants D1,D2D_{1},D_{2} above satisfy 14D1D24D1\frac{1}{4}D_{1}\leq D_{2}\leq 4D_{1}.

Note that N0N_{0} still satisfies that N0(t)1/q/tN_{0}(t)^{1/q}/t is non-decreasing and that N(t)=N0(t)N^{\wedge}(t)=N_{0}^{\wedge}(t) on t[0,1/2]t\in[0,1/2]. Using Proposition 3.10 to pass from the Orlicz-Sobolev inequality to a capacity inequality, we can then exchange between NN and N0N_{0}, and use Proposition 3.11 to pass back to the other Orlicz-Sobolev inequality. ∎

The General Theorem

Note that the assumption q2q\geq 2 was needed for the proof of Theorem 2.5 in order to use the estimate (2.3), and the convexity of NN was needed to employ Lemma 2.2. In order to relax these assumptions, as well as to deduce the direction (2)(1)(2)\Rightarrow(1) in Theorem 1.1, we will need some additional observations, which are most-naturally formulated in the language of capacities.

In the following proposition, the case q0=1q_{0}=1 is due to Maz’ya [40, p. 105]. Motivated by the method used in our joint work with Sodin in , we provide an independent proof, which generalizes to the case of an arbitrary metric probability space and q0>1q_{0}>1. We denote the conjugate exponent to q[1,]q\in[1,\infty] by q=q/(q1)q^{*}=q/(q-1).

Let 1q0q<1\leq q_{0}\leq q<\infty and set p0=q0,p=qp_{0}=q_{0}^{*},p=q^{*}. Then for all 0<a<b<10<a<b<1:

Let 0<a<b<10<a<b<1 be given, and let Φ:Ω\Phi:\Omega\rightarrow be a function in F\mathcal{F} such that a:=μ{Φ=1}aa^{\prime}:=\mu\left\{\Phi=1\right\}\geq a and 1b:=μ{Φ=0}1b1-b^{\prime}:=\mu\left\{\Phi=0\right\}\geq 1-b. As usual (see Remark 3.3), by approximating Φ\Phi, we may assume that {Φ=t}Φqdμ=0\int_{\left\{\Phi=t\right\}}\left|\nabla\Phi\right|^{q}d\mu=0 for all t(0,1)t\in(0,1). Let C:={t(0,1);μ{Φ=t}>0}C:=\left\{t\in(0,1);\mu\left\{\Phi=t\right\}>0\right\} denote the discrete set of atoms of Φ\Phi under μ\mu, set Γ:={fC}\Gamma:=\left\{f\in C\right\} and denote γ=μ(Γ)\gamma=\mu(\Gamma).

We now choose t0=0<t1<t2<<1t_{0}=0<t_{1}<t_{2}<\ldots<1, so that denoting for i1i\geq 1, Ωi={ti1Φti}\Omega_{i}=\left\{t_{i-1}\leq\Phi\leq t_{i}\right\}, and setting mi=μ(ΩiΓ)m_{i}=\mu(\Omega_{i}\setminus\Gamma), we have mi=(baγ)αi1(1α)m_{i}=(b^{\prime}-a^{\prime}-\gamma)\alpha^{i-1}(1-\alpha), where 0α10\leq\alpha\leq 1 will be chosen later. Denote in addition Φi=(Φti1titi10)1\Phi_{i}=\left(\frac{\Phi-t_{i-1}}{t_{i}-t_{i-1}}\vee 0\right)\wedge 1, Ni=j>imjN_{i}=\sum_{j>i}m_{j}. Applying Hölder’s inequality twice, we estimate:

Since μ{Φti}a+Ni\mu\left\{\Phi\geq t_{i}\right\}\geq a^{\prime}+N_{i} and Capq0(s,b)Cap_{q_{0}}(s,b) is non-decreasing in ss, we continue to estimate as follows:

where we have used that mi+1=αmim_{i+1}=\alpha m_{i}, mi=1ααNim_{i}=\frac{1-\alpha}{\alpha}N_{i}, and in the last inequality the fact that Capq0(s,b)Cap_{q_{0}}(s,b) is non-decreasing in ss. The assertion now follows by taking supremum on all Φ\Phi as above, and choosing the optimal α=1p/p0\alpha=1-p/p_{0}. ∎

Let 1pp01\leq p\leq p_{0}\leq\infty, and let NNN\in\mathcal{N} so that N(t)1/α/tN(t)^{1/\alpha}/t is non-decreasing for some α>0\alpha>0 (in particular this holds with α=1\alpha=1 when NN is a Young function). Then for any t>0t>0:

Let us evaluate the integral on [t,2t][t,2t] and [2t,)[2t,\infty) separately:

On the other hand, since N(tα)/tN^{\wedge}(t^{\alpha})/t is non-increasing:

Summing these two expressions, the assertion follows. ∎

We do not optimize on the dependence on α\alpha here, since in our applications α1\alpha\geq 1. In this case, note that γp,p0\gamma_{p,p_{0}} in (4.1) and δp,p0,α\delta_{p,p_{0},\alpha} in (4.2) conveniently satisfy:

where C>0C>0 is some universal constant. This will be used in the proof of Theorem 4.5 below.

Let p1,p2,p3[1,]p_{1},p_{2},p_{3}\in[1,\infty], and let N1NN_{1}\in\mathcal{N} satisfy:

N2(t)1/p3/tN_{2}(t)^{1/p_{3}}/t is non-decreasing.

If p2p3p_{2}\leq p_{3} then N2N_{2} is a convex (hence Young) function.

Note that since N1NN_{1}\in\mathcal{N}, it is almost everywhere differentiable. Also note that our integrability conditions (4.3) together with N1NN_{1}\in\mathcal{N} ensure that N2NN_{2}\in\mathcal{N}. We will assume that p2<p_{2}<\infty, the case p2=p_{2}=\infty follows by taking limit.

For the first part, it is equivalent to show that N2(tp3)/tN_{2}^{\wedge}(t^{p_{3}})/t is non-increasing, which in turn is equivalent to checking that F(t)F(t), defined below, is non-decreasing:

and the integrability condition (4.3) ensures that lim suptG(t)=0\limsup_{t\rightarrow\infty}G(t)=0. We will show that G(t)G(t) is non-increasing, from which it will follow that G(t)0G(t)\geq 0, hence F(t)0F^{\prime}(t)\geq 0, as claimed. Indeed, for almost all t>0t>0:

The last expression is indeed non-positive, since N1(t)1/p3+1/p21/p1/tN_{1}(t)^{1/p_{3}+1/p_{2}-1/p_{1}}/t is non-decreasing, hence N1(t)1/(1/p3+1/p21/p1)/tN_{1}^{\wedge}(t)^{1/(1/p_{3}+1/p_{2}-1/p_{1})}/t is non-increasing, and by differentiating the latter expression one verifies that (N1)(t)(1/p3+1/p21/p1)N(t)/t(N_{1}^{\wedge})^{\prime}(t)\leq(1/p_{3}+1/p_{2}-1/p_{1})N(t)/t.

For the second part, let us substitute the definitions of N1,N2N_{1}^{\wedge},N_{2}^{\wedge} in (4.4) and perform the change of variables z=1/sz=1/s. This amounts to:

Taking the derivative, we obtain that for almost every t>0t>0:

Multiplying by the denominator on the left hand side and taking the derivative once again yields that for almost every t>0t>0:

In particular, N2N_{2} is twice differentiable for almost every t>0t>0, and it is clear that N20N_{2}^{\prime\prime}\geq 0 if T0T\leq 0 almost everywhere. The latter amounts to checking that for almost all z>0z>0:

When p2p3p_{2}\leq p_{3}, this follows from the stronger statement:

which indeed holds for almost all z>0z>0, as verified by differentiating N1(z)1/p3+1/p21/p1z\frac{N_{1}(z)^{1/p_{3}+1/p_{2}-1/p_{1}}}{z}, which by assumption is non-decreasing.

2 Orlicz-Sobolev implies Isoperimetry for q≥1𝑞1q\geq 1

We can now prove the following extension of Theorem 2.5:

The assumption that q2q\geq 2 in Theorem 2.5 can be relaxed to q1q\geq 1, and the assumption that NN is a Young function omitted, if we assume in addition that:

In this case, under our convexity assumptions, (2.5) implies (2.6) with:

where c>0c>0 is a universal constant, and:

Estimating the expression in (4.5) is connected to Hardy-type inequalities. We do not proceed in this direction in this work, since for our applications the bounds are easy to deduce directly. We remark that whenever N(t)α/tN(t)^{\alpha}/t is non-decreasing for some α>0\alpha>0, N(t)1/α/tN^{\wedge}(t)^{1/\alpha}/t is non-increasing, and so:

Using this estimate, it is immediate to show that the expression on the right hand side of (4.5) is bounded from above by a universal constant whenever 1/qα11/q\leq\alpha\leq 1, even if the infimum in (4.5) is replaced by a supremum. In particular, this obviously applies to all Young functions NN (with α=1\alpha=1).

First, note that whichever the value of qq, we have:

By Corollary 3.16, we can always assume that N(t)=2(t/N1(2))qN(t)=2(t/N^{-1}(2))^{q} on t[0,N1(2)]t\in[0,N^{-1}(2)], so that N(t)=21/qN1(2)t1/qN^{\wedge}(t)=\frac{2^{1/q}}{N^{-1}(2)}t^{1/q} on t[1/2,)t\in[1/2,\infty), and therefore:

Using the assumption that N(t)1/q/tN(t)^{1/q}/t is non-decreasing, hence N(tq)/tN^{\wedge}(t^{q})/t is non-increasing, it follows that:

The assumption (2.5) implies by Proposition 3.10 that:

We start with the case q<2q<2. Using Proposition 4.1 (with q0=q,q=2q_{0}=q,q=2) to pass from CapqCap_{q} to Cap2Cap_{2}, together with Lemma 4.2 (with α=q\alpha=q) and Remark 4.3, we obtain that:

for some universal constant c>0c>0, where N2N_{2} is a function so that:

Since N(t)1/q/tN(t)^{1/q}/t is non-decreasing and the integrability conditions (4.7), (4.8) are fulfilled, we can apply Lemma 4.4 with N1=NN_{1}=N, p1=p,p2=2,p3=2p_{1}=p,p_{2}=2,p_{3}=2, and conclude that N2N_{2} is a Young function and that N2(t)1/2/tN_{2}(t)^{1/2}/t is non-decreasing. Proposition 3.11 then implies that:

We can now apply Theorem 2.5, and conclude that:

with c>0c^{\prime}>0 a universal constant. The value of CN,qC_{N,q} in (4.5) ensures that this implies:

as required. This concludes the proof when q<2q<2.

When q2q\geq 2, we use a similar argument. Let NqN_{q} denote the function so that:

Again, by Lemma 4.4 with N1=NN_{1}=N, p1=p2=p3=qp_{1}=p_{2}=p_{3}=q, we know that NqN_{q} is a Young function and that Nq(t)1/q/tN_{q}(t)^{1/q}/t is non-decreasing. Recalling Remark 4.6, the assumption (4.9) implies that:

for some universal c>0c>0. Proposition 3.11 then implies that:

We can now apply Theorem 2.5, and using the definition of CN,qC_{N,q} in (4.5), conclude that:

Let 1q<1\leq q<\infty, NNN\in\mathcal{N}, and assume that:

with r,pr,p as in (\refeq:rp)(\ref{eq:r-p}). Then under our convexity assumptions, the assumption (2.5) implies the conclusion (2.6) with:

whenever sts\geq t. Applying Theorem 4.5 and using (4.12), it is straightforward to obtain a lower bound on the expression in (4.5), which yields the bound in (4.11). ∎

It was shown by Bobkov and Zegarlinski [18, Proposition 3.1] (generalizing the case q=2q=2 due to Bobkov and Götze [15, Proposition 4.1]) that the following qq-log-Sobolev inequality (with 1q21\leq q\leq 2):

where φq(t)=tqlog(1+tq)\varphi_{q}(t)=t^{q}\log(1+t^{q}), and D1D2D_{1}\simeq D_{2} uniformly on qq\in. Using Lemma 2.1, we can replace EμfE_{\mu}f in (4.14) by MμfM_{\mu}f, at the expense of an additional universal constant. Using Corollary 4.7 with N=φqN=\varphi_{q} and α=12q>1/q1/2\alpha=\frac{1}{2q}>1/q-1/2 in the range q(1,2]q\in(1,2], we can easily show that the qq-log-Sobolev inequality (4.13) implies a corresponding isoperimetric inequality. However, to handle the entire range qq\in uniformly, we will need to turn to Theorem 4.5.

Under our convexity assumptions, the qq-log-Sobolev inequality (4.13) for 1q21\leq q\leq 2 implies the following isoperimetric inequality:

φq(t)1/q/t\varphi_{q}(t)^{1/q}/t is non-decreasing, so using Lemma 2.1 and Corollary 3.16, (4.13) implies that:

Note that Nq(t)1/q/tN_{q}(t)^{1/q}/t is still non-decreasing. Clearly:

uniformly on qq\in. Hence, using Theorem 4.5 to deduce the isoperimetric inequality (4.15), it remains to bound the expression in (4.5) from below uniformly in qq\in. This amounts to showing that:

where p=qp=q^{*} and C1>0C_{1}>0 is a universal constant. First, we bound the tail of this integral using (4.16):

which is bounded by a universal constant for t[0,1/2]t\in[0,1/2]. Next, we use (4.16) and the change of variables v=log(1+1/s)v=\log(1+1/s) to bound:

We see that this is also bounded in the range t[0,1/2]t\in[0,1/2], and this concludes the proof.

The case q=2q=2 was previously shown by Bakry–Ledoux and Ledoux . For general 1q21\leq q\leq 2, the reverse direction without any convexity assumptions was shown by Bobkov and Zegarlinski , and given a different proof by Sodin and the author . We will see a general argument for this in the next theorem.

3 Isoperimetry implies Orlicz-Sobolev

Let 1q<1\leq q<\infty, and set p=qp=q^{*}. Let NNN\in\mathcal{N}, so that N(t)1/q/tN(t)^{1/q}/t is non-decreasing. Then:

We rewrite (4.17) using Corollary 3.5 as:

Using Proposition 4.1 (with q0=1,q=qq_{0}=1,q=q) to pass from Cap1Cap_{1} to CapqCap_{q}, we obtain that:

where GpG_{p} is defined on [0,1/2][0,1/2] as:

Incidentally, if we replace 1/21/2 in the upper range of the above integral by \infty, by Lemma 4.4 with N1=NN_{1}=N, p1=p,p2=p,p3=qp_{1}=p,p_{2}=p,p_{3}=q, we would have that Gp(tq)/tG_{p}(t^{q})/t is non-increasing, but this will not be used. The estimate in (4.19) ensures that:

Using that N(t)1/q/tN(t)^{1/q}/t is non-decreasing, Proposition 3.11 then implies (4.18), as asserted. ∎

Note that the assumption (4.17) implies (4.20) without assuming that N(t)1/q/tN(t)^{1/q}/t is non-decreasing.

Let 1q<1\leq q<\infty and set p=qp=q^{*}. Assume that:

with some α>0\alpha>0. Then the assumption (4.17) implies the conclusion (4.18) with:

Exactly as in the proof of Corollary 4.7. ∎

Using this for N=φqN=\varphi_{q} and α=12q\alpha=\frac{1}{2q}, we see that as already noted in Remark 4.9, the isoperimetric inequality (\refeq:qlogSobIso)(\ref{eq:q-log-Sob-Iso}) implies without any further assumptions the qq-log-Sobolev inequalities (4.14) and (4.13).

4 Summary

To conclude this section, we provide a slightly stronger version of Theorem 1.1 from the Introduction, on the equivalence of isoperimetric and Orlicz-Sobolev functional inequalities under our convexity assumptions. Our results in this section are more general, but this theorem summarizes the most useful cases given by Theorem 2.5 and Corollaries 4.7 and 4.12, and generalizes the results from (which dealt with the case N(t)=tpN(t)=t^{p} below).

Let 1q1\leq q\leq\infty, and let NNN\in\mathcal{N}. Assume that:

Then the following statements are equivalent:

where the best constants D1,D2D_{1},D_{2} above satisfy:

with c1,c2>0c_{1},c_{2}>0 universal constants and:

In fact, among the assumptions (4.22), (4.23), (4.24), (4.25):

For the direction (2)(1)(2)\Rightarrow(1) only (4.24) is needed.

For the direction (1)(2)(1)\Rightarrow(2) with q2q\geq 2 only (4.22) and one of (4.23) or (4.24) are needed, and if (4.23) is used then Cα,qC_{\alpha,q} can be chosen to be 11.

For the direction (1)(2)(1)\Rightarrow(2) with q<2q<2 (4.23) is not needed.

The direction (1)(2)(1)\Rightarrow(2) was proved in Theorem 2.5 and Corollary 4.7. The direction (2)(1)(2)\Rightarrow(1) was proved in Corollary 4.12. ∎

Tensorization

As mentioned in the Introduction, the results of Section 4 coupled with the results of Section 3 on the equivalence of capacity inequalities and Orlicz-Sobolev inequalities, allow us to directly infer isoperimetric inequalities from capacity inequalities (under convexity assumptions of course).

Let 1q<1\leq q<\infty, and let NNN\in\mathcal{N}. If:

then the following statements are equivalent:

where the best constants D1,D2D_{1},D_{2} above satisfy:

with c1,c2>0c_{1},c_{2}>0 universal constants and Bα,q,Cα,qB_{\alpha,q},C_{\alpha,q} as in (4.26). In fact, among the assumptions (5.1), (5.2), (5.3), (5.4):

For the direction (2)(1)(2)\Rightarrow(1) only (5.3) is needed.

For the direction (1)(2)(1)\Rightarrow(2) with q2q\geq 2 only (5.1) and one of (5.2) or (5.3) are needed, and if (5.2) is used then Cα,qC_{\alpha,q} in (4.26) can be chosen to be 11.

For the direction (1)(2)(1)\Rightarrow(2) with q<2q<2 (5.2) is not needed.

The direction (1)(2)(1)\Rightarrow(2) follows from Proposition 3.11 coupled with Theorem 2.5 and Corollary 4.7. The direction (2)(1)(2)\Rightarrow(1) follows from Theorem 4.10, Remark 4.11 and Corollary 4.12 (note that we indeed do not need the assumption that N(t)1/q/tN(t)^{1/q}/t is non-decreasing). ∎

It has been established in recent years that several other types of functional inequalities are equivalent to 22-capacity inequalities. These include Beckner-type inequalities (including the Latała–Oleszkiewicz inequality as in ) and additive Φ\Phi-Sobolev inequalities . The advantage of these inequalities compared to the Orlicz-Sobolev inequalities lies in the fact that they admit tensorization. This easily allows us to deduce an extension of the dimension-free tensorization results of Barthe–Cattiaux–Roberto . We demonstrate this with Beckner-type inequalities (5.5), using Theorem 9 and Lemma 8 in as cited in (with a trivial change of notation):

with the best constants D1,D2D_{1},D_{2} above satisfying D1/6D220D1D_{1}/\sqrt{6}\leq D_{2}\leq\sqrt{20}D_{1}.

It is known (e.g. ) that Beckner-type inequalities (5.5) admit tensorization, in the sense that if they hold for (M,g,μ)(M,g,\mu) then they also hold for the Riemannian product (M×k,gk,μk)(M^{\times k},g^{\otimes k},\mu^{\otimes k}) for any k1k\geq 1. To obtain the most general result, we will also need the following remarkable observation of Franck Barthe [4, Theorem 10] (which in fact holds for very general metric probability spaces, but for simplicity we quote it in less general form; see also Ros ):

Then (without any additional convexity assumptions) there exists a constant cD>0c_{D}>0 depending only on DD, such that for any k1k\geq 1:

Our formulation of Theorem 5.4 using the condition (5.7), without refering to an auxiliary profile IνI_{\nu} where ν\nu is some 1-dimensional density, seems more natural than previous requirements, and this will also be evident in the proof. As mentioned in the Introduction, it seems possible to produce a proof of this theorem using the approach of , but the main obstacle would be to pass from the isoperimetric inequality I(t)J(t)I(t)\geq J(t) to the appropriate 22-capacity inequality, using only JJ and without passing via the auxiliary density ν\nu (compare with Theorem 7 and Propositions 9,13 in ), which would otherwise result in requiring some additional technical assumptions and in the constant cDc_{D} to depend on JJ. On the other hand, without any further technical assumptions, Theorem 5.4 basically follows from the argument used to derive Theorem 5.1 coupled with Theorems 5.2 and 5.3. To make this precise we will need to be slightly more careful.

Denote g(t)=mins[t,1/2]J(s)/I0(s)g(t)=\min_{s\in[t,1/2]}J(s)/I_{0}(s) for t[0,1/2]t\in[0,1/2]. Clearly g(1/2)=J(1/2)/I0(1/2)g(1/2)=J(1/2)/I_{0}(1/2), gg is non-decreasing and gJ/I0Dgg\leq J/I_{0}\leq Dg on [0,1/2][0,1/2]. Now denote:

Next, let NNN\in\mathcal{N} denote the function so that:

Indeed, Fact 3 implies that N(0)=0N^{\wedge}(0)=0, and together with the linear growth of J1J_{1} at infinity, this means that NNN^{\wedge}\in\mathcal{N} and hence NNN\in\mathcal{N}. We now apply Lemma 4.4 with N1=J1N_{1}=J_{1}^{\wedge} and p1=,p2=p3=2p_{1}=\infty,p_{2}=p_{3}=2. The appeal to Lemma 4.4 is legitimate since NNN^{\wedge}\in\mathcal{N} and since J1(t)/tJ_{1}^{\wedge}(t)/t is non-decreasing by Fact 2. We deduce that:

In addition, Facts 2 and 3 provide the following estimates:

and an elementary computation provided in Lemma 5.5 below implies that:

with D\simeq_{D} meaning that the bounds depend on DD.

Proposition 4.1 (with q0=1,q=qq_{0}=1,q=q) implies that for all t[0,1/2]t\in[0,1/2]:

is essentially non-decreasing (with the same constant) on [0,1/(e1)][0,1/(e-1)]. The latter follows from (5.11) and Fact 3 on J0J_{0}.

This concludes the proof, up to the proof of Lemma 5.5 below. ∎

for some constant C>0C>0, which will conclude the proof.

The lower bound is immediate from the lower bound in (5.10). For the upper bound, we decompose the integral into two parts:

with the first one interpreted as if t>1/2t>1/2. By Fact 1 and the definition of J1J_{1}, the second integral can be estimated for tt\in by:

To estimate the first integral for t[0,1/2]t\in[0,1/2], we use the upper bound in (5.10) and the change of variables v=log(1+1/s)v=\log(1+1/s):

In fact, by inspecting the bound given by Theorem 9 in more carefully, one can repeat our argument for an arbitrary isoperimetric profile JJ (perhaps violating the Central-Limit obstruction), and study what happens to the profile under tensorization. We leave this for another note.

Approximation Argument

Recall that μm\mu_{m} is said to converge to μ\mu in total-variation if:

In this section, we provide a careful approximation argument for deducing that our results from Section 2 hold under arbitrary convexity assumptions, without requiring any further smoothness conditions (as defined in the Introduction or more generally in Section 2 and Remark 2.7). We recall that at this point, the proof of Theorem 2.5 is only valid under the additional smoothness conditions. We emphasize that this is not just a technical matter, and that our convexity assumptions will need to be invoked once again. To explain this better, let us describe a naive approximation approach which completely fails. Suppose that (Ω,d,μ)(\Omega,d,\mu) satisfies a (N,q)(N,q) Orlicz-Sobolev inequality as in the assumption of Theorem 2.5, and we would like to deduce from this the conclusion of this theorem, assuming that (Ω,d,μ)(\Omega,d,\mu) satisfies our convexity assumptions. By definition, we know that there exists a sequence {μm}\left\{\mu_{m}\right\} which approximates μ\mu in total-variation, such that (Ω,d,μm)(\Omega,d,\mu_{m}) satisfy our smooth convexity assumptions, and so the proof of Theorem 2.5 applies to these spaces. One may hope that since μm\mu_{m} approximate μ\mu, the spaces (Ω,d,μm)(\Omega,d,\mu_{m}) will also satisfy the (N,q)(N,q) Orlicz-Sobolev inequality (perhaps with a worse constant), allowing us to apply Theorem 2.5. Unfortunately, this is completely false in general. For instance, consider the measures μm\mu_{m} which are uniform on the set [1/21/m,1/2+1/m]\setminus[1/2-1/m,1/2+1/m], and converge to μ\mu, the uniform measure on $.Clearly,thespaces. Clearly, the spaces(\Omega,d,\mu_{m})((m\geq 3)donotsatisfyany) do not satisfy any(N,q)OrliczSobolevinequality,whereasinthelimitthespaceOrlicz-Sobolev inequality, whereas in the limit the space(\Omega,d,\mu)$ will satisfy any reasonable inequality (Poincaré, log-Sobolev, etc.). We conclude that a different approach is needed.

Our strategy in this section will be to show that the semi-group estimates of Section 2 can be transferred to a setting without any smoothness assumptions. Our original argument, which at first relied on a method of weak-convergence due to Williams and Zheng (see also Burdzy and Chen ), has been replaced by an elementary argument which we provide below. We continue with the notations used in Section 2, and recall the following definition:

A domain Ω(M,g)\Omega\subset(M,g) is said to be locally convex, if all geodesics in MM tangent to Ω\partial\Omega are locally outside of Ω\Omega. By a result of Bishop , in case that Ω\Omega has C2C^{2} boundary, this is equivalent to requiring that the second fundamental form of Ω\partial\Omega with respect to the normal pointing into Ω\Omega be positive semi-definite on all of Ω\partial\Omega.

One may always choose a version of ψ\psi which is locally Lipschitz on Ω\Omega.

Let AMA\subset M denote the Borel subset of points xMx\in M for which the sequence ψm(x)\psi_{m}(x) converges to ψ(x)\psi(x) (in the wide sense). We know that volM(ΩA)=0vol_{M}(\Omega\setminus A)=0. We will show that for each x0Ωx_{0}\in\Omega, there exists a neighborhood Nx0ΩN_{x_{0}}\subset\Omega and a constant Cx0>0C_{x_{0}}>0, so that:

Consequently, it will follow that one may extend ψ\psi by continuity from AΩA\cap\Omega to the entire Ω\Omega, defining ψ(z0)\psi(z_{0}) for z0ΩAz_{0}\in\Omega\setminus A as ψ(z0)=limzz0,zAψ(z)\psi(z_{0})=\lim_{z\rightarrow z_{0},z\in A}\psi(z). The estimate (6.1) will imply that this limit is well defined and that the resulting ψ\psi satisfies the same locally Lipschitz condition.

It is known that for any x0Mx_{0}\in M, the geodesic open ball B(x0,r)B(x_{0},r) for small enough r>0r>0 is convex embedded in MM, in the sense that it is both geodesically convex and that the exponential map expx0:BTx0M(0,r)B(x0,r)\exp_{x_{0}}:B_{T_{x_{0}}M}(0,r)\rightarrow B(x_{0},r) is a diffeomorphism between BTx0M(0,r)Tx0MB_{T_{x_{0}}M}(0,r)\subset T_{x_{0}}M and B(x0,r)MB(x_{0},r)\subset M. We will therefore choose our neighborhood Nx0N_{x_{0}} to be a convex embedded ball B(x0,r)B(x_{0},r), so that in addition B(x0,r)\overline{B(x_{0},r)} is contained in Ω\Omega. Since ψm\psi_{m} converge to ψ\psi almost everywhere, it is clear that if B(x0,r)Ω\overline{B(x_{0},r)}\subset\Omega then B(x0,r)B(x_{0},r) must be contained in Ωm\Omega_{m} for all mmx0m\geq m_{x_{0}} (we could add “apart from a subset of zero measure” for safety, but this is in fact not necessary due to the convexity of the domains).

Choosing r>0r>0 small enough, it is known (e.g. [28, p. 643], [3, p. 311]) that fx0:=d(x0,)2f_{x_{0}}:=d(x_{0},\cdot)^{2} is a CC^{\infty} function on B(x0,r)B(x_{0},r) whose Riemannian Hessian satisfies Hessgfx0Ax0gHess_{g}f_{x_{0}}\geq A_{x_{0}}g on B(x0,r)B(x_{0},r) for some Ax0>0A_{x_{0}}>0. Denoting:

we define hx0=max(Rx0,r,0)Ax0fx0h_{x_{0}}=\frac{\max(R_{x_{0},r},0)}{A_{x_{0}}}f_{x_{0}}.

Since for all mmx0m\geq m_{x_{0}}, Ricg+Hessgψm0Ric_{g}+Hess_{g}\psi_{m}\geq 0 on B(x0,r)B(x_{0},r), it follows from the above construction that Hessg(ψm+hx0)0Hess_{g}(\psi_{m}+h_{x_{0}})\geq 0 on B(x0,r)B(x_{0},r). Since ψm+hx0C2(B(x0,r))\psi_{m}+h_{x_{0}}\in C^{2}(B(x_{0},r)), it is known ([3, p. 310]) that this is equivalent to being geodesically convex in B(x0,r)B(x_{0},r). We now employ [47, Theorem 10.8], whose proof easily passes to the Riemannian setting (taking into account a slight modification provided in [3, Lemma 2.1] of a Euclidean argument). This theorem asserts that if a sequence of geodesically convex functions {fi}\left\{f_{i}\right\} pointwise converges on a dense subset of a geodesically convex open set NN (to a finite value in each point of the subset), then the pointwise limit in fact exists for each xNx\in N, and the function f(x):=limifi(x)f(x):=\lim_{i\rightarrow\infty}f_{i}(x) is finite and geodesically convex on NN. Since ψm+hx0\psi_{m}+h_{x_{0}} converges to the finite function ψ+hx0\psi+h_{x_{0}} on B(x0,r)AB(x_{0},r)\cap A, it follows that in fact ψm+hx0\psi_{m}+h_{x_{0}} converges to a geodesically convex function on the entire B(x0,r)B(x_{0},r). Writing this function as ψ0+hx0\psi_{0}+h_{x_{0}}, we realize that ψ0\psi_{0} coincides with ψ\psi on B(x0,r)AB(x_{0},r)\cap A. The argument is therefore complete. ∎

In fact, we have shown that ψm\psi_{m} converge pointwise on all of Ω\Omega, and that the limit ψ=limmψm\psi=\lim_{m\rightarrow\infty}\psi_{m} is a semi-convex function (in the sense that in a small enough geodesically convex neighborhood, we may add to it a smooth function to obtain a geodesically convex function in that neighborhood).

We will henceforth choose ψ\psi as constructed in Lemma 6.1. This implies by Rademacher’s theorem that:

The differential ψ\nabla\psi exists almost everywhere on Ω\Omega.

ψ\left|\nabla\psi\right| is bounded almost everywhere on compact subsets of Ω\Omega.

Let X(m):Λm×[0,T]MX^{(m)}:\Lambda_{m}\times[0,T]\rightarrow M denote the diffusion process on Ωm\overline{\Omega}_{m} with reflection on the boundary generated by Δ(Ωm,μm)\Delta_{(\Omega_{m},\mu_{m})}, defined on the probability space (Λm,Fm,Pm)(\Lambda_{m},\mathcal{F}_{m},\mathcal{P}_{m}) and some fixed T>0T>0. Let Pt(m)P^{(m)}_{t} for t[0,T]t\in[0,T] denote the semi-group associated to X(m)X^{(m)}. We will assume that the initial distribution of X(m)(0)X^{(m)}(0) is given by the stationary measure μm\mu_{m}, so that X(m)X^{(m)} is a stationary process.

Using a forward-backward martingale decomposition due to Lyons and Zheng and a tightness criterion for stochastic processes with continuous paths, it can be shown as in [56, p. 472] (see also [21, p. 31], [27, pp. 248-257]) with a minor adaptation to the Riemannian setting, that there exists a subsequence X(mk)X^{(m_{k})} which converges weakly (as measures on [0,T]×M[0,T]\times M endowed with the locally uniform topology) to some process XX. By passing to a subsequence, let us assume that X(m)X^{(m)} converges weakly to XX. In particular, for any fixed t[0,T]t\in[0,T], the law of X(m)(t)X^{(m)}(t) weakly converges to that of X(t)X(t), and hence XX is also stationary with stationary measure μ\mu. Since X(0)X(0) is by definition distributed according to μ\mu, there is a one-to-one correspondence between the spaces L2(μ):=L2(Ω,B(Ω),μ)L_{2}(\mu):=L_{2}(\Omega,\mathcal{B}(\Omega),\mu) and L2(Λ,σ(X(0)),P)L_{2}(\Lambda,\sigma(X(0)),\mathcal{P}), where σ(X(0))\sigma(X(0)) is the σ\sigma-field generated by X(0)X(0) and XX is defined on the space Λ\Lambda with probability measure P\mathcal{P}. Consequently, we can define for any t[0,T]t\in[0,T] the following (bounded) linear operator PtP_{t} on L2(μ)L_{2}(\mu):

In fact, as in , it should be possible to show that XX is a continuous Markov process, that PtP_{t} is a strongly continuous semi-group associated to it, and that the associated Dirichlet form is exactly given by:

However, we will not require all this information. We will only use the weak convergence (through a subsequence) of the processes {X(m)(0),X(m)(t)}\left\{X^{(m)}(0),X^{(m)}(t)\right\}, defined on the 2-point set {0,t}\left\{0,t\right\}, to {X(0),X(t)}\left\{X(0),X(t)\right\}. We provide an elementary argument to deduce the tightness of this sequence (from which the former statement follows by Prokhorov’s Theorem). Fixing a point x0Mx_{0}\in M, since the process X(m)X^{(m)} is stationary:

which is easily seen to hold, since this holds for the (probability) measure μ\mu, and μm\mu_{m} converge to μ\mu in total-variation.

We conclude that by passing to a subsequence, we may assume that {X(m)(0),X(m)(t)}\left\{X^{(m)}(0),X^{(m)}(t)\right\} converges weakly to {X(0),X(t)}\left\{X(0),X(t)\right\}. In other words, for any f,gf,g continuous and bounded on MM:

where PtP_{t} is the linear operator defined in (6.2). Clearly, this also extends to hold for all fL(μ)f\in L_{\infty}(\mu) and gL1(μ)g\in L_{1}(\mu).

Using (6.3), we can now transfer the known estimates (2.3) and Corollary 2.4 on the semi-groups Pt(m)P^{(m)}_{t} to the operators PtP_{t}. Indeed, given a bounded and smooth function ff on Ω\Omega, we need to show that:

using the same estimates (6.4) and (6.5) for Pt(m)P^{(m)}_{t} and μm\mu_{m} replacing PtP_{t} and μ\mu, respectively (the latter are known to be true as described in Section 2, after approximating ff with functions in B(Ωm)\mathcal{B}(\Omega_{m})). Note that we interpret the second estimate (6.5) regarding Ptf\nabla P_{t}f in the sense of distributions as described below, since this is all that is needed for the applications of Section 2. Writing:

the estimate (6.4) immediately follows from the same estimate for Pt(m)P^{(m)}_{t} and μm\mu_{m} and the weak convergence (6.3). The estimate (6.5) is harder to handle, since it involves the distributional gradient of Pt(f)P_{t}(f). Setting p=qp=q^{*}, we interpret:

and Ptf,gdμ\int\left\langle\nabla P_{t}f,g\right\rangle d\mu is interpreted in the distributional sense (using integration by parts). Let gg denote such a vector field as above. Since μm\mu_{m} converges to μ\mu it total-variation, it remains to show the first equality in:

Since Pt(m)fL12tfL\||\nabla P^{(m)}_{t}f|\|_{L_{\infty}}\leq\frac{1}{\sqrt{2t}}\left\|f\right\|_{L_{\infty}} for all mm, the convergence of μm\mu_{m} to μ\mu in total-variation implies that the second term converges to 0 as mm\rightarrow\infty. To handle the first term, we intergrate by parts:

It follows from Lemma 6.1 and Remark 6.3 that g,ψ\left\langle g,\nabla\psi\right\rangle is a bounded function. This implies that the first term converges to 0 by (\refeq:Ptapprox)(\ref{eq:Pt-approx}), and since Pt(m)fLfL\|P^{(m)}_{t}f\|_{L_{\infty}}\leq\left\|f\right\|_{L_{\infty}}, the convergence of μm\mu_{m} to μ\mu in total-variation implies that the second term converges to 0 as well.

We conclude that the estimates (6.4) and (6.5) hold for the linear operators PtP_{t} defined using the limiting process XX, and we may use these operators in place of the diffusion semi-group in the relevant parts of the proof of Theorem 2.5 (and consequently Theorems 4.5, 4.13 and 5.1), so that the conclusion of these theorems remains valid for μ\mu as above.

References