On the role of Convexity in Functional and Isoperimetric Inequalities

Emanuel Milman

Introduction

Let $(\Omega,d,\mu)$ denote a metric probability space. More precisely, we assume that $(\Omega,d)$ is a separable metric space and that $\mu$ is a Borel probability measure on $(\Omega,d)$ which is not a unit mass at a point. Although it is not essential for the ensuing discussion, it will be more convenient to specialize to the case where $\Omega$ is a complete smooth oriented $n$ -dimensional Riemannian manifold $(M,g)$ , $d$ is the induced geodesic distance, and $\mu$ is an absolutely continuous measure with respect to the Riemannian volume form $vol_{M}$ on $M$ . This work continues the study of interplay between the metric $d$ and the measure $\mu$ initiated in . There are various different ways to measure this relationship, which may be typically arranged according to strength, forming a hierarchy. In this work, we will be primarily concerned with two such different ways.

The first way is by means of an isoperimetric inequality. Recall that Minkowski’s (exterior) boundary measure of a Borel set $A\subset\Omega$ , which we denote here by $\mu^{+}(A)$ , is defined as:

We will say that the space $(\Omega,d,\mu)$ satisfies an $(N,q)$ Orlicz-Sobolev inequality ( $N\in\mathcal{N},q\geq 1$ ) if:

A similar (yet different) definition was given by Roberto and Zegarlinski in the case $q=2$ following the work of Maz’ya [40, p. 112]. Our preference to use the median $M_{\mu}$ in our definition (in place of the more standard expectation $E_{\mu}$ ) is immaterial whenever $N$ is a convex function (see Lemma 2.1).

When $N(t)=t^{p}$ , in which case $N(\mu)$ is just the usual $L_{p}(\mu)$ norm, we will refer to the inequality (1.2) as a $(p,q)$ Poincaré inequality. If in addition $M_{\mu}$ in (1.2) is replaced by $E_{\mu}$ , the case $p=q=2$ is then just the classical Poincaré inequality, and we denote the best constant in this inequality by $D_{Poin}$ . Similarly, the case $q=1,p=\frac{n}{n-1}$ corresponds to the Gagliardo–Nirenberg–Sobolev inequality, and a limitting case when $n$ tends to infinity is the so-called log-Sobolev inequality. More generally, we say that our space satisfies a $q$ -log-Sobolev inequality ( $q\in$ ), if there exists a constant $D>0$ so that:

The best possible constant $D$ above is denoted by $D_{LS_{q}}=D_{LS_{q}}(\Omega,d,\mu)$ . Although these inequalities do not precisely fit into our announced framework, it follows from the work of Bobkov and Zegarlinski that they are in fact equivalent to some corresponding Orlicz-Sobolev inequalities (see Section 4). Various other functional inequalities admit an equivalent (up to universal constants) formulation using an appropriate Orlicz norm $N(\mu)$ on the left hand side of (1.2). We refer the reader to the recent paper of Barthe and Kolesnikov and the references therein for an account of several other types of functional inequalities.

It is well known that various isoperimetric inequalities imply their functional “counterparts”. It was shown by Maz’ya and independently by Cheeger , to whom this is usually attributed, that Cheeger’s isoperimetric inequality implies Poincaré’s inequality: $D_{Poin}\geq D_{Che}/2$ (Cheeger’s inequality). It was first observed by M. Ledoux that a Gaussian isoperimetric inequality implies a $2$ -log-Sobolev inequality: $D_{LS_{2}}\geq cD_{Gau}$ , for some universal constant $c>0$ . This has been later refined by Beckner (see ) using an equivalent functional form of the Gaussian isoperimetric inequality due to S. Bobkov (see also ): $D_{LS_{2}}\geq D_{Gau}/\sqrt{2}$ . The constants $2$ and $\sqrt{2}$ above are known to be optimal.

2 Reversing the Hierarchy

In both cases, we will say that “our convexity assumptions are fulfilled”. More generally, we recall the following definition from :

We will say that our smooth convexity assumptions are fulfilled if:

$d$ denotes the induced geodesic distance on $(M,g)$ .

$d\mu=\exp(-\psi)dvol_{M}$ , $\psi\in C^{2}(M)$ , and as tensor fields on $M$ :

We will say that our convexity assumptions are fulfilled if $\mu$ can be approximated in total-variation by measures $\left\{\mu_{m}\right\}$ so that $(\Omega,d,\mu_{m})$ satisfy our smooth convexity assumptions.

The condition (1.4) is the well-known Curvature-Dimension condition $CD(0,\infty)$ , introduced by Bakry and Émery in their celebrated paper (in the more abstract framework of diffusion generators). Here $Ric_{g}$ denotes the Ricci curvature tensor and $Hess_{g}$ denotes the second covariant derivative.

It is known that under our convexity assumptions, the implications stated in the previous subsection can be reversed: $D_{Che}\geq c_{1}D_{Poin}$ and $D_{Gau}\geq c_{2}D_{LS_{2}}$ , for some universal constants $c_{1},c_{2}>0$ . That Cheeger’s inequality can be reversed was first shown by Buser when $\mu$ is uniform on a closed manifold with $Ric_{g}\geq 0$ , and was recently strengthened and generalized by Ledoux to the Bakry–Émery abstract setting, assuming our smooth convexity assumptions. That a $2$ -log-Sobolev inequality implies a Gaussian isoperimetric inequality under these assumptions was first shown by Bakry and Ledoux [2, Section 4] (see also Ledoux ).

3 The Results

In this work, we generalize all of the above mentioned implications following Ledoux’s diffusion semi-group approach to a more general framework. Such a program was initiated in our previous work , where it was first shown how to use the $CD(0,\infty)$ condition via Ledoux’s semi-group gradient estimates to deduce isoperimetric inequalities from $(p,q)$ Poincaré inequalities. Contrary to previous approaches, which could only deduce isoperimetric information from functional inequalities with a $\left\|\left|\nabla f\right|\right\|_{L_{q}(\mu)}$ term with $q=2$ (see [8, p. 3] and the references therein), it was shown in how to handle arbitrary $q\geq 2$ . In the case of $(p,q)$ Poincaré inequalities, an easy reduction step in fact enables one to handle arbitrary $q\geq 1$ . In this work, we show how to deduce isoperimetric inequalities from very general Orlicz-Sobolev inequalities in the entire range $q\geq 1$ .

The easier case of $q\geq 2$ is handled in Section 2, by generalizing our argument for $(p,q)$ Poincaré inequalities from . Extending our results to the case $q\geq 1$ (which is very important for applications) requires additional work, to which end we employ the notion of capacity. Capacity inequalities are certain functional formulations of isoperimetric inequalities, which were introduced around 1960 by Maz’ya , Federer and Fleming , and used by Bobkov and Houdré in . Maz’ya’s notion of $q$ -capacity for $q=2$ has recently been extended to the metric probability space setting by Barthe, Cattiaux and Roberto in (after being introduced in ), where it was used to deduce isoperimetric inequalities, and has subsequently appeared in other works as well (e.g. ). We recall the appropriate definitions in Section 3, and show that $q$ -capacity inequalities are equivalent in full generality to an appropriate weak-type variant of these Orlicz-Sobolev inequalities (in the same sense that $L_{p,\infty}$ is the weak-type $L_{p}$ quasi-norm). We also give a very general condition for capacity inequalities to be equivalent to the usual (non-weak) Orlicz-Sobolev inequalities, which we require for the sequel. This extends a more restrictive condition (and partly implicit) obtained for $q=2$ in , following a similar condition for general $q$ in .

In Section 4 we use capacities to extend our results to the whole range $q\geq 1$ . We also demonstrate that our estimates are sharp, by showing that the isoperimetric inequalities we obtain are in fact equivalent (up to universal constants) to the functional inequalities used to derive them. To give a taste of the type of results we obtain, we state the following theorem (see Theorem 4.13 for more details and a slightly stronger version):

Let $1\leq q\leq\infty$ and let $N$ denote a Young function, so that:

Then under our convexity assumptions, the following statements are equivalent:

where the best constants $D_{1},D_{2}$ above satisfy:

with $c_{1},c_{2}>0$ universal constants and $B_{\alpha,q},C_{\alpha,q}$ depending explicitly on $\alpha,q$ . In fact, the convexity assumptions are not needed for the direction $(2)\Rightarrow(1)$ , and the assumptions (1.5) are not needed if $q\geq 2$ for the direction $(1)\Rightarrow(2)$ .

When $N(t)=t^{2},q=2$ , the direction $(2)\Rightarrow(1)$ reduces (up to constants) to Cheeger’s inequality, and the direction $(1)\Rightarrow(2)$ to its reversed form due to Buser–Ledoux. In addition, using $N(t)=t^{q}\log(1+t^{q})$ and a result of Bobkov and Zegarlinski (generalizing a previous result of Bobkov and Götze ), a variant of Theorem 1.1 implies (see Corollary 4.8) the following:

Under our convexity assumptions, the $q$ -log-Sobolev inequality (1.3) (for $q\in$ ) is equivalent to the isoperimetric inequality:

with the best constants $D_{LS_{q}},D_{I_{q}}$ satisfying $C_{1}D_{LS_{q}}\leq D_{I_{q}}\leq C_{2}D_{LS_{q}}$ for some universal constants $C_{1},C_{2}>0$ , uniformly on $q\in$ .

That the latter implies the former was previously shown by Bobkov and Zegarlinski without any convexity assumptions (we prove a more general result in Section 4). That the former implies the latter for $q=2$ is precisely the statement that $D_{Gau}\geq cD_{LS_{2}}$ under our convexity assumptions, recovering the previously mentioned result of Bakry–Ledoux and Ledoux .

Theorem 1.1 coupled with the equivalence between Orlicz-Sobolev inequalities and capacity inequalities, enables us to directly infer isoperimetric inequalities from their $q$ -capacity counterparts under our convexity assumptions. Previous works of Barthe–Roberto , Barthe–Cattiaux–Roberto and Roberto–Zegarlinski have shown that $2$ -capacity inequalities are often equivalent to certain other types of functional inequalities, such as the Latała–Oleszkiewicz inequality (or more general Beckner-type inequalities) and additive $\Phi$ -Sobolev inequalities. The advantage of these inequalities compared to the Orlicz-Sobolev inequalities lies in the fact that they admit tensorization. To further demonstrate the usefulness of the framework we develop, we easily deduce in Section 5 as a by-product of our methods the dimension-free tensorization results of . In fact, we prove the following natural extension of these results. By the Central-Limit Theorem, one cannot expect a dimension-free result for isoperimetric profiles which are better than the one for the Gaussian measure (and even in this case some badly behaved examples due to Franck Barthe are known ), so some condition needs to be imposed (we refer to Section 5 for more details):

Then without any additional convexity assumptions, there exists a constant $c_{D}>0$ depending only on $D$ , such that for any $k\geq 1$ :

As already mentioned, our convexity assumptions throughout this work are used via the semi-group argument described in Section 2. More precisely, in that section we assume that our smooth convexity assumptions are fulfilled. To justify the passage to the limit and conclude that our results are valid under arbitrary convexity assumptions, we develop a careful approximation argument in Section 6. We emphasize that this is not just a technical matter, in general it is simply not true that $(N,q)$ Orlicz-Sobolev inequalities on the spaces $(\Omega,d,\mu_{m})$ are stable under taking limit of $\mu_{m}$ in the total-variation norm (see Section 6), so the convexity assumptions will need to be exploited one last time. To the best of our knowledge, with the exclusion of the tensorization results above, all the previously known results which were mentioned did not address this point, and these results were deduced under the additional smoothness assumptions.

Acknowledgements. I would like to thank Professor Jean Bourgain and the Institute for Advanced Study for providing the perfect research environment. Most especially, I would like to thank Sasha Sodin for his invaluable help - acquainting me with capacities, suggesting to look at Ledoux’s semi-group argument, countless other references, many informative conversations and comments on this manuscript. I am also thankful to Professors Franck Barthe and Michel Ledoux for their remarks on earlier versions of this manuscript.

The Semi-Group Argument

In this section, we prove the direction $(1)\Rightarrow(2)$ of Theorem 1.1 for $q\geq 2$ . Our proof is an adaptation of the semi-group argument used in our earlier work , which in turn closely follows Ledoux’s proof of [34, Theorem 5.2].

Let $N(\mu)$ denote an Orlicz norm associated to the Young function $N$ . Then:

This lemma implies that we can pass back and forth between using the median $M_{\mu}$ and the expectation $E_{\mu}$ when excluding constant functions in our functional inequalities, at the expense of losing a universal constant.

$N^{*}$ is always convex, but unfortunately it may attain the value of $+\infty$ , so it will not be a Young function according to our definition. To avoid this minor issue, it will be more convenient to work with the dual norm to $N(\mu)$ :

We denote by $N(\mu)^{*}$ the dual norm to $N(\mu)$ , given by:

Although this will not be used, we comment that it is a nice exercise (e.g. ) to show that when $N$ is a Young function then:

The second inequality is usually called Young’s inequality.

Let $N$ denote a Young function. Then for any Borel set $A$ with $\mu(A)>0$ :

On one hand, denoting $g_{0}:=N^{-1}(1/\mu(A))\chi_{A}$ , since $\left\|g_{0}\right\|_{N(\mu)}=1$ we have:

On the other hand, by Jensen’s inequality, for any $g$ with $\left\|g\right\|_{N(\mu)}\leq 1$ , we have:

2 Semi-Group Gradient Estimates

where $\Delta_{\Omega}$ is the usual Laplace-Beltrami operator on $\Omega$ . $\Delta_{(\Omega,\mu)}$ acts on $\mathcal{B}(\Omega)$ , the space of bounded smooth real-valued functions on $\Omega$ . Let $(P_{t})_{t\geq 0}$ denote the semi-group associated to the diffusion process with infinitesimal generator $\Delta_{(\Omega,\mu)}$ (cf. ), characterized by the following system of second order differential equations:

For each $t\geq 0$ , $P_{t}:\mathcal{B}(\Omega)\rightarrow\mathcal{B}(\Omega)$ is a bounded linear operator and its action naturally extends to the entire $L_{p}(\mu)$ spaces ( $p\geq 1$ ). We collect several elementary properties of these operators:

$\left|P_{t}(f)\right|^{p}\leq P_{t}(\left|f\right|^{p})$ for all $p\geq 1$ .

The following crucial dimension-free reverse Poincaré inequality was shown by Bakry and Ledoux in [2, Lemma 4.2], extending Ledoux’s approach for proving Buser’s Theorem (see also [2, Lemma 2.4], [34, Lemma 5.1]):

Assume that the following Bakry-Émery Curvature-Dimension condition holds on $\Omega$ :

Then for any $t\geq 0$ and $f\in\mathcal{B}(\Omega)$ , we have:

Our convexity assumptions are that $K=0$ in Lemma 2.3, and this is what we will henceforth assume. It is clear that our results in this section as well as Section 4 may be extended to the case of $K>0$ , but we do not pursue this direction in this work.

From Lemma 2.3, it is immediate that for any $2\leq q\leq\infty$ :

and using $q=\infty$ , Ledoux easily deduces the following dual statement [34, (5.5)]:

3 Orlicz-Sobolev implies Isoperimetry for q≥2𝑞2q\geq 2

Let $2\leq q\leq\infty$ and let $N$ denote a Young function. Then under our convexity assumptions, the statement:

with $C_{N,q}\geq c>0$ , a universal constant.

We will see how to relax the assumption that $q\geq 2$ to $q\geq 1$ as well as the requirement that $N$ is convex in Theorem 4.5, in which case we will get a different lower bound on $C_{N,q}$ which will depend on $N$ and $q$ .

Since $N$ is a Young function, we may replace $M_{\mu}f$ in (2.5) by $E_{\mu}f$ using Lemma 2.1 at the expense of an additional universal constant in the final conclusion.

Let $A$ denote an arbitrary Borel set in $\Omega$ , and let $\chi_{A,\varepsilon}(x):=(1-\frac{1}{\varepsilon}d_{g}(x,A))\vee 0$ denote a continuous approximation in $\Omega$ to the characteristic function $\chi_{A}$ of $A$ . Clearly:

Applying Corollary 2.4 to functions in $\mathcal{B}(\Omega)$ which approximate $\chi_{A,\varepsilon}$ (in say $W^{1,1}(\Omega,\mu)$ ) and passing to the limit inferior as $\varepsilon\rightarrow 0$ , it follows that:

We start by rewriting the right hand side above as:

To estimate the right-most expression, we use the definition of the dual norm:

Note that we could have also used Young’s inequality, yielding $2\left\|g\right\|_{N^{*}(\mu)}$ instead of $\left\|g\right\|_{N(\mu)^{*}}$ above, but this would lead to slightly worse numeric estimates. Using our assumption (2.5) with $M_{\mu}$ replaced by $E_{\mu}$ , we get:

Using (2.3) (recall that $q\geq 2$ ) to estimate $\left\|\left|\nabla P_{t}\chi_{A}\right|\right\|_{L_{q}(\mu)}$ , we conclude that:

Using Lemma 2.2, we estimate $\left\|\chi_{A}-\mu(A)\right\|_{N(\mu)^{*}}$ :

We also have the following rough estimate (for $q\geq 2$ ):

It remains to optimize on $t$ . Evaluating (2.7) at time:

As evident from the proof, the definition of smooth convexity assumptions given in the Introduction may be extended to encompass the more general case treated in this section. Consequently, the same remark applies to all of the subsequent results which employ our convexity assumptions.

Capacities

As already mentioned in the Introduction, $q$ -capacity inequalities are certain functional formulations of isoperimetric inequalities. We conform to the definition given in , which is a variation on the definition introduced by Maz’ya (for general $q$ ) and extended by Barthe, Cattiaux and Roberto (with $q=2$ ) in (after being introduced in ). In this section, we introduce a coherent unified framework which provides an equivalence between capacity inequalities and weak-type Orlicz-Sobolev functional inequalities (introduced below), and a general sufficient condition for an equivalence to Orlicz-Sobolev inequalities. We also provide an argument for handling general metric probability spaces. There is essentially no novel content in some parts of this section, and these are provided here for completeness.

Given a metric probability space $(\Omega,d,\mu)$ , $1\leq q<\infty$ and $0\leq a\leq b\leq 1$ , we denote:

where the infimum is on all $\Phi:\Omega\rightarrow$ which are Lipschitz-on-balls.

Both Maz’ya’s definition for general $q$ and the definition of Barthe–Cattiaux–Roberto for the case $q=2$ use $\int\left|\nabla\Phi\right|^{q}d\mu$ instead of our normalized $\left\|\left|\nabla\Phi\right|\right\|_{L_{q}(\mu)}$ . Our definition seems more convenient, as witnessed by the formulation of our results below.

The use of the metric $d$ induced by the geodesic distance on $(M,g)$ was essential for applying the (linear) semi-group argument of the previous section. Throughout this section, as well as the relevant parts of Sections 4 and 5, such a restriction no longer exists, and one may use an arbitrary metric $d$ . In this case, we interpret $\left|\nabla f\right|$ for any $f\in\mathcal{F}$ as the following Borel function:

(and we define it as 0 if $x$ is an isolated point - see [16, pp. 184,189] for more details).

A remark which will be useful for dealing with general metric probability spaces, is that in the definition of capacity, we may always assume that $\int_{\left\{\Phi=t\right\}}\left|\nabla\Phi\right|^{q}d\mu=0$ , for any $t\in(0,1)$ , even though we may have $\mu\left\{\Phi=t\right\}>0$ . The argument is as follows.

Denote $\Gamma:=\left\{t\in(0,1);\mu\left\{\Phi=t\right\}>0\right\}$ the discrete countable set of atoms of $\Phi$ under $\mu$ , and write $\Gamma=\left\{\gamma_{i}\right\}_{i=-A,\ldots,B}$ , $A,B\in\left\{0,1,\ldots,\infty\right\}$ , with $\gamma_{i}<\gamma_{i+1}$ (and set $\gamma_{-(A+1)}=0$ if $A<\infty$ and $\gamma_{B+1}=1$ if $B<\infty$ ). Denote $\beta_{i}=(\gamma_{i}+\gamma_{i+1})/2$ , and set for $\varepsilon>0$ :

Clearly $\Phi_{\varepsilon}\in\mathcal{F}$ and $\left\|\left|\nabla\Phi_{\varepsilon}\right|\right\|_{L_{q}(\mu)}\leq(1+\varepsilon)\left\|\left|\nabla\Phi\right|\right\|_{L_{q}(\mu)}$ , so $\Phi_{\varepsilon}$ is a valid approximation. Since $\Phi$ is Lipschitz-on-balls and $\Phi_{\varepsilon}$ has the same set of atoms $\Gamma$ as $\Phi$ , it is immediate to verify that for every $\gamma_{i}\in\Gamma$ :

But the integral on the right hand side of (3.1) is 0 since $\mu_{i},\nu_{i}\notin\Gamma$ .

The following proposition (see , , , [50, Proposition A]) encapsulates the connection between capacity and the isoperimetric profile $I=I_{(\Omega,d,\mu)}$ (we refer to for a careful proof).

Since obviously $Cap_{1}(a,b)=Cap_{1}(1-b,1-a)$ , we have the following useful corollary:

Note that the operation $N\rightarrow N^{\wedge}$ is an involution on $\mathcal{N}$ , and that $N(\cdot^{\alpha})^{\wedge}=(N^{\wedge})^{1/\alpha}$ for $\alpha>0$ .

$N(t^{\alpha})/t$ is non-decreasing iff $N^{\wedge}(t)^{1/\alpha}/t$ is non-increasing ( $\alpha>0$ ).

It is enough to prove the “only if” direction for $\alpha=1$ by Remark 3.6. Our assumption is that for all $0<t_{1}\leq t_{2}$ :

Let $s_{1}\geq s_{2}>0$ be given. Using $t_{i}=N^{-1}(1/s_{i})$ , $i=1,2$ above (which is legitimate since $N$ is increasing), we deduce:

We denote by $L_{s,\infty}(\mu)$ the weak $L_{s}$ quasi-norm, defined as:

We now extend the definition of the weak $L_{s}$ quasi-norm to Orlicz quasi-norms $N(\mu)$ , using the adjoint function $N^{\wedge}$ :

Given $N\in\mathcal{N}$ , define the weak $N(\mu)$ quasi-norm as:

This definition is consistent with the one for $L_{s,\infty}$ , and satisfies:

as easily checked using the Markov-Chebyshev inequality. Also note that by a simple union-bound:

The motivation for the definition of $N^{\wedge}$ stems from the immediate observation that for any Borel set $A$ :

For this reason, the expression $1/N^{-1}(1/t)$ already appears in the works of Maz’ya [40, p. 112] and Roberto–Zegarlinski .

is called a weak-type Orlicz-Sobolev inequality.

The weak-type Orlicz-Sobolev inequality (3.4) implies:

Apply (3.4) to $f=\Phi$ , where $\Phi:\Omega\rightarrow$ is any Lipschitz-on-balls function so that $\mu\left\{\Phi=1\right\}\geq t$ and $\mu\left\{\Phi=0\right\}\geq 1/2$ . Since $M_{\mu}\Phi=0$ , it follows that:

Taking the infimum over all $\Phi$ as above, the assertion is verified. ∎

2 Equivalences

Let $1\leq q<\infty$ , then the following statements are equivalent:

and the best constants $D_{1},D_{2}$ above satisfy $D_{1}\leq D_{2}\leq 4D_{1}$ .

Given a non-negative function $f$ as above ( $\mu\left\{f=0\right\}\geq 1/2$ hence $M_{\mu}f=0$ ), and $t>0$ , define $\Omega_{t}=\left\{f\leq t\right\}$ and $f_{t}:=f/t\wedge 1$ . Then:

Taking supremum on $t>0$ , the assertion follows. ∎

and the best constants $D_{1},D_{2}$ above satisfy $D_{1}\leq D_{2}\leq 4D_{1}$ .

As already mentioned in the Introduction, we call an inequality of the form (3.6) an Orlicz-Sobolev inequality (even though $N$ may not be convex).

One may show (see e.g. the proof of [46, Theorem 1]) that when $N(t^{1/q})$ is convex (so in particular $N(t)^{1/q}/t$ is non-decreasing), Proposition 3.11 is equivalent to a theorem of Maz’ya [40, p. 112], but there the condition on $N$ is hidden. Such a stronger assumption is too restrictive for our purposes. Under this stronger assumption, the statement of this proposition was used in the case $q=2$ in and for $N(t)=t^{2},q=2$ in .

The last inequality follows from the fact that $N^{1/q}(t)/t$ is non-decreasing, so denoting $v_{\pm}=\left\|f_{\pm}\right\|_{N(\mu)}$ , we indeed verify that:

We will first assume that $f$ is bounded. Given a bounded non-negative function $f$ as above ( $M_{\mu}f=0$ and $\mu\left\{f=0\right\}\geq 1/2$ ), we may assume by homogeneity that $\left\|f\right\|_{L_{\infty}}=1$ . For $i\geq 1$ , denote $\Omega_{i}=\left\{1/2^{i}\leq f\leq 1/2^{i-1}\right\}$ , $m_{i}=\mu(\Omega_{i})$ , $f_{i}=2^{i}(f-1/2^{i})\vee 0\wedge 1$ and set $m_{0}=0$ . Also denote $J:=N^{\wedge}$ . Now:

It remains to show that $\left\|f\right\|_{N(\mu)}\leq V$ . Indeed:

where in the last inequality we have used the fact that $N(t)^{1/q}/t$ is non-decreasing, hence $(J^{-1})^{1/q}(t)/t$ is non-decreasing, and therefore:

whenever $x/y\leq 1$ , which is indeed the case for us.

For a non-bounded $f\in\mathcal{F}$ with $\mu\left\{f=0\right\}\geq 1/2$ , we may define $f_{m}=f\wedge b_{m}$ so that $\mu\left\{f>b_{m}\right\}\leq 1/m$ and (just for safety) $\mu\left\{f=b_{m}\right\}=0$ . It then follows by what was proved for bounded functions that:

where all limits exist since they are non-decreasing. To conclude, $Z\geq\left\|f\right\|_{N(\mu)}$ , since $N$ is continuous, so by the Monotone Convergence Theorem:

We immediately deduce from Propositions 3.10 and 3.11 the following peculiar corollary on the equivalence of the weak and usual Orlicz norms for some functional inequalities:

and the best constants $D_{1},D_{2}$ above satisfy $D_{1}\leq D_{2}\leq 4D_{1}$ .

This corollary seems useful, even in the case of $F(t)=t^{2}$ and $q=2$ , where this amounts to an equivalent characterization of the classical Poincaré inequality, using the weak $L_{2,\infty}$ quasi-norm on the left hand side. We do not know whether this characterization was previously noticed.

Another useful fact which follows from Propositions 3.10 and 3.11 is that the behavior of $N$ at a neighborhood of 0 is simply irrelevant as far as Orlicz-Sobolev inequalities are concerned:

Then the following statements are equivalent:

and the best constants $D_{1},D_{2}$ above satisfy $\frac{1}{4}D_{1}\leq D_{2}\leq 4D_{1}$ .

Note that $N_{0}$ still satisfies that $N_{0}(t)^{1/q}/t$ is non-decreasing and that $N^{\wedge}(t)=N_{0}^{\wedge}(t)$ on $t\in[0,1/2]$ . Using Proposition 3.10 to pass from the Orlicz-Sobolev inequality to a capacity inequality, we can then exchange between $N$ and $N_{0}$ , and use Proposition 3.11 to pass back to the other Orlicz-Sobolev inequality. ∎

The General Theorem

Note that the assumption $q\geq 2$ was needed for the proof of Theorem 2.5 in order to use the estimate (2.3), and the convexity of $N$ was needed to employ Lemma 2.2. In order to relax these assumptions, as well as to deduce the direction $(2)\Rightarrow(1)$ in Theorem 1.1, we will need some additional observations, which are most-naturally formulated in the language of capacities.

In the following proposition, the case $q_{0}=1$ is due to Maz’ya [40, p. 105]. Motivated by the method used in our joint work with Sodin in , we provide an independent proof, which generalizes to the case of an arbitrary metric probability space and $q_{0}>1$ . We denote the conjugate exponent to $q\in[1,\infty]$ by $q^{*}=q/(q-1)$ .

Let $1\leq q_{0}\leq q<\infty$ and set $p_{0}=q_{0}^{*},p=q^{*}$ . Then for all $0<a<b<1$ :

Let $0<a<b<1$ be given, and let $\Phi:\Omega\rightarrow$ be a function in $\mathcal{F}$ such that $a^{\prime}:=\mu\left\{\Phi=1\right\}\geq a$ and $1-b^{\prime}:=\mu\left\{\Phi=0\right\}\geq 1-b$ . As usual (see Remark 3.3), by approximating $\Phi$ , we may assume that $\int_{\left\{\Phi=t\right\}}\left|\nabla\Phi\right|^{q}d\mu=0$ for all $t\in(0,1)$ . Let $C:=\left\{t\in(0,1);\mu\left\{\Phi=t\right\}>0\right\}$ denote the discrete set of atoms of $\Phi$ under $\mu$ , set $\Gamma:=\left\{f\in C\right\}$ and denote $\gamma=\mu(\Gamma)$ .

We now choose $t_{0}=0<t_{1}<t_{2}<\ldots<1$ , so that denoting for $i\geq 1$ , $\Omega_{i}=\left\{t_{i-1}\leq\Phi\leq t_{i}\right\}$ , and setting $m_{i}=\mu(\Omega_{i}\setminus\Gamma)$ , we have $m_{i}=(b^{\prime}-a^{\prime}-\gamma)\alpha^{i-1}(1-\alpha)$ , where $0\leq\alpha\leq 1$ will be chosen later. Denote in addition $\Phi_{i}=\left(\frac{\Phi-t_{i-1}}{t_{i}-t_{i-1}}\vee 0\right)\wedge 1$ , $N_{i}=\sum_{j>i}m_{j}$ . Applying Hölder’s inequality twice, we estimate:

Since $\mu\left\{\Phi\geq t_{i}\right\}\geq a^{\prime}+N_{i}$ and $Cap_{q_{0}}(s,b)$ is non-decreasing in $s$ , we continue to estimate as follows:

where we have used that $m_{i+1}=\alpha m_{i}$ , $m_{i}=\frac{1-\alpha}{\alpha}N_{i}$ , and in the last inequality the fact that $Cap_{q_{0}}(s,b)$ is non-decreasing in $s$ . The assertion now follows by taking supremum on all $\Phi$ as above, and choosing the optimal $\alpha=1-p/p_{0}$ . ∎

Let $1\leq p\leq p_{0}\leq\infty$ , and let $N\in\mathcal{N}$ so that $N(t)^{1/\alpha}/t$ is non-decreasing for some $\alpha>0$ (in particular this holds with $\alpha=1$ when $N$ is a Young function). Then for any $t>0$ :

Let us evaluate the integral on $[t,2t]$ and $[2t,\infty)$ separately:

On the other hand, since $N^{\wedge}(t^{\alpha})/t$ is non-increasing:

Summing these two expressions, the assertion follows. ∎

We do not optimize on the dependence on $\alpha$ here, since in our applications $\alpha\geq 1$ . In this case, note that $\gamma_{p,p_{0}}$ in (4.1) and $\delta_{p,p_{0},\alpha}$ in (4.2) conveniently satisfy:

where $C>0$ is some universal constant. This will be used in the proof of Theorem 4.5 below.

Let $p_{1},p_{2},p_{3}\in[1,\infty]$ , and let $N_{1}\in\mathcal{N}$ satisfy:

$N_{2}(t)^{1/p_{3}}/t$ is non-decreasing.

If $p_{2}\leq p_{3}$ then $N_{2}$ is a convex (hence Young) function.

Note that since $N_{1}\in\mathcal{N}$ , it is almost everywhere differentiable. Also note that our integrability conditions (4.3) together with $N_{1}\in\mathcal{N}$ ensure that $N_{2}\in\mathcal{N}$ . We will assume that $p_{2}<\infty$ , the case $p_{2}=\infty$ follows by taking limit.

For the first part, it is equivalent to show that $N_{2}^{\wedge}(t^{p_{3}})/t$ is non-increasing, which in turn is equivalent to checking that $F(t)$ , defined below, is non-decreasing:

and the integrability condition (4.3) ensures that $\limsup_{t\rightarrow\infty}G(t)=0$ . We will show that $G(t)$ is non-increasing, from which it will follow that $G(t)\geq 0$ , hence $F^{\prime}(t)\geq 0$ , as claimed. Indeed, for almost all $t>0$ :

The last expression is indeed non-positive, since $N_{1}(t)^{1/p_{3}+1/p_{2}-1/p_{1}}/t$ is non-decreasing, hence $N_{1}^{\wedge}(t)^{1/(1/p_{3}+1/p_{2}-1/p_{1})}/t$ is non-increasing, and by differentiating the latter expression one verifies that $(N_{1}^{\wedge})^{\prime}(t)\leq(1/p_{3}+1/p_{2}-1/p_{1})N(t)/t$ .

For the second part, let us substitute the definitions of $N_{1}^{\wedge},N_{2}^{\wedge}$ in (4.4) and perform the change of variables $z=1/s$ . This amounts to:

Taking the derivative, we obtain that for almost every $t>0$ :

Multiplying by the denominator on the left hand side and taking the derivative once again yields that for almost every $t>0$ :

In particular, $N_{2}$ is twice differentiable for almost every $t>0$ , and it is clear that $N_{2}^{\prime\prime}\geq 0$ if $T\leq 0$ almost everywhere. The latter amounts to checking that for almost all $z>0$ :

When $p_{2}\leq p_{3}$ , this follows from the stronger statement:

which indeed holds for almost all $z>0$ , as verified by differentiating $\frac{N_{1}(z)^{1/p_{3}+1/p_{2}-1/p_{1}}}{z}$ , which by assumption is non-decreasing.

2 Orlicz-Sobolev implies Isoperimetry for q≥1𝑞1q\geq 1

We can now prove the following extension of Theorem 2.5:

The assumption that $q\geq 2$ in Theorem 2.5 can be relaxed to $q\geq 1$ , and the assumption that $N$ is a Young function omitted, if we assume in addition that:

In this case, under our convexity assumptions, (2.5) implies (2.6) with:

where $c>0$ is a universal constant, and:

Estimating the expression in (4.5) is connected to Hardy-type inequalities. We do not proceed in this direction in this work, since for our applications the bounds are easy to deduce directly. We remark that whenever $N(t)^{\alpha}/t$ is non-decreasing for some $\alpha>0$ , $N^{\wedge}(t)^{1/\alpha}/t$ is non-increasing, and so:

Using this estimate, it is immediate to show that the expression on the right hand side of (4.5) is bounded from above by a universal constant whenever $1/q\leq\alpha\leq 1$ , even if the infimum in (4.5) is replaced by a supremum. In particular, this obviously applies to all Young functions $N$ (with $\alpha=1$ ).

First, note that whichever the value of $q$ , we have:

By Corollary 3.16, we can always assume that $N(t)=2(t/N^{-1}(2))^{q}$ on $t\in[0,N^{-1}(2)]$ , so that $N^{\wedge}(t)=\frac{2^{1/q}}{N^{-1}(2)}t^{1/q}$ on $t\in[1/2,\infty)$ , and therefore:

Using the assumption that $N(t)^{1/q}/t$ is non-decreasing, hence $N^{\wedge}(t^{q})/t$ is non-increasing, it follows that:

The assumption (2.5) implies by Proposition 3.10 that:

We start with the case $q<2$ . Using Proposition 4.1 (with $q_{0}=q,q=2$ ) to pass from $Cap_{q}$ to $Cap_{2}$ , together with Lemma 4.2 (with $\alpha=q$ ) and Remark 4.3, we obtain that:

for some universal constant $c>0$ , where $N_{2}$ is a function so that:

Since $N(t)^{1/q}/t$ is non-decreasing and the integrability conditions (4.7), (4.8) are fulfilled, we can apply Lemma 4.4 with $N_{1}=N$ , $p_{1}=p,p_{2}=2,p_{3}=2$ , and conclude that $N_{2}$ is a Young function and that $N_{2}(t)^{1/2}/t$ is non-decreasing. Proposition 3.11 then implies that:

We can now apply Theorem 2.5, and conclude that:

with $c^{\prime}>0$ a universal constant. The value of $C_{N,q}$ in (4.5) ensures that this implies:

as required. This concludes the proof when $q<2$ .

When $q\geq 2$ , we use a similar argument. Let $N_{q}$ denote the function so that:

Again, by Lemma 4.4 with $N_{1}=N$ , $p_{1}=p_{2}=p_{3}=q$ , we know that $N_{q}$ is a Young function and that $N_{q}(t)^{1/q}/t$ is non-decreasing. Recalling Remark 4.6, the assumption (4.9) implies that:

for some universal $c>0$ . Proposition 3.11 then implies that:

We can now apply Theorem 2.5, and using the definition of $C_{N,q}$ in (4.5), conclude that:

Let $1\leq q<\infty$ , $N\in\mathcal{N}$ , and assume that:

with $r,p$ as in $(\ref{eq:r-p})$ . Then under our convexity assumptions, the assumption (2.5) implies the conclusion (2.6) with:

whenever $s\geq t$ . Applying Theorem 4.5 and using (4.12), it is straightforward to obtain a lower bound on the expression in (4.5), which yields the bound in (4.11). ∎

It was shown by Bobkov and Zegarlinski [18, Proposition 3.1] (generalizing the case $q=2$ due to Bobkov and Götze [15, Proposition 4.1]) that the following $q$ -log-Sobolev inequality (with $1\leq q\leq 2$ ):

where $\varphi_{q}(t)=t^{q}\log(1+t^{q})$ , and $D_{1}\simeq D_{2}$ uniformly on $q\in$ . Using Lemma 2.1, we can replace $E_{\mu}f$ in (4.14) by $M_{\mu}f$ , at the expense of an additional universal constant. Using Corollary 4.7 with $N=\varphi_{q}$ and $\alpha=\frac{1}{2q}>1/q-1/2$ in the range $q\in(1,2]$ , we can easily show that the $q$ -log-Sobolev inequality (4.13) implies a corresponding isoperimetric inequality. However, to handle the entire range $q\in$ uniformly, we will need to turn to Theorem 4.5.

Under our convexity assumptions, the $q$ -log-Sobolev inequality (4.13) for $1\leq q\leq 2$ implies the following isoperimetric inequality:

$\varphi_{q}(t)^{1/q}/t$ is non-decreasing, so using Lemma 2.1 and Corollary 3.16, (4.13) implies that:

Note that $N_{q}(t)^{1/q}/t$ is still non-decreasing. Clearly:

uniformly on $q\in$ . Hence, using Theorem 4.5 to deduce the isoperimetric inequality (4.15), it remains to bound the expression in (4.5) from below uniformly in $q\in$ . This amounts to showing that:

where $p=q^{*}$ and $C_{1}>0$ is a universal constant. First, we bound the tail of this integral using (4.16):

which is bounded by a universal constant for $t\in[0,1/2]$ . Next, we use (4.16) and the change of variables $v=\log(1+1/s)$ to bound:

We see that this is also bounded in the range $t\in[0,1/2]$ , and this concludes the proof.

The case $q=2$ was previously shown by Bakry–Ledoux and Ledoux . For general $1\leq q\leq 2$ , the reverse direction without any convexity assumptions was shown by Bobkov and Zegarlinski , and given a different proof by Sodin and the author . We will see a general argument for this in the next theorem.

3 Isoperimetry implies Orlicz-Sobolev

Let $1\leq q<\infty$ , and set $p=q^{*}$ . Let $N\in\mathcal{N}$ , so that $N(t)^{1/q}/t$ is non-decreasing. Then:

We rewrite (4.17) using Corollary 3.5 as:

Using Proposition 4.1 (with $q_{0}=1,q=q$ ) to pass from $Cap_{1}$ to $Cap_{q}$ , we obtain that:

where $G_{p}$ is defined on $[0,1/2]$ as:

Incidentally, if we replace $1/2$ in the upper range of the above integral by $\infty$ , by Lemma 4.4 with $N_{1}=N$ , $p_{1}=p,p_{2}=p,p_{3}=q$ , we would have that $G_{p}(t^{q})/t$ is non-increasing, but this will not be used. The estimate in (4.19) ensures that:

Using that $N(t)^{1/q}/t$ is non-decreasing, Proposition 3.11 then implies (4.18), as asserted. ∎

Note that the assumption (4.17) implies (4.20) without assuming that $N(t)^{1/q}/t$ is non-decreasing.

Let $1\leq q<\infty$ and set $p=q^{*}$ . Assume that:

with some $\alpha>0$ . Then the assumption (4.17) implies the conclusion (4.18) with:

Exactly as in the proof of Corollary 4.7. ∎

Using this for $N=\varphi_{q}$ and $\alpha=\frac{1}{2q}$ , we see that as already noted in Remark 4.9, the isoperimetric inequality $(\ref{eq:q-log-Sob-Iso})$ implies without any further assumptions the $q$ -log-Sobolev inequalities (4.14) and (4.13).

4 Summary

To conclude this section, we provide a slightly stronger version of Theorem 1.1 from the Introduction, on the equivalence of isoperimetric and Orlicz-Sobolev functional inequalities under our convexity assumptions. Our results in this section are more general, but this theorem summarizes the most useful cases given by Theorem 2.5 and Corollaries 4.7 and 4.12, and generalizes the results from (which dealt with the case $N(t)=t^{p}$ below).

Let $1\leq q\leq\infty$ , and let $N\in\mathcal{N}$ . Assume that:

Then the following statements are equivalent:

where the best constants $D_{1},D_{2}$ above satisfy:

with $c_{1},c_{2}>0$ universal constants and:

In fact, among the assumptions (4.22), (4.23), (4.24), (4.25):

For the direction $(2)\Rightarrow(1)$ only (4.24) is needed.

For the direction $(1)\Rightarrow(2)$ with $q\geq 2$ only (4.22) and one of (4.23) or (4.24) are needed, and if (4.23) is used then $C_{\alpha,q}$ can be chosen to be $1$ .

For the direction $(1)\Rightarrow(2)$ with $q<2$ (4.23) is not needed.

The direction $(1)\Rightarrow(2)$ was proved in Theorem 2.5 and Corollary 4.7. The direction $(2)\Rightarrow(1)$ was proved in Corollary 4.12. ∎

Tensorization

As mentioned in the Introduction, the results of Section 4 coupled with the results of Section 3 on the equivalence of capacity inequalities and Orlicz-Sobolev inequalities, allow us to directly infer isoperimetric inequalities from capacity inequalities (under convexity assumptions of course).

Let $1\leq q<\infty$ , and let $N\in\mathcal{N}$ . If:

then the following statements are equivalent:

where the best constants $D_{1},D_{2}$ above satisfy:

with $c_{1},c_{2}>0$ universal constants and $B_{\alpha,q},C_{\alpha,q}$ as in (4.26). In fact, among the assumptions (5.1), (5.2), (5.3), (5.4):

For the direction $(2)\Rightarrow(1)$ only (5.3) is needed.

For the direction $(1)\Rightarrow(2)$ with $q\geq 2$ only (5.1) and one of (5.2) or (5.3) are needed, and if (5.2) is used then $C_{\alpha,q}$ in (4.26) can be chosen to be $1$ .

For the direction $(1)\Rightarrow(2)$ with $q<2$ (5.2) is not needed.

The direction $(1)\Rightarrow(2)$ follows from Proposition 3.11 coupled with Theorem 2.5 and Corollary 4.7. The direction $(2)\Rightarrow(1)$ follows from Theorem 4.10, Remark 4.11 and Corollary 4.12 (note that we indeed do not need the assumption that $N(t)^{1/q}/t$ is non-decreasing). ∎

It has been established in recent years that several other types of functional inequalities are equivalent to $2$ -capacity inequalities. These include Beckner-type inequalities (including the Latała–Oleszkiewicz inequality as in ) and additive $\Phi$ -Sobolev inequalities . The advantage of these inequalities compared to the Orlicz-Sobolev inequalities lies in the fact that they admit tensorization. This easily allows us to deduce an extension of the dimension-free tensorization results of Barthe–Cattiaux–Roberto . We demonstrate this with Beckner-type inequalities (5.5), using Theorem 9 and Lemma 8 in as cited in (with a trivial change of notation):

with the best constants $D_{1},D_{2}$ above satisfying $D_{1}/\sqrt{6}\leq D_{2}\leq\sqrt{20}D_{1}$ .

It is known (e.g. ) that Beckner-type inequalities (5.5) admit tensorization, in the sense that if they hold for $(M,g,\mu)$ then they also hold for the Riemannian product $(M^{\times k},g^{\otimes k},\mu^{\otimes k})$ for any $k\geq 1$ . To obtain the most general result, we will also need the following remarkable observation of Franck Barthe [4, Theorem 10] (which in fact holds for very general metric probability spaces, but for simplicity we quote it in less general form; see also Ros ):

Then (without any additional convexity assumptions) there exists a constant $c_{D}>0$ depending only on $D$ , such that for any $k\geq 1$ :

Our formulation of Theorem 5.4 using the condition (5.7), without refering to an auxiliary profile $I_{\nu}$ where $\nu$ is some 1-dimensional density, seems more natural than previous requirements, and this will also be evident in the proof. As mentioned in the Introduction, it seems possible to produce a proof of this theorem using the approach of , but the main obstacle would be to pass from the isoperimetric inequality $I(t)\geq J(t)$ to the appropriate $2$ -capacity inequality, using only $J$ and without passing via the auxiliary density $\nu$ (compare with Theorem 7 and Propositions 9,13 in ), which would otherwise result in requiring some additional technical assumptions and in the constant $c_{D}$ to depend on $J$ . On the other hand, without any further technical assumptions, Theorem 5.4 basically follows from the argument used to derive Theorem 5.1 coupled with Theorems 5.2 and 5.3. To make this precise we will need to be slightly more careful.

Denote $g(t)=\min_{s\in[t,1/2]}J(s)/I_{0}(s)$ for $t\in[0,1/2]$ . Clearly $g(1/2)=J(1/2)/I_{0}(1/2)$ , $g$ is non-decreasing and $g\leq J/I_{0}\leq Dg$ on $[0,1/2]$ . Now denote:

Next, let $N\in\mathcal{N}$ denote the function so that:

Indeed, Fact 3 implies that $N^{\wedge}(0)=0$ , and together with the linear growth of $J_{1}$ at infinity, this means that $N^{\wedge}\in\mathcal{N}$ and hence $N\in\mathcal{N}$ . We now apply Lemma 4.4 with $N_{1}=J_{1}^{\wedge}$ and $p_{1}=\infty,p_{2}=p_{3}=2$ . The appeal to Lemma 4.4 is legitimate since $N^{\wedge}\in\mathcal{N}$ and since $J_{1}^{\wedge}(t)/t$ is non-decreasing by Fact 2. We deduce that:

In addition, Facts 2 and 3 provide the following estimates:

and an elementary computation provided in Lemma 5.5 below implies that:

with $\simeq_{D}$ meaning that the bounds depend on $D$ .

Proposition 4.1 (with $q_{0}=1,q=q$ ) implies that for all $t\in[0,1/2]$ :

is essentially non-decreasing (with the same constant) on $[0,1/(e-1)]$ . The latter follows from (5.11) and Fact 3 on $J_{0}$ .

This concludes the proof, up to the proof of Lemma 5.5 below. ∎

for some constant $C>0$ , which will conclude the proof.

The lower bound is immediate from the lower bound in (5.10). For the upper bound, we decompose the integral into two parts:

with the first one interpreted as if $t>1/2$ . By Fact 1 and the definition of $J_{1}$ , the second integral can be estimated for $t\in$ by:

To estimate the first integral for $t\in[0,1/2]$ , we use the upper bound in (5.10) and the change of variables $v=\log(1+1/s)$ :

In fact, by inspecting the bound given by Theorem 9 in more carefully, one can repeat our argument for an arbitrary isoperimetric profile $J$ (perhaps violating the Central-Limit obstruction), and study what happens to the profile under tensorization. We leave this for another note.

Approximation Argument

Recall that $\mu_{m}$ is said to converge to $\mu$ in total-variation if:

In this section, we provide a careful approximation argument for deducing that our results from Section 2 hold under arbitrary convexity assumptions, without requiring any further smoothness conditions (as defined in the Introduction or more generally in Section 2 and Remark 2.7). We recall that at this point, the proof of Theorem 2.5 is only valid under the additional smoothness conditions. We emphasize that this is not just a technical matter, and that our convexity assumptions will need to be invoked once again. To explain this better, let us describe a naive approximation approach which completely fails. Suppose that $(\Omega,d,\mu)$ satisfies a $(N,q)$ Orlicz-Sobolev inequality as in the assumption of Theorem 2.5, and we would like to deduce from this the conclusion of this theorem, assuming that $(\Omega,d,\mu)$ satisfies our convexity assumptions. By definition, we know that there exists a sequence $\left\{\mu_{m}\right\}$ which approximates $\mu$ in total-variation, such that $(\Omega,d,\mu_{m})$ satisfy our smooth convexity assumptions, and so the proof of Theorem 2.5 applies to these spaces. One may hope that since $\mu_{m}$ approximate $\mu$ , the spaces $(\Omega,d,\mu_{m})$ will also satisfy the $(N,q)$ Orlicz-Sobolev inequality (perhaps with a worse constant), allowing us to apply Theorem 2.5. Unfortunately, this is completely false in general. For instance, consider the measures $\mu_{m}$ which are uniform on the set $\setminus[1/2-1/m,1/2+1/m]$ , and converge to $\mu$ , the uniform measure on $ $. Clearly, the spaces$ (\Omega,d,\mu_{m}) $($ m\geq 3 $) do not satisfy any$ (N,q) $Orlicz-Sobolev inequality, whereas in the limit the space$ (\Omega,d,\mu)$ will satisfy any reasonable inequality (Poincaré, log-Sobolev, etc.). We conclude that a different approach is needed.

Our strategy in this section will be to show that the semi-group estimates of Section 2 can be transferred to a setting without any smoothness assumptions. Our original argument, which at first relied on a method of weak-convergence due to Williams and Zheng (see also Burdzy and Chen ), has been replaced by an elementary argument which we provide below. We continue with the notations used in Section 2, and recall the following definition:

A domain $\Omega\subset(M,g)$ is said to be locally convex, if all geodesics in $M$ tangent to $\partial\Omega$ are locally outside of $\Omega$ . By a result of Bishop , in case that $\Omega$ has $C^{2}$ boundary, this is equivalent to requiring that the second fundamental form of $\partial\Omega$ with respect to the normal pointing into $\Omega$ be positive semi-definite on all of $\partial\Omega$ .

One may always choose a version of $\psi$ which is locally Lipschitz on $\Omega$ .

Let $A\subset M$ denote the Borel subset of points $x\in M$ for which the sequence $\psi_{m}(x)$ converges to $\psi(x)$ (in the wide sense). We know that $vol_{M}(\Omega\setminus A)=0$ . We will show that for each $x_{0}\in\Omega$ , there exists a neighborhood $N_{x_{0}}\subset\Omega$ and a constant $C_{x_{0}}>0$ , so that:

Consequently, it will follow that one may extend $\psi$ by continuity from $A\cap\Omega$ to the entire $\Omega$ , defining $\psi(z_{0})$ for $z_{0}\in\Omega\setminus A$ as $\psi(z_{0})=\lim_{z\rightarrow z_{0},z\in A}\psi(z)$ . The estimate (6.1) will imply that this limit is well defined and that the resulting $\psi$ satisfies the same locally Lipschitz condition.

It is known that for any $x_{0}\in M$ , the geodesic open ball $B(x_{0},r)$ for small enough $r>0$ is convex embedded in $M$ , in the sense that it is both geodesically convex and that the exponential map $\exp_{x_{0}}:B_{T_{x_{0}}M}(0,r)\rightarrow B(x_{0},r)$ is a diffeomorphism between $B_{T_{x_{0}}M}(0,r)\subset T_{x_{0}}M$ and $B(x_{0},r)\subset M$ . We will therefore choose our neighborhood $N_{x_{0}}$ to be a convex embedded ball $B(x_{0},r)$ , so that in addition $\overline{B(x_{0},r)}$ is contained in $\Omega$ . Since $\psi_{m}$ converge to $\psi$ almost everywhere, it is clear that if $\overline{B(x_{0},r)}\subset\Omega$ then $B(x_{0},r)$ must be contained in $\Omega_{m}$ for all $m\geq m_{x_{0}}$ (we could add “apart from a subset of zero measure” for safety, but this is in fact not necessary due to the convexity of the domains).

Choosing $r>0$ small enough, it is known (e.g. [28, p. 643], [3, p. 311]) that $f_{x_{0}}:=d(x_{0},\cdot)^{2}$ is a $C^{\infty}$ function on $B(x_{0},r)$ whose Riemannian Hessian satisfies $Hess_{g}f_{x_{0}}\geq A_{x_{0}}g$ on $B(x_{0},r)$ for some $A_{x_{0}}>0$ . Denoting:

we define $h_{x_{0}}=\frac{\max(R_{x_{0},r},0)}{A_{x_{0}}}f_{x_{0}}$ .

Since for all $m\geq m_{x_{0}}$ , $Ric_{g}+Hess_{g}\psi_{m}\geq 0$ on $B(x_{0},r)$ , it follows from the above construction that $Hess_{g}(\psi_{m}+h_{x_{0}})\geq 0$ on $B(x_{0},r)$ . Since $\psi_{m}+h_{x_{0}}\in C^{2}(B(x_{0},r))$ , it is known ([3, p. 310]) that this is equivalent to being geodesically convex in $B(x_{0},r)$ . We now employ [47, Theorem 10.8], whose proof easily passes to the Riemannian setting (taking into account a slight modification provided in [3, Lemma 2.1] of a Euclidean argument). This theorem asserts that if a sequence of geodesically convex functions $\left\{f_{i}\right\}$ pointwise converges on a dense subset of a geodesically convex open set $N$ (to a finite value in each point of the subset), then the pointwise limit in fact exists for each $x\in N$ , and the function $f(x):=\lim_{i\rightarrow\infty}f_{i}(x)$ is finite and geodesically convex on $N$ . Since $\psi_{m}+h_{x_{0}}$ converges to the finite function $\psi+h_{x_{0}}$ on $B(x_{0},r)\cap A$ , it follows that in fact $\psi_{m}+h_{x_{0}}$ converges to a geodesically convex function on the entire $B(x_{0},r)$ . Writing this function as $\psi_{0}+h_{x_{0}}$ , we realize that $\psi_{0}$ coincides with $\psi$ on $B(x_{0},r)\cap A$ . The argument is therefore complete. ∎

In fact, we have shown that $\psi_{m}$ converge pointwise on all of $\Omega$ , and that the limit $\psi=\lim_{m\rightarrow\infty}\psi_{m}$ is a semi-convex function (in the sense that in a small enough geodesically convex neighborhood, we may add to it a smooth function to obtain a geodesically convex function in that neighborhood).

We will henceforth choose $\psi$ as constructed in Lemma 6.1. This implies by Rademacher’s theorem that:

The differential $\nabla\psi$ exists almost everywhere on $\Omega$ .

$\left|\nabla\psi\right|$ is bounded almost everywhere on compact subsets of $\Omega$ .

Let $X^{(m)}:\Lambda_{m}\times[0,T]\rightarrow M$ denote the diffusion process on $\overline{\Omega}_{m}$ with reflection on the boundary generated by $\Delta_{(\Omega_{m},\mu_{m})}$ , defined on the probability space $(\Lambda_{m},\mathcal{F}_{m},\mathcal{P}_{m})$ and some fixed $T>0$ . Let $P^{(m)}_{t}$ for $t\in[0,T]$ denote the semi-group associated to $X^{(m)}$ . We will assume that the initial distribution of $X^{(m)}(0)$ is given by the stationary measure $\mu_{m}$ , so that $X^{(m)}$ is a stationary process.

Using a forward-backward martingale decomposition due to Lyons and Zheng and a tightness criterion for stochastic processes with continuous paths, it can be shown as in [56, p. 472] (see also [21, p. 31], [27, pp. 248-257]) with a minor adaptation to the Riemannian setting, that there exists a subsequence $X^{(m_{k})}$ which converges weakly (as measures on $[0,T]\times M$ endowed with the locally uniform topology) to some process $X$ . By passing to a subsequence, let us assume that $X^{(m)}$ converges weakly to $X$ . In particular, for any fixed $t\in[0,T]$ , the law of $X^{(m)}(t)$ weakly converges to that of $X(t)$ , and hence $X$ is also stationary with stationary measure $\mu$ . Since $X(0)$ is by definition distributed according to $\mu$ , there is a one-to-one correspondence between the spaces $L_{2}(\mu):=L_{2}(\Omega,\mathcal{B}(\Omega),\mu)$ and $L_{2}(\Lambda,\sigma(X(0)),\mathcal{P})$ , where $\sigma(X(0))$ is the $\sigma$ -field generated by $X(0)$ and $X$ is defined on the space $\Lambda$ with probability measure $\mathcal{P}$ . Consequently, we can define for any $t\in[0,T]$ the following (bounded) linear operator $P_{t}$ on $L_{2}(\mu)$ :

In fact, as in , it should be possible to show that $X$ is a continuous Markov process, that $P_{t}$ is a strongly continuous semi-group associated to it, and that the associated Dirichlet form is exactly given by:

However, we will not require all this information. We will only use the weak convergence (through a subsequence) of the processes $\left\{X^{(m)}(0),X^{(m)}(t)\right\}$ , defined on the 2-point set $\left\{0,t\right\}$ , to $\left\{X(0),X(t)\right\}$ . We provide an elementary argument to deduce the tightness of this sequence (from which the former statement follows by Prokhorov’s Theorem). Fixing a point $x_{0}\in M$ , since the process $X^{(m)}$ is stationary:

which is easily seen to hold, since this holds for the (probability) measure $\mu$ , and $\mu_{m}$ converge to $\mu$ in total-variation.

We conclude that by passing to a subsequence, we may assume that $\left\{X^{(m)}(0),X^{(m)}(t)\right\}$ converges weakly to $\left\{X(0),X(t)\right\}$ . In other words, for any $f,g$ continuous and bounded on $M$ :

where $P_{t}$ is the linear operator defined in (6.2). Clearly, this also extends to hold for all $f\in L_{\infty}(\mu)$ and $g\in L_{1}(\mu)$ .

Using (6.3), we can now transfer the known estimates (2.3) and Corollary 2.4 on the semi-groups $P^{(m)}_{t}$ to the operators $P_{t}$ . Indeed, given a bounded and smooth function $f$ on $\Omega$ , we need to show that:

using the same estimates (6.4) and (6.5) for $P^{(m)}_{t}$ and $\mu_{m}$ replacing $P_{t}$ and $\mu$ , respectively (the latter are known to be true as described in Section 2, after approximating $f$ with functions in $\mathcal{B}(\Omega_{m})$ ). Note that we interpret the second estimate (6.5) regarding $\nabla P_{t}f$ in the sense of distributions as described below, since this is all that is needed for the applications of Section 2. Writing:

the estimate (6.4) immediately follows from the same estimate for $P^{(m)}_{t}$ and $\mu_{m}$ and the weak convergence (6.3). The estimate (6.5) is harder to handle, since it involves the distributional gradient of $P_{t}(f)$ . Setting $p=q^{*}$ , we interpret:

and $\int\left\langle\nabla P_{t}f,g\right\rangle d\mu$ is interpreted in the distributional sense (using integration by parts). Let $g$ denote such a vector field as above. Since $\mu_{m}$ converges to $\mu$ it total-variation, it remains to show the first equality in:

Since $\||\nabla P^{(m)}_{t}f|\|_{L_{\infty}}\leq\frac{1}{\sqrt{2t}}\left\|f\right\|_{L_{\infty}}$ for all $m$ , the convergence of $\mu_{m}$ to $\mu$ in total-variation implies that the second term converges to 0 as $m\rightarrow\infty$ . To handle the first term, we intergrate by parts:

It follows from Lemma 6.1 and Remark 6.3 that $\left\langle g,\nabla\psi\right\rangle$ is a bounded function. This implies that the first term converges to 0 by $(\ref{eq:Pt-approx})$ , and since $\|P^{(m)}_{t}f\|_{L_{\infty}}\leq\left\|f\right\|_{L_{\infty}}$ , the convergence of $\mu_{m}$ to $\mu$ in total-variation implies that the second term converges to 0 as well.

We conclude that the estimates (6.4) and (6.5) hold for the linear operators $P_{t}$ defined using the limiting process $X$ , and we may use these operators in place of the diffusion semi-group in the relevant parts of the proof of Theorem 2.5 (and consequently Theorems 4.5, 4.13 and 5.1), so that the conclusion of these theorems remains valid for $\mu$ as above.

Introduction

2 Reversing the Hierarchy

3 The Results

The Semi-Group Argument

2 Semi-Group Gradient Estimates

3 Orlicz-Sobolev implies Isoperimetry for q≥2𝑞2q\geq 2

Capacities

2 Equivalences

The General Theorem

2 Orlicz-Sobolev implies Isoperimetry for q≥1𝑞1q\geq 1

3 Isoperimetry implies Orlicz-Sobolev

4 Summary

Tensorization

Approximation Argument

References