Exponential Contraction in Wasserstein Distances for Diffusion Semigroups with Negative Curvature
Feng-Yu Wang
Introduction
Let be a -dimensional connected complete Riemannian manifold possibly with a convex boundary . Let be the Riemannian distance. Consider for the Laplace-Beltrami operator and some -vector field such that the (reflecting) diffusion process generated by is non-explosive. Then the associated Markov semigroup is the (Neumann if ) semigroup generated by on . In particular, it is the case when the curvature of is bounded below; that is,
is equivalent to the curvature condition (1.1). Here, is the class of all probability measures on ; is the -Warsserstein distance induced by , i.e.,
where is the class of all couplings of and ; and for a Markov operator on (i.e. is a positivity-preserving linear operator with ),
where for . In some references, is also denoted by . In the sequel we will use to stand for the adjoint operator of in for the invariant probability measure , hence adopt the notation rather than to avoid confusion. When the curvature is positive (i.e. ), (1.2) implies the -exponential contraction of for
In this paper, we aim to consider the case when (1.1) only holds for some negative constant and to prove the exponential contraction
for some constants . It is crucial that the exponential rate is independent of . Due to the equivalence of (1.1) and (1.2), in the negative curvature case it is essential that .
According to , even when is unbounded below, i.e. goes to when for a fixed , there may exist the log-Sobolev inequality which implies the exponentially convergence of in entropy. This suggests that (1.3) may also hold for a class of diffusion semigroups with negative curvature.
for some constants , where is the Dirac measure at point . Indeed, proved the -exponential contraction with respect to a modified distance in place of as constructed in for estimates of the spectral gap using the coupling by reflection. Under condition (1.4) the modified distance is comparable with the usual one so that (1.5) follows. As mentioned in that there is essential difficulty to prove (1.3) for even for this flat case.
In Luo and Wang the estimate (1.5) was extended as
for some constants . Comparing with (1.3) which is equivalent to
according to (see Proposition 3.1 below), (1.6) is less sharp for small and/or large . It is open whether (1.4), or in the Riemannian setting that is uniformly positive outside a compact domain, implies (1.3) for some constants .
As in , we will consider the Warsserstein distances induced by Young functions in the class
For any and a measure on , consider the gauge norm in
In particular, we have for . This is the reason why we do not take in the characterization of Legendre conjugates. We extend the notion to by letting and for all Now, let
In particular, for We aim to prove the exponential decay
when (1.1) only holds for a negative constant , where is the inverse of and we set by convention.
To extend condition (1.4) to the Riemannian setting, consider the index
where is the Riemannian distance, is the curvature tensor; is the minimal geodesic from to with unit speed; are Jacobi fields along such that
holds for the parallel transform along the geodesic , and () is an orthonormal basis of the tangent space (at points and , respectively).
Note that when , i.e. is in the cut-locus of , the minimal geodesic may be not unique. As a convention in the literature, all conditions on the index are given outside . We now extend condition (1.4) to the non-flat case as follows: for some constants ,
In the flat case we have and , so that this condition reduces back to (1.4). Moreover, the curvature condition (1.1) is equivalent to
so that (1.8) implies
In the next section, we state our main results and present examples. With condition (1.8) we first extend the main results of to the present Riemannian setting and give the exponential convergence of in . Under the ultracontractivity and condition (1.1) for some , our the second result ensures the desired inequality (1.7). Finally, we extend these results to SDEs with multiplicative noise by using explicit conditions on the coefficients. To prove these results, we make some preparations in Section 3. Complete proofs of the main results are addressed in Sections 4-6 respectively.
Main Results and examples
We first consider the Riemannian setting, then extend to SDEs with multiplicative noise by using explicit conditions on the coefficients instead of the less explicit curvature condition.
We start with condition (1.8). Besides the extension of (1.6), this condition also implies the hypercontractivity and the exponential convergence in for the semigroup . For a measure and constants , let stand for the operator norm form to . Recall that is called hypercontractive if it has a unique invariant probability measure and holds for large . By interpolation theorem, can be replaced by for some
Let hold for some constants and . Then:
There exist two constants such that for any and ,
has a unique invariant probability measure and the log-Sobolev inequality
holds for some constant . Consequently, is hypercontractive.
There exist constants such that
To illustrate this result, we present below a consequence with explicit curvature conditions in the spirit of . These conditions allow to be negative everywhere, for instance, when and for some constants . As indicated in Introduction that (1.8) implies so in the following corollary we assume that is bounded below.
Assume that is bounded below. Let for a fixed point . If there exist constants and such that
Next, we introduce sufficient conditions for (1.7) which allow to be negative. Due to technical reason, we will need the ultracontractivity of , which is essentially stronger than the hypercontractivity. We call ultracontractive if for all The ultracontractivity implies that has a density with respect to (called heat kernel) and
In references (see e.g. ), the ultracontractivity is also defined by for . When is symmetric in we have
so that these two definitions are equivalent. However, when is non-symmetric, the former might be stronger than the latter. The appearance of the ultracontractivity in our study is very nature: by Theorem 2.3(1) we already have (1.7) for (the weakest case), and by the ultracontractivity we are able to deduce the inequality from to (the strongest case). On the other hand, the result also indicates that (1.7) implies the hypercontractivity of .
Assume that is bounded below.
If is ultracontractive, then there exist constants such that for any ,
Consequently, for any and ,
On the other hand, if there exist constants such that
then the log-Sobolev inequality holds for , so that is hypercontractive.
We note that in Theorem 2.3(1) we have for . Indeed, since is bounded below, by [23, Theorem 2.1] the ultracontractivity implies the super log-Sobolev inequality (3.3) below, so that due to Herbst we have for all (see e.g. ). Therefore, for and satisfying
In the symmetric case (i.e. for some ), explicit sufficient conditions for the ultracontractivity have been introduced in by using the dimension-free Harnack inequality in the sense of . Together with a suitable exponential estimate on the diffusion process, this inequality implies for and thus, is ultracontractive due to (2.5). The conditions can be formulated as
where are increasing functions such that
and for some constants and
When Ric is bounded below, (2.11) as well as the second inequality in (2.9) hold for being a large enough constant. In general, since , (2.11) with follows from
Since (2.5) fails for non-symmetric semigroups, we apply the inequality
due to the semigroup property. So, to ensure the ultracontractivity, we need an additional condition implying (see Corollary 2.4(2) below).
To estimate in (2.6) using , we introduce
Obviously, the inverse function exists on , and since is increasing with , we have
Assume that and hold for some constants and
If is symmetric, i.e. for some , then there exist constants such that and hold for
If is non-symmetric but there exists continuous with for such that and
then there exist constants such that holds for
To conclude this part, we present a simple example to illustrate Corollary 2.4.
Let have non-positive sectional curvatures and a pole . Let outside a compact domain, where are constants and is a vector field with
Let be increasing such that
By (2.13), (2.14) and the Hessian comparison theorem, we see that (2.9), (2.10) and (2.12) hold with for some constant . According to Corollary 2.4, there exist constants such that for any ,
2 SDEs with multiplicative noise
We intend to investigate the -exponential contraction for . As mentioned in Introduction that existing results only apply to and , and as mentioned in that there is essential difficulty to prove (1.3) for even for . So, the present study is non-trivial.
Corresponding to that (1.1) implies (1.2) in the Riemannian setting, we have the following assertion.
Note that this result does apply to when is non-constant. Next, as in the Riemannian case, we intend to prove the exponential contraction in when (2.16) only holds for some negative constant . To this end, we need the SDE to be non-degenerate. The following result contains analogous assertions in Theorems 2.1 and 2.3, where the first assertion extends (1.5) to the multiplicative noise setting.
Assume that for some constant .
If there exist constants such that and satisfy
then there exist constants such that
Let have a unique invariant probability measure such that the log-Sobolev inequality
holds for some constant . If there exists a constant such that
Combining this with , we see that (2.17) follows from the following more explicit condition:
Note that conditions in Theorem 2.5 and Theorem 2.6(1) are explicit. To illustrate Theorem 2.6(2)-(3), we present below sufficient conditions for the log-Sobolev inequality (2.18) and the ultracontractivity of . For and , we introduce the Christoffel symbols
for some constant . If there exist constants and such that
then has a unique invariant probability measure and there exists a constant such that
We now introduce a simple example to illustrate Theorem 2.6.
then (2.22) holds for some constant . Moreover, it is easy to see that
holds for some constants . By Proposition 2.7 and Theorem 2.6(3), for any , there exist constants such that
Preparations
This section includes some propositions which will be used to prove the results introduced in Section 2. We first recall a link between the Wasserstein distance and gradient estimates due to , then deduce the hyperboundedness and the exponential convergence in entropy from the log-Sobolev inequality for non-symmetric diffusion semigroups, and finally prove the exponential contraction in gradient for ultracontractive semigroups in a general framework including both diffusion and jump Markov semigroups.
Let be a geodesic Polish space, i.e. it is a Polish space and for any two different points , there exists a continuous curve such that and for Then for any , the class of bounded Lipschitz functions on , the length of gradient
is measurable. Moreover, let be a Markov transition kernel and define the Markov operator
For any , consider the Young norm induced by with respect to
and set Then for The following result follows from [16, Theorem 2.2, Remark 2 and Remark 3].
For any constant and , the following statements are equivalent to each other:
for
When for , they are also equivalent to
2 Hyperboundedness and exponential convergence in entropy
When is symmetric, it is well known that the hyperbounddeness, exponential convergence in entropy and the log-Sobolev inequality are equivalent each other, see and references within. In the non-symmetric case, the log-Sobolev inequality implies the former two properties if the generator and the symmetric part of the Dirichlet form satisfy
for some constant and a reasonable class of non-negative bounded functions, which is stable under and dense in for any , see e.g. . In applications, it may be not easy to figure out the class such that (3.2) holds. But in general this condition can be replaced by the following approximation formula Lemma 3.2 in the spirit of .
Now, consider the (Neumann) semigroup generated by for a local bounded vector field such that has a unique invariant probability measure . Let
Then is dissipative (thus, closable) in with closure generating in , see e.g. and references within. Let
Let and . There exists a sequence with such that in for any , in , and
Since , there exists a uniformly bounded sequence such that and in . By the uniform boundedness, in for any . Since ,
This implies since is -invariant. So, by the dominated convergence theorem,
Let be a locally bounded vector field such that the (Neumann) semigroup generated by has a unique invariant probability measure .
holds for some , then for any constants and such that there holds
(1) According to Lemma 3.2, for any and , there exists such that in for all , and
Applying (3.3) to and using (3.5), we obtain
for Noting that is -invariant (i.e. ) and dense in for any , the desired assertion follows from the proof of [13, Corollary 3.13].
(2) It suffices to prove for with Applying Lemma 3.2 to and , and using (3.4), we obtain
This implies the desired exponential estimate. ∎
3 Exponential contraction in gradient
In this part, we consider a general framework including both diffusion and jump processes. Let be a separable complete probability space, and let be a Markov semigroup on with as invariant probability measure. Let be the generator of in . We assume that there exists an algebra such that
, is dense in and the algebra induced by
gives rise to a non-degenerate positive definite bilinear form on ; i.e., for any , and it equals to if and only if is constant.
In particular, when is the (Neumann) semigroup generated by on with bounded below, the assumption holds for
is closable and the closure is a conservative symmetric Dirichlet form. Although is not associated to when it is non-symmetric, we have
If then has a heat kernel with respect to , i.e.
We consider the gradient” length induced by . Note that for jump processes the length is non-local and thus essentially different from the usual gradient length. As shown below that estimates of have a close link to functional inequalities of the associated Dirichlet form.
Assume that there exist and such that
Then there exist constants such that for any and , the gradient estimate
for some constants . By the second inequality in (3.7), for any and we have
Integrating both sides over leads to
Taking and noting that is the invariant probability measure of , we obtain
Since is the closure of under the -norm, this inequality also holds for By condition (ii), the symmetric Dirichlet form is irreducible. So, according to [38, Corollary 1.2] the defective Poincaré inequality (3.11) implies the Poincaré inequality
for some constant . By (3.6) and that is dense in , the Poincaré inequality is equivalent to
On the other hand, by the second inequality in (3.7), for any and we have
Using to replace and integrating with respect to , we obtain
Combining this with (3.13) and (3.12) we arrive at
for some constant ; that is, (3.10) holds for Finally, (3.7) implies (3.10) for
(b) Next, we intend to find out a constant such that
Indeed, by (3.13) and the first inequality in (3.7), we obtain
where This implies the desired assertion for such that .
(c) Finally, combining (3.7), (3.14), (3.10) and (3.12), we obtain
for some constants . Then (3.9) holds for ∎
Proof of Theorem 2.1
The proofs of the other two assertions are based on the log-Sobolev inequality and the log-Harnack inequality derived in and respectively for bounded below .
(a) For two different points , let be the parallel displacement along the minimal geodesic from to , and let be the mirror reflection. Both maps are smooth in outside the cut-locus . According to and , the appearance of the cut-locus and/or a convex boundary helps for the success of coupling, i.e. it makes the distance between two marginal processes smaller. So, for simplicity, we may and do assume that both the cut-locus and the boundary are empty, see [2, Section 3] or [33, Chapter 2] for details.
where denotes the Itô differential introduced in on Riemannian manifolds, is the -dimensional Brownian motion, and is the horizontal lift of to the frame bundle . Then is a diffusion process generated by . To construct the coupling by reflection for short distance and parallel displacement for long distance, we introduce a cut-off function which is decreasing such that for for , and is also in , see e.g. [40, (3.1)] for a concrete example. To construct the coupling in the above spirit, we split the noise into two parts, i.e. to replace by for two independent Brownian motions and , then make reflection for the part and parallel displacement for the part. More precisely, let solve the following SDE on for :
Since the coefficients of the SDE are at least outside the diagonal , it has a unique solution up to the coupling time
We then let for as usual. By the second variational formula and the index lemma (see e.g. the proof of [34, Lemma 2.3] and [29, (2.4)]), the process satisfies
for some one-dimensional Brownian motion . Thus, by condition (1.8),
Since for while when this implies
On the other hand, since for , as observed in we have
for some constants . Indeed, let
which proves (2.1). Therefore, the proof of (1) is finished since the second inequality therein is a simple consequence of (2.1).
(b) According to the proofs of [34, Proposition 3.1 and Theorem 1.1], our conditions imply that is hyperbounded; that is, holds for some . Since (1.8) implies , by the hyperboundedness and [23, Theorem 2.1], we have the defective log-Sobolev inequality
for some constants . Since the symmetric Dirichlet form with domain is irreducible, according to (see also ), the log-Sobolev inequality (3.4) holds for some constant , so that (2) is proved.
(c) According to [25, Theorem 1.10] (see for the case without boundary), the log-Sobolev inequality implies the Talagrand inequality
Next, let be the adjoint of in . By Proposition 3.3 for in place of , the log-Sobolev inequality implies
Moreover, according to [36, Theorem 1.1], the curvature condition is equivalent to the log-Harnack inequality
By [39, Proposition 1.4.4(3)], this implies
Combining (4.4), (4.5) and (4.6), we obtain
for some constant . Noting that implies (see e.g. ), by Proposition 3.1 we have
for some constants . Therefore, the proof of (3) is finished. ∎
Proof of Theorem 2.3 and Corollary 2.4
(1) Since for some constant , we have (see e.g. )
Combining this with Proposition 3.4 for and noting that is continuous, we obtain
for some constants . Obviously, (3.1) implies
According to Proposition 3.1, this is equivalent to
Combining this with (5.1) and the semigroup property, we arrive at
This together with (5.1) implies (2.6) for some constants Moreover, (2.7) follows from (2.6) according to Proposition 3.1.
Then using the standard semigroup calculation of Bakry-Emery, this implies
Since for due to the ergodicity, by letting we prove the log-Sobolev inequality for (3.4) for ∎
We first observe that the proof of [34, Theorem 4.2] works also for the non-symmetric case with in place of , so that
Since in the symmetric case we have , the first assertion follows immediately from Theorem 2.3.
by Theorem 2.3 and (5.2) it suffices to prove
for some constant According to [23, Theorem 2.1], (5.2) implies the super log-Sobolev inequality (3.3) for
for some (possibly different) constant . Then Proposition 3.3 with and implies (5.3).
Proofs of Theorems 2.5-2.6 and Proposition 2.7
Let solve (2.15) with initial point . By Itô’s formula and condition (2.16) we obtain
for some martingale . This implies
Then the desired assertion follows from Proposition 3.1. ∎
where and are independent -dimensional Brownian motions. For any , let solve this SDE with , and let solve the following coupled SDE with :
That is, under the flat metric we have made coupling by reflection for and coupling by parallel displacement for . Obviously, the coupled SDE has a unique solution up to the coupling time
We set for as usual. Then by (2.17) and Itô’s formula, we obtain
By repeating the argument leading to (4.3), it is easy see that (6.3) and (6.4) imply
for some constants independent of . Therefore,
so that the first assertion follows from Proposition 3.1.
(2) According to [37, Theorem 1.1], and (2.19) imply the log-Harnack inequality
for some constants . Combining this with the log-Sobolev inequality, we prove the second assertion as in (c) in the proof of Theorem 2.1.
(3) According to the proof of Theorem 2.5, the condition (2.16) implies the gradient estimate (6.1). Next, by Proposition 3.4, the ultracontractivity and (6.1) imply
for some and independent of . Then the proof if finished by Proposition 3.1. ∎
We will apply results in and . To this end, we introduce the Riemannian metric
and let be the corresponding Laplacian, gradient and Hessian tensor respectively. Then for some vector field . We first verify the Bakry-Emery curvature condition (1.1) for some constant . Using the Christoffel symbols, the intrinsic Hessian tensor induced by is formulated as
Thus, by Bochner-Weitzenböck formula and (2.22), at point there holds
for some constant . Then (1.1) hold for some constant .
Next, (2.23) implies that has a unique invariant probability measure such that for some . By our assumption on , the Riemannian distance induced by the metric is equivalent to the Euclidian metric:
Then we may repeat the proof of [23, Corollary 2.5] with and to prove
for some constant Combining this with the curvature condition (1.1), we obtain from [23, Theorem 2.1] for and that
holds for some constant . Applying Proposition 3.3 below for and for constant such that , we obtain
for some constant . Combining this with (6.6) we arrive at
The author would like to thank Jian Wang for helpful comments.