Exponential Convergence in $L^p$-Wasserstein Distance for Diffusion Processes without Uniformly Dissipative Drift

Dejun Luo, Jian Wang

Introduction

In this paper we consider the following Itô stochastic differential equation

Yet there is another approach for studying exponential convergence of the semigroup $(P_{t})_{t\geq 0}$ corresponding to the SDE (1.1) considered in this paper, that is, the coupling method. If the drift vector field $b$ fulfills certain dissipative properties, this latter method provides explicit rate of convergence to equilibrium in a straightforward way, see e.g. and the preprint . The present work is motivated by where the author obtained exponential decay of $(P_{t})_{t\geq 0}$ when the drift $b$ is assumed to be only dissipative at infinity, see the introduction below for more details.

In this paper, we shall establish the exponential convergence of the map $\mu\mapsto\mu P_{t}$ with respect to the $L^{p}$ -Wasserstein distance $W_{p}$ for all $p\geq 1$ . We first recall some known results.

Suppose that $\sigma=\textup{Id}$ and there exists a constant $K>0$ such that

The proof of this result is quite straightforward, by simply using the synchronous coupling (also called the coupling of marching soldiers in [7, Example 2.16]), see e.g. [2, p.2432] for a short proof.

As in [13, (2.3)], we shall assume throughout the paper that

This technical condition will be used in Section 2 to construct the auxiliary function.

Suppose that the vector field $b$ is locally Lipschitz continuous, and there is a constant $c>0$ such that

In particular, when $\sigma=\textup{Id}$ , the condition (1.5) holds true if (1.2) is satisfied only for large $|x-y|$ , that is,

for some constant $L>0$ large enough. Therefore, [13, Corollary 2.3] implies that the map $\mu\mapsto\mu P_{t}$ converges exponentially with respect to the standard $L^{1}$ -Wasserstein distance $W_{1}$ under locally non-dissipative drift, see [13, Example 1.1] for more details. The proof of [13, Corollary 2.3] is based on the coupling by reflection of diffusion processes and a carefully constructed concave function, cf. [13, Section 2]. A number of direct consequences are presented in [13, Section 2.2] which indicate that the convergence result as [13, Corollary 2.3] is extremely useful.

However, [13, Corollary 2.3] is not satisfactory in the sense that no information on the $L^{2}$ -Wasserstein distance $W_{2}$ is provided. This fact has also been noted in [5, Section 7.1, Remark 19], saying that “the reflection coupling cannot furnish some information on $W_{2}$ ”. Our main result of this paper shows that this is not the case.

Assume that there are constants $c>0$ and $\eta\geq 0$ such that for all $r\geq\eta$ , one has

where $C:=C(c,\eta,\varepsilon,p)>0$ is a positive constant.

Theorem 1.3 above does provide new conditions on the drift term $b$ such that the associated semigroup $(P_{t})_{t\geq 0}$ is exponentially convergent with respect to the $L^{p}$ -Wasserstein distance $W_{p}$ for all $p\geq 1$ . The reason that we can obtain the exponential convergence in $W_{p}$ for all $p\geq 1$ , not only $W_{1}$ , is due to our particular choice of the auxiliary function which is convex near infinity. It is designed by using Chen–Wang’s famous variational formula for the principal eigenvalue of one-dimensional diffusion operator, see for instance or [7, Section 3.4]. The reader can refer to and the references therein for recent studies on this topic.

The assertion of Theorem 1.3 can be slightly strengthened if (1.6) is replaced by a stronger condition.

Assume that there are constants $c>0$ , $\eta>0$ and $\theta>1$ such that for all $r\geq\eta$ , one has

The idea of the proof is to use synchronous coupling for large $|x-y|$ and the coupling by reflection for small $|x-y|$ . For the latter part, we can directly use the result of Theorem 1.3, since (1.8) implies that (1.6) holds with $c\eta^{\theta-1}$ if $\eta>0$ .

Before presenting the consequences of Theorem 1.3, let us first make some comments and give some examples. In the beginning, we intended to generalize Eberle’s results by assuming that there is a constant $c>0$ such that

It turns out that under mild conditions on $\kappa$ , the two conditions (1.5) and (1.10) are equivalent, up to changing the constants. More explicitly, we have

Assume that there are constants $c,r_{0}>0$ such that $\kappa(r_{0})\leq-c$ and $\delta_{0}:=\sup_{0\leq r\leq r_{0}}\kappa(r)<+\infty$ . Then, the condition (1.5) holds with some other positive constant.

This result indicates that if the function $\kappa$ is locally bounded from above, then the following statements are equivalent:

there exist constants $c,r_{0}>0$ such that $\kappa(r_{0})\leq-c$ ;

there exist constants $c>0$ and $\theta\leq 1$ such that $\kappa(r)\leq-cr^{\theta}$ for $r>0$ large enough;

there exists a constant $c>0$ such that $\kappa(r)\leq-cr$ for $r>0$ large enough.

The equivalence stated above also indicates that Theorem 1.3 is sharp in some situation, as shown by the next example.

Assume that $\sigma=\textup{Id}$ and $b(x)=\nabla V(x)$ with $V(x)=-(1+|x|^{2})^{\delta/2}$ for some $\delta\in(0,2)$ . Then, we have the following statements.

If $\delta\in(0,1)$ , then $\kappa(r)\geq 0$ for all $r$ large enough, and the inequality (1.7) does not hold for any positive constants $C$ and $\lambda$ with $p=1$ .

where $d_{TV}$ is the total variation distance between probability measures.

To show the power of Theorem 1.4, we consider the following example which yields the exponential convergence of the semigroup $(P_{t})_{t\geq 0}$ with respect to the $L^{p}$ -Wasserstein distance $W_{p}$ $(p>2)$ for super-convex potentials. The assertion below improves the results mentioned in [5, Section 6.1].

Let $\sigma=\textup{Id}$ and $b(x)=\nabla V(x)$ with $V(x)=-|x|^{2\alpha}$ and $\alpha>1$ . It follows from [5, Section 6, Example 1] that there is a constant $c>0$ such that for all $r>0$ ,

holds with some constant $C:=C(\alpha,p)>0$ .

As applications of Theorem 1.3, we consider the existence and uniqueness of the invariant probability measure, and also the exponential convergence of the semigroup with respect to the $L^{p}$ -Wasserstein distance $W_{p}$ . For $p\in(1,\infty)$ , we define

holds with some positive constant $C:=C(c,p)$ .

Under the assumptions of Corollary 1.8, it is easy to establish the following Foster-Lyapunov type conditions:

where $P(t,\cdot)$ is the associated transition probability.

As was pointed by the referee, the conclusion of Corollary 1.8 for $p=1$ also could be deduced from Theorem 1.4 and the Foster-Lyapunov type condition above, by Harris’s theorem for the exponential convergence to the invariant measure in the Wasserstein metric. See [17, Theorem 4.8] and [4, Theorem 2.4] for more details.

The following statement is concerned with symmetric diffusion processes. Though we believe the assertion below is known (see e.g. [25, Corollary 1.4]), we stress the relation between the exponential convergence with respect to $L^{1}$ -Wasserstein distance $W_{1}$ and that with respect to the $L^{2}$ -norm, which is equivalent to the Poincaré inequality.

then $\mu$ satisfies the Poincaré inequality, i.e.

Preliminaries

Similar to the main result in , the proof of Theorem 1.3 is based on the reflection coupling of Brownian motion, which was introduced by Lindvall and Rogers and developed by Chen and Li . First, we give a brief introduction of the coupling by reflection. Together with (1.1), we also consider

where $(W_{t})_{0\leq t<T}$ is a one dimensional Brownian motion expressed by $W_{t}=\int_{0}^{t}\langle e_{s},\textup{d}B_{s}\rangle$ .

Next, by Itô’s formula and (2.2), for $t<T$ ,

Denote by $r_{t}=|\sigma^{-1}Z_{t}|$ . Then

2 Auxiliary Function

For any $\varepsilon\in(0,c)$ , let $\psi\in C^{2}([0,\infty))$ be the following strictly increasing function

The definition of $\psi$ seems a little bit strange at first sight, indeed, it is motivated by Chen–Wang’s variational formula for principal eigenvalue of one-dimensional diffusion operator

on $[0,\infty)$ . One key to this famous formula is the following elegant mimic eigenfunction $g$ associated with the first eigenvalue $($ see [7, pp. 52–53] for a heuristic argument $)$ :

Now, let $b(r)=-cr$ in the definition of the function $g$ above. Then we have

for some proper choice of the constant $\varepsilon$ in the definition of $\psi$ .

On the one hand, it is easy to see that for any $r>0$ ,

We first show that $\psi_{1}(r)$ and $\psi_{2}(r)$ are comparable.

There exist two constants $C_{0}:=C_{0}(\varepsilon)$ and $\hat{C}_{0}:=\hat{C}_{0}(\varepsilon)>0$ such that for all $r\geq 0$ ,

Note that $r(1+r)\sim r$ as $r\to 0$ and $r(1+r)\sim r^{2}$ as $r\to\infty$ , where $\sim$ means the two quantities are of the same order. By L’Hôpital’s law,

Therefore, the required assertion follows from the two limits above. ∎

Roughly speaking, the above two inequalities imply that the auxiliary function $\psi$ behaves like $c^{\prime}r$ for small $r$ , and grows exponentially fast as $e^{c^{\prime\prime}r^{2}}$ for large $r$ . Hence, the function $\psi(r)$ can be used to control the function $r^{p}$ with $p\geq 1$ . More explicitly, we have

There is a constant $C_{1}>1$ such that for all $r\geq 0$ ,

Consequently, for any $p\geq 1$ , there is a constant $C_{2}=C_{2}(p,\varepsilon)>0$ such that for all $r\geq 0$ ,

Furthermore, we can give an explicit estimate to the constant $C_{0}$ in Lemma 2.2, which will be used in the exponential convergence rate.

The constant $C_{0}$ in Lemma 2.2 has the following expression:

In order to estimate $\psi_{1}(r)$ , we need the following inequality on the tail of standard Gaussian distribution (e.g. see [11, (3)]):

where $\Phi(r)$ and $\phi(r)$ are respectively the distribution and density function of the standard Gaussian distribution $N(0,1)$ . Consequently, for $s>0$ ,

Substituting this estimate into the expression of $\psi_{1}$ leads to

Next, we consider two cases. (i) If $r\leq 2/\sqrt{\varepsilon}$ , then

Substituting this estimate into (2.10) yields

Summarizing the above two cases and using (2.9), we obtain

it is easy to show that for all $r\in[0,2/\sqrt{\varepsilon}\,]$ , it holds

On the other hand, for $r>2/\sqrt{\varepsilon}$ , we have

Combining the above two inequalities, we deduce that for all $r>2/\sqrt{\varepsilon}$ ,

Having the two inequalities (2.12) and (2.13) in hand, and using (2.11), we complete the proof. ∎

We also need the following simple result.

If $0\leq r\leq 1$ , then, using the fact that

The required assertion follows immediately from these two conclusions. ∎

Finally, we present a consequence of all the previous results in this part.

Let $\psi$ be the function defined by (2.4), and $\lambda$ be the constant in Theorem 1.3. Then, for all $r>0$ ,

By (2.5), we deduce from Lemmas 2.4 and 2.5 that

where the last inequality follows from (2.6). This, along with the definition of the constant $\lambda$ in Theorem 1.3, yields the required assertion.∎

Proofs of Theorems and Proposition

Let $\psi$ be the function defined by (2.4). According to (2.3) and Itô’s formula, it holds that

Combining this inequality with (2.15), we obtain

where $\lambda$ is the constant in Theorem 1.3.

Then, for $t\leq T_{n}$ , the inequality (3.1) yields

Taking expectation in the both hand sides of the inequality above leads to

Since the coupling process $(X_{t},Y_{t})_{t\geq 0}$ is non-explosive, we have $T_{n}\uparrow T$ a.s. as $n\to\infty$ , where $T$ is the coupling time. Thus by Fatou’s lemma, letting $n\to\infty$ in the above inequality gives us

Thanks to our convention that $Y_{t}=X_{t}$ for $t\geq T$ , we have $r_{t}=0$ for all $t\geq T$ . Therefore,

If $|\sigma^{-1}(x-y)|\leq\eta$ , then for any $p\geq 1$ and any $t>0$ , we deduce from (2.8), (3.3) and (2.7) that

for some constant $C_{6}>1$ . Therefore, if $|x-y|\leq\eta/C_{6}$ , then for any $p\geq 1$ there is a constant $C_{7}>0$ such that

Set $x_{i}=x+i(y-x)/n$ for $i=0,1,\ldots,n$ . Then $x_{0}=x$ and $x_{n}=y$ ; moreover, (3.6) implies $|x_{i-1}-x_{i}|=|x-y|/n\leq\eta/C_{6}$ for all $i=1,2,\ldots,n$ . Therefore, for all $p\geq 1$ , by (3.5),

where in the last inequality we have used $n\leq 2C_{6}|x-y|/\eta$ . The proof of Theorem 1.3 is completed. ∎

and $T=\inf\{t>0:X_{t}=Y_{t}\}$ is the coupling time. For $t\geq T$ , we still set $Y_{t}=X_{t}$ . Therefore, the difference process $(Z_{t})_{t\geq 0}=(X_{t}-Y_{t})_{t\geq 0}$ satisfies

Still denoting by $r_{t}=|\sigma^{-1}Z_{t}|$ , we get

where in the first inequality we have used (3.4), and the last inequality follows from (3.8). In particular, we have for all $|\sigma^{-1}(x-y)|>\eta$ and $t>t_{0}$ ,

Combining with all conclusions above, we complete the proof of Theorem 1.4. ∎

Next, since ${r}/{r_{0}}\leq n_{0}+1$ , we have

which implies $-2cn_{0}\leq 2c-2c{r}/{r_{0}}$ . Therefore

for all $r\geq r_{0}(\delta_{0}+2c)/c$ , and so (1.5) holds with the new constant $c/2r_{0}$ . ∎

Proofs of Examples and Corollaries

Now the result follows immediately from the fact that $V^{\prime}(x)$ is strictly increasing when $|x|$ is large enough. Indeed, we have

which is positive if $|x|\geq(1-\delta)^{-1/2}$ .

On the other hand, it is easy to see that with the choices of $\sigma$ and $b$ above, the semigroup $(P_{t})_{t\geq 0}$ is symmetric with respect to the probability measure $\mu(\textup{d}x)=\frac{1}{Z_{V}}e^{V(x)}\,\textup{d}x$ . Then, according to (the proof of) Corollary 1.10 below, we know that $\mu(\textup{d}x)$ fulfills the Poincaré inequality (1.15) if (1.7) is satisfied with $p=1$ ; however, this is impossible, see e.g. [27, Example 4.3.1 (3)].

which implies $V(x)$ is strictly concave. Hence $\kappa(r)\leq 0$ for all $r\geq 0$ . On the other hand, to show that $\kappa(r)\geq 0$ for all $r\geq 0$ , as in the proof of (1), we simply look at the one dimensional case:

Then, the assertion (1.11) immediately follows from (the proof of) Theorem 1.1, by simply using the synchronous coupling, see e.g. [2, p.2432].

Finally we prove the algebraic convergence rate (1.12). For this, we mainly follow from [9, Section 5] or [5, Section 7.2] (see also [21, Theorem 1.1]). By (2.3),

where $T$ is the coupling time of the coupling process $(X_{t},Y_{t})_{t\geq 0}$ , and $(W_{t})_{t\geq 0}$ is the same one-dimensional Brownian motion as in (2.2). Hence,

Denote by $W_{t}^{\ast}=\inf_{0\leq s\leq t}W_{s}$ which has the same law as that of $-|W_{t}|$ . Thus for any $t>0$ ,

In particular, by the definition of total variation distance,

Finally we present the proofs of the two corollaries of Theorem 1.3.

The inequality above also yields the uniqueness of the invariant measure.

This implies that (e.g. see [6, Theorem 5.10] or [24, Theorem 5.10])

holds for any $t>0$ and any Lipschitz continuous function $f$ , where $\|f\|_{\textrm{Lip}}$ denotes the Lipschitz semi-norm with respect to the Euclidean norm $|\cdot|$ .

Replacing $f$ with $P_{t}f$ in the equality above, we arrive at

Next, we follow the proof of [23, Lemma 2.2] to show that the inequality above yields the desired Poincaré inequality. Indeed, for every $f$ with $\mu(f)=0$ and $\mu(f^{2})=1$ . By the spectral representation theorem, we have

where in the inequality above we have used Jensen’s inequality. Thus,

which is equivalent to the desired Poincaré inequality, see e.g. [27, Theorem 1.1.1]. ∎

Acknowledgements. The authors are grateful to the referee for his valuable comments which helped to improve the quality of the paper.

Introduction

Preliminaries

2 Auxiliary Function

Proofs of Theorems and Proposition

Proofs of Examples and Corollaries

References