The computation of averages from equilibrium and nonequilibrium Langevin molecular dynamics

Benedict Leimkuhler, Charles Matthews, Gabriel Stoltz

Introduction

A fundamental purpose of molecular simulation is the computation of macroscopic quantities, typically through averages of functions of the variables of the system with respect to a given probability measure $\mu$ which defines the macroscopic state of the system. We consider systems described by a separable Hamiltonian

where $q=(q_{1},\dots,q_{N})$ and $p=(p_{1},\dots,p_{N})$ respectively are the vectors of positions and momenta of $N$ particles in dimension $d$ , $V$ is a potential energy function and $M$ is a positive definite mass matrix, typically a diagonal matrix.

The Hamiltonian (1) represents a fully classical molecular dynamics model. For instance, a fluid of $N$ argon atoms is well described by pairwise interactions among the nuclei, where the potential $V(q)=\sum_{1\leqslant i<j\leqslant N}v(|q_{i}-q_{j}|).$ The distance based potential $v(r)$ may be fitted to Buckingham or Lennard-Jones forms (for instance, see Frenkel & Smit (2001) or Allen & Tildesley (1989)). These short-ranged potentials model van der Waals type interactions including both Pauli repulsion (the inability of the populated electron shells to interpenetrate) and dispersion due to temporary dipoles forming in the charge clouds surrounding the nuclei. In more complicated molecular systems, other potential energy functions are used to capture local covalent bond structure and Coulombic interactions due to charges on the atoms. Coarse-grained classical models may amalgamate several degrees of freedom, as for example when a molecule is replaced by a rigid body description. Classical molecular dynamics models are now a standard and widespread tool in almost every field of science and engineering. For example, see Schulz et al. (2004) for some applications in engineering, Durrant & McCammon (2011) for a discussion of the use of molecular dynamics in drug discovery and see also the motivation provided in classical textbooks on molecular simulation such as Allen & Tildesley (1989); Frenkel & Smit (2001); Schlick (2002); Tuckerman (2010).

$\mathcal{V}$ being the physical volume of the box occupied by the fluid. By studying the variation in pressure with changes in a thermodynamic parameter (temperature or density), one may obtain part of the phase diagram of the material. Other observables may be used to model the determination of molecular form (shape and size) or structural rearrangement under different ambient conditions. It is for instance increasingly common to use molecular dynamics in biology to reveal allosteric mechanisms related to protein function or drug binding; in such cases the observable may measure the distance between two particular groups of atoms or their relative alignment; see Durrant & McCammon (2011) for examples and further references contained therein.

One of the most popular choices of SDE system for sampling purposes is Langevin dynamics, which is given by:

We will also be interested in nonequilibrium situations where a given system is subject to nonconservative driving and dissipative perturbations. In this case, the averages may be taken with respect to a stationary distribution which has no simple functional form. The simulation of nonequilibrium systems in their steady-states is one popular way to compute transport coefficients such as the thermal conductivity or the shear viscosity, as the linear response of an appropriate average property (see for instance Evans & Morriss (2008); Tuckerman (2010)). We discuss a specific example in Section 3: the computation of the mobility of a particle, which measures the tendency of the particle to flow in the direction of an external forcing. The mobility is related to the self-diffusion through Einstein’s relation (see (50) below).

The aim of this work is to provide a numerical analysis of the perfect sampling bias in Langevin dynamics arising from numerical schemes obtained by a splitting strategy, building on studies such as Talay (2002) or Bou-Rabee & Owhadi (2010), and clarifying the sampling properties of recently proposed schemes (see Skeel & Izaguirre (2002); Melchionna (2007); Bussi & Parrinello (2007); Thalmann & Farago (2007); Leimkuhler & Matthews (2013a)). Of particular interest is the behavior of methods in the overdamped limit $\gamma\to+\infty$ and variations of Langevin dynamics incorporating nonequilibrium forcings such as the addition of non-gradient forces (in which case the invariant measure is unknown). The idea behind splitting schemes for stochastic differential equations is to decompose the generator of the dynamics into a sum of generators associated with dynamics which are analytically integrable, or at least very simple to integrate. We refer to the individual splitting terms of the dynamics as “elementary dynamics” in the sequel. One example in the context of Langevin dynamics is the splitting scheme based on a symplectic integration of the Hamiltonian part of the dynamics combined with an exact treatment of the fluctuation-dissipation part. Such methods are more convenient to implement in molecular simulation codes than the implicit schemes proposed in Talay (2002) or Mattingly et al. (2002), and are also efficient in practice (see Leimkuhler & Matthews (2013b)). Some essential elements of the numerical analysis on the accuracy of such splitting schemes have been provided in Bou-Rabee & Owhadi (2010).

Let us emphasize that we expect our results to hold for unbounded position spaces, under appropriate assumptions on the potential energy function. Our proofs may however require non-trivial modifications, using in particular the tools and the results from Mattingly et al. (2002); Talay (2002); Bou-Rabee & Owhadi (2010); Kopec (2013). Generalizations to other dynamics similar to Langevin dynamics such as Generalized Langevin Dynamics (see Mori (1965); Zwanzig (1973)), Dissipative Particle Dynamics (see Hoogerbrugge & Koelman (1992); Espanol & Warren (1995)) or Nosé-Hoover-Langevin dynamics (see Samoletov et al. (2007); Leimkuhler et al. (2009)) are also possible, although a rigorous extension would require substantial work in view of the estimates needed involving the generator of the dynamics for instance (see the discussion in Remark 4.6).

In practice, since Langevin dynamics is discretized, averages computed along a single trajectory converge to averages with respect to a measure $\mu_{\gamma,{\Delta t}}$ , which is an approximation to $\mu$ in the sense that there exists a function $f_{\alpha,\gamma}$ for which

see Section 2.4 for precise statements. Of course, the momenta are usually trivial to sample since they are distributed according to a Gaussian measure. The primary issue is therefore to sample positions according to the marginal of the canonical measure:

the partial average of a function $\varphi$ with respect to the momentum variable, the error estimate (4) becomes, for observables which depend only on the position variable,

Let us conclude this introduction by noting that alternative sampling strategies are available: the bias in the invariant measure sampled by discretization of Langevin dynamics could in principle be eliminated by employing a Metropolis-Hastings procedure (see Metropolis et al. (1953); Hastings (1970) and the discussion in Section 2.2 of Lelièvre et al. (2010)). Another advantage of superimposing a Metropolis-Hastings procedure upon a discretization of Langevin dynamics is that it stabilizes the numerical scheme even for forces $-\nabla V$ which are not globally Lipschitz. The numerical analysis of Langevin-based Metropolis integrators has been performed in Bou-Rabee & Vanden-Eijnden (2009) and Bou-Rabee & Vanden-Eijnden (2012), where strong error estimates are provided. On the other hand, it is not always possible or desirable to use a Metropolis correction. First, the average acceptance probability in the Metropolis step for Langevin-like dynamics in general decreases exponentially with the dimension of the system for a fixed timestep (see for instance Kennedy & Pendleton (1991)). In fact, the timestep should be reduced as some inverse power of the system size in order to maintain a constant acceptance rate (see the recent works on Metropolization of Hamiltonian dynamics by Beskos et al. (2013), following the strategy pioneered in Roberts et al. (1997); Roberts & Rosenthal (1998)). There are ways to limit the decrease of the ratio, by either changing the dynamics or the measure used to compute the Metropolis ratio (see for instance Izaguirre & Hampton (2004) in the context of Hamiltonian dynamics), or by evolving only parts of the system (see Bou-Rabee & Vanden-Eijnden (2012)). The latter strategy may however complicate the implementation of parallel algorithms for the simulation of very large systems, especially if long-range potentials are used (as acknowledged in Remark 2.5 of Bou-Rabee & Vanden-Eijnden (2012)). This may be a reason why Metropolis corrections are not often implemented in popular molecular dynamics packages such as NAMD. Second, the variance of the computed averages may increase since rejections occur, and the numerical trajectory is therefore more correlated in general than for rejection-free dynamics. Lastly, the Metropolis procedure requires that the invariant measure of the system be known. This is the case for equilibrium systems, but no longer is the case for nonequilibrium systems subjected to external forcings such as a temperature gradient or a non-gradient force (this is the framework considered in Section 3 of this article, see for instance the dynamics (45)).

We focus in this article on first- and second-order splitting schemes, relying on Lie-Trotter decompositions of the evolution. This restriction is motivated both by pedagogical purposes and by the dominant role in applications played by second-order splitting schemes. Let us however emphasize that most of our results could, in principle, be extended to higher-order decompositions.

Results corresponding to discretizations of the equilibrium Langevin dynamics and computation of static average properties are gathered in Section 2, while nonequilibrium systems and the computation of transport properties are discussed in Section 3 (relying on the computation of the mobility or autodiffusion coefficient as an illustration). The proofs of all our results can be found in Section 4.

Let us now highlight some of our contributions.

In the equilibrium setting, we rigorously ground in Section 2.4 the results presented in Leimkuhler & Matthews (2013a) giving the leading order correction to the invariant measure with respect to ${\Delta t}$ for general splitting schemes, via a Talay-Tubaro expansion (see Talay & Tubaro (1990)). We carefully study all possible splitting schemes, taking advantage of what we call the “TU lemma” (Lemma 2.11) to relate invariant measures of various splitting schemes where the elementary dynamics are integrated in different orders. From a technical viewpoint, our proofs are a variation on the standard way of establishing similar results since we use the specific structure of splitting schemes to conveniently write evolution operators as compositions of the semigroups of the elementary dynamics (working at the level of generators, as in Debussche & Faou (2012); see also Mattingly et al. (2010) for a related approach based on solution of appropriate Poisson equations). The structure of the proof is highlighted in Section 4.4, see Remark 4.6.

We show in Section 2.5 how the leading order correction to equilibrium averages can be estimated on-the-fly by approximating a time-integrated correlation function. This can be seen as a practical way of numerically solving a Poisson equation (a standard way of proceeding when studying linear response of nonequilibrium systems) and is an alternative to Romberg extrapolation to eliminate the leading order correction as done in Talay & Tubaro (1990).

We carefully study the overdamped regime $\gamma\to+\infty$ in Section 2.6, making use in particular of uniform resolvent estimates obtained in Theorem 2.4 thanks to a uniform hypocoercivity property;

We provide error estimates for the computation of transport coefficients, by assessing the bias arising in the numerical discretization of either (i) the computation of integrated time-correlation functions expressing transport coefficients via Green-Kubo formulae; or (ii) ergodic averages of steady-state nonequilibrium dynamics where the equilibrium evolution (3) is perturbed by a non-gradient force and the transport coefficient is extracted from the linear response of some quantity of interest (see Section 3). The latter approach is illustrated by the study of the mobility, which measures the response in the average velocity arising from a constant external force exerted on the system. We also study the consistency of the numerical estimations in the overdamped limit.

Some numerical simulations are provided to illustrate the most important results (see Section 2.5.3 and 3.3).

Error estimates for the invariant measure for equilibrium dynamics

We start by giving some properties of Langevin dynamics in Section 2.1 (most results are well-known, except for the material on the overdamped limit $\gamma\to+\infty$ presented in Section 2.1.3). The numerical schemes we consider are then described in Section 2.2, their ergodic properties being discussed in Section 2.3. Error estimates for the invariant measure are provided in Section 2.4. We then show in Section 2.5 how to estimate the leading order correction term through an appropriate integrated correlation function. An important side result of this section is the development error estimates for Green-Kubo type formulas. Finally, we study the errors on the invariant measures in the overdamped limit in Section 2.6. Let us emphasize that we will make use of the following assumption throughout this work:

The above assumption is quite restrictive since typical potentials used in molecular simulation, such as the Lennard-Jones potential, have singularities. Although ergodicity for Langevin dynamics with singular potentials has been recently proved in Conrad & Grothaus (2010), there are still many issues with singular potentials, including the existence and uniqueness of an invariant measure for numerical schemes (see Mattingly et al. (2002)), and the derivation of appropriate bounds or estimates on the resolvent of the generator of Langevin dynamics (all the results presented in Section 2.1.1 below are obtained under the assumption of smooth potentials). Since the latter estimates are fundamental for our work, we have to restrict ourselves to smooth potentials. Of course, from a more practical viewpoint, it could also be argued that the potential energy function could be smoothed out by appropriate high energy truncations and regularizations, and that such regularizations should not affect too much the average properties of the system since high energy states are quite unlikely under the canonical measure.

The reference Hilbert space for our analysis is the Hilbert space $L^{2}(\mu)$ . As in Talay (2002) for instance, we will consider errors in the average of smooth functions whose derivatives grow at most polynomially (the space $\mathcal{S}$ defined below). In fact, since the position space is compact, only the growth in the momentum variable has to be controlled.

The polynomial growth of a function can be characterized by the Lyapunov functions:

To characterize the growth of the derivatives, we introduce the spaces $W^{m,\infty}_{\mathcal{K}_{s}}$ defined as

where $|r|=r_{1}+r_{2}+\dots+r_{2dN}$ , and $\partial^{r}$ stands for $\partial_{q_{1}}^{r_{1}}\dots\partial_{q_{dN}}^{r_{dN}}\partial_{p_{1}}^{r_{dN+1}}\dots\partial_{p_{dN}}^{r_{2dN}}$ .

The set $\mathcal{S}$ of smooth functions is the set of functions $f\in L^{2}(\mu)$ such that, for any $m\geqslant 0$ , there exists $s\geqslant 0$ (depending on $f$ and $m$ ) so that $f\in W^{m,\infty}_{\mathcal{K}_{s}}$ . The subset $\widetilde{\mathcal{S}}\subset\mathcal{S}$ is composed of the functions with average zero with respect to $\mu$ :

Some of our results will be stated in the weighted Sobolev spaces $H^{m}(\mu)$ defined as

Note that $W^{m,\infty}_{\mathcal{K}_{s}}\subset H^{m}(\mu)$ since the function $\mathcal{K}_{s}$ is in $L^{2}(\mu)$ . We will also occasionally need the Sobolev spaces $H^{m}(\kappa)$ of functions of the $p$ variable only whose derivatives up to order $m$ are square-integrable with respect to the probability measure $\kappa(dp)$ .

Unless stated otherwise, all the operators appearing below are by default considered as operators defined on the core $\mathcal{S}$ , with range contained in $\mathcal{S}$ . Some results are stated on extensions of the operators under consideration to (sub)spaces of $H^{1}(\mu)$ or $L^{\infty}_{\mathcal{K}_{s}}$ . With some abuse of notation, we will denote the extension of operators by the same letter. The appropriate domain of the operators should always be clear from the context. When an operator $T$ is defined on the core $\mathcal{S}$ , we denote by $T^{*}$ its formal adjoint, which is the operator defined on $\mathcal{S}$ such that, for all $(f,g)\in\mathcal{S}^{2}$ ,

When $T$ is a differential operator with smooth coefficient (which will be the case in many situations here), the action of the formal adjoint is found using integration by parts.

1 Properties of equilibrium Langevin dynamics

Langevin dynamics can be seen as Hamiltonian dynamics perturbed by an Ornstein-Uhlenbeck process in the momenta with friction coefficient $\gamma>0$ :

The existence and uniqueness of strong solutions is guaranteed when the position space is compact since the kinetic energy function $1+|p|^{2}$ is a Lyapunov function, see for instance Theorem 5.9 in Rey-Bellet (2006). We will sometimes denote by $(q_{\gamma,t},p_{\gamma,t})$ the solution of this equation to emphasize the dependence on the friction coefficient.

In order to describe more conveniently splitting schemes, it is useful to introduce the elementary dynamics with generators (defined on the core $\mathcal{S}$ )

The generator $\mathcal{L}_{\gamma}$ for equilibrium Langevin dynamics (7), defined on the core $\mathcal{S}$ , is the sum of the generators of the elementary dynamics:

where $\mathcal{L}_{0}=A+B$ is the generator associated with the Hamiltonian part of the dynamics. The invariance of the canonical measure $\mu$ defined in (2) for Langevin dynamics can be rewritten in terms of the generator $\mathcal{L}_{\gamma}$ : for any test function $\varphi\in\mathcal{S}$ ,

In fact, the operators $A+B$ and $C$ separately preserve $\mu$ . Recall also that, thanks to the compact embedding of

where, we recall, the adjoints are formally defined as operators on $\mathcal{S}$ through integration by parts. Note that the formal adjoint

defined on $\mathcal{S}$ has an action quite similar to the action of the generator $\mathcal{L}_{\gamma}$ defined on $\mathcal{S}$ . Functional estimates valid for (extensions of) $\mathcal{L}_{\gamma}$ will therefore also hold for (extensions of) the formal adjoint of this operator. The equality (10) expresses the reversibility up to momentum reversal of Langevin dynamics with respect to the invariant measure $\mu$ (see the discussion in Section 2.2.3 of Lelièvre et al. (2010)). In particular, introducing the bounded, unitary operator on $L^{2}(\mu)$

(10) can be reformulated $\mathcal{R}\mathcal{L}_{\gamma}\mathcal{R}=\mathcal{L}_{\gamma}^{*}$ .

Note also that the same bound holds for $(\mathcal{L}_{\gamma}^{*})^{-1}$ .

Before presenting these asymptotic estimates, let us first recall that a careful analysis of the proof presented in Talay (2002), as provided by Kopec (2013), allows to prove the following result.

The space $\widetilde{\mathcal{S}}$ is stable under $\mathcal{L}_{\gamma}^{-1}$ and $(\mathcal{L}_{\gamma}^{*})^{-1}$ .

This result is of fundamental importance in our proofs. It allows to state that, if the operators $T_{1},\dots,T_{M}$ are well defined operators from $\widetilde{\mathcal{S}}$ to $\widetilde{\mathcal{S}}$ , then the operator $\mathcal{L}_{\gamma}^{-1}T_{M}\mathcal{L}_{\gamma}^{-1}\dots\mathcal{L}_{\gamma}^{-1}T_{1}\mathcal{L}_{\gamma}^{-1}$ also is a well defined operator from $\widetilde{\mathcal{S}}$ to $\widetilde{\mathcal{S}}$ .

1.2 Hamiltonian limit γ→0→𝛾0\gamma\to 0

Denote by $\|\cdot\|_{\mathcal{B}(\mathcal{H}^{0})}$ the operator norm on the subspace

of the Hilbert space $L^{2}(\mu)$ . There exists two constants $c_{-},c_{+}>0$ such that, for any $0<\gamma\leqslant 1$ ,

We state the result with the upper bound $\gamma\leqslant 1$ , but it holds in fact for $0<\gamma\leqslant\gamma_{\rm max}$ for any finite value $\gamma_{\rm max}>0$ . Note also that the same bound holds for $(\mathcal{L}_{\gamma}^{*})^{-1}$ .

1.3 Overdamped limit γ→+∞→𝛾\gamma\to+\infty

\gamma\to+\infty The overdamped limit can be obtained by either letting the friction go to infinity in (7) together with an appropriate rescaling of time; or by letting masses go to 0. When discussing overdamped limits in this article, we will always set the mass matrix $M$ to identity and consider the limit $\gamma\to+\infty$ . Since we restrict our attention to the invariant measure of the system, the time rescaling is not relevant.

Let us describe more precisely the convergence result. It is shown in Section 2.2.4 of Lelièvre et al. (2010) for instance that the solutions of (7) observed over long times, namely $(q_{\gamma,\gamma s},p_{\gamma,\gamma s})_{s\geqslant 0}$ , converge pathwise on finite time intervals $s\in[0,t]$ to the solutions of overdamped Langevin dynamics

to $\widetilde{H}^{m+2}(\overline{\mu})$ . Let us finally mention that the set of $C^{\infty}(\mathcal{M})$ functions with average zero with respect to $\overline{\mu}$ is of course stable with respect to $\mathcal{L}_{\rm ovd}^{-1}$ .

The following result gives bounds on the resolvent of the Langevin generator in the overdamped regime, and in fact quantifies the difference between the resolvent $\mathcal{L}_{\gamma}^{-1}$ and the resolvent $\mathcal{L}_{\rm ovd}^{-1}$ appropriately rescaled by a factor $\gamma$ .

There exist two constants $c_{-},c_{+}>0$ such that, for any $\gamma\geqslant 1$ ,

More precisely, there exists a constant $K>0$ such that, for any $\gamma\geqslant 1$ ,

where the operator $\pi$ is defined in (6), and $(C^{-1}\psi)(q,p)$ is understood as applying the operator $C^{-1}$ to the function $\psi(q,\cdot)\in L^{2}(\kappa)$ for all values of $q\in\mathcal{M}$ .

Consider the following subspace of $\mathcal{H}^{1}$ :

There exists a constant $K>0$ such that, for any $\gamma\geqslant 1$ ,

The proofs of Theorem 2.4 and Lemma 2.5 are provided in Section 4.1.

2 Splitting schemes for equilibrium Langevin dynamics

We present in this section the splitting schemes to be examined in this article. These schemes can be described by evolution operators $P_{\Delta t}$ defined on the core $\mathcal{S}$ (but which can be extended to bounded operators on $L^{\infty}(\mathcal{E})$ ), and which are such that the Markov chain $(q^{n},p^{n})$ generated by the discretization satisfies

We also briefly give some ergodicity results obtained by minor extensions or variations of existing results in the literature (see in particular Mattingly et al. (2002); Talay (2002); Bou-Rabee & Owhadi (2010); Bou-Rabee & Hairer (2013)). Since these ergodicity issues are by now a rather standard and well-understood matter, especially for compact position spaces, we provide only elements of proofs in Section 4.2.

First-order schemes are obtained by a Lie-Trotter splitting of the elementary evolutions generated by $A,B,\gamma C$ . The motivation for this splitting is that all elementary evolutions are analytically integrable (see the expressions of the associated semigroups in (65)). There are 6 possible schemes, whose evolution operators (defined on the core $\mathcal{S}$ ) are of the general form

with all possible permutations $(Z,Y,X)$ of $(A,B,\gamma C)$ . For instance, the numerical scheme associated with $P_{\Delta t}^{B,A,\gamma C}$ is

The iterations of the three schemes associated with $P^{\gamma C,B,A}_{\Delta t},P^{B,A,\gamma C}_{\Delta t},P^{A,\gamma C,B}_{\Delta t}$ share a common sequence of update operations, as for $P^{\gamma C,A,B}_{\Delta t},P^{A,B,\gamma C}_{\Delta t},P^{B,\gamma C,A}_{\Delta t}$ . More precisely, we mean that equalities of the following form hold:

It is therefore not surprising that the invariant measures of the schemes with operators composed in the same order have very similar properties, as made precise in Theorem 2.13, relying on Lemma 2.11.

2.2 Second-order schemes

Second-order schemes are obtained by a Strang splitting of the elementary evolutions generated by $A,B,\gamma C$ . There are also 6 possible schemes, which are of the general form

with the same possible orderings as for first-order schemes. Again, these schemes can be classified into three groups depending on the ordering of the operators once the elementary one-step evolution is iterated: (i) $P_{\Delta t}^{\gamma C,B,A,B,\gamma C},P_{\Delta t}^{A,B,\gamma C,B,A}$ , (ii) $P_{\Delta t}^{\gamma C,A,B,A,\gamma C},P_{\Delta t}^{B,A,\gamma C,A,B}$ , and (iii) $P_{\Delta t}^{B,\gamma C,A,\gamma C,B},P_{\Delta t}^{A,\gamma C,B,\gamma C,A}$ . We discard the latter category since the invariant measures of the associated numerical schemes are not consistent with $\overline{\mu}$ in the overdamped limit (see Section 2.6).

2.3 Geometric Langevin Algorithms

In fact, as already proved in Bou-Rabee & Owhadi (2010) (see also Corollary 2.17 below), second order accuracy of the invariant measure can be obtained by resorting to a first-order splitting between the Hamiltonian and the Ornstein-Uhlenbeck parts, and discretizing the Hamiltonian part with a second-order scheme. This corresponds to the following evolution operators of Geometric Langevin Algorithm (GLA) type:

3 Ergodicity results for splitting schemes

Let us now give some technical results on the ergodic behavior of the splitting schemes presented in Section 2.2.In this section we denote the evolution operator by $P_{\Delta t}$ (supressing the dependence on the friction parameter $\gamma$ although the constants appearing in the results below a priori depend on this parameter). Ergodicity results for a fixed value of ${\Delta t}$ are obtained with techniques similar to the ones presented in Meyn & Tweedie (2009), by mimicking the proofs presented for certain discretization schemes of the Langevin equation in Mattingly et al. (2002); Talay (2002); Bou-Rabee & Owhadi (2010). A more subtle point is to obtain rates of convergence which are uniform in the timestep ${\Delta t}$ , as done in Bou-Rabee & Hairer (2013) for a class of Metropolis-Hastings schemes based on a discretization of overdamped Langevin dynamics in unbounded spaces as the proposal. We are able here to prove such results by relying on the fact that the position space $\mathcal{M}$ is compact.

The proof is based on two preliminary results, namely a uniform drift inequality or Lyapunov condition and a uniform minorization condition (see Section 4.2 for the proofs). The term uniform refers to estimates which are independent of the timestep ${\Delta t}$ . To obtain such estimates, we have to consider evolutions over fixed times $T\simeq n{\Delta t}$ , which amounts to iterating the elementary evolution $P_{\Delta t}$ over $\lceil T/{\Delta t}\rceil$ timesteps (where $\lceil x\rceil$ denotes the smallest integer larger than $x$ ).

Consider $T>0$ sufficiently large, and fix any $p_{\rm max}>0$ . There exist ${\Delta t}^{*},\alpha>0$ and a probability measure $\nu$ such that, for any bounded, measurable non-negative function $f$ , and any $0<{\Delta t}\leqslant{\Delta t}^{*}$ ,

Lemma 2.7 ensures that Assumption 2 in Hairer & Mattingly (2011) holds for any choice of Lyapunov function $\mathcal{K}_{s}$ ( $s\geqslant 1$ ), provided $p_{\rm max}$ is chosen to be sufficiently large. The uniform minorization condition can formally be rewritten as

We present a direct proof of Lemma 2.7 in Section 4.2. Extending this result to unbounded position spaces is much more difficult in general, see for instance the recent works Klokov & Veretennikov (2006, 2013) and Bou-Rabee & Hairer (2013) where non-degeneracy of the noise is assumed.

Let us now precisely state the ergodicity result.

uniformly in the timestep ${\Delta t}$ . There also exist $\lambda,K>0$ (depending on $s^{*}$ and $\gamma$ but not on ${\Delta t}$ ) such that, and for all functions $f\in L^{\infty}_{\mathcal{K}_{s}}$ , the following holds for almost all $(q,p)\in\mathcal{E}$ :

Let us again emphasize that, compared to the results of Mattingly et al. (2002); Talay (2002); Bou-Rabee & Owhadi (2010), the only new estimate is the uniform-in- ${\Delta t}$ decay rate in (24) as obtained in Bou-Rabee & Hairer (2013) for Metropolis schemes. These uniform estimates follow from an application of the results of Hairer & Mattingly (2011) to the sampled chain $P_{\Delta t}^{\lceil T/{\Delta t}\rceil}$ (see Section 4.2 for more detail). Recall also that the convergence rates we obtain of course depend on the friction parameter $\gamma$ .

The Banach space $L^{\infty}_{\mathcal{K}_{s},{\Delta t}}$ depends both on ${\Delta t}$ and $\gamma$ through $\mu_{\gamma,{\Delta t}}$ , although the dependence on $\gamma$ is not explicitly written. This proves that the series

provided ${\Delta t}$ is sufficiently small. Let us summarize this result as follows.

4 Error estimates for finite frictions

In this section we study the error of the average of sufficiently smooth functions, which allows us to characterize the corrections to the invariant measure. In Theorems 2.13 and 2.16, below, we characterize all the first- and second-order splittings; the technique of proof allows us to provide a rigorous study of the error estimates in the overdamped regime (see Section 2.6) and for nonequilibrium systems (see Section 3).

If only the order of magnitude of the correction is of interest, and not the expression of the correction in itself, no regularity result with regard to the derivatives is required (see Bou-Rabee & Owhadi (2010)), in contrast to situations where such corrections are explicitly considered, as in Talay (2002) for instance.

We classified in Section 2.2 the numerical schemes according to the order of appearance of the elementary operators. More precisely, we considered schemes to be similar when the global ordering of the operators is the same but the operations are started and ended differently, as in (19) above (see also (26) below for an abstract definition). This choice of classification is motivated by the following lemma which demonstrates how we may straightforwardly obtain the expression of the invariant measure of one scheme when the expression for another one is given.

We state the result in an abstract fashion for two schemes $P_{\Delta t}=U_{\Delta t}T_{\Delta t}$ and $Q_{\Delta t}=T_{\Delta t}U_{\Delta t}$ (which implies the condition (26) below). See (19) for a concrete example.

Consider two numerical schemes with associated evolution operators $P_{\Delta t},Q_{\Delta t}$ bounded on $L^{\infty}(\mathcal{E})$ , for which there exist bounded operators $U_{\Delta t},T_{\Delta t}$ on $L^{\infty}(\mathcal{E})$ such that, for all $n\geqslant 1$ ,

We also assume that both schemes are ergodic with associated invariant measures denoted respectively by $\mu_{P,{\Delta t}}$ , $\mu_{Q,{\Delta t}}$ : For almost all $(q,p)\in\mathcal{E}$ and $f\in L^{\infty}(\mathcal{E})$ ,

Then, for all $\varphi\in L^{\infty}(\mathcal{E})$ ,

Ergodicity results such as (27) are implied by conditions such as (24).

The proof of this result relies on the simple observation that, for a given initial measure $\rho$ with a smooth density with respect to the Lebesgue measure, the ergodicity assumption ensures that, for a bounded measurable function $\varphi$ ,

Now, we use the ergodicity property (27) with $f$ replaced by $U_{\Delta t}\varphi$ to obtain the following convergence for almost all $(q,p)\in\mathcal{E}$ :

Since $T_{\Delta t}$ preserves constant functions, there holds

Let us now show how we will use Lemma 2.11 in the sequel. Assume that a weak error estimate holds on the invariant measure $\mu_{P,{\Delta t}}$ : there exist $\alpha\geqslant 1$ and a function $f_{\alpha}\in\mathcal{S}$ such that

with $|r_{\psi,\alpha,{\Delta t}}|\leqslant K$ for ${\Delta t}$ sufficiently small. Combining this equality and (28), the following expansion is obtained for $\mu_{Q,{\Delta t}}$ :

In general, for an evolution operator $U_{\Delta t}$ preserving the measure $\mu$ at order $\delta\geqslant 1$ , we can write

where all the operators on the right hand side are defined on the core $\mathcal{S}$ , and the operators $\mathcal{A}_{k}$ preserve the measure $\mu$ :

while the operator $S_{\delta}$ does not. Typically, $\mathcal{A}_{k}$ is a composition of the operators $A+B$ and $C$ . In addition, for a given function $\varphi\in\mathcal{S}$ , the remainder $R_{\delta,{\Delta t}}\varphi$ is uniformly bounded for ${\Delta t}$ sufficiently small. Three cases should then be distinguished:

When $\delta\geqslant\alpha+1$ , the weak error in the invariant measure $\mu_{Q,{\Delta t}}$ is of the same order as for $\mu_{P,{\Delta t}}$ since

For $\delta\leqslant\alpha-1$ , the weak error in the invariant measure $\mu_{Q}$ arises at dominant order from the operator $U_{\Delta t}$ :

The interesting case corresponds to $\alpha=\delta$ . In this situation,

An increase in the order of the error on the invariant measure is obtained when the leading order correction vanishes for all admissible observables $\psi$ , that is, if and only if $f_{\alpha}+S_{\alpha}^{*}\mathbf{1}=0$ .

4.2 First-order schemes

The following result characterizes at leading order the invariant measure of the schemes based on a first-order splitting (see Section 2.2.1). We first study the error estimates in the invariant measure of the schemes $P_{\Delta t}^{\gamma C,B,A}$ , $P_{\Delta t}^{\gamma C,A,B}$ (which can be interpreted as GLA schemes with a symplectic Euler discretization of the Hamiltonian part, see Bou-Rabee & Owhadi (2010)), and then deduce error estimates for the four remaining schemes introduced in Section 2.2.1 by making use of Lemma 2.11. The proof can be read in Section 4.4.

where the remainder $r_{\psi,\gamma,{\Delta t}}$ is uniformly bounded for ${\Delta t}$ sufficiently small. The expressions of the correction functions $f_{1,\gamma}$ depend on the numerical scheme at hand. They are defined as

It would in fact possible to obtain bounds on the the remainder $r_{\psi,\gamma,{\Delta t}}$ with respect to $\psi$ , thanks to functional inequalities given in Appendix A of Kopec (2013).

The equations (31) could be analytically solved if, instead of the fluctuation/dissipation operator $C$ , we were using the mass-weighted differential operator as in Leimkuhler & Matthews (2013a):

The corresponding generator $\mathcal{L}_{\gamma,M}=A+B+\gamma C_{M}$ defined on the core $\mathcal{S}$ is associated with Langevin dynamics where the friction force is proportional to the momenta rather than velocities. A simple computation shows that

The condition (31) would be replaced by $\mathcal{L}_{\gamma,M}^{*}f_{1}^{\gamma C,B,A}=-(A+B)g/2$ , so that $f_{1}^{\gamma C,B,A}=\beta V/2-g+c$ where $c$ is a constant ensuring that $f_{1}^{\gamma C,B,A}$ has a vanishing average with respect to $\mu$ .

4.3 Hamiltonian limit of the correction term

For first order splitting schemes, the limit of the leading order correction term in (30) can be studied in the limit when $\gamma\to 0$ . Not surprisingly, it turns out that the leading order correction is the first term in the expansion of the modified Hamiltonian of the symplectic Euler method in powers of ${\Delta t}$ . In contrast to the more complete proof we are able to present for the overdamped limit (see Section 2.6), we were not able to study the behavior of the remainder terms $r_{\psi,\gamma,{\Delta t}}$ in (30). There is a technical obstruction to controlling these remainders from the way we prove our results since the limiting operator $\mathcal{L}_{0}=A+B$ is not invertible. Let us also mention that studying the corresponding Hamiltonian limit for second order schemes turns out to be a much more difficult question (see Remark 2.18).

There exists a constant $K>0$ such that, for all $0<\gamma\leqslant 1$ ,

with similar estimates for $f_{1}^{B,\gamma C,A}$ and $f_{1}^{B,A,\gamma C}$ ; and

with similar estimates for $f_{1}^{A,\gamma C,B}$ and $f_{1}^{A,B,\gamma C}$ .

The proof of this result is provided in Section 4.5.

4.4 Second-order schemes

The following result characterizes at leading order the invariant measure of the schemes based on a second-order splitting (see Section 2.2.2).

where the remainder $r_{\psi,\gamma,{\Delta t}}$ is uniformly bounded for ${\Delta t}$ sufficiently small. The expressions of the correction functions $f_{2,\gamma}$ depend on the numerical scheme at hand. They are defined as

It can be checked that the expressions of $f_{2}^{B,A,\gamma C,A,B}$ and $f_{2}^{A,B,\gamma C,B,A}$ agree with the ones presented in Leimkuhler & Matthews (2013a). Let us emphasize that no ${\Delta t}^{3}$ correction term appears in (32) after the ${\Delta t}^{2}$ term. In fact, a more careful treatment would allow us to write an error expansion in terms of higher orders of ${\Delta t}$ , with only even powers of ${\Delta t}$ appearing.

The proof of this result is given in Section 4.6. We use as reference schemes for the proofs the schemes $P_{\Delta t}^{\gamma C,A,B,A,\gamma C}$ , $P_{\Delta t}^{\gamma C,B,A,B,\gamma C}$ . These schemes indeed turn out to be particularly convenient to study the overdamped limit.

The results from Theorem 2.16 allow us to obtain error estimates for the so-called Geometric Langevin Algorithms (GLA) introduced in Bou-Rabee & Owhadi (2010). Recall the somewhat surprising result that the error in the invariant measure of the GLA schemes is of order $\Delta t^{p}$ for a discretization of order $p$ of the Hamiltonian part, even though the weak and strong orders of the scheme are only one. The following result complements the estimate given in Bou-Rabee & Owhadi (2010) by making precise the leading order corrections to the invariant measure of the numerical scheme with respect to the canonical measure (see the proof in Section 4.7).

where the remainder $r_{\psi,\gamma,{\Delta t}}$ is uniformly bounded for ${\Delta t}$ sufficiently small. The expressions of the correction functions $f_{2,\gamma}$ and $f_{3,\gamma}$ are

Note that the leading order term of the error is the same as for the corresponding second order splitting schemes. The next order correction (of order ${\Delta t}^{3}$ ) vanishes for functions $\psi$ depending only on the position variable $q$ .

Proving a result similar to Proposition 2.15 for second order splitting schemes or GLA schemes turns out to be much more difficult, although we formally expect that the limit of $f_{2,\gamma}$ as $\gamma\to 0$ is the first order correction of the modified Hamiltonian constructed by backward analysis. From (33), it should indeed be the case that $f_{2}^{\gamma C,B,A,B,\gamma C}$ converges to

Moreover, as we already mentioned before Proposition 2.15, we are not able to uniformly control remainder terms in the error expansion (32) as $\gamma\to 0$ .

5 Numerical estimation of the correction term

The results of Section 2.4 show that the leading order correction terms for the average of an observable $\psi\in\mathcal{S}$ can be written as

where the function $f_{\gamma}\in\widetilde{\mathcal{S}}$ is the solution of a Poisson equation

the function $g_{\gamma}\in\widetilde{\mathcal{S}}$ depending on the numerical scheme at hand (the fact that $f_{\gamma}\in\widetilde{\mathcal{S}}$ is a consequence of Theorem 2.2). It is in general impossible to analytically solve (36), and very difficult to numerically approximate its solution since it is a high-dimensional partial differential equation. It is however possible to rewrite (35) as an integrated correlation function, a quantity which is amenable to numerical approximation. This is a standard way of computing transport coefficients based on Green-Kubo formulae, see the summary provided in Section 3.1. It provides here a way to compute the first order correction in the perfect sampling bias with a single simulation (as an alternative to Romberg extrapolation, which requires at least two simulations at different timesteps, see Talay & Tubaro (1990)).

The approach we follow is based on the following operator identity (which makes sense in $\mathcal{H}^{1}$ for instance, in view of (12))

where the expectation is taken over all initial conditions $(q_{0},p_{0})$ distributed according to $\mu$ and over all realizations of equilibrium Langevin dynamics (7), the leading order correction term (35) can be rewritten as

Consider a numerical method with an invariant measure $\mu_{\Delta t}$ having bounded moments at all orders (i.e. (23) is satisfied) and such that, for $\psi\in\mathcal{S}$ ,

where the remainder $r_{\psi,{\Delta t}}$ is uniformly bounded for ${\Delta t}$ small enough. Suppose in addition that its evolution operator $P_{\Delta t}$ is such that, for any $\psi\in\mathcal{S}$ ,

where $S_{1}\psi,\dots,S_{\alpha-1}\psi,\widetilde{R}_{\alpha,{\Delta t}}\psi\in\mathcal{S}$ and there exists $s>0$ such that the remainder $\widetilde{R}_{\alpha,{\Delta t}}\psi$ is uniformly bounded in $L^{\infty}_{\mathcal{K}_{s}}$ for ${\Delta t}$ sufficiently small. Assume finally that $P_{\Delta t}$ satisfies the uniform ergodicity condition (24) (hence (25) holds). Then, the integrated correlation of two observables $\psi,\varphi\in\widetilde{\mathcal{S}}$ can be approximated by a Riemann sum up to an error of order ${\Delta t}^{\alpha}$ :

The assumptions of this theorem are satisfied for the splitting schemes considered in this article (see the comment after (70) for the boundedness of the remainder $\widetilde{R}_{\alpha,{\Delta t}}\psi$ ).

In the particular case $\alpha=2$ , which is in fact the most relevant one from a practical viewpoint, it is possible not to modify the observable $\psi$ when the discrete generator is correct at order $2$ (see (41) below for a precise statement), upon considering a time discretization of the integral which leads to errors of order ${\Delta t}^{2}$ , for instance a trapezoidal rule. The following result is obtained by an appropriate application of Theorem 2.19 (see Section 4.8 for the proof).

Consider a numerical scheme satisfying the assumptions of Theorem 2.19, and whose discrete generator is in addition correct at order 2: for any $\psi\in\mathcal{S}$ ,

Then, for two observables $\varphi,\psi\in\widetilde{\mathcal{S}}$ ,

where $r_{\Delta t}^{\psi,\varphi}$ is bounded for ${\Delta t}$ sufficiently small and

5.2 Numerical approximation

There are two principal ways to estimate the expectations in (40) or (42), using either several independent realizations of the nonequilibrium dynamics or a single, long trajectory, see for instance the discussion in Section 13.4 of Tuckerman (2010). When $K$ independent realizations $(q^{n,k},p^{n,k})$ are generated for $N_{\rm iter}$ timesteps each, starting from initial conditions distributed according to $\mu_{\Delta t}$ , the expectation in (40) may be approximated using empirical averages of the correlation functions as

where $\alpha=1$ and $\psi_{{\Delta t},1}=\psi$ for first order splittings; while $\alpha=2$ and $\psi_{{\Delta t},2}=(1+{\Delta t}\mathcal{L}_{\gamma}/2)\psi$ for second order ones since $S_{1}=\mathcal{L}_{\gamma}^{2}/2$ for the schemes presented in Section 2.2.2 (see for instance (77)). The empirical average $\Psi_{{\Delta t},p}^{M,N_{\rm iter}}$ reads

This formula highlights the other errors arising from the discretization: (i) a statistical error related to the finiteness of $K$ and to the fact that initial conditions are obtained in practice by subsampling a single, long trajectory; (ii) a truncation error related to the finiteness of $N_{\rm iter}$ .

5.3 Numerical illustration

Let us now numerically confirm the error estimates (30)-(32)-(34). More precisely, we show that, provided the leading correction term (35) is estimated by discretizing (37) using (42) and subtracted from the estimated result, canonical averages are estimated up to errors of order ${\Delta t}^{4}$ for second order splittings instead of ${\Delta t}^{2}$ without the correction. We use the same trajectory data as above to approximate the canonical average of the total system energy $H$ . We test the effectiveness of the correction both in practice and principle, by computing the observed average and correction term in the same simulation in the former case, while computing a more accurate correction term independently in the latter case (using a smaller timestep ${\Delta t}=0.1$ ). The results are shown in the right panel of Figure 1.

6 Overdamped limit

where $(G^{n})$ are independent and identically distributed Gaussian random vectors with identity covariance. This is indeed a consistent discretization of the overdamped process (15) with an effective timestep $h={\Delta t}^{2}/2$ , and the invariant measure of this numerical scheme is close to $\overline{\mu}$ . Other schemes may have non-trivial large friction limits and invariant measures close to $\overline{\mu}$ . This is the case for the scheme associated with the evolution operator $P_{\Delta t}^{B,A,\gamma C,A,B}$ , for which the limiting discrete dynamics reads (see Leimkuhler & Matthews (2013a))

Note that $(q^{n})$ is not a Markov chain due to the correlations in the random noises.

On the other hand, the limits of the invariant measures associated with certain schemes are not consistent with the canonical measure $\overline{\mu}$ . This is the case for the first-order schemes, as well as the second order splittings listed in item (iii) in Section 2.2.2. For instance, the limit of the scheme associated with $P_{\Delta t}^{\gamma C,A,B}$ reads

The invariant measure of this Markov chain is the uniform measure on $\mathcal{M}$ , and is therefore very different from the invariant measure $\overline{\mu}$ of the continuous dynamics (15) (it amounts to setting $V=0$ ). As another example, consider the limit of the scheme associated with $P_{\Delta t}^{\gamma C,B,A}$ :

This is the Euler-Maruyama discretization of (15) with an effective timestep $h={\Delta t}^{2}$ but an inverse temperature $2\beta$ rather than $\beta$ .

6.2 Rigorous error estimates

The following result quantifies the errors of the invariant measure of second order splitting schemes of Langevin dynamics, for large values of $\gamma$ . We restrict ourselves to the second order splittings where the Ornstein-Uhlenbeck part is either at the ends or in the middle (categories (i) and (ii) in Section 2.2.2). From a technical viewpoint, we are able here to bound remainder terms uniformly in $\gamma$ by relying on the properties of the limiting operator $\mathcal{L}_{\rm ovd}^{-1}$ . The result we obtain is the following (see Section 4.9 for the proof).

where the remainder is of order ${\Delta t}^{4}$ up to terms exponentially small in $\gamma{\Delta t}$ . More precisely, there exist constants $a,b\geqslant 0$ and $c>0$ (all depending on $\psi$ ) such that

The expression of $f_{2,\infty}$ depends on the numerical scheme at hand:

The real number $a_{\beta,V}$ ensures that all functions $f_{2,\infty}$ are of average zero with respect to $\overline{\mu}$ . Two comments are in order. Note first that the result is stated for observables which depend only on the position variable $q$ since the limiting case $\gamma\to+\infty$ corresponds to a dynamics on the positions only. There is anyway no restriction in stating the result using such observables since, as already discussed in the introduction, the error on the marginal in the position variables is the relevant error, momenta being trivial to sample exactly under the canonical measure. Secondly, let us emphasize that the ${\Delta t}^{2}$ correction term vanishes for the method associated with $P_{\Delta t}^{B,A,\gamma C,A,B}$ (as already noted in Leimkuhler & Matthews (2013a)). This means that the corresponding discretization of overdamped Langevin dynamics (formally obtained by setting $\gamma=+\infty$ ) has an invariant measure which is correct at second-order in the effective timestep $h={\Delta t}^{2}/2$ .

6.3 Overdamped limit of the correction terms

In order to relate the convergence result from Theorem 2.21 to the error estimates from Theorem 2.16, we prove that the limits of the correction functions $f_{2,\gamma}$ as $\gamma\to+\infty$ agree with the functions defined in (43) (see Section 4.10 for the proof). This can be seen as a statement regarding the permutation of the limits $\gamma\to+\infty$ and ${\Delta t}\to 0$ for the leading correction term, namely, for a smooth function $\psi=\psi(q)\in C^{\infty}(\mathcal{M})$ ,

There exists a constant $K>0$ such that, for all $\gamma\geqslant 1$ ,

where the constant $a_{\beta,V}$ is defined in (43).

Let us also mention that the overdamped limit of the correction function $f_{1,\gamma}$ for first order splittings is not well defined. This is not surprising since the invariant measures of the corresponding numerical schemes are not consistent with $\overline{\mu}$ , as discussed in Section 2.6.1. For instance, combining (17) and the expressions of the correction functions (31), we see that there exists a constant $K>0$ such that

defined on $\mathcal{S}$ , is the generator of the overdamped Langevin dynamics with non-trivial mass matrix:

Nonequilibrium dynamics and the computation of transport coefficients

We discuss in this section the numerical estimation of transport properties such as the thermal conductivity, the shear stress, etc. (see Evans & Morriss (2008); Tuckerman (2010) for general physical presentations of the computation of transport coefficients, and Section 3.1 of Stoltz (2012) for a mathematically oriented introduction).

the generator of the perturbation (considered as an operator on $L^{2}(\mu)$ , with core $\mathcal{S}$ ). Note that the constant force $F$ does not derive from the gradient of a smooth function defined on $\mathcal{M}$ . (It would indeed seem that this force derives from $-F^{T}q$ , but this potential is not periodic.) Therefore, the expression of the invariant measure is unknown, but can be nonetheless obtained as an expansion in powers of $\eta$ when the magnitude of the forcing is sufficiently small (see Section 3.1). The effect of the force is to create a non-zero average velocity in the direction of $F$ . The magnitude of the average velocity is a property of the system under consideration. For small forcings, it is linear in $\eta$ , with a constant of proportionality called the mobility (see the definition (47) below), related to the autodiffusion coefficient through (50).

As shown in Joubaud et al. (2015), it is possible to consider more general forcing terms $F(q)$ which do not derive from the gradient of a periodic function. A popular example is provided by shearing forces where the particles experience a force in some direction, whose intensity depends on the coordinates of the system in another direction.

We will also be interested in the overdamped limit of the nonequilibrium dynamics (45), which reads

The generator of this dynamics is $\mathcal{L}_{\rm ovd}+\eta\widetilde{\mathcal{L}}_{\rm ovd}$ with $\widetilde{\mathcal{L}}_{\rm ovd}=F\cdot\nabla_{q}$ (all operators being defined on the core $\mathcal{S}$ ). In this case the physically relevant response turns out to be the average force $-F\cdot\nabla V$ exerted in the direction $F$ .

From linear response theory (see for example the presentation in (Stoltz, 2012, Section 3.1), and the short summary provided in Section 4.11), it can be shown that

The mobility can therefore be rewritten as the integrated autocorrelation function of the velocity in the direction $F$ :

where the expectation is over all initial conditions $(q_{0},p_{0})$ distributed according to $\mu$ and over all realizations of the equilibrium Langevin dynamics (7). From this relation, it is easily seen that the mobility in the direction $F$ is related to the autodiffusion coefficient

In practice, the two most popular ways of estimating a transport coefficient rely on the Green-Kubo formula (49) and the linear response of nonequilibrium dynamics in their steady-states (47). Since the error estimates for Green-Kubo type formulas have already been discussed in Theorem 2.19, we will restrict ourselves in the sequel to the analysis of the numerical errors introduced by nonequilibrium methods.

The derivation of this formula is very similar to that leading to (47). The following result summarizes the limiting behavior of the mobility as the friction increases (recall that we set mass matrices to identity when studying overdamped limits).

There exists $K>0$ such that, for any $\gamma\geqslant 1$ ,

This result is already contained in Hairer & Pavliotis (2008), but we nonetheless provide a short alternative proof in Section 4.11.2 (see Remark 4.13 for a more precise comparison of the results). It shows that, in the overdamped regime $\gamma\to+\infty$ ,

which suggests to estimate $\nu_{F,\gamma}$ using the linear response of $F^{T}\nabla V$ for large frictions since this quantity is expected to be a good approximation of $\overline{\nu}_{F}$ – instead of relying on the standard linear response result (47), for which the response is of order $1/\gamma$ and is hence difficult to reliably estimate. Error estimates on the numerical approximation are deduced from (55) below.

2 Numerical schemes for the nonequilibrium Langevin dynamics

We present in this section numerical schemes approximating solutions of (45). These schemes reduce to the schemes presented in Section 2.2 when $\eta=0$ . Since the aim is to decompose the evolution generated by $\mathcal{L}_{\gamma}+\eta\widetilde{\mathcal{L}}$ into analytically integrable parts, there are two principal options: either replace $B$ by

where $\alpha_{\Delta t}$ is defined after (18), and $(G^{n})$ is a sequence of independent and identically distributed Gaussian random vectors with identity covariance. As $\gamma\to+\infty$ , a standard Euler-Maruyama discretization of the equilibrium overdamped Langevin dynamics (i.e. $\eta=0$ ) is obtained, whereas we would like to obtain a consistent discretization of nonequilibrium overdamped Langevin dynamics (46). We therefore instead consider schemes obtained by replacing $B$ with $B+\eta\widetilde{\mathcal{L}}$ , such as the first order splitting

The numerical scheme associated with the first order splitting scheme $P_{{\Delta t}}^{A,B+\eta\widetilde{\mathcal{L}},\gamma C}$

indeed is, in the limit as $\gamma\to+\infty$ , a consistent discretization of the nonequilibrium Langevin dynamics (46), and its invariant measure turns out to converge to the invariant measure of (46) in the limit ${\Delta t}\to 0$ .

uniformly in the timestep ${\Delta t}$ and the forcing magnitude $\eta$ . There also exist $\lambda,K>0$ (depending on $s^{*}$ , $\gamma$ and $\eta^{*}$ but not on ${\Delta t}$ ) such that, for all functions $f\in L^{\infty}_{\mathcal{K}_{s}}$ , the following holds for almost all $(q,p)\in\mathcal{E}$ :

Let us emphasize that we do not have any control on the convergence rate $\lambda$ in terms of $\eta^{*}$ , and it could well be that $\lambda$ goes to 0 as $\eta^{*}$ increases.

3 Error estimates on transport coefficients from nonequilibrium methods

The following result provides error estimates for the invariant measure of the first order or second order splittings schemes of Section 2.2.2 when $B$ is replaced by $B_{\eta}$ .

Denote by $p$ the order of the splitting scheme, by $f_{\alpha,0,\gamma}$ the leading order correction function in the case $\eta=0$ as given by Theorem 2.13 for $\alpha=1$ and by Theorem 2.16 for $\alpha=2$ . Then, there exists a function $f_{\alpha,1,\gamma}\in\widetilde{\mathcal{S}}$ such that, for any smooth function $\psi\in\mathcal{S}$ , there exist ${\Delta t}^{*},\eta^{*}>0$ and a constant $K>0$ for which, for all $\eta\in[-\eta^{*},\eta^{*}]$ and $0<{\Delta t}\leqslant{\Delta t}^{*}$ ,

where $f_{0,1,\gamma}$ is defined in (48), and

The proof of this result can be found in Section 4.12. Note that the remainder term now collects higher order terms both as powers of the timestep and the nonequilibrium parameter $\eta$ . The estimates we obtain on the remainder are however compatible with taking the linear response limit, as made precise by the following error estimate on the transport coefficient (which is an immediate consequence of Theorem 3.4). In order to state the result, we introduce the reference linear response for an observable $\psi$

It is often the case that $\psi$ has a vanishing average with respect to $\mu$ , as is the case for the function $F^{T}M^{-1}p$ in (47). In general, it however has a non-zero average with respect to the invariant measure $\mu_{\gamma,{\Delta t}}$ of the numerical scheme associated with a discretization of the equilibrium dynamics.

There exist ${\Delta t}^{*},\eta^{*}>0$ and a constant $K>0$ such that, for all $\eta\in[-\eta^{*},\eta^{*}]$ and $0<{\Delta t}\leqslant{\Delta t}^{*}$ ,

where $r_{\psi,\gamma,{\Delta t}}$ is uniformly bounded.

In particular, we obtain the following estimate on the numerically computed mobility:

where the reference mobility $\nu_{F,\gamma}$ is defined in (48).

We consider the same system as in Section 2.5.3, with an external force $F=(1,0)$ and $K+1$ forcing strengths $\eta_{k}=(k-1)\Delta\eta$ uniformly spaced in the interval $[0,\eta_{\rm max}]$ with $\eta_{\rm max}=0.5$ (so that $\Delta\eta=\eta_{\rm max}/K$ ). We fix the friction to $\gamma=1$ and the inverse temperature to $\beta=1$ . We use a coupling strategy to reduce the statistical noise in the computation of the linear response (53). The $K+1$ replicas of the system are started at the same position $q=(0,0)$ , with the same velocity (sampled according to the canonical measure $\mu$ ). Each replica experiences the force $-\nabla V+\eta_{k}F$ (Note that the first replica experiences the reference force $-\nabla V$ corresponding to a discretization of the equilibrium dynamics). Most importantly, the same Gaussian random numbers $G^{n}$ are used for all replicas to discretize the Brownian motion. Although not carefully documented here, this coupling strategy tremendously decreases the statistical error in the computed linear response. Such a coupling strategy was already proposed for exclusion processes in Goodman & Lin (2009). However, our experience shows that it fails for higher dimensional systems with more complex potentials (such as Lennard-Jones fluids).

For a given value of the timestep ${\Delta t}$ , we denote by $(q^{k,n},p^{k,n})_{n\geqslant 0}$ the discrete trajectory of the $k$ th replica. The linear response in the projected average velocity $\delta v_{\eta_{k}}$ is approximated over $N_{\rm iter}$ integration steps as

We then estimate the mobility by a linear fit on the first $K^{\prime}=10$ values of $\widehat{v}_{\eta_{k}}^{\,\,N_{\rm iter}}$ considered as a function of $\eta_{k}$ (see Figure 2, left). The value $\nu_{F,\gamma,{\Delta t}}$ is the estimated slope in the fit. The behavior of the mobility $\nu_{F,\gamma,{\Delta t}}$ as a function of the timestep is presented in Figure 2 (right) for the numerical schemes associated with the first order splitting $P_{\Delta t}^{A,B_{\eta},\gamma C}$ and the second order splitting $P_{\Delta t}^{\gamma C,B_{\eta},A,B_{\eta},\gamma C}$ . We used $N_{\rm iter}=4\times 10^{11}$ for the first order scheme, and $N_{\rm iter}=2.5\times 10^{11}$ for the second order one. The statistical error is very small and error bars are therefore not reported. The computed mobilities can be fitted for small ${\Delta t}$ as

for the second order splitting scheme, in agreement with the theoretical prediction (54).

4 Error estimates in the overdamped limit

We now study the numerical errors arising in the simulation of nonequilibrium systems in the large friction limit. We restrict ourselves to the second order splittings where the Ornstein-Uhlenbeck part is either at the ends or in the middle (categories (i) and (ii) in Section 2.2.2). To state the result, we introduce the first order correction to the invariant measure in terms of the magnitude of the nonequilibrium forcing, namely (recall $\widetilde{\mathcal{L}}_{\rm ovd}=F\cdot\nabla_{q}$ )

The proof is presented in Section 4.13. This result allows us to estimate the error in the computation of the transport coefficient $\nu_{F,\gamma}$ based on (51) and Lemma 3.2. Indeed, studying the linear response of the observable $-F^{T}\nabla V$ and defining

In the latter expression, $\overline{\nu}_{F,\gamma,{\Delta t}}$ can be numerically estimated, in a manner similar to that presented at the end of Section 3.3.

Proofs of the results

Unless otherwise stated, the default norm $\|f\|$ and scalar product $\langle f,g\rangle$ are the ones associated with the Hilbert space $L^{2}(\mu)$ . Recall that, unless otherwise mentioned, all operators are defined on $\mathcal{S}$ , and that formal adjoint operators are by default considered on $L^{2}(\mu)$ . Recall also that

with $p_{i}=(p_{i,1},\dots,p_{i,d})$ since $\partial_{p_{i,\alpha}}^{*}=-\partial_{p_{i,\alpha}}+\beta p_{i,\alpha}$ .

The proof of Lemma 2.5 follows the same lines as the proof of uniform hypocoercive estimates in the corrected version of Theorem 3 in Joubaud & Stoltz (2012a) (see the erratum Joubaud & Stoltz (2013) or the updated preprint version Joubaud & Stoltz (2012b)). We provide a simplified version of it for completeness.

We show that the operator $\mathcal{L}_{\gamma}$ is uniformly hypocoercive for $\gamma\geqslant 1$ . The aim is to obtain bounds on the inverse $\mathcal{L}^{-1}_{\gamma}$ extended to $\mathcal{H}^{1}_{\perp}$ . To this end, we decompose $\mathcal{L}_{\gamma}$ for $\gamma\geqslant 1$ as

The proof of Theorem 6.2 in Hairer & Pavliotis (2008) shows that there exists $\widetilde{\alpha}>0$ such that, for all $u\in\mathcal{S}$ ,

where the norm induced by $\left\langle\left\langle\cdot,\cdot\right\rangle\right\rangle$ is equivalent to the $H^{1}(\mu)$ norm. More precisely, $\left\langle\left\langle\cdot,\cdot\right\rangle\right\rangle$ is the bilinear form defined by

with appropriate coefficients $a\gg b\gg 1$ . It follows that there exists $\alpha>0$ independent of $\gamma$ such that

Using the rewriting (56) of the operator $C$ , and the commutation relations $[\partial_{p_{i,\alpha}},\partial_{p_{j,\alpha^{\prime}}}^{*}]=\beta\delta_{\alpha,\alpha^{\prime}}\delta_{ij}$ , a simple computation shows

for functions $u\in\mathcal{H}^{1}_{\perp}$ . Therefore,

Summing (59) on $i\in\{1,\dots,N\}$ and $\alpha\in\{1,\dots,d\}$ , the quantity (58) is seen to be non-negative for an appropriate choice of constants $a\gg b\gg 1$ .

From (57), we then deduce that there exists a constant $K>0$ such that, for any $\gamma\geqslant 1$ and for any $u\in\mathcal{H}^{1}_{\perp}\cap\mathcal{S}$ , it holds $\left\|u\right\|_{H^{1}(\mu)}\leqslant K\|\mathcal{L}_{\gamma}u\|_{H^{1}(\mu)}$ . Taking inverses and passing to the limit in $\mathcal{H}^{1}_{\perp}$ gives

We are now in position to give the proof of Theorem 2.4.

We write the proof for $\mathcal{L}_{\gamma}^{-1}$ . The estimates for $(\mathcal{L}_{\gamma}^{*})^{-1}$ are obtained by using $\mathcal{L}_{\gamma}^{*}=\mathcal{R}\mathcal{L}_{\gamma}\mathcal{R}$ (the momentum reversal operator being defined in (11)), and the fact that $\mathcal{R}C\mathcal{R}=C$ , $\mathcal{R}\mathcal{L}_{\rm ovd}\mathcal{R}=\mathcal{L}_{\rm ovd}$ and $\mathcal{R}(A+B)\mathcal{R}=-(A+B)$ .

The lower bound in (16) could be obtained directly provided $V$ is not constant, by considering the special case

where $v$ is a constant chosen such that $p^{T}\nabla V+\gamma(V-v)$ has a vanishing average with respect to $\mu$ . This example is also useful to motivate the fact that, in general, solutions of the Poisson equation $\mathcal{L}_{\gamma}u_{\gamma}=f$ have divergent parts of order $\gamma$ as $\gamma\to+\infty$ .

Let us now turn to the refined upper and lower bounds (17), which we prove using techniques from asymptotic analysis. Consider $f\in\mathcal{H}^{1}$ , and $u_{\gamma}\in\mathcal{H}^{1}$ the unique solution of the following Poisson equation $\mathcal{L}_{\gamma}u_{\gamma}=f$ . The above example suggests the following expansion in inverse powers of $\gamma$ :

To rigorously prove this expansion, we first proceed formally, taking (60) as an ansatz, plugging it into $\mathcal{L}_{\gamma}u=f$ and identifying terms according to powers of $\gamma$ . This leads to

The first equality implies that $u^{-1}=u^{-1}(q)$ since $C$ satisfies a Poincaré inequality on $L^{2}(\kappa)$ (where $\kappa$ is defined in (6)). The second then reduces to $Cu^{0}=-M^{-1}p\cdot\nabla_{q}u^{-1}$ , from which we deduce $u^{0}(q,p)=p^{T}\nabla u^{-1}(q)+\widetilde{u}^{0}(q)$ . Inserting this expression in the third equality gives

The solvability condition for this equation is that the right-hand side has a vanishing average with respect to $\kappa$ , i.e. belongs to the kernel of $\pi$ . This condition reads

Coming back to (60), we see that the proposed approximate solution is such that

We now choose $\widetilde{u}^{0}$ such that $(A+B)u^{1}$ belongs to $\mathcal{H}^{1}_{\perp}$ , which amounts to

It is easily checked that $\widetilde{u}^{0}=-\mathcal{L}_{\rm ovd}^{-1}\pi(A+B)C^{-1}(f-\pi f)$ is a well defined element in $\mathcal{H}^{1}$ for $f\in H^{1}(\mu)$ : first, $C^{-1}(f-\pi f)\in\mathcal{H}^{1}$ , so $(A+B)C^{-1}(f-\pi f)\in L^{2}(\mu)$ . Finally, the image under $\mathcal{L}_{\rm ovd}^{-1}\pi$ of any function in $L^{2}(\mu)$ is a function of average zero with respect to $\overline{\mu}$ , depending only on the position variable $q$ and belonging to $H^{2}(\overline{\mu})$ ; hence to $\mathcal{H}^{1}$ .

Combining (61) and Lemma 2.5, we see that there exists a constant $R>0$ , such that, for all $\gamma\geqslant 1$ , it holds $\|u_{\gamma}-\gamma u^{-1}-u^{0}\|_{H^{1}(\mu)}\leqslant R\|f\|_{H^{1}(\mu)}/\gamma$ for the above choices of functions $u^{-1},u^{0}$ . This gives (17).

2 Ergodicity results for numerical schemes

We write the proof for the scheme associated with the evolution operator $P_{\Delta t}^{B,A,\gamma C}$ , starting by the case $s=1$ , before turning to the general case $s\geqslant 2$ . The proofs for other schemes are very similar, and we therefore skip them.

The numerical scheme corresponding to $P_{\Delta t}^{B,A,\gamma C}$ is (18). We introduce $m\in(0,+\infty)$ such that $m\leqslant M\leqslant m^{-1}$ (in the sense of symmetric matrices). A simple computation shows that

We choose for instance $\varepsilon=m\gamma$ , in which case

for ${\Delta t}$ sufficiently small. Finally, since $\mathcal{K}_{2}(q,p)=1+|p|^{2}$ ,

for ${\Delta t}$ sufficiently small. This gives (21). To obtain (22), we iterate the bound (21):

The computations are similar for a general power $s\geqslant 2$ . We write $p^{n+1}=\alpha_{\Delta t}p^{n}+\delta_{\Delta t}$ with $\delta_{\Delta t}=-\alpha_{\Delta t}{\Delta t}\nabla V(q^{n})+\sqrt{\beta^{-1}(1-\alpha_{\Delta t}^{2})M}\,G^{n}$ . Note that $\delta_{\Delta t}$ is of order ${\Delta t}^{1/2}$ because of the random term. We work componentwise, using the assumption that $M$ is diagonal, so that, denoting by $m_{i}$ the mass of the $i$ th degree of freedom,

where the remainder $r_{s,{\Delta t}}(q^{n})$ is uniformly bounded as ${\Delta t}\to 0$ . Distinguishing between $|p_{i}|\geqslant 1/\varepsilon$ and $|p_{i}|\leqslant 1/\varepsilon$ , we have

The proof is then concluded as in the case $s=1$ by choosing $\varepsilon$ sufficiently small (independently of ${\Delta t}$ ).

for a well chosen probability measure $\nu$ and a constant $\alpha>0$ . The idea of the proof is to explicitly rewrite $q^{n}$ and $p^{n}$ as perturbations of the reference evolution corresponding to $\nabla V=0$ and $(q^{0},p^{0})=(0,0)$ . Since we consider smooth potentials and the position space is compact, the perturbation can be uniformly controlled when the initial momenta are within a compact set.

We write the proof for the scheme associated with the evolution operator $P_{\Delta t}^{B,A,\gamma C}$ , as in the proof of Lemma 2.6. A simple computation shows that, for $n\geqslant 1$ ,

Denote by $\mathcal{G}^{n}$ the centered Gaussian random variable

is a centered Gaussian random variable. Now,

Note that ${\Delta t}\,F^{n}$ is uniformly bounded: using $0\leqslant\alpha_{\Delta t}\leqslant\exp(-\gamma m{\Delta t})$ in the sense of symmetric, positive matrices (with $m\leqslant M\leqslant m^{-1}$ ),

provided ${\Delta t}$ is sufficiently small. Therefore, there exists a constant $R>0$ (depending on $p_{\rm max}$ ) and ${\Delta t}^{*}>0$ such that, for all timesteps $0<{\Delta t}\leqslant{\Delta t}^{*}$ and corresponding integration steps $0\leqslant n\leqslant T/{\Delta t}$ ,

A lengthy but straightforward computation shows that the variance of the centered Gaussian vector $\left(\widetilde{\mathcal{G}}^{n},\mathcal{G}^{n}\right)$ is

To check that this expression is appropriate, we note that it converges as ${\Delta t}\to 0$ with $n{\Delta t}\to T$ to the variance of the limiting continuous process

starting from $(q_{0},p_{0})=(0,0)$ , which reads

The result follows by combining (62)-(63)-(64) and introducing the probability measure

We only prove (24) and (23) since the other results are standard. To obtain the bound (24), we first note that, by the results of Hairer & Mattingly (2011), there exists $\widetilde{\lambda}>0$ such that, for any function $f\in L^{\infty}_{\mathcal{K}_{s},{\Delta t}}$ and $0<{\Delta t}\leqslant{\Delta t}^{*}$ (the critical timestep being given by Lemmas 2.6 and 2.7), the following holds for almost all $(q,p)\in\mathcal{E}$ :

so that, using the contractivity property $|P_{\Delta t}f(q,p)|\leqslant|f(q,p)|$ ,

Introducing $\lambda=\widetilde{\lambda}/T$ , the argument of the exponent reads

when ${\Delta t}$ is sufficiently small. This gives (24).

The moment estimate (23) is obtained by averaging (21) with respect to the invariant measure:

Since $\mu_{\gamma,{\Delta t}}$ is invariant,

which gives the desired result with $R=2C_{b}/C_{a}$ for instance, provided ${\Delta t}$ is sufficiently small.

3 Some useful results

We give in this section an expression for the evolution operator

It is easy to check that the operators $A,B,C$ defined in (8) map $\mathcal{S}$ to itself. It is in fact possible to analytically write down the action of the associated semigroups:

Coming back to the general case, the key building block for the subsequent numerical analysis is the following equality:

where $\mathcal{T}$ is a notation indicating that the operators with the smallest indices (or their associated semigroups) are farthest to the right. In fact, simple computations show that

Therefore, the following equality holds when applied to functions $\varphi\in X$ :

3.2 Baker-Campbell-Hausdorff (BCH) formula

It is important to rewrite the various terms in the right-hand side of (66) in a form more amenable to analytical computations. More precisely, it is convenient to write the following equality in terms of operators defined on $X$ :

where the operator $S_{n}$ involves commutators $[A_{i},A_{j}]$ , which can also be defined as operators on $X$ with values in $X$ . In fact, the algebraic expressions of the operators $S_{n}$ can be formally obtained from the BCH formula for first order splittings (see for instance (Hairer et al., 2006, Section III.4.2)): for $M=3$ ,

and from the symmetric BCH formula for second order involving 3 operators (obtained by composition of the standard BCH formula involving 2 operators):

where we do not write down the expressions of the higher order terms ${\Delta t}^{2n}$ (for $n\geqslant 2$ ). Let us insist that these formulas are only formal (since the operators appearing the argument of the exponential on the right-hand side involve more and more derivatives), but nonetheless allow us to find the algebraic expressions of $S_{n}$ upon formally expanding the exponential as

and identifying terms with the same powers of ${\Delta t}$ in (66).

3.3 Approximate inverse operators

Consider an operator $A$ defined on some core $X$ (typically some subspace of $\mathcal{S}$ ), and whose inverse is also defined on $X$ in the following sense: for any $g\in X$ , there exist $f\in X$ such that $Af=g$ . We denote by $A^{-1}g$ the element $f\in X$ . At this stage, we do not assume any boundedness in an appropriate operator norm for $A^{-1}$ or some extension of it. We next consider a perturbation ${\Delta t}^{\alpha}B$ for some exponent $\alpha\geqslant 1$ , where $B$ is also defined on $X$ and has values in $X$ . In the typical situations encountered in this article, $B$ is not bounded with respect to $A$ in an appropriate operator norm since it may involve higher order derivatives than $A$ does. It is therefore impossible in general to properly define the inverse of $A+{\Delta t}^{\alpha}B$ .

However, it is possible to introduce an approximate inverse, which we define as an operator $Q_{{\Delta t},n}$ from $X$ to $X$ such that there exists an operator $\widetilde{Q}_{{\Delta t},n}$ from $X$ to $X$ for which the following equality holds for any function $f\in X$ :

4 Proof of Theorem 2.13

First, note that, by definition of the invariant measure $\mu_{\gamma,{\Delta t}}$ , it holds that, for any $\varphi\in\mathcal{S}$ ,

The next step is to choose the correction function $f_{1,\gamma}$ . Using the results of Section 4.3, a simple computation shows that

The dominant term on the right-hand side can be written, using integration by parts,

In view of (69), we choose the correction function in order to eliminate the dominant term:

We would like, at this stage, to replace the observable $\varphi$ appearing on the left hand side by the function

However, we do not have any control on the derivatives of this function (Corollary 2.9 allows to control the norm of the function, not of its derivatives), whereas such a control is required to bound the remainder terms. In order to use an approximate inverse operator involving iterated powers of $\mathcal{L}_{\gamma}^{-1}$ (see Section 4.3.3), we first need to make sure that all operators are defined on $\widetilde{\mathcal{S}}$ , with values in $\widetilde{\mathcal{S}}$ . This is the case for $\mathcal{L}_{\gamma}$ and its inverse, but not for the other operators appearing in (70), which have values in $\mathcal{S}$ . We therefore project out averages with respect to $\mu$ . Define to this end the projector

which maps $\mathcal{S}$ to $\widetilde{\mathcal{S}}$ . Then, for a function $\varphi\in\widetilde{\mathcal{S}}$ (for which $\Pi^{\perp}\varphi=\varphi$ ), (72) can be rewritten as

where we have used the fact that $f_{1,\gamma}$ is of average zero with respect to $\mu$ . On the other hand, (69) may be rewritten

As a consequence of the presence of the projection $\Pi^{\perp}$ , all of the operators in (70) are restricted to the range of $\Pi^{\perp}$ , i.e. the following equality holds on $\widetilde{\mathcal{S}}$ :

which is a well defined operator from $\widetilde{\mathcal{S}}$ to $\widetilde{\mathcal{S}}$ such that

where $Z_{1,{\Delta t}}$ maps $\mathcal{S}$ to $\mathcal{S}$ . We finally replace $\varphi$ by $Q_{1,{\Delta t}}\psi$ in (74). This gives (recall that $\Pi^{\perp}\psi=\psi$ by assumption)

where the functions $\widetilde{R}_{1,{\Delta t}}\psi,\widehat{R}_{1,{\Delta t}}\psi$ belong to $\mathcal{S}$ when $\psi$ does. The integral on the right-hand side is uniformly bounded for small ${\Delta t}$ (using the fact that the functions appearing in the integral are in $\mathcal{S}$ and relying on Proposition 2.8). This gives (30) for the splitting scheme $P_{\Delta t}^{\gamma C,B,A}$ .

The function $f_{1}^{\gamma C,B,A}\in\widetilde{\mathcal{S}}$ (denoted by $f_{1,\gamma}$ above) is uniquely determined by the equation

where we have used $[\mathcal{L}_{\gamma}^{2}]^{*}\mathbf{1}=0$ to simplify the right-hand side. Now, $[C,A+B]^{*}=[C,A+B]$ since $C^{*}=C$ and $(A+B)^{*}=-(A+B)$ . Therefore, $[C,A+B]^{*}\mathbf{1}=0$ . In addition,

since $A^{*}=-A+g$ and $B^{*}=-B-g$ . Therefore,

The expressions for the first order corrections when the operators $A$ and $B$ are exchanged are obtained by noting that the sign of $S_{1}^{*}\mathbf{1}$ is changed and that $f_{1}^{B,\gamma C,A}=f_{1}^{\gamma C,A,B}+A^{*}\mathbf{1}$ .

Let us highlight the structure of the proof, in order to make clear which technical extensions are required in order to state error estimates for other dynamics:

first, an expansion of the evolution operator $P_{\Delta t}$ in powers of ${\Delta t}$ has to be written out. This step is usually quite simple although sometimes algebraically involved. The expansion of $P_{\Delta t}$ is the same as the one used to prove weak error estimates;

second, good control on the resolvent has to be established, such as the stability result provided by Theorem 2.2. This step may already be quite complicated since it involves proving that $\mu$ is the unique invariant measure, and that the resolvent can be inverted for functions with average zero with respect to $\mu$ . A typical way to do so is to establish decay properties of the semigroup. Such decay estimates may be hard to obtain for degenerate noises;

the existence of an invariant measure $\mu_{{\Delta t}}$ for the numerical scheme has to be demonstrated (uniqueness is not required), typically by finding a Lyapunov function. Again, this may be difficult if the dynamics is highly degenerate.

5 Proof of Proposition 2.15

We use a very standard strategy: first, we propose an ansatz for the correction term $f_{1,\gamma}$ as

then identify the two leading order terms in this expression, and finally use the resolvent estimate of Theorem 2.3 to conclude. Note that our ansatz is not obvious since the estimate of Theorem 2.3 shows that, in general, a leading order correction term of order $1/\gamma$ should be considered. It turns out however that, due to the specific structure of the right-hand side of (31) (namely the fact that the right-hand is at leading order in $\gamma$ the image under the Hamiltonian operator of some function), such a divergent leading order term is not necessary.

Consider for instance the case when $f_{1,\gamma}$ is $f_{1}^{\gamma C,B,A}$ . This function solves

so that we consider the ansatz $f_{1}^{\gamma C,B,A}=g/2+\gamma f_{1}^{1}+\dots$ . Identifying terms with same powers of $\gamma$ , we see that the correction term $f_{1}^{1}$ should satisfy

Possible solutions are defined up to elements of the kernel of $A+B$ (which contains function of the form $\varphi\circ H$ ). One possible choice is to set $f_{1}^{1}=\beta p^{T}M^{-2}p/4+c_{1}^{1}$ , where the constant $c_{1}^{1}$ is chosen in order for $f_{1}^{1}$ to have a vanishing average with respect to $\mu$ . Then,

In view of Theorem 2.3, this implies that there exists a constant $K>0$ such that

for $\gamma\leqslant 1$ , which gives the desired estimate on $f_{1}^{\gamma C,B,A}$ . Similar computations give the estimate on $f_{1}^{\gamma C,A,B}$ , while the estimates on the remaining functions are obtained from (31).

6 Proof of Theorem 2.16

The proof follows the same lines as the proof for the first order splitting schemes (see Section 4.4). We present only the required modifications. We write the proof for $P_{\Delta t}^{\gamma C,B,A,B,\gamma C}$ since the correction term has a much simpler right-hand side than $P_{\Delta t}^{A,B,\gamma C,B,A}$ .

Expanding up to terms of order ${\Delta t}^{5}$ the formal expression of $P_{\Delta t}^{\gamma C,B,A,B,\gamma C}$ given by the BCH expansion (67), we obtain the following equality (as operators on $\mathcal{S}$ )

We choose $f_{2}^{\gamma C,B,A,B,\gamma C}\in\widetilde{\mathcal{S}}$ as the unique solution of the Poisson equation $\mathcal{L}_{\gamma}^{*}f_{2}^{\gamma C,B,A,B,\gamma C}=-S_{2}^{*}\mathbf{1}$ (which is indeed well posed since the right hand side has a vanishing average with respect to $\mu$ since it is in the image of $S_{2}$ , and is regular as shown by (78) below). Then, for a function $\varphi\in\mathcal{S}$ ,

where many terms cancel by the invariance of $\mu$ by $\left(\mathcal{L}_{\gamma}^{\alpha}\right)^{*}$ (for integer powers $\alpha$ ). The leading order term on the right-hand side in fact vanishes since it can be rewritten as

We then restrict the above equality to functions $\varphi\in\widetilde{\mathcal{S}}$ , project out the average with respect to $\mu$ of the first factor in the integral on the left using the projector $\Pi^{\perp}$ introduced in (73), and finally replace $\varphi$ by $Q_{2,{\Delta t}}\psi$ where $Q_{2,{\Delta t}}$ is an approximate inverse satisfying

The proof is concluded as in Section 4.4.

To evaluate the expression $S_{2}^{*}\mathbf{1}$ , we need to compute the actions of the formal adjoints of the various commutators. Using $C\mathbf{1}=(A+B)\mathbf{1}=0$ and

straightforward computations show that $S_{2,2}^{*}\mathbf{1}=S_{2,1}^{*}\mathbf{1}=0$ . In addition, since

where we have used $ABg=BAg$ (as can be checked by a direct computation). A similar computation shows that $\big{(}[B,[B,A]]\big{)}^{*}\mathbf{1}=(-AB+2BA+B^{2})g=(A+B)Bg=ABg$ (since $B^{2}g=0$ by a direct verification). Finally,

To obtain the expression of $f_{2}^{A,B,\gamma C,B,A}$ , we use the TU lemma with the operator

In fact, it can be shown that the ${\Delta t}^{3}$ term does not pollute the remainder since the next order correction in the invariant measure has to be of order ${\Delta t}^{4}$ (see (32)). The expressions for $f_{2}^{\gamma C,A,B,A,\gamma C}$ and $f_{2}^{B,A,\gamma C,A,B}$ are obtained in a similar manner.

7 Proof of Corollary 2.17

where we have used the invariance of $\mu$ by $U_{\Delta t}$ . The comparison with (32)-(33) gives the desired result.

8 Approximation of integrated correlation functions

The proof makes use of the projection operator defined on $\mathcal{S}$ as (compare (73))

The range of $\Pi^{\perp}_{\Delta t}$ is contained in the set of functions with average zero with respect to the invariant measure $\mu_{\Delta t}$ of the numerical scheme. We first introduce the invariant measure for the numerical scheme, using the fact that $-\mathcal{L}_{\gamma}^{-1}\psi$ has zero average with respect to $\mu$ :

where $r^{\psi,\varphi}_{\Delta t}$ is uniformly bounded for ${\Delta t}$ sufficiently small by (38). In addition, by (39),

Note that the sum on the right hand side is well defined in view of the decay estimates (24). Plugging the above equality in (79) leads to

The second term on the right hand side is uniformly bounded in view of the moment estimates on $\mu_{\Delta t}$ , the resolvent bounds provided by Corollary 2.9 and the uniform boundedness of the remainder $\widetilde{R}_{\alpha,{\Delta t}}f$ for a given, smooth function $f$ . Since

The idea is to start from (40) and to appropriately rewrite the first order correction term. We use to this end (40) with $\psi$ replaced by its first order correction $(\psi_{{\Delta t},2}-\psi)/{\Delta t}=S_{1}\mathcal{L}_{\gamma}^{-1}\psi$ , and discard terms of order ${\Delta t}^{2}$ :

where $r^{\psi,\varphi}_{\Delta t}$ is uniformly bounded for ${\Delta t}$ sufficiently small and $\varphi_{{\Delta t},0}=\Pi_{\Delta t}^{\perp}\varphi$ . On the other hand, using $S_{1}=\mathcal{L}_{\gamma}^{2}/2$ ,

9 Proof of Theorem 2.21

We write the proof for the evolution operator $P_{\Delta t}^{\gamma C,A,B,A,\gamma C}$ first, and mention then how to extend the result to $P_{\Delta t}^{B,A,\gamma C,A,B}$ using the TU lemma. The proofs for $P_{\Delta t}^{\gamma C,B,A,B,\gamma C}$ and $P_{\Delta t}^{A,B,\gamma C,B,A}$ are very similar, so we only briefly mention the required modifications. By default, all operators appearing in this section are defined on the core $\mathcal{S}$ .

This suggests to consider the limiting operator $P_{\infty,{\Delta t}}=\pi P_{{\rm ham},{\Delta t}}\pi$ and write

For a given smooth function $\varphi\in\mathcal{S}$ which depends only on the position variable $q$ ,

The idea is that the remainders $r_{\varphi,\gamma,{\Delta t}}^{1}$ and $r_{\varphi,\gamma,{\Delta t}}^{2}$ are exponentially small when the function $\varphi$ is sufficiently smooth (see below for a more precise discussion, once $\varphi$ has been replaced by $Q_{\Delta t}\psi$ with $Q_{\Delta t}$ an appropriate approximate inverse). Therefore, the leading order terms in the error estimate are obtained by considering the limiting operator $P_{\infty,{\Delta t}}$ only.

We now study the error estimates associated with $P_{\infty,{\Delta t}}$ , following the strategy used in Section 4.4. We first use the results of Section 4.3.1 with $M=3$ , $A_{1}=A_{3}=A/2$ and $A_{2}=B$ to expand $P_{{\rm ham},{\Delta t}}$ , so that

with $S_{i}=\mathcal{T}[(A_{1}+A_{2}+A_{3})^{i}]$ . To give more precise expressions of the operators appearing on the right-hand side of the above equality, we use the following facts:

Plugging the above results in (83) and introducing $h={\Delta t}^{2}/2$ ,

An approximate inverse of the operator appearing on the left hand side of the above equality is thus

where $f_{2,\infty}$ is the unique solution of

A more explicit expression can be obtained by noting that

so that (recalling $\mathcal{L}_{\rm ovd}=-\beta^{-1}\nabla^{*}\nabla=-\beta^{-1}\sum_{i=1}^{dN}\partial_{q_{i}}^{*}\partial_{q_{i}}$ where the formal adjoints are taken on $L^{2}(\overline{\mu})$ )

Since $f_{2,\infty}$ should have a vanishing average with respect to $\mu$ , this proves that

where the constant $a$ is adjusted to account for the constraint of vanishing average. A simple computation shows that it is equal to the constant $a_{\beta,V}$ defined in (43).

In fact, it is possible for the scheme considered here to precisely determine the leading order correction for numerical averages by noting that

We now come back to (81)-(82) and replace $\overline{\Pi}^{\perp}\varphi$ by $Q_{h}\psi$ :

where $\overline{r}_{{\Delta t},\psi}$ is the same as in (91), while

so that $P_{\Delta t}^{B,A,\gamma C,A,B}=T_{\gamma,{\Delta t}}U_{\gamma,{\Delta t}}$ while $P_{\Delta t}^{\gamma C,A,B,A,\gamma C}=U_{\gamma,{\Delta t}}T_{\gamma,{\Delta t}}$ . By the TU lemma,

where the remainder $\widetilde{r}_{\psi,{\Delta t}}$ is uniformly bounded for ${\Delta t}$ sufficiently small. Therefore,

We mimic the above proof for the evolution operator $P_{\Delta t}^{\gamma C,B,A,B,\gamma C}$ . The equality (83) still holds, but the operator $S_{4}$ now reads

The expression of $f^{A,B,\gamma C,B,A}_{2,\infty}$ is obtained via the TU lemma, introducing the limiting operator

Let us conclude this section with the proof of Lemma 4.9.

The conclusion follows for instance by an application of Theorem 8.7 in Rey-Bellet (2006), considering as a reference dynamics the Ornstein-Uhlenbeck process

for an appropriate constant $b_{s}\geqslant 0$ . This shows the existence of constants $R_{s},\alpha_{s}$ such that

10 Proof of Proposition 2.22

Note that the above function has average zero with respect to $\kappa$ . We then apply Theorem 2.4 to obtain

To compute $\pi(A+B)C^{-1}\widetilde{g}$ , we rely on (86) and (87) and obtain

This allows us to conclude that the limit of $f_{2}^{\gamma C,B,A,B,\gamma C}$ is the argument of the operator $\mathcal{L}_{\rm ovd}$ in the previous line, up to an additive constant chosen to ensure that $f_{2}^{\gamma C,B,A,B,\gamma C}$ has a vanishing average with respect to $\mu$ (which turns out to be $a_{\beta,V}/8$ ). We deduce the limit for $f_{2}^{A,B,\gamma C,B,A}$ with (33) since $(A+B)g=p^{T}(\nabla^{2}V)p-|\nabla V|^{2}$ .

The expressions for the limits of $f_{2}^{\gamma C,A,B,A\gamma C}$ and $f_{2}^{B,A,\gamma C,A,B}$ are obtained in a similar fashion.

11 Linear response theory

We briefly sketch the discussion in (Stoltz, 2012, Section 3.1) (see in particular Theorem 3.1 in this reference). Hypoellipticity arguments show that the measure $\mu_{\gamma,\eta}$ has a smooth density with respect to the Lebesgue measure. It moreover formally satisfies the Fokker-Planck equation

This equation can be given a rigorous meaning when $\eta$ is sufficiently small. We rely on the following result (proved at the end of this section), which is itself based on the fact that $\left(\mathcal{L}_{\gamma}^{*}\right)^{-1}$ can be extended to a bounded operator on $\mathcal{H}^{0}$ (see Theorem 2.3 and the comment after it).

The operator $(\mathcal{L}_{\gamma}^{*})^{-1}\widetilde{\mathcal{L}}^{*}$ , considered as an operator on the Hilbert space $\mathcal{H}^{0}=L^{2}(\mu)\cap\{\mathbf{1}\}^{\perp}$ introduced in (14), is bounded.

Denoting by $r$ the spectral radius of $(\mathcal{L}_{\gamma}^{*})^{-1}\widetilde{\mathcal{L}}^{*}\in\mathcal{B}(\mathcal{H}^{0})$ , it is easily checked that $\left(\mathcal{L}_{\gamma}+\eta\widetilde{\mathcal{L}}\right)^{*}$ is invertible for $|\eta|<r^{-1}$ with

Therefore, a straightforward computation shows that

with $r_{\eta,\gamma}$ uniformly bounded as $\eta\to 0$ . This gives (48).

Note first that the image of $\widetilde{\mathcal{L}}^{*}$ is contained in $\mathcal{H}^{0}$ since, for any $u\in\mathcal{S}$ ,

It is therefore possible to give a meaning to the operator $(\mathcal{L}_{\gamma}^{*})^{-1}\widetilde{\mathcal{L}}^{*}$ as an operator on $\widetilde{\mathcal{S}}$ . We then check that the perturbation $\widetilde{\mathcal{L}}$ is $\mathcal{L}_{\gamma}$ -bounded (with relative bound 0, in fact): for $u\in\widetilde{\mathcal{S}}$ ,

so that, for $u\in\mathcal{H}^{0}$ (recall that $\mathcal{L}_{\gamma}^{-1}u$ is well defined in this case),

This proves that $\widetilde{\mathcal{L}}\mathcal{L}_{\gamma}^{-1}$ is bounded, hence its adjoint is bounded as well.

11.2 Proof of Lemma 3.2

Recall that we set mass matrices to identity when considering overdamped limits. Since

it follows (using first (98) to compute the linear response and then (17) to obtain the asymptotic behavior of $\mathcal{L}_{\gamma}^{-1}(F^{T}\nabla V)$ as $\gamma\to+\infty$ )

where $r_{\gamma}$ is uniformly bounded for $\gamma\geqslant 1$ . This gives the desired result.

The article Hairer & Pavliotis (2008) in fact studies the limiting behavior of the autodiffusion coefficient, as computed from (50):

Using $\mathcal{L}_{\rm ovd}=-\beta^{-1}\nabla_{q}^{*}\nabla_{q}$ , a simple computation shows

so that $\beta\overline{D}_{F}=|F|^{2}+\overline{\nu}_{F}$ .

12 Proof of Theorem 3.4

The proof again is along the lines of the proof written in Section 4.4, and we are therefore very brief, mentioning only the most important modifications.

Let us first consider the first order scheme $P_{{\Delta t}}^{\gamma C,B+\eta\widetilde{\mathcal{L}},A}$ . Using the notation introduced in Section 4.3.1, and recalling the definition $B_{\eta}=B+\eta\widetilde{\mathcal{L}}$ , we write

All the operators appearing in the expressions above are defined on the core $\mathcal{S}$ , and have values in $\mathcal{S}$ . Since

it is easy to see that the operator $R_{\eta,{\Delta t}}$ can be rewritten as the sum of two contributions: $R_{\eta,{\Delta t}}=R_{0,{\Delta t}}+\eta\widetilde{R}_{\eta,{\Delta t}}$ , where, for $\psi\in\mathcal{S}$ , the smooth function $\widetilde{R}_{\eta,{\Delta t}}\psi$ can be uniformly controlled in $\eta$ for $|\eta|\leqslant 1$ . Finally, the evolution operator can be rewritten as

where $S_{1}$ is defined in (70) (which corresponds to the case $\eta=0$ ), $D_{1}=(2\gamma C+B)\widetilde{\mathcal{L}}+\widetilde{\mathcal{L}}(2A+B)$ , and

We then compute, for $\varphi\in\mathcal{S}$ and $f_{1,1,\gamma}\in\widetilde{\mathcal{S}}$ to be chosen later,

The first two terms in the last expression vanish by definition of $f_{0,1,\gamma}$ and $f_{1,0,\gamma}$ , while the third one vanishes when the function $f_{1,1,\gamma}$ is defined by the Poisson equation

It is easy to check that the right-hand side of this equation has a vanishing average with respect to $\mu$ (integrating with respect to $\mu$ and letting the adjoints of the operators act on $\mathbf{1}$ ). We then project (99) using $\Pi^{\perp}$ and introduce the approximate inverse, defined on $\widetilde{\mathcal{S}}$ as

obtained by truncating the formal series expansion of the inverse operator by discarding terms associated with $\eta^{2}$ or ${\Delta t}^{2}$ . The approximate inverse is such that

with $\mathcal{R}^{2}_{\eta,{\Delta t}}=\mathcal{R}^{2}_{0,{\Delta t}}+\eta\widetilde{\mathcal{R}}^{2}_{\eta,{\Delta t}}$ . We then replace $\Pi^{\perp}\varphi$ by $Q_{\eta,{\Delta t}}\psi$ and conclude as in Section 4.4.

The result for the second order splitting is obtained by appropriate modifications of the proof written above for $p=1$ , similar to the ones introduced in Section 4.6. We will therefore mention only the most important point, which is the following. Replacing $B$ by $B_{\eta}$ in the expansion (77), we see that

where $\mathscr{R}_{\eta,{\Delta t}}$ regroups operators of order ${\Delta t}^{3+\alpha}\eta^{\alpha^{\prime}}$ or ${\Delta t}^{2+\alpha}\eta^{2+\alpha^{\prime}}$ for $\alpha,\alpha^{\prime}\geqslant 0$ , the operator $S_{2}$ is defined in (76) and $\widetilde{S}_{2,\eta}$ satisfies

We consider only contributions of the form $\eta^{\alpha}{\Delta t}^{\alpha^{\prime}}$ with $\alpha=0,1$ and $0\leqslant\alpha^{\prime}\leqslant 2$ . The contributions in ${\Delta t},{\Delta t}^{2}$ are the same as in the case $\eta=0$ and therefore vanish. The contribution in $\eta$ vanishes in view of the choice of $f_{0,1,\gamma}$ . For the same reason, the contribution in $\eta{\Delta t}$ vanishes as well:

The contribution in $\eta{\Delta t}^{2}$ is proportional to

The requirement that this expression vanishes for all functions $\varphi\in\mathcal{S}$ characterizes the function $f_{2,1,\gamma}$ (the discussion on the solvability of this equation following the same lines as the discussion on the solvability of (101)). The proof is then concluded as in the case $p=1$ .

13 Proof of Theorem 3.6

The proof of this result is obtained by modifying the proof of Theorem 2.21 presented in Section 4.9 by taking into account the nonequilibrium perturbation, as done in the proof of Theorem 3.4 presented in Section 4.12. We will therefore be very brief and only mention the most important modifications.

We write the proof for the scheme associated with the evolution operator $P_{\Delta t}^{\gamma C,A,B_{\eta},A,\gamma C}$ for instance (since this is the case explicitly treated in Section 4.9 for $\eta=0$ ). First, arguing as in Section 4.9, we see that it is possible to replace $P_{\Delta t}^{\gamma C,A,B_{\eta},A,\gamma C}$ by

up to error terms in the invariant measure which are exponentially small in $\gamma{\Delta t}$ . Note that $B_{\eta}=(F-\nabla V)\cdot\nabla_{p}$ , so that the rules (84)-(85) are still valid. Therefore, introducing again $h={\Delta t}^{2}/2$ ,

where $D$ is defined in (88), and the expressions of the operators $\widetilde{D}_{i}$ ( $i=1,2$ ) are obtained by expanding the various terms $A^{a}B_{\eta}^{b}A^{c}$ in powers of $\eta$ . Keeping only the dominant terms, we arrive at

This relation is the analogue of (100) in the overdamped limit, and the remainder of the proof is carried on following the strategy presented in Section 4.9.

Acknowledgements

The authors thank Francis Nier for several fruitful discussions on the properties of Langevin-type generators. Ben Leimkuhler and Charles Matthews acknowledge the support of the Engineering and Physical Sciences Research Council (UK) and grant EP/G036136/1. G. Stoltz was partially supported by the project DYMHOM (De la dynamique moléculaire, via l’homogénéisation, aux modèles macroscopiques de poroélasticité et électrocinétique) from the program NEEDS (Projet fédérateur Milieux Poreux MIPOR). G. Stoltz also benefited from the scientific environment of the Laboratoire International Associé between the Centre National de la Recherche Scientifique and the University of Illinois at Urbana-Champaign.