Rational Construction of Stochastic Numerical Methods for Molecular Sampling

Benedict Leimkuhler, Charles Matthews

Introduction

Let U:RNRU:{\bf R}^{N}\rightarrow{\bf R} be the potential energy function of a classical model for a molecular system. A fundamental challenge is to sample the configurational Gibbs-Boltzmann (canonical) distribution with density

where β1=kBT\beta^{-1}=k_{B}T where kBk_{B} is Boltzmann’s constant, TT is temperature, and ZZ is a normalization constant so that ρˉβ\bar{\rho}_{\beta} has unit integral over the entire configuration space. A wide variety of methods are available to calculate averages with respect to ρˉβ\bar{\rho}_{\beta}; among these, some of the most popular are based on Brownian dynamics or Langevin dynamics (defined in the phase space of positions and momenta) . (In this article we focus exclusively on molecular dynamics techniques; molecular models can also be sampled using Monte-Carlo methods, and, more generally, using hybrid algorithms which combine molecular dynamics with a Metropolis-Hastings test in order to correct averages. For a recent review of such schemes see .) Recall that Brownian dynamics (overdamped Langevin dynamics) is a system of Itō-type stochastic differential equations of the form

where dW(t){\rm d}W(t) is the infinitesimal increment of a vector of stochastic Wiener processes W(t)W(t), and MM is a positive (we assume here diagonal) mass matrix.In terms of sampling the Gibbs distribution, MM is in fact arbitrary, but it may be useful to allow for coordinate scaling. A simple and popular method for numerical solution of Eq. 2 is the Euler-Maruyama method

where RnR_{n} is a vector of random variables with standard normal distribution. This produces a sequence of points x0,x1,x2,x_{0},x_{1},x_{2},\ldots, which, following a certain relaxation period, are approximately distributed according to the canonical invariant distribution. Euler-Maruyama has the property that the time averages along discrete trajectories, in the limit of large time (under appropriate conditions on the potential U(x)U(x) and assuming no effects from floating point rounding error), have error proportional to hh. One of the observations of this article is that the simple modification

provides a second order approximation of stationary averages.

We arrive at this scheme by considering the large friction limit in a particular numerical method for Langevin dynamics. Recall that Langevin dynamics is a stochastic-dynamical system involving both positions and momenta pp of the form

where W=W(t)W=W(t) is again a vector of NN independent Wiener processes, and γ>0\gamma>0 is a free parameter, the friction coefficient. The methods that are in fact the primary focus of this paper are splitting integrators that decompose the stochastic vector field of Langevin dynamics into simpler vector fields which can be solved exactly. The composition method that results cannot be directly related to a stochastic differential equation, so the analogy with backward error analysis for deterministic problems is incomplete, but nonetheless the invariant measure associated to the numerical method can be derived, using the Baker-Campbell-Hausdorff (BCH) expansion, as an asymptotic series in two parameters: the stepsize and the reciprocal of the friction coefficient. The superconvergence result alluded to above is then obtained in the high friction limit.

The Langevin stepsize must be understood to be proportional to the square root of the stepsize that appears in Eq. 3, so in Langevin dynamics an effective 4th order approximation is obtained, but only for the marginal configurational invariant distribution, Eq. 1. Our approach also provides a simple method for comparative assessment of the invariant measure of a class of Langevin integrators.

Molecular dynamics is a large family of modelling techniques which is widely used in different application areas and for different purposes . This article is addressed specifically to the topic of calculating averages with respect to an invariant distribution, and will probably be of highest interest for applications in molecular sampling, in particular the calculation of averages with respect to the configurational density Eq. 1. The high friction limit renders Langevin dynamics unsuitable for dynamical modelling (except as a method for generating starting configurations for dynamical exploration). It is worth noting that invariant measure computations arise frequently in applications other than molecular modelling, and the techniques described here would be of potential use in many of these.

In large scale simulations, it should be understood that the statistical error (dependent on the number of samples used) is typically the dominant concern. Our approach focuses on the truncation error of the invariant distribution, thus the greatest benefit would be seen only when the statistical error is well controlled. Nonetheless we observe in our numerical experiments that the least biased scheme from the point of view of the error introduced in configurational sampling is also as efficient as the alternatives and the most robust with respect to variation of the parameters (stepsize, friction coefficient), thus there is effectively no price for the improvement in accuracy. Moreover, it would be possible to complement the methods proposed here by procedures such as importance sampling to further reduce the statistical error.

With regard to the approximation of canonical averages, methods have previously been constructed for Brownian dynamics with order >1>1 and for Langevin dynamics with order >2>2 , but these require multiple evaluations of the force; for this reason they are not normally viewed as competitive alternatives for molecular sampling . By contrast all of the methods described in this article use a single force evaluation at each timestep.

The approach used here may be compared to other recent works on stochastic numerical methods, and, in particular . Our technique differs from these in (i) the direct focus on the stationary configurational distribution, and (ii) the use of the BCH expansion. Other articles (see e.g. ) which address the invariant measure in Langevin-type stochastic differential equations do not use the backward error analysis (and do not find the superconvergent scheme).

The rest of this article is as follows. In Section 2 we review necessary background on stochastic differential equations for sampling from the Gibbs measure. Section 3 presents our expansion of the associated perturbed invariant measure and calculations involving the use of the Baker-Campbell-Hausdorff theorem. Section 4 describes the reduction of the methods in the case of overdamped Langevin Dynamics. Section 5 demonstrates the theory obtained using numerical experiments to verify the results.

Background

The Ornstein-Uhlenbeck (OU) stochastic differential equation is an Itō equation of the form

where uu is a random variable defined for each time tt, γ\gamma and σ\sigma are positive parameters and dW{\rm d}W represents the infinitesimal increment of a Wiener process. Langevin dynamics (Eq. 4) combines the Ornstein-Uhlenbeck stochastic vector field with conservative dynamics.

For stochastic differential equations, the density evolves according to an evolution equation (the Fokker-Planck or forward Kolmogorov equation) of the form

where L{\cal L}^{*} is a second order differential operator. In the case of Langevin dynamics, the relevant Kolmogorov operator is defined by its action on a function ϕ\phi of the variables of the system by

Assuming UU is CC^{\infty} it is possible to demonstrate that the operator LLD{\cal L}_{\rm LD}^{*} is hypoelliptic by using Hörmander’s criterion based on iterated commutators, and this implies that the Gibbs measure is the unique steady state (up to normalization). Even stronger is the result of which demonstrates the existence of a spectral gap for the operator LLD{\cal L}_{\rm LD}. Many of the challenges related to obtaining formal analytical results for stochastic differential equations relate to singularities of the potential and/or the assumption of an unbounded solution domain. However, with periodic boundary conditions and strong repulsive potentials (e.g. Lennard-Jones potentials) we observe that configurations typically evolve in a bounded set and remain far from singular points (the radial distribution vanishes in a large interval around the origin). Indeed it is a simple calculation to demonstrate that for a Lennard-Jones system with potentials φ(r)=(σ/r)122(σ/r)6\varphi(r)=(\sigma/r)^{1}2-2(\sigma/r)^{6} the expected number of samples required to observe r(0,σ/2)r\in(0,\sigma/2) at unit temperature would involve a simulation of duration far greater than the age of the universe, due to the steepness of the potential close to the origin. In our simulations of a small Lennard-Jones cluster in Section 5, we did not observe a separation of two atoms beneath 0.7σ0.7\sigma in all of the nearly 101110^{11} timesteps performed to gather statistics; a separation less than 0.5σ0.5\sigma could only be seen at very large stepsizes, when instabilities due to other components of model, e.g. harmonic bonds, would have anyway rendered a typical molecular simulation useless. Thus it is somewhat of a moot point whether we simply assume that configurations stay well away from singular points (domain restriction) or that the potential has been smoothly cut off to remove the singularity; in no case will we encounter the atomic collision singularity in a simulation of the type envisioned here.

LLDf=0{\cal L}_{\rm LD}^{*}f=0 has a unique solution in C(Ω)C^{\infty}(\Omega), i.e., the Gibbs density ρβ\rho_{\beta},

LLD{\cal L}_{\rm LD}^{*} has a compact resolvent , and the Gibbs state is therefore exponentially attracting.

Note that, if, for some given gg, there are two solutions of LLDf=g{\cal L}_{\rm LD}^{*}f=g then, using the linearity of the operator these may differ only in a constant scaling of the Gibbs density.

In a splitting method for a deterministic system z˙=f(z)\dot{z}=f(z), one divides the vector field ff into exactly solvable parts, i.e. f=f1+f2f=f_{1}+f_{2}, which are treated sequentially within a timestep. An example of such a splitting method for the Hamiltonian system with energy H(x,p)=ipi2/(2mi)+U(x1,x2,,xN)H(x,p)=\sum_{i}p_{i}^{2}/(2m_{i})+U(x_{1},x_{2},\ldots,x_{N}) is the “symplectic Euler” method defined by f1=imi1pixif_{1}=\sum_{i}m_{i}^{-1}p_{i}\partial_{x_{i}}, f2=iUxipif_{2}=-\sum_{i}\frac{\partial U}{\partial x_{i}}\partial_{p_{i}}. By dividing the vector field as f=12f1+f2+12f1f=\frac{1}{2}f_{1}+f_{2}+\frac{1}{2}f_{1}, solving each vector field in turn, we obtain the so-called position Verlet method, and by switching the roles of f1f_{1} and f2f_{2} we obtain the velocity Verlet method. Splitting methods like these are explicit and this feature is of particular importance in molecular dynamics, where the force calculation is the usual measure of per-timestep computational complexity.

In a similar way, Langevin dynamics may be treated by splitting . For example, one may divide the Langevin system (Eq. 4) into three parts

and each of the three parts may be solved ‘exactly’. In the case of the OU part (labelled here simply as O) we mean by this that we realize the stochastic process by the equivalent formula

where R(t)R(t) is a vector of uncorrelated independent standard normal random processes (white noise). One method based on the splitting in Eq. 5 is defined by the composition

where exp(δtLf)\exp(\delta t{\cal L}_{f}) represents the phase space propagator associated to the (deterministic or stochastic) vector field ff, and we use δt\delta t as the timestep in Langevin dynamics (later we use hh for the timestep of an associated Brownian dynamics). The deterministic part is approximated by the position Verlet method. This is referred to as a Geometric Langevin Algorithm of order two (GLA-2) following ; an alternative is to use velocity Verlet for the Hamiltonian part (ψBABOδt\psi_{\rm BABO}^{\delta t}). A simple generalization of GLA methods is obtained by interspersing integrators associated to parts of the Hamiltonian vector field with exact OU solves, thus we have

and ψBAOAB\psi_{\rm BAOAB} defined in an analogous way. Recent work in similarly uses exact OU solves to give an integrator equivalent to ψOBABO\psi_{\rm OBABO} (in our notation), though with a reparameterised timestep. The analysis technique we use can employed to study many such integrators, though for brevity we will limit this article to discussion only on a select few interesting cases.

An alternative integrator termed the Stochastic Position Verlet (SPV) method , relies on the splitting

SPV is not a (generalized) GLA-type method, although it, as each of the generalized GLA schemes, is quasisymplectic in the language of . Likewise the commonly used method of Brunger, Brooks and Karplus (BBK) is not of the (generalized) GLA family. Details of all methods examined are given in the Appendix.

To slightly simplify the presentation that follows, we make the change of variables qM1/2qq\rightarrow M^{-1/2}q, pM+1/2pp\rightarrow M^{+1/2}p, with a corresponding adjustment of the potential; this is equivalent to assuming M=IM=I.

Expansion of the Invariant Measure

We shall work here with formal series expansions, however we expect that our derivation could be rigorously founded using techniques found in and . Associated to any given splitting-based method for Langevin dynamics, we define the operator L^\hat{\cal L}^{*} that characterises the propagation of density by an expansion of the form:

For example, for the method labelled by the string ABAO, we have

The perturbation series may be found by successive applications of the BCH expansion and linearity properties of the Kolmogorov operator. However, unlike in the deterministic case, the terms that appear in the series cannot be associated to modified vector fields or even SDEs .

Note that, when iterated n+1n+1 times, the method ABAO produces a sequence of the form

thus, with a minor coordinate transformation, the dynamics sample the same invariant density as BAOAB. Similarly BABO and OBAB are essentially the same method as ABOBA. For this reason we concentrate in the remainder of this article on ABOBA and BAOAB. For these two methods, the symmetry implies that the odd order terms in Eq. 7 vanish identically using the Jacobi identity in the BCH expansions.

After deriving L^\hat{\cal L}^{*} in this way, we seek the invariant distribution which satisfies L^ρ^=0\hat{{\cal L}}^{*}\hat{\rho}=0. For the BAOAB and ABOBA methods, we make the ansatz that the invariant measure of the numerical method has the simple form

Although some technical issues might be encountered, we believe that the existence of such an expansion can be made rigorous using techniques found in , based on the regularity of the operator LLD{\cal L}^{*}_{\rm LD}. We may rewrite this as

This means that the equation L^ρ^=0\hat{\cal L}^{*}\hat{\rho}=0 becomes

Equating second order terms in δt\delta t gives

The equation LLDg=0{\cal L}_{\rm LD}^{*}g=0 has unique solution g=ρβg=\rho_{\beta}, up to a constant multiple. Hence the homogeneous solution to the above PDE is f2(x,p)=cf_{2}(x,p)=c, for some constant cc; we therefore require a particular solution f2f_{2} of Eq. 9. According to the Fredholm Alternative, the equation has a solution provided that, for any solution of

As the only solutions of LLDg=0{\cal L}_{\rm LD}g=0 are the constants, we require

For a symmetric splitting method such as the BAOAB method, recall that we can use the Baker-Campbell-Hausdorff formula to find the overall one-step perturbation operator for the scheme. For linear operators XX, YY and ZZ, we have the relation

Here [X,Y]=XYYX[X,Y]=XY-YX is the commutator of YY and XX.

to compute the perturbed operator for the method. A similar analysis can be conducted for the other generalised GLA-type methods considered in this paper.

To compute L^\hat{\cal L}^{*} we simply plug in our choices into the BCH formula to obtain:

The calculation of the inhomogeneity in (8) then amounts to a straightforward computation of the commutator series applied to ρβ\rho_{\beta}. The commutators needed are:

where we have abbreviated the Hessian xxTU(x)=:U(x).\nabla_{x}\nabla_{x}^{T}U(x)=:U^{\prime\prime}(x).

Observe that Eq. 10 is satisfied since the average of the first term is equivalent to a canonical average which vanishes

whereas the other terms in Eq. 11, being canonical averages of terms which are odd-order in pp, necessarily also average to zero.

An analogous computation can be performed for the ABOBA method, giving a slightly (but crucially) different perturbation operator L(ABOBA){\cal L}^{*\,({\rm ABOBA})}, where

Here UU^{\prime\prime} is the Hessian matrix of UU and Δx=i=1N2xi2\Delta_{x}=\sum_{i=1}^{N}\frac{\partial^{2}}{\partial x_{i}^{2}} is the partial Laplacian in xx. Equation 10 is again seen to be satisfied.

Although we are not able to give the general analytical solution to the partial differential equation of Eq. 9, we can find a solution in an important limiting case: the high friction regime. To do this, we expand the invariant density of the numerical method further, viewing both δt\delta t and ε=γ1\varepsilon=\gamma^{-1} as small parameters:

Dividing by γ\gamma, we may reduce Eq. 9 to

Note that this is a singularly perturbed system as L0{\cal L}_{0} is degenerate and it is only the combined operator that has the necessary regularity to define a unique solution. Nonetheless, as explained below it is possible to find the leading term f2,0f_{2,0} by substituting a truncated expansion of fixed degree and solving the resulting equations.

Equating powers of ε\varepsilon, we find

Truncating at n=2n=2, for example, we find the following solution of these equations:

For ABOBA, f2,0f_{2,0} the solution would change to

3 Marginal distribution

We now turn out attention to the configurational marginal distribution obtained by integrating the density expansion Eq. 12 with respect to the momenta. Our interest is only in the leading term of this expansion, which defines the sampling behavior for large γ\gamma. Ignoring higher order terms in δt\delta t and ε\varepsilon, we would have the distribution

Using the identity det(M)=exp(trace(log(M)))\det(M)=\exp({\rm trace}(\log(M))), we find

We then Taylor expand the logarithm of I+δt24UI+\frac{\delta t^{2}}{4}U^{\prime\prime} and take the trace to obtain a cancellation of the δt2\delta t^{2} terms, giving

If ε\varepsilon is small (or δt\delta t is relatively large), the error will be dominated by the quartic term in δt\delta t and we will observe 4th order accuracy in configurational averages. For small δt\delta t the method is always eventually second order.

In the case of the ABOBA method, the remarkable cancellation of the second order errors does not occur and the method always exhibits 2nd order configuration distribution error.

The Limit Method

We now consider the limit γ\gamma\rightarrow\infty, where the exact solution of the vector OU process reduces to p=kBTM1/2Rp=\sqrt{k_{B}T}M^{1/2}R, where RR is a vector whose components have a standard normal distribution (Gaussian white noise). Alternatively, we could consider the limit of the particle mass going to , although this requires a reformulation of Eq. 4 so that the friction is proportional to the velocity instead of the momentum . Whichever limit is taken, we would expect the ultimate result to be the same. (Here we have reintroduced the masses in order to present the method, since they may be useful scaling parameters in simulation.) In the configurations it is straightforward to show that the BAOAB method therefore becomes

where the RnR_{n} are vectors of i.i.d. standard normal random variables. Replacing δt2/2\delta t^{2}/2 by hh we arrive at Eq. 3. Since the Langevin scheme gives 4th order accurate configurational averages in this limit, we expect the method of Eq. 3 to be second order accurate in modified timestep hh. Moreover, since we completely remove the second order term in the Langevin dynamics configurational density expansion, we expect to observe this behaviour across all values of hh.

By contrast the ABOBA scheme gives a much more complicated limit method as γ\gamma\rightarrow\infty which is not in one-step form.

In the Euler-Maruyama method, the random perturbations introduced at each step are independent. In the method of Eq. 2, the random perturbation is a scaling of Zn=(Rn+Rn+1)/2Z_{n}=(R_{n}+R_{n+1})/\sqrt{2}; the components Zn(i)Z_{n}^{(i)} of these are independent of each other and decay linearly with timestep:

Thus, in this new method, we use a colored noise which has characteristics that directly depend on the stepsize, although the noise decorrelates in just a couple of timesteps. This is therefore no longer a Markov process, however it can be reformulated as such if one considers the appropriate extended space (eg. yn=[xn,Rn,Rn+1]y_{n}=\left[x_{n},R_{n},R_{n+1}\right]).

Numerical Experiments

We implemented the methods ABOBA, BAOAB, SPV and BBK and compared the accuracy of configurational sampling for different values of γ\gamma and a range of timesteps. A brief analysis shows that the use of the harmonic oscillator leads to special cancellations in the BCH series of the splitting schemes, making it a poor test subject. Hence, in order to compare the order of accuracy of the different schemes, we first considered an oscillator model in 1D with potential U(x)=x4/4+sin(1+5x)U(x)=x^{4}/4+\sin(1+5x). This was accomplished by introducing MM intervals (‘bins’) of equal length, and computing the mean error in the observed probability frequency compared to the exact expected frequency (obtained by integration of the probability density). If the observed frequency in bin ii is ωi{\omega}_{i}, and the exact expected frequency is ω^i\hat{\omega}_{i}, then the error calculated is

In this one-dimensional example, we used 2020 bins to cover the interval from 3.5-3.5 to 3.53.5. The configurational density error is plotted against stepsize in log-log scale. If Errorδtr{\rm Error}\propto\delta t^{r} then we expect this graph to be a line of slope rr. Due to the relative simplicity of this model, we were able to perform highly resolved simulations to calculate accurate error estimates for the configurational distribution. The exact expected value that we compare experimental results against can be computed to arbitrary prescision, and we are able to run as many simulations as needed in order to drive the variance of results to a minimum. The variance in our results, in cases where the stepsize was less than 0.30.3, was consistently below 101010^{-10}. Above a stepsize of 0.30.3 some of the methods were found to be unstable. The results of our simulations are summarized in Figure 1.

As we can see, when γ\gamma is small, the methods perform somewhat similarly, at least in the qualitative sense, with all showing a 2nd order error in configurational sampling, and the ABOBA and SPV methods essentially identical. As γ\gamma is increased, the substantial difference between BAOAB and the other methods becomes apparent. In the limit of large γ\gamma the SPV method effectively annihilates the force which results in poor sampling. In the graph for γ=1\gamma=1, we can see that for larger values of the stepsize, the graph steepens (indicating that the fourth order term is dominant); as the stepsize is decreased the method exhibits a 2nd order asymptotic decay. With γ=50\gamma=50 the fourth order behavior is seen for for all indicated data points, although, again, this becomes second order for smaller values of δt\delta t. Note that the limit method Eq. 2 (with the substitution h=δt2/2h=\delta t^{2}/2) gives an essentially identical behavior to the γ=50\gamma=50 case. We also give a comparison of the actual computed configurational distributions, at different stepsizes using each method, for γ=20\gamma=20 in Figure 2.

To examine the performance of the limit method in more detail, we next considered small molecular clusters consisting of seven atoms (motion restricted to the plane), with both Morse (φM\varphi_{M}) and Lennard-Jones (φLJ\varphi_{LJ}) potentials, given by

where we use a=2,ε=1a=2,\varepsilon=1 and rm=1r_{m}=1. The overall potentials are hence

where rijr_{ij} is the distance between particles ii and jj, rkr_{k} is the distance between particle kk and the origin, and a mild harmonic term is included in the case of the Lennard-Jones system in order to prevent particles being ejected from the cluster.

The Morse potential gives a smoother dynamics compared to Lennard-Jones (the Morse forces were on average three times smaller than those for Lennard-Jones) and allows a more satisfying determination of the error scaling behavior with stepsize. To quantify the error in configurational sampling, we calculated the radial density G(r)G(r) by binning the instantaneous interatomic distances at each step into 20 compartments and compared to a calculated reference value (Figure 3). Though not exact, the errors in the reference will be negligible compared with configurational distribution errors at higher stepsizes. For the Morse cluster the reference stepsize we used was href=0.001h_{\rm{ref}}=0.001, whereas for the Lennard-Jones cluster, we used href=0.00025,h_{\rm{ref}}=0.00025, both with the same integration time that was used in their respective test runs. In both cases the cluster was initialized with 6 particles placed on a regular hexagon with unit side-length and with the remaining particle in the centre, and initial velocities randomly drawn from the canonical distribution.

Considerable computation is required to achieve the level of accuracy required, due to the dominance of sampling error and the complexity of the system compared to the earlier 1D example. We ran a large number of independent simulations at each stepsize and computed the average radial density plot for both examples, and compared this to the reference result.

Our results, presented in Figure 4, are entirely consistent with our analysis and show the second order dependence of the configurational sampling error on hh (equivalent to fourth order in δt\delta t), as compared to the Euler-Maruyama’s first order behavior. These results demonstrate a good agreement with our theoretical results, however even using extensive computation the variances in each experiment were still quite high. If ωh,n,m\omega_{h,n,m} is the density of bin mm in simulation nn at stepsize hh, we calculate the variance as

where we used M=20M=20 bins for each experiment, N=200N=200 for the Morse experiment and N=1000N=1000 for the Lennard-Jones experiment. The variances for the Morse experiment were around 101010^{-10} while the Lennard Jones experiment had variances around 10810^{-8}. We expect that completing more simulations (increasing NN) would reduce the variance and give smoother results in Figure 4. Nonetheless, in both cases, we observe a significant reduction in the error compared to Euler-Maruyama.

It might be a suggested that the improved accuracy seen in the high γ\gamma regime could potentially come at the price of a slower convergence to equilibrium due to a reduced rate of transition between metastable states, hence overall sampling of the configurational distribution might be impaired in favour of local sampling. To address this point, we have plotted in Figure 5 the error computed in our Lennard-Jones simulation as a function of the number of force evaluations (vertical) against the friction value γ\gamma used for the simulation (horizontal). Gridpoints in the plot are coloured according to the configurational error, computed using Eq. 13. The results indicate that the convergence rate for the BAOAB method is not diminished for large friction coefficient, so it does not appear that we are sacrificing sampling accuracy using this scheme. Note that the performance of ABOBA is also robust in the limit of large γ\gamma, but the achievable accuracy is reduced as it is only second order, consistent with what we have presented in this article.

Conclusions

Our results confirm the theoretical results of Sections 3 and 4, in particular showing the higher order configurational sampling of the BAOAB method. The order in its Langevin formulation is effectively four in the large γ\gamma limit and large values of γ\gamma do not impair its stability due to the use of exact Ornstein-Uhlenbeck solves. Not only is the order of the method high, the constant multiplying the leading term must be, in the cases looked at here, of modest magnitude, since the errors are relatively small also at the large stepsize stability threshold. Since, in the context of molecular dynamics, BAOAB is a ‘cheap’ and easy to implement scheme using only a single force vector per timestep, we stress that there is no price to pay for its improved accuracy.

In molecular modelling, there are other errors that play important roles, most importantly errors in the force fields (or, more fundamentally, the errors due to not modelling quantum mechanics properly) and sampling errors. Obviously these errors may dominate the overall method error and limit the relative benefit to be gained by using one integrator as compared to another, but it is also clear that both of the other types of errors are constantly being reduced through the design of better models and the use of more powerful computers. More important, one can ask the question: how can a practitioner know which part of the error in a given complicated simulation is due to sampling error and which part due to the truncation errors addressed here? In our experiments with molecular models, even where there was substantial sampling error still present, we nonetheless found the accuracy to be noticeably higher for the BAOAB method; it is likely that this improvement in sampling accuracy would be of direct benefit in many real world simulations. Finally we point out that the BAOAB scheme and its limit method (Eq. 2) was, in each case studied, stable at a larger stepsize than the alternatives, meaning that longer time intervals are made accessible. This was particularly dramatic in the case of the Lennard-Jones system.

Acknowledgements. The first author acknowledges the support of a JTO Fellowship from the Institute for Computational Engineering and Sciences at the University of Texas. The second author was supported by the Centre for Numerical Algorithms and Intelligent Software (funded by EPSRC grant EP/G036136/1 and the Scottish Funding Council). A. Stuart, H. Owhadi, and especially G. Stoltz made helpful suggestions regarding the presentation of the results. We further acknowledge the comments received from A. Abdulle and M. Tretyakov which have improved the article and its relation to other works. Computations were performed on the University of Edinburgh’s ECDF compute facility.

References

Appendix: Langevin Dynamics Integrators

The Langevin dynamics methods used for the numerical experiments in this paper are given here. RiR_{i} is an NN-vector of i.i.d. Normal random numbers, with mean and variance 11. The diagonal mass matrix is denoted MM, and we assume a timestep δt\delta t is provided. Given the parameter γ\gamma, we define useful constants

The Method of Brunger-Brooks-Karplus (1982) (BBK)