Error analysis of the transport properties of Metropolized schemes
Max Fathi, A. -A. Homman, G. Stoltz
Description of the model
Standard results (see for instance the references in [13, Section 2.2]) show that (1) admits the Boltzmann-Gibbs measure
as its unique invariant probability measure (note that is finite since the position space is compact and is smooth hence bounded). In fact, (1) is ergodic with respect to this measure, where ergodicity is understood both as (i) the long-time (almost-sure) convergence of averages along trajectories
for any initial condition and all observables ; or as (ii) the convergence of the law of the process (1), happening here at an exponential rate, for instance in total variation: Denoting with some abuse of notation the measure by , there exist such that
where the total variation distance between two measures is defined as
the suprema being taken over all measurable sets of for the first one, and over all bounded, measurable functions for the second one.
For further purposes, we introduce the generator of (1), namely the operator
This operator (defined with domain ) is self-adjoint on the Hilbert space endowed with the scalar product
The operator moreover has a positive spectral gap (see for instance [13, Section 2] and references therein). Indeed, a simple computation shows that
The Poincaré inequality , valid for any function belonging to
which shows that the spectral gap is larger or equal to . In particular, the resolvent is a well-defined operator on , and the following estimate holds:
Here and in the sequel, for a given Banach space , we denote by the Banach space of bounded operators on , endowed with the norm
2. Self-diffusion
where the expectation is over all realizations of the continuous dynamics (1), starting from initial conditions distributed according to the Boltzmann-Gibbs measure (2). The following result shows that the diffusion tensor (8) is well defined, and naturally arises in a diffusive time-rescaling of the dynamics (1).
Consider for the diffusively rescaled process . Then, as , the process starting from a given initial condition weakly converges on finite time intervals to an effective Brownian motion starting from and with covariance matrix given by (8). Moreover, is a real, positive definite matrix, satisfying
in the sense of symmetric matrices, and which can alternatively be expressed as
where the expectation is over all realizations of the continuous dynamics (1), starting from initial conditions distributed according to the Boltzmann-Gibbs measure (2).
The proof of this statement is standard, and follows from arguments presented in [1, Chapter 3] for instance. We nonetheless provide a short proof in Section 5.1 since the proofs of the discrete counterparts of Theorem 1 rely on an appropriate extension of the argument used in the continuous case (see Section 5.5).
A straightforward consequence of Theorem 1 is that the self-diffusion constant , defined as the average mean-square displacement of the individual particles, is well defined and has two equivalent expressions:
The expression (10) is called the Einstein formula. The second expression (11) involves an integrated autocorrelation function. In accordance with the standard physics and chemistry nomenclature, we call (11) the Green-Kubo formula for the self-diffusion in the sequel.
3. Numerical estimation of the self-diffusion
In order to compute approximations of formulas such as (10) or (11), the first task is to numerically integrate realizations of the continuous dynamics (1). The most straightforward way would be to resort to a Euler-Maruyama scheme: given a time-step and denoting by an approximation of , this scheme reads
where is a sequence of independent and identically distributed (i.i.d.) -dimensional standard Gaussian random variables. However, this simple scheme has been shown to fail to be ergodic when the dynamics is considered on unbounded spaces and the potential energy function is not globally Lipschitz . In simulations of Brownian dynamics for ionic solutions, potential energy functions with Coulomb-type singularities are used and it has been observed that the energy blows up along trajectories of (12) (see Section 3.2).
is the probability transition of the Markov chain (12). When the proposition is accepted, we project into the periodic simulation cell . If the proposal is rejected, the previous configuration is counted twice: (It is very important to count rejected configuration as many times as needed to ensure that the Boltzmann-Gibbs measure is invariant). In conclusion,
where are i.i.d. uniform random variables in $\mathbf{1}_{U^{n}\leqslant\alpha}U^{n}\leqslant\alpha$ and 0 otherwise. The average rejection rate is
In order to avoid confusion, we call the scheme (13) “Metropolized Euler-Maruyama” in the sequel, and denote by its evolution operator:
Of course, the fact that some configurations are rejected destroys the trajectorial accuracy of the dynamics, see for precise statements. The resulting strong errors and, as a consequence, errors on finite time correlation functions have been quantified in , with prefactors which unfortunately depend on time. The estimates provided by these works therefore do not provide error estimates on diffusion coefficients, obtained either as infinite time integrals of correlation functions as in the Green-Kubo formula (11) or as the infinite time average mean square displacement as in Einstein’s formula (10).
The next section quantifies the errors in the approximation of (10) and (11) when the Metropolized Euler-Maruyama scheme is used. Although the formulas (10) and (11) are equivalent for continuous dynamics, they lead to different numerical methods. Let us already emphasize that the errors on the diffusion coefficients are in fact determined by the expansion of the evolution operator . This expansion is the same as the one used to establish weak error estimates. From a technical viewpoint, the techniques used in the proofs of our main results are therefore quite different from the techniques of , which are based on strong error estimates obtained with Gronwall’s lemma.
A priori error estimates on the self-diffusion
As discussed in , error bounds on transport properties in fact depend on approximation properties of the evolution operator (similar to the one used to prove weak error estimates), rather than strong error estimates. A key building block in this framework is the following expansion of the evolution operator, obtained by a slight extension of [5, Lemma 4.7] and [4, Lemma 5.5].
There exist an operator and such that, for any and any smooth function ,
with a remainder uniformly bounded for . Moreover,
where the expectation is over all possible realizations of the standard -dimensional Gaussian random variable .
The precise expression of the operator is unimportant. It is however given in the proof of this result, see Section 5.2. Note that the numerical scheme can be proved to be weakly first order accurate by relying on standard techniques , in view of the equality
Another important result which we will repeatedly use in the analysis below is the following uniform-in- geometric ergodicity of the Metropolized Euler-Maruyama scheme, easily obtained by adapting the results of to the case of compact position spaces (for completeness, we nonetheless provide elements of proof in Section 5.3). To state the result, we introduce the following functional space
As a consequence, there exists such that
We first give error estimates on (11) by appropriately adapting the results from . The result is stated for two smooth observables with average 0 with respect to . Define to this end
Error estimates for (11) are obtained by setting , with and . Note indeed that a simple integration by parts shows that has average 0 with respect to , so .
Consider two observables , and define the modified observable
where the operator is defined in (15). Then, there exists such that, for any ,
with uniformly bounded (with respect to ), and where the expectation on the left hand side of the above equation is with respect to initial conditions and over all realizations of the dynamics (1), while the expectation on the right hand side is with respect to initial conditions and over all realizations of the Metropolized Euler-Maruyama scheme (13).
As a corollary, we obtain first order error bounds on the computation of the self-diffusion through (11):
where is uniformly bounded for sufficiently small, and where the numerically computed self-diffusion reads
The expression of the correction term is obtained by replacing the modified observable by its expression:
The appearance of subleading fractional correction term in (20) (here, of order ) is typical of Metropolis algorithms, and is usually not encountered for standard, un-Metropolized discretizations of SDEs (compare with the results of ).
2. Error bounds on the Einstein formula
In this section, we investigate the discretization error made when using the Metropolized Euler-Maruyama scheme to approximate the self-diffusion using (10). In accordance with the definition (7), we introduce a discrete additive functional allowing to keep track of the diffuse behavior of the Markov chain: Starting from ,
where, as in Theorem 2, the expectation on the right hand side is with respect to initial conditions and for all realizations of the Metropolized Euler-Maruyama scheme.
There exists such that, for any ,
The proof of this result can be read in Section 5.5. In fact, a slight extension of our technique of proof would allow to show that the diffusively rescaled process generated by the Metropolized Euler-Maruyama scheme, namely (where denote the unique integer such that ), weakly converges on finite time intervals to a Brownian motion with covariance matrix .
An immediate corollary of Theorem 3 is the following a priori error estimate on the self-diffusion:
where is uniformly bounded for sufficiently small. Some more work would allow to prove that the subleading correction term is of order , as in the Green-Kubo case (see Remark 4).
Numerical illustration
The aim of this section is to illustrate the errors bounds (20) and (24). We perform long computations so that the statistical errors are negligible. Let us mention that numerical simulations illustrating timestep errors for velocity autocorrelation functions were already presented in .
The self-diffusion coefficient for the Einstein approach is approximated by fitting the unnormalized self-diffusion
by a linear function , the slope being the estimation of the self-diffusion for the time step under consideration. This is indeed confirmed by Figure 1 (Left), which presents the evolution of as a function of the physical time . The results produced in Figures 1 and 3 have been obtained with replicas and steps. Let us also note that, in accordance with (17), the rejection rate scales as .
A numerical approximation of (21) requires both a discretization using finitely many replicas, but also a truncation of the integration in time with an upper bound . We consider the following numerical estimation of the self-diffusion coefficient obtained with the Green-Kubo formula:
The correlation functions shown in Figure 2 suggest that the autocorrelation of is exponentially decreasing. It can be considered as negligible for times larger than 0.2. The numerical results reported in Figure 2 and 3 have been obtained with replicas and a time cut-off .
The results presented in Figure 3 indeed confirm that, for small time steps , the error in the self-diffusion is of order for both methods. For larger time steps, nonlinearities appear. Note also that the errors on the coefficients computed with the Green-Kubo formula are smaller in this simple case. In any case, in accordance with the statements of Theorem 1, the self-diffusion is between 0 and 1 when .
2. The more realistic case of solvated ions
We additionally consider a large ion, modeled as a fixed particle at position (the center of the simulation box), whose interaction with the solvent particles is described by an attractive Yukawa potential (screened Coulomb interaction) plus some repulsive potential preventing the small particles from coming too close to the ion. More precisely, for a solvent particle at position , the interaction reads , with
The potentials and are plotted in Figure 4.
Expectations are approximated using trajectories of the system. We integrate trajectories one after the other, using the Metropolized Euler scheme (13), with initial conditions for the th trajectory obtained by taking the last configuration of the th trajectory. The self-diffusion coefficient calculated with the Green-Kubo formula is obtained as in the previous section, using the estimator (26) (upon introducing the correct normalization factor in the autocorrelation and replacing by ). The values obtained by the Einstein formula are computed by dividing the unnormalized diffusion defined in (25) by the final time of the simulation: for a simulation time ,
The results presented in Figure 5 show that the unnormalized mean squared displacement indeed grows linearly in time, as expected. Note also that, in accordance with (17), the rejection rate scales as . The results of Figure 5 and (7) have been obtained with trajectories, with integrations performed up to .
The results presented in Figure 6 suggest that the decay of the force autocorrelation cannot be represented by a single exponential function. The force autocorrelation can be considered to be small in relative value for times of the order of . The numerical results reported in Figure 6 and 7 were obtained by averaging trajectories with an integration time .
Error estimates for the diffusion coefficients are gathered in Figure 7. For small time steps , the error in the self-diffusion is linear in , while nonlinearities appear for larger time steps. The continuous lines are linear fits obtained over the values corresponding to the 10 smallest time steps. As in the simple example discussed in the previous section, estimates obtained with Green-Kubo’s formula seem more reliable than those obtained with Einstein’s formula. Note also that, in accordance with Theorem 1, the self-diffusion is between 0 and 1 in all cases for sufficiently small time-steps.
Possible work tracks to reduce the error on the estimation of the self-diffusion
Both the Green-Kubo and the Einstein approaches lead to discretization errors of order , as proved theoretically and verified numerically. A natural question is how to reduce this error. In the chemistry literature, it was proposed in to renormalize the time in Einstein’s method by replacing the simulation time appearing in the denominator of (23) by some effective time , where is the average acceptance rate of the Metropolis algorithm for a given time step . However, since the average acceptance rate is of order for small time steps (see (17)), such a correction cannot possibly cancel out the -error in the diffusion coefficient.
A more promising work track, which we started working on at the end of our stay at CIRM, is to modify the proposed move (instead of simply considering the Euler-Maruyama scheme (12)), and possibly the invariant measure as well, in order to increase the weak and strong orders of the associated Metropolized scheme. Some steps in this direction have already been pursued in . In fact, the proofs of Theorem 2 and 3 show that, in order to gain accuracy on the computation of transport coefficients, it is sufficient to find a numerical scheme such that
A key element to obtain such equalities is to decrease the rejection rate for small time steps.
Proof of the results
In all the proofs, the constants and the critical time steps may change from line to line. Upon changing into , we may also assume that .
The idea is to rewrite the part of the additive functional involving as an approximate martingale. To this end, we introduce the solutions of the following Poisson equations
In view of the resolvent estimate (6), the functions are well defined elements of since
In addition, by elliptic regularity, the functions are smooth. By Ito’s lemma, we therefore obtain
so that the integrated displacement from the origin can be rewritten as
The ergodic properties of the diffusion process allow to prove that
Since has a positive spectral gap on (see (5)), we can write the following operator equality on :
Therefore, for general functions with vanishing average with respect to ,
Combining this result with (32) leads to the expression (9) of the diffusion matrix . To prove the convergence of the processes, two arguments should be made precise (see for an elementary account):
the convergence of the finite-dimensional laws, which can be obtained very simply here by considering the exponential martingales
which are such that the conditional expectations converge to those of a Brownian motion as :
denoting the filtration of events until the time . Finite-dimensional laws are then obtained by a simple induction, as made precise in for instance.
the tightness of the process, proved using Prohorov’s criterion (see for instance ):
This criterion is satisfied in view of the tightness of the martingale , itself easily obtained using Doob’s inequality (see ).
2. Proof of Lemma 1
We first determine the magnitude of the acceptance probability in the Metropolis algorithm, which reads
Using the following expansions (with integral remainders)
while there exists a constant such that for any . We next use the inequality
obtained by distinguishing the cases and . This shows that
with and where . The estimate on the average acceptance rate (17) is obtained by taking the expectation over all possible realizations of , with the definition
To obtain the action of the operator , we start from the expression of the Metropolis transition operator (see for instance [13, Section 2.1.2])
We now write , so that, using (37) to estimate the rejection rate,
where we have used for the first integral a Taylor expansion at fourth order similar to (34) to obtain (see the computations in [12, Section 4.9])
and a Taylor expansion at first order for the term involving the rejection rate to obtain
The remainder is uniformly bounded in for sufficiently small. The conclusion follows by setting .
Finally, the invariance of by implies
This equality, together with the expansion (15) proves (16).
3. Proof of Lemma 2
The proof below is a simplification of the argument presented in , made possible since we work on a compact state space. The idea of the proof is to compare the Metropolis dynamics to the continuous dynamics, using the standard, un-Metropolized Euler-Maruyama scheme as an intermediate. Alternatively, it would be possible to directly compare the Metropolized and un-Metropolized schemes, by proving as in [12, Section 4.2] that the standard, un-Metropolized Euler-Maruyama scheme is geometrically ergodic since it satisfies a minorization condition.
Since we work on a compact state space, it is enough to show by Harris’ theorem (see the presentation in ) that there exist such that
uniformly in and . We now introduce the transition kernel of the continuous dynamics (1), defined as
and consider it at time . The transition kernel is well defined and regular since the generator is elliptic. By the triangle inequality,
From [4, Lemma 2.7], since we work on a compact space, we know that there exists such that
To control the second term in (39), we introduce the transition kernel of the standard, un-Metropolized Euler-Maruyama scheme (12), denoted by , and write
By [4, Lemma 4.2], the transition kernel of the standard, un-Metropolized dynamics is uniformly close to the transition kernel of the continuous dynamics when the state space is compact: There exists and such that, for any ,
It therefore remains to control the distance between the transition rate of the Metropolized and un-Metropolized dynamics. It is at this stage that the argument of [4, Lemma 4.6] can be simplified. As in the proof of this lemma, we use a coupling argument, and consider two chains and , corresponding respectively to the Metropolized dynamics and the standard un-Metropolized one, starting from the same initial condition . The probability that is bounded from above by the probability that for some , i.e. at least one rejection occurred along the discrete trajectory. Since the probability to reject the move from to is , it holds
The combination of all previous estimates finally gives (38) provided is sufficiently small.
from which the bound (19) immediately follows.
4. Proof of Theorem 2
We follow the strategy of [12, Section 4.8]. The proof starts by noticing that the integrated correlation function can be written using , as made precise in (33). The strategy of the proof is to write an approximation of using the discrete evolution operator .
In view of (40) and (15), and since is stable under , it holds
where the sum is convergent in view of (18). In conclusion,
5. Proof of Theorem 3
We start by highlighting the martingale part of the increments , similarly to the continuous case (compare (31)) :
where the expectation on the right-hand side is over the Gaussian random variable and the uniform variable (the configuration being fixed). It will be useful to decompose the increment as
In view of Lemma 3 (which shows in particular that can be thought of as being of order by setting in (43)), the first term on the right-hand side of (41) is equal to at dominant order in . This term therefore corresponds to the term in the decomposition (31). The second term in the right-hand side of (41) is handled by introducing an appropriate Poisson equation (see (44) below), which is the discrete analogue of (30).
The average increment has the following expansion in powers of :
where is uniformly bounded for sufficiently small. In addition, there exists a constant such that and, for any ,
There exists, for any , a unique function such that
Moreover, recalling the definition (30) of ,
where is uniformly bounded for sufficiently small, and is the unique solution of the Poisson equation
Since is uniformly bounded as , we obtain
We now expand in powers of . By Lemma 4, it is possible to replace the function in the second term on the right-hand side of (46) by , up to a remainder of order . We next use the following lemma to compute the cross correlation between the various functions of appearing in .
For any smooth functions , growing at most polynomially,
The conclusion then follows by applying this result with the functions replaced by , and for a given test direction . Indeed,
with uniformly bounded as . Manipulations similar to the ones leading to (32) finally give the claimed result.
In order to characterize the leading order term in the error and prove that the subleading order term indeed is of order , it would be necessary to compute correlation terms involving components of the remainder term . This is not possible as such because the regularity of is not established, and obtaining regularity result from (44) is difficult. An expansion of up to terms (as ) is therefore needed in order not have to treat correlations involving the remainder . This, in turn, would require an expansion of up to remainders of order , instead of as in (15). Although this does not pose any problem in principle, we chose not to follow this path in order to keep the arguments as simple as possible.
We conclude this section with the proofs of the technical results quoted above.
The bound (43) is a straightforward consequence of the equality
while immediately gives . ∎
We introduce the normalized average increment, defined as the following periodic function :
Lemma 2 shows that is well defined provided the periodic function has a vanishing average with respect to . To prove this statement, we start from
Since and play symmetric roles, vanishes. For , we obtain, by first exchanging the names of the dummy variables and and then using as well as the invariance of by translations of the periodic cell,
Therefore, , which allows to conclude that
To obtain an expansion of in terms of fractional powers of , we rely on (42), which implies that
This gives the result upon defining . ∎
A Taylor expansion with integral remainder gives
A simple computation using (42) and (43) shows that
where is uniformly bounded for sufficiently small. Therefore,
where the remainder is uniformly bounded. ∎