Computing the optimal protocol for finite-time processes in stochastic thermodynamics

Holger Then, Andreas Engel

I Introduction

There exist plenty of reasons for processes to be optimized. Both, economically and ecologically, it is of high interest to minimize the energy consumption. Mathematically, the issue addresses the question of designing an optimal protocol $\lambda(t)$ according to which some dynamical system is driven from a given initial state to some other desired final state. Macroscopically, from the second law of thermodynamics, there is a lower bound on the required work. If started from thermal equilibrium, the applied work needed to reach the final state in an isothermal process is always larger than or equal to the free energy difference. The amount of the applied work that exceeds the free energy difference is called the dissipated work and is lost by heating the environment. Since tasks need to be finished within a given finite amount of time, the dissipated work is always positive and depends on the details of the protocol resulting in a technical challenge of how to prevent powerful engines and fast microprocessors from their heat death.

Along with miniaturization new aspects arise. Microscopic systems are subject to both, deterministic and stochastic forces, and it becomes necessary to consider ensemble averages. To ensure that a quantum computer is in a well defined state, the fluctuations have to be minimized. Finding the optimal protocol that yields the least fluctuations (beyond cooling down the system) is also of interest in soft and biomatter systems (where cooling is not even be possible). In these situations the second law only yields constraints for the average behavior, whereas individual realizations may extract work from the heat bath thereby consuming instead of producing entropy ECM . The characterization of work and heat distributions in fluctuating non-equilibrium situations has benefited from recent progress in statistical mechanics centered around the so-called work and fluctuation theorems, see BoKu ; Jarzynski1997 ; Jar2 and GaCo ; Kurchan ; JarFT ; Crooks ; Seifert respectively.

Consequently one has to search for the protocol that minimizes the average work. This statement can be sharpened in the linear response regime, where the work fluctuations are proportional to the dissipated work Hermans1991 ; Jarzynski1997 , $\langle\Delta W^{2}\rangle=\frac{2}{\beta}\langle W_{\text{diss}}\rangle$ , as proven in SpeckSeifert2004 . If one is interested in extracting a sharp value for the mean work that is required, e.g. for folding a protein, it is desirable to drive the folding according to a protocol that minimizes the fluctuations of the work, $\langle\Delta W^{2}\rangle=\langle W^{2}\rangle-\langle W\rangle^{2}=\frac{2}{\beta}\langle W_{\text{diss}}\rangle$ , resulting again in the protocol that minimizes the average work. (Far from equilibrium the connection between work fluctuations and dissipation is more complicated and no general and precise relation between the two is known.)

In the present paper we investigate the uniqueness and general properties of optimal protocols $\lambda(t)$ which minimize the average work necessary to accomplish a given isothermal transition between two equilibrium states. We first rederive the results obtained in SchmiedlSeifert2007 for linear systems. We then consider a non-linear system and study the behavior well beyond the validity of linear response.

We are seeking for the optimal protocol $\lambda(t)$ that minimizes the average work in a finite-time process of a small system that is subject to both, deterministic and stochastic forces. The former can be controlled experimentally by the external parameter $\lambda(t)$ , whereas the latter originate from thermal fluctuations of the environment. Taking the environment to be a heat bath in thermal equilibrium at temperature $T$ , the stochastic forces are modeled via a Gaussian white noise, $\sqrt{\frac{2}{\beta}}\zeta(t)$ , that is characterized by its vanishing ensemble average, $\langle\zeta(t)\rangle=0$ , and by the absence of any time-correlations, $\langle\zeta(t)\zeta(t^{\prime})\rangle=\delta(t-t^{\prime})$ . Concerning the deterministic forces, we assume that Stokes friction is present, $F=-\gamma\dot{x}$ , where the force of friction is proportional to the velocity $\dot{x}$ of the system along its trajectory $x(t)$ . The proportionality constant $\gamma$ is the friction coefficient. All the other deterministic forces are assumed to be conservative.

For systems on the nanometer scale-size, e.g. biological systems on the cellular or subcellular level and single molecule experiments, the stochastic forces exceed the inertial forces by far. Neglecting the acceleration term $m\ddot{x}$ in the overdamped motion, where $m$ is the mass that is accelerated, the microscopic dynamics is described by the Langevin equation

where $V=V(x,\lambda)$ is the potential of the conservative deterministic forces whose time-dependence is attributed to the control parameter $\lambda(t)$ . Rescaling the time, we can set the friction constant to unity, $\gamma=1$ .

The time evolution of the probability distribution $p(x,t)$ to observe the system at position $x$ at time $t$ is governed by the Fokker-Planck equation

Starting in thermal equilibrium, the initial canonical distribution is

where the normalization constant $Z_{0}$ is the partition function. According to the stochastic forces, any trajectory $[x(t)],\ 0\leq t\leq t_{f},$ is possible and occurs with probability

where the integrand in the exponent of the probability functional reads

and ${\cal N}$ is a normalization constant. Each realization of the process requires its specific amount of work

where we take possible jumps at the beginning and at the end of the protocol explicitly into account, e.g.

Averaging $W[\lambda(\cdot),x(\cdot)]$ over the initial distribution and the noisy history the average work

becomes a functional of the protocol $[\lambda(t)],\ 0\leq t\leq t_{f},$ according to which the parameter $\lambda(t)$ is varied from its initial value $\lambda_{0}$ at $t=0$ to its final value $\lambda_{f}$ at $t=t_{f}$ .

The optimal protocol is found by solving the non-local Euler-Lagrange equation

where the variation reads SchmiedlSeifert2007

I.2 Known results

Schmiedl and Seifert studied the motion of a colloidal particle in an optical tweezer SchmiedlSeifert2007 . For two cases, namely for varying the position and the strength of the trap, respectively, they could express the mean work as a local functional of one variable. This allowed them to find the optimal protocol that minimizes the mean work. As a surprising result Schmiedl and Seifert found the optimal protocol to jump at the beginning and at the end of the process, i.e. at $t=0$ and $t=t_{f}$ , whereas in between the optimal protocol varies smoothly. The initial jump can be interpreted as an immediate jump from equilibrium to a stationary state in order not to loose valuable time and the final jump allows a slower driving of the system at earlier times. It is worth noticing that the optimal protocol is unique in the two cases studied by Schmiedl and Seifert. Both, the uniqueness of the protocol and the possibility to express the mean work as a local functional of one variable, result from the linearity of the systems considered.

Below, we show how these results can be rederived by an explicit solution of the Euler-Lagrange equation (10), (11). For a generic non-linear problem such an analytic solution seems impossible. We therefore analyze a simple non-linear system numerically and discuss the new features of the optimal protocol arising in this case.

II The analytic solution

In the following we demonstrate that an analytic solution of the Euler-Lagrange equation (10), (11) is possible if, as a necessary condition, the underlying Langevin equation (2) can be explicitly integrated. The main trick consists of combining the Euler-Lagrange equation with its derivatives in such a way that the integrals cancel out. Below, this is carried through explicitly for the two cases studied by Schmiedl and Seifert, i.e. for the stochastic motion of a colloidal particle in an optical tweezer.

Dragging a colloidal particle through a viscous fluid by an optical tweezer with harmonic potential

where the focus of the optical tweezer is moved according to a protocol $\lambda(t)$ from $\lambda(0^{-})=\lambda_{0}$ to $\lambda(t_{f}^{+})=\lambda_{f}$ in a finite time $t_{f}$ , the expressions appearing in the variation of the average work (11) can be computed,

Differentiating the Euler-Lagrange equation twice it is almost identically reproduced. From the difference between the Euler-Lagrange equation and its second derivative follows a simple differential equation for the optimal protocol,

Inserting the solution of (17), $\lambda(t)=at+b$ , back into (16) and setting $\lambda_{0}=0$ yields the integration constants to be equal to

in agreement with the result of Schmiedl and Seifert SchmiedlSeifert2007 .

II.2 Case study II

with $\lambda(0^{-})=\lambda_{0}>0$ and $\lambda(t_{f}^{+})=\lambda_{f}>0$ as boundary conditions, the expressions appearing in the variation of the average work (11) can be computed again,

for $t^{\prime}>t>0$ , and the Euler-Lagrange equation reads

Note that $C=AB-1,\ \dot{C}=\dot{A}B+A\dot{B},\ \ddot{C}=\ddot{A}B+2\dot{A}\dot{B}+A\ddot{B},\ \ldots\$ are complicated, because with $A$ and $B$ they contain integrals and exponentials of integrals.

From $\ddot{A}\dot{C}-\dot{A}\ddot{C}=A(\ddot{A}\dot{B}-\dot{A}\ddot{B})-2\dot{A}^{2}\dot{B}$ and $\ddot{B}\dot{C}-\dot{B}\ddot{C}=B(\ddot{B}\dot{A}-\dot{B}\ddot{A})-2\dot{B}^{2}\dot{A}$ follows an ordinary differential equation for $C$ ,

Remember that we are interested in extremizing the average work

Consequently, $C$ and its derivatives have to vanish identically. This yields

The latter is an ordinary differential equation for the optimal protocol. Using Lie symmetries Hydon2000 it can be decomposed and integrated resulting in $\lambda=\lambda_{1}$ and $\lambda=\lambda_{2,3}$ with

If plugged into the Euler-Lagrange equation (28), $\lambda_{1}(t)$ fails to describe the solution for positive $\lambda_{0}$ . $\lambda_{2,3}(t)$ solve the Euler-Lagrange equation provided the integration constants are $a=\lambda_{0}$ and

in agreement with the result of Schmiedl and Seifert SchmiedlSeifert2007 .

III Numerical methods

We have shown how the Euler-Lagrange equation can be solved analytically. Thereby, it was essential that the Langevin equation can be integrated. Often, this equation cannot be integrated analytically. In the latter case, it is not even possible to express the Euler-Lagrange equation as an integro-differential equation in $\lambda$ , because the correlation functions cannot be evaluated explicitly, and one has to resort to numerical methods.

Numerically, we have implemented a Monte Carlo algorithm that minimizes the average work directly. For a given protocol the average work (9) is approximated using a finite ensemble of trajectories that are discretized in time, $\{x_{n}(t_{m})\}$ , $n=1\ldots N$ and $m=0\ldots M$ with $t_{0}=0$ and $t_{M}=t_{f}$ . The initial distribution $\{x_{n}(0)\}$ according to (4) and the noisy history $\{\zeta_{n}(t_{m})\}$ for each trajectory of the ensemble are diced using the Ziggurat method MarsagliaTsang2000 . For each trajectory the Langevin equation (2) is integrated according to the Heun-scheme Blum1972 ; GarciaPalaciosLazaro1998 . Finally, the optimal protocol is found with the threshold acceptance algorithm DueckScheuer1990 .

In the threshold algorithm we approximate the protocol by a polygon line that connects the $2Q+2$ points $\{(0,\lambda_{0^{-}}),(0,\lambda_{0^{+}}),(\frac{1}{Q}t_{f},\lambda_{1^{-}}),(\frac{1}{Q}t_{f},\lambda_{1^{+}}),\ldots,$ $(\frac{Q-1}{Q}t_{f},\lambda_{(Q-1)^{-}}),(\frac{Q-1}{Q}t_{f},\lambda_{(Q-1)^{+}}),(t_{f},\lambda_{f^{-}}),(t_{f},\lambda_{f^{+}})\}$ , where the boundary values are $\lambda_{0^{-}}=\lambda_{0}$ and $\lambda_{f^{+}}=\lambda_{f}$ , and $Q$ is a positive integer. The threshold algorithm is an iterative algorithm that starts with a random choice of initial values for $\{\lambda_{0^{+}},\ldots,\lambda_{f^{-}}\}$ and computes the corresponding average work. In each iteration the values $\{\lambda_{0^{+}},\ldots,\lambda_{f^{-}}\}$ are randomly perturbed and the corresponding average work is compared with the average work of the best protocol that was found in the previous iterations. If the protocol results in an average work that is not deteriorated by more than some given threshold value, it is used as the best protocol in the next iteration. Whenever no better protocol is found in some finite number of iterations, the algorithm lowers the threshold and continues iterating. If finally the algorithm does not find a better protocol in some finite number of iterations at zero threshold, it eliminates intermediate jumps in the protocol provided the protocol keeps optimal.

It is worth to note that we dice the initial distribution and the noisy history only once and reuse these values in any iteration of the algorithm. This is not just to speed up the numerics, it is necessary for the algorithm to converge.

Using up to about $N=7000$ trajectories discretized into $M=154$ time-steps and approximating the protocol using a few points for the polygon, $2Q+2=44+2$ , the algorithm is reasonably fast and can easily be run on a single processor. Starting from a random initial protocol, it typically converges in less than $30000$ iterations. Depending on the parameter values, it finds the optimal protocol within a few seconds or minutes, but for some parameter values it runs for several hours. The result is a first approximation of the optimal protocol.

In order to increase the numerical resolution, we rerun the algorithm using up to $N=50000$ trajectories discretized into $M=700$ time-steps. Starting from the previously found first approximation of the optimal protocol, but now approximating the protocol using $2Q+2=200+2$ polygon points, the algorithm typically converges in less than $70000$ iterations. The required CPU-time of the rerun is by a factor of $\sim 40$ larger, because of the increased number of trajectories and time-steps involved.

With the numerics at hand, we study a further example.

Consider the stochastic motion of a small dipole in a viscous liquid driven by an external field where the direction of the external field is changed according to a protocol $\lambda(t)$ . The protocol may start in any arbitrary initial direction $\lambda_{0}$ at time $t=0$ and ends in the final direction $\lambda_{f}=\lambda_{0}+\Delta\lambda$ at time $t=t_{f}$ . For notational simplicity, we regard one degree of freedom and explore the stochastic motion

where the angle $x$ characterizes the orientation of the dipole. In the numerics, we set the amplitude of the field to unity, $H=1$ , and stop the protocol at $t_{f}=1$ .

Being subject to two time-scales, the relaxation time and the time for driving the system, the motion of the dipole depends qualitatively on the diffusion constant, $D=\frac{1}{\beta}$ , which we take as an additional constant parameter into account.

Simulating the stochastic motion of the dipole, we find optimal protocols as displayed in figures 1 to 6.

Let us first discuss the numerical solutions with the diffusion constant kept small, $D=10^{-5}$ . If $\Delta\lambda=\lambda_{f}-\lambda_{0}$ is also small, the potential can be Taylor expanded around its minimum and is approximately equal to that of the moving laser trap of case study I. The solid line in figure 1 displays the optimal protocol for $\Delta\lambda=1.2$ .

If the angle $\Delta\lambda$ increases, the dynamics starts to experience the non-linearities of the potential. For e.g. $\Delta\lambda=2.0$ the final jump becomes much larger than the initial jump, see the solid line in figure 2.

In rare cases, the numerical algorithm converges to another protocol, see the dotted lines in figures 1 and 2. The values of the average work tell us that these latter protocols are slightly suboptimal, $\langle W\rangle=0.4620$ and $\langle W\rangle=1.1792$ , while the optimal protocols require $\langle W\rangle=0.4619$ and $\langle W\rangle=1.1791$ , respectively. However, we cannot really decide whether the dotted line in figure 2 is a suboptimal protocol or whether it is optimal, because the differences in the average work are close to the resolution of our numerical procedure.

If the angle $\Delta\lambda$ is further increased, to e.g. $\Delta\lambda=3.0$ , the previously suboptimal protocol becomes optimal. Both protocols, indicated by the solid and the dashed lines in figure 3 yield the same value for the average work, $\langle W\rangle=1.9802$ . The two optimal protocols differ by the sizes of their initial and final jumps, whereas their slopes are identical for $0<t<t_{f}$ .

Being the result of a numerical procedure, we can never claim in a strict mathematical sense that two protocols are both optimal. In practice, however, it is impossible to distinguish the case of two optimal protocols from that of two protocols with extremely close work values.

The existence of two optimal protocols results from the symmetry in the potential, $-\cos(x-\lambda)=\cos(\pi-(x-\lambda))$ . The optimal protocol, as indicated by the solid line in figure 3, jumps at $t=0$ to a stable state that pulls the dipole towards a new orientation, see figure 7 (left). This protocol needs most of its average work in the final jump. The other optimal protocol, displayed by the dashed line in figure 3, requires most of the average work in the initial jump. Immediately after this jump, the system is in a state where the dipole is pushed by the potential, see figure 7 (right). Because of the symmetry in the potential between its minimum and its maximum, the slopes of the two optimal protocols are the same for $0<t<t_{f}$ .

Beyond these two optimal protocols, for $D$ small and $\Delta\lambda$ close to $\pi$ , there is a whole family of optimal protocols. Any protocol of this family starts with an initial jump that brings the system to its pulled or pushed state, respectively. At any arbitrary time the protocol can jump again and the system switches between the pulled and the pushed state. Such a protocol is displayed by the dotted line in figure 3. Finally, the optimal protocol can jump repeatedly between the the pulled and the pushed state allowing for any number of jumps.

For $\Delta\lambda$ approaching $\pi$ , the size of the initial and/or the final jump, and the slope of the optimal protocols decrease towards zero, cf. figures 2 and 3.

Increasing $\Delta\lambda$ further such that it exceeds $\pi$ , the initial and/or the final jump, and the slope of the optimal protocols change sign, cf. figures 3 and 4.

Due to the change in signs and because of the symmetry and periodicity of the potential, we know that for $\Delta\lambda=\pi$ the initial and/or the final jump, and the slope of the optimal protocols vanish identically, and we conclude the following:

For $D$ small and $\Delta\lambda$ close to $\pi$ , the optimal protocol jumps at least two times, except if $\Delta\lambda=\pi$ holds identically. In the latter case, the initial and/or the final jump vanish and the whole protocol can degenerate to one single jump that may happen at any arbitrary time, $0\leq t\leq t_{f}$ . Two members of the family of optimal protocols are displayed in figure 5.

It is interesting to consider also the variances of the work for the different optimal protocols. While they give the same value for the average work, the variances of the work differ slightly among the family members. The optimal protocol that pulls the system all the time in the stable state, see figure 7 (left), yields the smallest variance for the work, whereas the optimal protocol that pushes the system all the time in the unstable state, figure 7 (right), yields the largest variance for the work. Note that different values for the variances of work for protocols that yield the same average work imply that we are not in the linear response regime.

Because of the limited numerical resolution, the slight differences in the variances of the work have to be treated with some care. Nevertheless, the fact that we are well beyond the linear response regime can clearly be demonstrated by checking whether a central identity of linear response theory, $\langle\Delta W^{2}\rangle=2D\langle W_{\text{diss}}\rangle$ Hermans1991 ; Jarzynski1997 ; SpeckSeifert2004 , is violated. For example, the optimal protocol for $\Delta\lambda=2.0$ , see figure 2, results in $2D\langle W_{\text{diss}}\rangle/\langle\Delta W^{2}\rangle=1.76\not=1$ , and the optimal protocols for $\Delta\lambda=\pi$ , see figure 5, result in $2D\langle W_{\text{diss}}\rangle/\langle\Delta W^{2}\rangle\approx 10^{5}\not=1$ .

Yet, we have kept the diffusion constant $D$ small. Increasing the diffusion constant increases the noise, see the scrambled lines that mimic the optimal protocols in figure 6. In principle, we can get rid of the fluctuations in taking the average over larger ensembles, but we have decided to use only $50000$ trajectories for the ensembles in order to keep the CPU-time reasonable. Moreover, in computing the optimal protocol for $D=0.7$ we have reduced the resolution to $2Q+2=100+2$ points, otherwise the CPU-time would exceed two weeks.

Beyond the quantitative change in the amplitude of the fluctuations, there is also a qualitative change in the behavior of the system. As discussed above, if the direction of the external field is changed by the angle $\Delta\lambda=\pi$ in time $t_{f}=1$ and the diffusion constant is small, there exist optimal protocols that consist of only one single jump that may happen at any arbitrary time. The solid line in figure 6 shows that such protocols are optimal up to $D=0.5$ .

If the diffusion becomes larger, the previously optimal protocols that were allowed to jump several times, get suboptimal. Instead, a different protocol becomes optimal. The latter is unique. The transition happens somewhere below $D=0.7$ . The dashed line in figure 6 mimics the optimal protocol for $D=0.7$ , having two jumps, one at $t=0$ and one at $t=t_{f}=1$ .

There are hence two regions in the $\Delta\lambda$ - $D$ -plane. These are shown qualitatively in figure 8. One region is located around $\Delta\lambda\approx\pi$ and $D$ small. In this region, the optimal protocol occurs in a family and may jump several times. In the other region, i.e. for $\Delta\lambda$ far away from $\pi$ or for $D$ large, the optimal protocol is unique and jumps twice.

If one comes close to the curve that separates the two regions in the $\Delta\lambda$ - $D$ -plane, it becomes hard to decide numerically which protocols are optimal and which are suboptimal, because the values of their average work approach each other. Being blurred by the noisy history of the trajectories, it becomes quite impossible to determine the exact curve that separates the two regions in the $\Delta\lambda$ - $D$ -plane. For this reason, we show in figure 8 only a crude approximation of this curve.

IV Conclusion

The central subject of this article was to present a guide according to which the non-local Euler-Lagrange equation, a non-linear integro-differential equation of correlation functions, can be solved in order to find the optimal protocol of an external control parameter that minimizes the average work for driving a small system from one given equilibrium to another in finite time.

Studying the stochastic motion of a dipole where the direction of the external field is varied according to some protocol, we have found as a surprise that the optimal protocol may not be unique. For suitable parameter values the protocol can jump several times.

The reason for the non-uniqueness of the optimal protocol results from a symmetry in the potential that allows the dominant trajectories to be near the unstable maximum of the potential, figure 7 (right), as an alternative to the stable solution, where the trajectories are near the minimum of the potential, figure 7 (left).

If the diffusion becomes large, the optimal protocol becomes unique.

It is tempting to speculate whether biological systems on the cellular and subcellular level benefit from non-unique optimal protocols. Just imagine a protocol to be a triggering mechanism for a cellular process to be activated. It can be optimal without the need of exact timing, allowing the biological system to react spontaneously on the environment.

Acknowledgments

We thank Tim Schmiedl and Udo Seifert for useful correspondence.