Rectified Flow: A Marginal Preserving Approach to Optimal Transport
Qiang Liu
Introduction
The Monge–Kantorovich (MK) optimal transport (OT) problem concerns finding an optimal coupling between two distributions :
We present a different approach tocontinuous OTthat re-frames (1) intoa sequence of simple unconstrained nonlinear least squares optimization problems,whichmonotonically reduce the transport cost of a couplingwhile automatically preserving the marginal constraints.Different from theminimax and regularization approaches that enforce the constraints from outside,our method is an interior approachwhich starts from a valid coupling (typically the naive independent coupling), and traverses inside the constraint set to decrease the transport cost.Such an interior approach is non-trivial and has not been realized before, because there exists no obvious unconstrained parameterization of the set of couplings of and . Our method is made possible by leveragingrectified flow ,a recent approach to constructing (non-optimal) transport maps for generative modeling and domain transfer.What makes rectified flow specialis that it provides a simple procedurethat turns a given coupling into a new one thatobeys the same marginal laws, while yielding no worse transport cost w.r.t. all convex functions simultaneously.Despite this attractive property,as pointed out in ,rectified flow can not be used to optimize any fixed cost ,as it is essentially a special multi-objective optimization procedure that targets no specific cost.Our method is a variant of rectified flow that targets a user-specified cost function and hence yields a new approach to the OT problem (1).
Rectified flow
We provide a high-level overview ofthe rectified flow of and the main results of this work.For a given coupling of and ,the rectified flowinduced by is the time-differentiable process over an artificial notion of time ,that solves the following ordinary differential equation (ODE):
and is the linear interpolation between and .Eq (3) isa least squares regression problem of predicting the line direction of from every space-time point on the linear interpolation path, yielding a solution of
c𝑐c-Rectified flow
In this work,we modify the procedure so that it can be used tosolve (1) given a user-specified cost function .We show that this can be done easily by properly restricting the optimization domain of and modifying the loss functionin (3).The case of quadratic loss is particularly simple,for which we simplyneed to restrict the to be a gradient field in the optimization of (3).For more general convex ,we need to restrict to have a form of , with minimizing the following loss function:
Notation
Outline
The rest of the work is organized as follows. Section 2introduces the background of optimal transport. Section 3reviews rectified flow of from an optimization-based view.Section 4 characterizesthe if and only if conditionfor two differentiable stochastic processes to have equal marginal laws.Section 5 introduces the main -rectified flow method and establishes its theoretical properties.
Background of Optimal Transport
This section introduces the background of optimal transport (OT),including both the static and dynamic formulations.Of special importance is the dynamic formulation,which is closely related to the rectified flow approach.The readers can find systematic introductions to OT ina collection of excellent textbooks.
The optimal transport problem wasfirst formulated byGaspard Monge in 1781 when he studiedthe problem of how toredistribute mass, e.g., a pile of soil, with minimal effort.Monge’s problem can be formulated as
As the left side of (7)only depends on and the right side only on , one can show that is -optimal and solves (6) iffholds with probability one, which provides a basic optimality criterion. Many existingOT algorithms are developed by exploiting the primal dual relation of (1) and(6)(see e.g., ), but have the drawback of yielding minimax problems that are challenging to solve in practice. If is strictly convex,the optimal transport map of (5)is unique (almost surely) and yields a form of
where is the convex conjugate function of , and is an optimal solution of (6), which is -convex in that with the associated solution.In the canonical case of quadratic cost , we can write , where is a convex function.
Dynamic formulations
Both the MK and Monge problems can be equivalently framed in dynamic waysas finding continuous-time processes that transfer to .Let be a smooth path connecting and , whose time derivative is denoted as .For convex , by Jensen’s inequality, we can represent the cost in an integral form:
where the infimum is attained when is the linear interpolation (geodesic) path: .Hence, the MK optimal transport problem (1) is equivalent to
Hence, we can rewrite (9) into an optimization problem on , yielding the celebrated Benamou-Brenier formula :
Rectified Flow: An Optimization-Based View
We introduce rectified flow of from an optimization-based perspective:we show that rectified flow can be viewed as the solutionof a special constrained dynamic optimization problem,which allows us to gain more understanding of rectified flow and motivates the development of -rectified flow.Following ,for a time-differentiable stochastic process , its expected velocity field is defined as
where denotes the time derivative of .Obviously, is the solution of
Precisely, Equation (13) should be interpreted by its weak and integral form:
Moreover, the rectified flow of a coupling is defined as the rectified flow of when is the linear interpolation of .
A stochastic process is called rectifiable if exists and is locally bounded,and Equation (15)has an unique solution.A coupling is called rectifiable if its linear interpolation process , following , is rectifiable.In this case, we call the rectified flow of , and write it (with an abuse of notation) as .The corresponding is called the rectified coupling of , denoted as .
Assume that is rectifiable. We have
Hence,rectified flowturns a rectifiable stochastic process into a flow while preserving the marginal laws.
We show thatthe rectified flow of achieves the minimum of the path-wise -transport cost in the set of time-differentiable stochastic processes whose expected velocity field equals .This explains that the property of non-increasing convex transport costs of rectified flow/coupling.
The rectified flow in (15) attains the minimum of
which yields a proof of Theorem 3.2 of that the rectified coupling yields no larger convex transport costs than .
A primal-dual relation
Let us generalize the least squares loss in (12) to aa Bregman divergence based loss:
For any differentiable convex function ,and rectifiable process , we have
andthe optima above are achieved when and .
Straight couplings
Take .Hence, for with ,the -optimal mapping is the trivial identity coupling with .However,consider the coupling ,where is anon-identity and non-reflecting rotation matrix (namely , , and does not have as an eigenvalue).Then is a straight coupling of and ,but it is not -optimal for allsecond order differentiable convex function whose Hessian matrix is invertible everywhere. See Appendix for the proof.It is the rotation transformthat makes sub-optimal,which is removed in the proposed -rectified flow in Section 5 via a Helmholtz like decomposition.
Differentiable Processes with Equivalent Marginal Laws
The marginal preserving property of rectified flow is due to the property of by construction.However, we show in this section that is only a sufficient condition:two differentiable processes and can have the same marginal lawseven if .This is because , as illustrated in Example 3.5,can be a rotation-only vector field (in a generalized sense shown below)that introduces rotation components into the dynamics without modifying the marginal distributions.Therefore, the constraint of inthe optimization problem (16) may be too restrictive.A natural relaxation of (16) would be
which yields a dynamic OT problem with a continuum ofmarginal constraints.In Section 5, we show thatthe solution of (17) yields our -rectified flow thatsolve the OT problemat the fixed point.Solving (17) allows us to remove the rotational components of , which is whatwhat renders rectified flow non-optimal.In this section,we first characterize the necessary and sufficient conditionfor having equivalent marginal laws.
which gives . This says that is a rotation-only (or divergence-free) vector field in the classical sense.
which is the definition of-marginal-preserving following (18).∎
c𝑐c-Rectified Flow
We introduce -rectified flow,a -dependent variant of rectified flow that guarantees tominimize the -transport costwhen applied recursively. This section is organized as follows:Section 5.1defines and discusses the -rectified flow of a differentiable stochastic process , which we show yields the solution of theinfinite-marginal OT problem (17).Section 5.2considers the -rectified flow of a coupling , which we show is non-increasing on the -transport cost.Section 5.3proves that the fixed points of are -optimal.Section 5.4interprets -rectified flowas an alternating direction descent methodfor the dynamic OT problem (8),and a majorize-minimization (MM) algorithm for the static OT problem (1).Section 5.5 discussesa key lemma relating -optimal couplings and its associated displacement interpolation with Hamilton-Jacobi equation.
Note that we have for following the definition of the conjugate (or the Fenchel-Young inequality).Losses of form is equivalent to the so called matching lossproposed forlearning generalized linear models . Compared with the original rectified flow,the difference of -rectified flow is i) restricting the velocity field to a form of , and ii) replacing the quadratic objective function to the matching loss.These two changes combined yield a Helmholtz like decomposition of as we show below, allowing us to remove the “rotation-only” component of and obtain -optimal couplings at fixed points.
We can equivalently write (23) using Bergman divergence associated with , that is,
Then it is easy to see that , by using the fact that and .Hence, and are equivalent up to the monotonic transform on .The minimum is achieved when ,while is achieved when .Therefore, (23) is equivalent to
Moreover,the generalized Pythagorean theorem of Bregman divergence (e.g., ) gives
which can be viewed as projecting the expected velocity to the set of functions of form ,w.r.t. the Bregman divergence.This yields an orthogonal decomposition of :
where is the residual term.The key result below shows that is -marginal-preserving, which ensures thatthe -rectified flow preserves the marginals of .
We say that is -rectifiable if exists,the minimum of (23) exists and is attained by a locally bounded function ,and the solution of Equation (22) exists and is unique.
Taking if and if yields that is -marginal-preserving following (18).ii) Note that is rectifiable if is -rectifiable. Applying Lemma 4.2yields the result. ∎
For the quadratic cost , the is the identity mapping, and (27) reduces to the Helmholtz decomposition,which represents a velocity field into the sum of a gradient field and a divergence-free field. Hence,(27) yields a generalization of Helmholtz decomposition, in which a monotonic transform is applied on the gradient field component.We call (27) aBregman Helmholtz decomposition.
Remark: score matching
In some special cases, may already be a gradient field,and hence the rectified flow and -rectified flow coincide for . One example of this is when for some time-differentiable functions and , and , satisfying , and . In this case, one can show that
c𝑐c-Rectified flow solves Problem (17)
We are ready to show that the -rectified flow solves the optimization problem in (17).Further, (23) forms a dual problem of(17).
Under the conditions in Theorem 5.2, we havei) attains the minimum of (17).ii) Problem(17) and (23) has a strong duality:
As the optima above are achieved by and , we have
Moreover, if we take and ,then the inequality in is tight because holds -almost surely.Therefore, , which suggests that attains the maximum of (under the marginal constraints) and the strong duality holds.∎
Similar to the case of rectified flow,the -rectified flow/coupling of a coupling is defined as the -rectified flow/coupling of its linear interpolation process.In the following, we show that the -rectified coupling of a coupling yields no larger-transport cost.
Let be the linear interpolation ofcoupling in that . We say that is -rectifiable if is -rectifiable, and call the -rectified flow of .We call the induced the -rectified coupling of , denoted as .
which establishes that yields no larger transport cost than .
Compared with the regular mapping,the key difference here is thatthe monotonicity of only holds for the specific that it employees, rather than all convex cost functions. More importantly,as we show below,recursively applying yields optimal couplings w.r.t. ,a key property that the regular rectified flow misses.
3 Fixed Points ofc𝑐c-𝚁𝚎𝚌𝚝𝚒𝚏𝚢𝚁𝚎𝚌𝚝𝚒𝚏𝚢\mathtt{Rectify} arec𝑐c-Optimal
Knowing that is an indication of -optimality, we show below that it is guaranteed to converge to zero with recursive updates.
Let be the -th -rectified flow of , satisfying and .Assume each is -rectifiable for . Then
Applying (28) to and yields
4 c𝑐c-Rectified Flow as Optimization Algorithms
In this section, we draw more understanding on how iterative -rectified flowing solves the static and dynamic OT problems. We first show that -rectified flow can be viewed as an alternative direction descent on the dynamic OT problem(8), and then that -rectified coupling as a majorize-minimization (MM) algorithm on the statistic OT problem (1).The results in this section are framed in terms of a general path-wise loss function ,and hence provide a useful starting point for deriving -rectified flow like approaches tomore general optimization problems with coupling constraints.
The mapping can beinterpreted as an alternative direction descent procedure for the dynamic OT problem (8):
c𝑐c-Rectified flow as an MM algorithm
In this case,the MM update guarantees that is monotonically non-increasing:
and the minimum is attained by ,where denotes the set of couplings of and .ii) yields the MM update related in that
i) For any coupling and , we have
where the inequality holds because remove the constraint . In addition, it is obvious that the inequality above becomes equality when .ii) Note that
whose minimum of the right side is attained by following Theorem 5.3. Hence, the minimum of the left side is attained by .∎
5 Hamilton-Jacobi Equation and Optimal Transport
On the other hand, define . Then we have
Hence, is a -optimal coupling.∎
where we assume that decays to zero sufficiently fast at infinity.We have
At the saddle points,the functional derivations of equal zero, yielding
Assume is positive everywhere and note that , we have ,and hence .Plugging it back to yields that .Overall, the (formal) KKT condition of (10) is
Discussion and Open Questions
For machine learning (ML) tasks such as generative models and domain transfer, the transport cost is not necessarily the direct object of interest.In these cases,as suggested in, rectified flow might be preferred because it is simpler and does not require to specify a particular cost .Question: for such ML tasks, when would it be preferred to use OT with a specific , and how to choose optimally?
In practice, recursively applying the (-)rectification accumulates errorsbecause the training optimization for the drift field and the simulation of the ODEcan not be conducted perfectly.Howto avoid the error accumulationat each step?Assume , and is obtained by solving the ODE of the -th -rectified flow starting from .As we increase , may yield increasingly bad approximation of due to the error accumulation. One way to fix this is to adjust to make it closer to at each step. This can be done by reweighting/transporting towards by minimizing certain discrepancy measure,or replacing each with where is a permutation that yields a one-to-one matching between and .The key and challenging part is to do the adjustment in a good and fast way, ideally with a (near) linear time complexity.
With or without the adjustment step,build a complete theoretical analysis on the statistical error of the method.
In what precise sense is rectified flow solving a multi-objective variant of optimal transport?
References
Appendix A Proofs
i)The fact that and ensures that and hence is a coupling of and .Let be the linear interpolation of the coupling. Related, we have . Canceling yields that
where is the convex conjugate of .Equation (33) is equivalent to , which means that is continuously differentiable.Taking gradient on both sides of (33) gives
where are both symmetric and is positive definite,and henceThen is a diagonalizable (all its eigenvalues are real) by Lemma A.1.However, is not diagonalizable because must have complex eigenvalues as a non-reflecting, non-identity rotation matrix. Hence, (34) can not hold.
Assume that are two real symmetric matrices and is positive definite. Then is diagonalizable (on the real domain), that is, there exists an invertible matrix , such that is a diagonal matrix.
This is a standard result in linear algebra.Because is positive definite, there exists an invertible symmetric matrix , such that Then, , and it is similar to , which is symmetric and hence diagonalizable.∎