Rectified Flow: A Marginal Preserving Approach to Optimal Transport

Qiang Liu

Introduction

The Monge–Kantorovich (MK) optimal transport (OT) problem concerns finding an optimal coupling between two distributions π0,π1{\pi}_{0},{\pi}_{1}:

We present a different approach tocontinuous OTthat re-frames (1) intoa sequence of simple unconstrained nonlinear least squares optimization problems,whichmonotonically reduce the transport cost of a couplingwhile automatically preserving the marginal constraints.Different from theminimax and regularization approaches that enforce the constraints from outside,our method is an interior approachwhich starts from a valid coupling (typically the naive independent coupling), and traverses inside the constraint set to decrease the transport cost.Such an interior approach is non-trivial and has not been realized before, because there exists no obvious unconstrained parameterization of the set of couplings of π0{\pi}_{0} and π1{\pi}_{1}. Our method is made possible by leveragingrectified flow ,a recent approach to constructing (non-optimal) transport maps for generative modeling and domain transfer.What makes rectified flow specialis that it provides a simple procedurethat turns a given coupling into a new one thatobeys the same marginal laws, while yielding no worse transport cost w.r.t. all convex functions cc simultaneously.Despite this attractive property,as pointed out in ,rectified flow can not be used to optimize any fixed cost cc,as it is essentially a special multi-objective optimization procedure that targets no specific cost.Our method is a variant of rectified flow that targets a user-specified cost function cc and hence yields a new approach to the OT problem (1).

Rectified flow

We provide a high-level overview ofthe rectified flow of and the main results of this work.For a given coupling (X0,X1)(X_{0},X_{1}) of π0{\pi}_{0} and π1{\pi}_{1},the rectified flowinduced by (X0,X1)(X_{0},X_{1})is the time-differentiable process Z={Zt ⁣:t}{\boldsymbol{Z}}=\{Z_{t}\colon t\in\} over an artificial notion of time tt\in,that solves the following ordinary differential equation (ODE):

and XtX_{t} is the linear interpolation between X0X_{0} and X1X_{1}.Eq (3) isa least squares regression problem of predicting the line direction of (X1X0)(X_{1}-X_{0}) from every space-time point (Xt,t)(X_{t},t) on the linear interpolation path, yielding a solution of

c𝑐c-Rectified flow

In this work,we modify the Rectify\mathtt{Rectify} procedure so that it can be used tosolve (1) given a user-specified cost function cc.We show that this can be done easily by properly restricting the optimization domain of vv and modifying the loss functionin (3).The case of quadratic loss c(x)=12x2c(x)=\frac{1}{2}\left\lVert x\right\rVert^{2}is particularly simple,for which we simplyneed to restrict the vvto be a gradient field vt=ftv_{t}=\nabla f_{t} in the optimization of (3).For more general convex cc,we need to restrict vv to have a form of vt(x)=c(ft(x))v_{t}(x)=\nabla c^{*}(\nabla f_{t}(x)), with ff minimizing the following loss function:

Notation

Outline

The rest of the work is organized as follows. Section 2introduces the background of optimal transport. Section 3reviews rectified flow of from an optimization-based view.Section 4 characterizesthe if and only if conditionfor two differentiable stochastic processes to have equal marginal laws.Section 5 introduces the main cc-rectified flow method and establishes its theoretical properties.

Background of Optimal Transport

This section introduces the background of optimal transport (OT),including both the static and dynamic formulations.Of special importance is the dynamic formulation,which is closely related to the rectified flow approach.The readers can find systematic introductions to OT ina collection of excellent textbooks.

The optimal transport problem wasfirst formulated byGaspard Monge in 1781 when he studiedthe problem of how toredistribute mass, e.g., a pile of soil, with minimal effort.Monge’s problem can be formulated as

As the left side of (7)only depends on (X0,X1)(X_{0},X_{1}) and the right side only on (μ,ν)(\mu,\nu), one can show that (X0,X1)(X_{0},X_{1}) is cc-optimal and (μ,ν)(\mu,\nu) solves (6) iffμ(X0)+ν(X1)=c(X1X0)\mu(X_{0})+\nu(X_{1})=c(X_{1}-X_{0})holds with probability one, which provides a basic optimality criterion. Many existingOT algorithms are developed by exploiting the primal dual relation of (1) and(6)(see e.g., ), but have the drawback of yielding minimax problems that are challenging to solve in practice. If cc is strictly convex,the optimal transport map of (5)is unique (almost surely) and yields a form of

where cc^{*} is the convex conjugate function of cc, and ν\nu is an optimal solution of (6), which is cc-convex in that ν(x)=supy{c(yx)+μ(y)}\nu(x)=\sup_{y}\left\{-c(y-x)+\mu(y)\right\} with μ\mu the associated solution.In the canonical case of quadratic cost c(x)=12x2c(x)=\frac{1}{2}\left\lVert x\right\rVert^{2}, we can write T(x)=ϕ(x)T(x)=\nabla\phi(x), where ϕ(x)12x2+ν(x)\phi(x)\coloneqq\frac{1}{2}\left\lVert x\right\rVert^{2}+\nu(x) is a convex function.

Dynamic formulations

Both the MK and Monge problems can be equivalently framed in dynamic waysas finding continuous-time processes that transfer π0{\pi}_{0} to π1{\pi}_{1}.Let {xt ⁣:t}\{x_{t}\colon t\in\} be a smooth path connecting x0x_{0} and x1x_{1}, whose time derivative is denoted as x˙t\dot{x}_{t}.For convex cc, by Jensen’s inequality, we can represent the cost c(x1x0)c(x_{1}-x_{0}) in an integral form:

where the infimum is attained when xtx_{t} is the linear interpolation (geodesic) path: xt=tx1+(1t)x0x_{t}=tx_{1}+(1-t)x_{0}.Hence, the MK optimal transport problem (1) is equivalent to

Hence, we can rewrite (9) into an optimization problem on (v,ϱ)(v,\varrho), yielding the celebrated Benamou-Brenier formula :

Rectified Flow: An Optimization-Based View

We introduce rectified flow of from an optimization-based perspective:we show that rectified flow can be viewed as the solutionof a special constrained dynamic optimization problem,which allows us to gain more understanding of rectified flow and motivates the development of cc-rectified flow.Following ,for a time-differentiable stochastic process X={Xt ⁣:t}{\boldsymbol{X}}=\{X_{t}\colon t\in\}, its expected velocity field vXv^{{\boldsymbol{X}}} is defined as

where X˙t\dot{X}_{t} denotes the time derivative of XtX_{t}.Obviously, vXv^{{\boldsymbol{X}}} is the solution of

Precisely, Equation (13) should be interpreted by its weak and integral form:

Moreover, the rectified flow of a coupling (X0,X1)(X_{0},X_{1})is defined as the rectified flow of X{\boldsymbol{X}} when X{\boldsymbol{X}} is the linear interpolation of (X0,X1)(X_{0},X_{1}).

A stochastic process X{\boldsymbol{X}} is called rectifiable if vXv^{\boldsymbol{X}} exists and is locally bounded,and Equation (15)has an unique solution.A coupling (X0,X1)(X_{0},X_{1}) is called rectifiable if its linear interpolation process X{\boldsymbol{X}}, following Xt=tX1+(1t)X0X_{t}=tX_{1}+(1-t)X_{0}, is rectifiable.In this case, we call Z=Rectflow(X){\boldsymbol{Z}}=\mathtt{Rectflow}({\boldsymbol{X}}) the rectified flow of (X0,X1)(X_{0},X_{1}), and write it (with an abuse of notation) as Z=Rectflow((X0,X1)){\boldsymbol{Z}}=\mathtt{Rectflow}((X_{0},X_{1})).The corresponding(Z0,Z1)(Z_{0},Z_{1}) is called the rectified coupling of (X0,X1)(X_{0},X_{1}), denoted as (Z0,Z1)=Rectify((X0,X1))(Z_{0},Z_{1})=\mathtt{Rectify}((X_{0},X_{1})).

Assume that X{\boldsymbol{X}} is rectifiable. We have

Hence,rectified flowturns a rectifiable stochastic process into a flow while preserving the marginal laws.

We show thatthe rectified flow Z{\boldsymbol{Z}} of X{\boldsymbol{X}}achieves the minimum of the path-wise cc-transport cost in the set of time-differentiable stochastic processes whose expected velocity field equals vXv^{{\boldsymbol{X}}}.This explains that the property of non-increasing convex transport costs of rectified flow/coupling.

The rectified flow Z=Rectflow(Xt){\boldsymbol{Z}}=\mathtt{Rectflow}({\boldsymbol{X}}_{t}) in (15) attains the minimum of

which yields a proof of Theorem 3.2 of that the rectified coupling (Z0,Z1)(Z_{0},Z_{1}) yields no larger convex transport costs than (X0,X1)(X_{0},X_{1}).

A primal-dual relation

Let us generalize the least squares loss LX(v)L_{{\boldsymbol{X}}}(v) in (12) to aa Bregman divergence based loss:

For any differentiable convex function cc,and rectifiable process X{\boldsymbol{X}}, we have

andthe optima above are achieved when v=vXv=v^{{{\boldsymbol{X}}}} and Y=Rectflow(X){\boldsymbol{Y}}=\mathtt{Rectflow}({\boldsymbol{X}}).

Straight couplings

Take π0=π1=N(0,I){\pi}_{0}={\pi}_{1}=\mathcal{N}(0,I).Hence, for c(x)=xpc(x)=\left\lVert x\right\rVert^{p} with p>0p>0,the cc-optimal mapping is the trivial identity coupling (X0,X0)(X_{0},X_{0}) with X0π0X_{0}\sim{\pi}_{0}.However,consider the coupling (X0,AX0)(X_{0},AX_{0}),where AA is anon-identity and non-reflecting rotation matrix (namely AA=IA^{\top}A=I, det(A)=1\det(A)=1, AIA\neq I and AA does not have λ=1\lambda=-1 as an eigenvalue).Then (X0,AX0)(X_{0},AX_{0}) is a straight coupling of π0{\pi}_{0} and π1{\pi}_{1},but it is not cc-optimal for allsecond order differentiable convex function cc whose Hessian matrix is invertible everywhere. See Appendix for the proof.It is the rotation transformthat makes (X0,AX0)(X_{0},AX_{0}) sub-optimal,which is removed in the proposed cc-rectified flow in Section 5 via a Helmholtz like decomposition.

Differentiable Processes with Equivalent Marginal Laws

The marginal preserving property of rectified flow is due to the property of vZ=vXv^{{\boldsymbol{Z}}}=v^{{\boldsymbol{X}}} by construction.However, we show in this section thatvX=vZv^{{\boldsymbol{X}}}=v^{{\boldsymbol{Z}}} is only a sufficient condition:two differentiable processes X{\boldsymbol{X}} and Z{\boldsymbol{Z}}can have the same marginal lawseven if rvXvZ0r\coloneqq v^{{\boldsymbol{X}}}-v^{{\boldsymbol{Z}}}\neq 0.This is because rr, as illustrated in Example 3.5,can be a rotation-only vector field (in a generalized sense shown below)that introduces rotation components into the dynamics without modifying the marginal distributions.Therefore, the constraint of vY=vXv^{{\boldsymbol{Y}}}=v^{{\boldsymbol{X}}} inthe optimization problem (16) may be too restrictive.A natural relaxation of (16) would be

which yields a dynamic OT problem with a continuum ofmarginal constraints.In Section 5, we show thatthe solution of (17) yields our cc-rectified flow thatsolve the OT problemat the fixed point.Solving (17) allows us to remove the rotational components of vXv^{{\boldsymbol{X}}}, which is whatwhat renders rectified flow non-optimal.In this section,we first characterize the necessary and sufficient conditionfor having equivalent marginal laws.

which gives (rtϱt)=0\nabla\cdot(r_{t}\varrho_{t})=0. This says that rtϱtr_{t}\varrho_{t} is a rotation-only (or divergence-free) vector field in the classical sense.

which is the definition ofY{\boldsymbol{Y}}-marginal-preserving following (18).∎

c𝑐c-Rectified Flow

We introduce cc-rectified flow,a cc-dependent variant of rectified flow that guarantees tominimize the cc-transport costwhen applied recursively. This section is organized as follows:Section 5.1defines and discusses the cc-rectified flow of a differentiable stochastic process X{\boldsymbol{X}}, which we show yields the solution of theinfinite-marginal OT problem (17).Section 5.2considers the cc-rectified flow of a coupling (X0,X1)(X_{0},X_{1}), which we show is non-increasing on the cc-transport cost.Section 5.3proves that the fixed points of c-Rectifyc\text{-}\mathtt{Rectify} are cc-optimal.Section 5.4interprets cc-rectified flowas an alternating direction descent methodfor the dynamic OT problem (8),and a majorize-minimization (MM) algorithm for the static OT problem (1).Section 5.5 discussesa key lemma relating cc-optimal couplings and its associated displacement interpolation with Hamilton-Jacobi equation.

Note that we have mc ⁣(x;y)0{\mathsf{m}_{c}}\!\left(x;y\right)\geq 0 for x,y\forall x,y following the definition of the conjugate cc^{*} (or the Fenchel-Young inequality).Losses of form mc ⁣(x;y){\mathsf{m}_{c}}\!\left(x;y\right)is equivalent to the so called matching lossproposed forlearning generalized linear models . Compared with the original rectified flow,the difference of cc-rectified flow is i) restricting the velocity field to a form of gt=cftg_{t}=\nabla c^{*}\circ\nabla f_{t}, and ii) replacing the quadratic objective function to the matching loss.These two changes combined yield a Helmholtz like decomposition of vXv^{{\boldsymbol{X}}} as we show below, allowing us to remove the “rotation-only” component of vXv^{{\boldsymbol{X}}} and obtain cc-optimal couplings at fixed points.

We can equivalently write (23) using Bergman divergence associated with cc, that is,

Then it is easy to see that mc ⁣(x;y)=bc ⁣(x;c(y)){\mathsf{m}_{c}}\!\left(x;y\right)={\mathsf{b}_{c}}\!\left(x;\nabla c^{*}(y)\right), by using the fact that c(c(y))=y\nabla c(\nabla c^{*}(y))=y and c(y)=yc(y)c(c(y))c^{*}(y)=y^{\top}\nabla c^{*}(y)-c(\nabla c^{*}(y)).Hence, mc{\mathsf{m}_{c}} and bc{\mathsf{b}_{c}} are equivalent up to the monotonic transform c\nabla c^{*} on yy.The minimum bc ⁣(x;y)=0{\mathsf{b}_{c}}\!\left(x;y\right)=0 is achieved when y=xy=x,while mc ⁣(x;y)=0{\mathsf{m}_{c}}\!\left(x;y\right)=0 is achieved when c(y)=x\nabla c^{*}(y)=x.Therefore, (23) is equivalent to

Moreover,the generalized Pythagorean theorem of Bregman divergence (e.g., ) gives

which can be viewed as projecting the expected velocity vtXv^{{\boldsymbol{X}}}_{t} to the set of functions of form gt=cftg_{t}=\nabla c^{*}\circ\nabla f_{t},w.r.t. the Bregman divergence.This yields an orthogonal decomposition of vtXv_{t}^{{\boldsymbol{X}}}:

where rtX,c=vtX,ccftX,cr^{{{\boldsymbol{X}}},c}_{t}=v^{{{\boldsymbol{X}}},c}_{t}-\nabla c^{*}\circ\nabla f^{{{\boldsymbol{X}}},c}_{t} is the residual term.The key result below shows that rX,cr^{{{\boldsymbol{X}}},c} is X{{\boldsymbol{X}}}-marginal-preserving, which ensures thatthe cc-rectified flow preserves the marginals of X{{\boldsymbol{X}}}.

We say that X{\boldsymbol{X}} is cc-rectifiable if vXv^{{\boldsymbol{X}}} exists,the minimum of (23) exists and is attained by a locally bounded function fX,cf^{{\boldsymbol{X}},c},and the solution of Equation (22) exists and is unique.

Taking gs=hg_{s}=h if s<ts<t and gs=0g_{s}=0 if s>ts>t yields that rX,c(x)=c(fsX,c(Xs))vX(Xs)r^{{{\boldsymbol{X}}},c}(x)=\nabla c^{*}(\nabla f_{s}^{{{\boldsymbol{X}}},c}(X_{s}))-v^{{\boldsymbol{X}}}(X_{s}) is X{\boldsymbol{X}}-marginal-preserving following (18).ii) Note that Z{\boldsymbol{Z}} is rectifiable if X{\boldsymbol{X}} is cc-rectifiable. Applying Lemma 4.2yields the result. ∎

For the quadratic cost c(x)=c(x)=12x2c(x)=c^{*}(x)=\frac{1}{2}\left\lVert x\right\rVert^{2}, the c\nabla c^{*} is the identity mapping, and (27) reduces to the Helmholtz decomposition,which represents a velocity field into the sum of a gradient field and a divergence-free field. Hence,(27) yields a generalization of Helmholtz decomposition, in which a monotonic transform c\nabla c^{*} is applied on the gradient field component.We call (27) aBregman Helmholtz decomposition.

Remark: score matching

In some special cases, vXv^{{\boldsymbol{X}}} may already be a gradient field,and hence the rectified flow and cc-rectified flow coincide for c(x)=12x2c(x)=\frac{1}{2}\left\lVert x\right\rVert^{2}. One example of this is when Xt=αtX1+βtξX_{t}=\alpha_{t}X_{1}+\beta_{t}\xifor some time-differentiable functions αt\alpha_{t} and βt\beta_{t}, and ξN(0,I)\xi\sim\mathcal{N}(0,I), satisfying α1=1,β1=0\alpha_{1}=1,\beta_{1}=0, and X0=α0X1+β0ξX_{0}=\alpha_{0}X_{1}+\beta_{0}\xi. In this case, one can show that

c𝑐c-Rectified flow solves Problem (17)

We are ready to show that the cc-rectified flow solves the optimization problem in (17).Further, (23) forms a dual problem of(17).

Under the conditions in Theorem 5.2, we havei) Z=c-Rectify(X){\boldsymbol{Z}}=c\text{-}\mathtt{Rectify}({\boldsymbol{X}}) attains the minimum of (17).ii) Problem(17) and (23) has a strong duality:

As the optima above are achieved by fX,cf^{{{\boldsymbol{X}}},c} and Z{\boldsymbol{Z}}, we haveLX,c(fX,c)=Fc(X)Fc(Z).L_{{{\boldsymbol{X}}},c}(f^{{{\boldsymbol{X}}},c})=F_{c}({\boldsymbol{X}})-F_{c}({\boldsymbol{Z}}).

Moreover, if we takeY=Z{\boldsymbol{Y}}={\boldsymbol{Z}} and f=fX,cf=f^{{{\boldsymbol{X}}},c},then the inequality in (1)\overset{(1)}{\leq} is tight because Z˙t=c(ft(Yt))\dot{Z}_{t}=\nabla c^{*}(\nabla f_{t}(Y_{t})) holds tt-almost surely.Therefore, RX,c(Z)=LX,c(fX,c)RX,c(Y)R_{{{\boldsymbol{X}}},c}({\boldsymbol{Z}})=L_{{{\boldsymbol{X}}},c}(f^{{{\boldsymbol{X}}},c})\geq R^{{{\boldsymbol{X}}},c}(Y), which suggests that Z{\boldsymbol{Z}} attains the maximum of RX,cR_{{{\boldsymbol{X}}},c} (under the marginal constraints) and the strong duality holds.∎

Similar to the case of rectified flow,the cc-rectified flow/coupling of a coupling (X0,X1)(X_{0},X_{1})is defined as the cc-rectified flow/coupling of its linear interpolation process.In the following, we show that the cc-rectified coupling of a coupling yields no largercc-transport cost.

Let X{\boldsymbol{X}} be the linear interpolation ofcoupling (X0,X1)(X_{0},X_{1}) in that Xt=tX1+(1t)X0,tX_{t}=tX_{1}+(1-t)X_{0},\forall t\in. We say that (X0,X1)(X_{0},X_{1}) is cc-rectifiable if X{\boldsymbol{X}} is cc-rectifiable, and call Z=c-Rectflow(X){\boldsymbol{Z}}=c\text{-}\mathtt{Rectflow}({\boldsymbol{X}}) the cc-rectified flow of (X0,X1)(X_{0},X_{1}).We call the induced (Z0,Z1)(Z_{0},Z_{1}) the cc-rectified coupling of (X0,X1)(X_{0},X_{1}), denoted as (Z0,Z1)=c-Rectify((X0,X1))(Z_{0},Z_{1})=c\text{-}\mathtt{Rectify}((X_{0},X_{1})).

which establishes that (Z0,Z1)(Z_{0},Z_{1}) yields no larger transport cost than (X0,X1)(X_{0},X_{1}).

Compared with the regular Rectify\mathtt{Rectify} mapping,the key difference here is thatthe monotonicity of c-Rectifyc\text{-}\mathtt{Rectify} only holds for the specific cc that it employees, rather than all convex cost functions. More importantly,as we show below,recursively applying c-Rectifyc\text{-}\mathtt{Rectify}yields optimal couplings w.r.t. cc,a key property that the regular rectified flow misses.

3 Fixed Points ofc𝑐c-𝚁𝚎𝚌𝚝𝚒𝚏𝚢𝚁𝚎𝚌𝚝𝚒𝚏𝚢\mathtt{Rectify} arec𝑐c-Optimal

Knowing that LX,c(fX,c)L_{{{\boldsymbol{X}}},c}(f^{{{\boldsymbol{X}}},c}) is an indication of cc-optimality, we show below that it is guaranteed to converge to zero with recursive Rectify\mathtt{Rectify} updates.

Let Zk{\boldsymbol{Z}}^{k} be the kk-th cc-rectified flow of (X0,X1)(X_{0},X_{1}), satisfying Zk+1=c-Rectflow((Z0k,Z1k)){\boldsymbol{Z}}^{k+1}=c\text{-}\mathtt{Rectflow}((Z_{0}^{k},Z_{1}^{k})) and (Z00,Z10)=(X0,X1)(Z_{0}^{0},Z_{1}^{0})=(X_{0},X_{1}).Assume each (Z0k,Z1k)(Z_{0}^{k},Z_{1}^{k}) is cc-rectifiable for k=0,,Kk=0,\ldots,K. Then

Applying (28) to (Z0k,Z1k)(Z_{0}^{k},Z_{1}^{k}) and (Z0k+1,Z1k+1)(Z_{0}^{k+1},Z_{1}^{k+1}) yields

4 c𝑐c-Rectified Flow as Optimization Algorithms

In this section, we draw more understanding on how iterative cc-rectified flowing solves the static and dynamic OT problems. We first show that cc-rectified flow can be viewed as an alternative direction descent on the dynamic OT problem(8), and then that cc-rectified coupling as a majorize-minimization (MM) algorithm on the statistic OT problem (1).The results in this section are framed in terms of a general path-wise loss function Fc(Y)F_{c}({\boldsymbol{Y}}),and hence provide a useful starting point for deriving cc-rectified flow like approaches tomore general optimization problems with coupling constraints.

The mapping Zk+1=c-Rectflow(Zk){\boldsymbol{Z}}^{k+1}=c\text{-}\mathtt{Rectflow}({\boldsymbol{Z}}^{k}) can beinterpreted as an alternative direction descent procedure for the dynamic OT problem (8):

c𝑐c-Rectified flow as an MM algorithm

In this case,the MM update guarantees that F(Xk)F(X^{k}) is monotonically non-increasing:

and the minimum is attained by (X0,X1)=(Y0,Y1)(X_{0},X_{1})=(Y_{0},Y_{1}),where Π0,1\Pi_{0,1} denotes the set of couplings of π0{\pi}_{0} and π1{\pi}_{1}.ii) c-Rectifyc\text{-}\mathtt{Rectify} yields the MM update related F+F^{+} in that

i) For any coupling (X0,X1)(X_{0},X_{1}) and (Y0,Y1)(Y_{0},Y_{1}), we have

where the inequality holds because remove the constraint YMX{\boldsymbol{Y}}\in\mathcal{M}_{X}. In addition, it is obvious that the inequality above becomes equality when (X0,X1)=(Y0,Y1)(X_{0},X_{1})=(Y_{0},Y_{1}).ii) Note that

whose minimum of the right side is attained by Y=c-Rectflow((X0,X1)){\boldsymbol{Y}}=c\text{-}\mathtt{Rectflow}((X_{0},X_{1})) following Theorem 5.3. Hence, the minimum of the left side is attained by (Y0,Y1)=c-Rectify((X0,X1))(Y_{0},Y_{1})=c\text{-}\mathtt{Rectify}((X_{0},X_{1})).∎

5 Hamilton-Jacobi Equation and Optimal Transport

On the other hand, define ht(x)=tft(x)+c(ft(x))h_{t}(x)=\partial_{t}f_{t}(x)+c^{*}(\nabla f_{t}(x)). Then we have

Hence, (X0,X1)(X_{0},X_{1}) is a cc-optimal coupling.∎

where we assume that λtvrρt\lambda_{t}v_{r}\rho_{t} decays to zero sufficiently fast at infinity.We have

At the saddle points,the functional derivations of L\mathcal{L} equal zero, yielding

Assume ϱt\varrho_{t} is positive everywhere and note that c(c(x))=x\nabla c^{*}(\nabla c(x))=x, we have vt=c(λt)v_{t}=\nabla c^{*}(\nabla\lambda_{t}),and hence λtvtc(vt)=c(λt)\nabla\lambda_{t}^{\top}v_{t}-c(v_{t})=c^{*}(\nabla\lambda_{t}).Plugging it back to δLδρt=0\frac{\delta\mathcal{L}}{\delta\rho_{t}}=0 yields that λ˙t+c(λt)=0\dot{\lambda}_{t}+c^{*}(\nabla\lambda_{t})=0.Overall, the (formal) KKT condition of (10) is

Discussion and Open Questions

For machine learning (ML) tasks such as generative models and domain transfer, the transport cost is not necessarily the direct object of interest.In these cases,as suggested in, rectified flow might be preferred because it is simpler and does not require to specify a particular cost cc.Question: for such ML tasks, when would it be preferred to use OT with a specific cc, and how to choose cc optimally?

In practice, recursively applying the (cc-)rectification accumulates errorsbecause the training optimization for the drift field and the simulation of the ODEcan not be conducted perfectly.Howto avoid the error accumulationat each step?Assume {x1,i}iπ1\{x_{1,i}\}_{i}\sim{\pi}_{1}, and {z0,ik,z1,ik}i\{z_{0,i}^{k},z_{1,i}^{k}\}_{i} is obtained by solving the ODE of the kk-th cc-rectified flow starting from z0,ikπ0z_{0,i}^{k}\sim{\pi}_{0}.As we increase kk, {z0,ik}i\{z_{0,i}^{k}\}_{i} may yield increasingly bad approximation of π1{\pi}_{1} due to the error accumulation. One way to fix this is to adjust {z1,ik}\{z_{1,i}^{k}\} to make it closer to {x1,ik}i\{x_{1,i}^{k}\}_{i} at each step. This can be done by reweighting/transporting {z1,ik}i\{z_{1,i}^{k}\}_{i} towards {x1,ik}i\{x_{1,i}^{k}\}_{i} by minimizing certain discrepancy measure,or replacing each z1,ikz_{1,i}^{k} with xσ(i)kx_{\sigma(i)}^{k} where σ\sigma is a permutation that yields a one-to-one matching between {z1(i)}\{z_{1}^{(i)}\} and {x1(i)}i\{x_{1}^{(i)}\}_{i}.The key and challenging part is to do the adjustment in a good and fast way, ideally with a (near) linear time complexity.

With or without the adjustment step,build a complete theoretical analysis on the statistical error of the method.

In what precise sense is rectified flow solving a multi-objective variant of optimal transport?

References

Appendix A Proofs

i)The fact that AA=IA^{\top}A=I and π0=π1=N(0,I){\pi}_{0}={\pi}_{1}=\mathcal{N}(0,I)ensures that AX0π1AX_{0}\sim{\pi}_{1} and hence(X0,AX0)(X_{0},AX_{0}) is a coupling of π0{\pi}_{0} and π1{\pi}_{1}.Let Xt=tAX0+(1t)X0X_{t}=tAX_{0}+(1-t)X_{0} be the linear interpolation of the coupling. Related, we have X˙t=AX0X0\dot{X}_{t}=AX_{0}-X_{0}. Canceling X0X_{0} yields that

where cc^{*} is the convex conjugate of cc.Equation (33) is equivalent to c(Axx)=ϕ(x)\nabla c(Ax-x)=\nabla\phi(x), which means that ϕ\nabla\phi is continuously differentiable.Taking gradient on both sides of (33) gives

where Hx,BxH_{x},B_{x} are both symmetric and HxH_{x} is positive definite,and henceThen HxBxH_{x}B_{x} is a diagonalizable (all its eigenvalues are real) by Lemma A.1.However, AIA-I is not diagonalizable because AA must have complex eigenvalues as a non-reflecting, non-identity rotation matrix. Hence, (34) can not hold.

Assume that A,BA,B are two real symmetric matrices and AA is positive definite. Then ABAB is diagonalizable (on the real domain), that is, there exists an invertible matrix PP, such that P1ABPP^{-1}ABP is a diagonal matrix.

This is a standard result in linear algebra.Because AA is positive definite, there exists an invertible symmetric matrix CC, such that CC=A.CC=A. Then, AB=CCBAB=CCB, and it is similar to CBC1CBC^{-1}, which is symmetric and hence diagonalizable.∎