Scaling a unitary matrix

Alexis De Vos, Stijn De Baerdemacker

Introduction

By definition, the scaling of an n×mn\times m matrix AA is the multiplication of this matrix to the left by a diagonal n×nn\times n matrix LL and to the right by a diagonal m×mm\times m matrix RR, resulting in the n×mn\times m matrix B=LARB=LAR, called the scaled matrix .

Sinkhorn has demonstrated that an arbitrary matrix AA with exclusively real and positive entries can be scaled by diagonal matrices LL and RR with exclusively positive real entries, such that the resulting scaled matrix BB is doubly stochastic, i.e. such that all entries of BB are real and in the interval (0,1)(0,1) and all line sums (i.e. all row sums and all column sums) of BB are equal to 1.

In order to find the appropriate matrices LL and RR, one could consider the set of n+mn+m line-sum equations and solve it for the nn unknown entries of LL and the mm unknown entries of RR. However, all equations are quadratic. As the equations are non-linear, an analytic solution of the set is not available. Instead, one proceeds iteratively, in the way pioneered by Kruithof . One computes the matrices A1A_{1}, A2A_{2}, …, successive approximations of the wanted matrix BB. At the kk th iteration, one chooses a diagonal matrix LkL_{k} with diagonal entries equal to the inverse of the row sums of the matrix Ak1A_{k-1} :

and one chooses a diagonal matrix RkR_{k} with diagonal entries equal to the inverse of the resulting column sums

Because this procedure converges, the matrices LkL_{k} and RkR_{k} ultimately become the n×nn\times n and m×mm\times m unit matrices, respectively. This means that ultimately AkA_{k} becomes a matrix with unit line sums. By choosing A0A_{0} equal to AA, the two wanted scaling matrices are

and the scaled matrix BB is AA_{\infty}.

In the present paper, we investigate whether it is possible to scale an n×nn\times n unitary matrix AA by two unitary diagonal matrices LL and RR, such that the scaled matrix B=LARB=LAR has all line sums equal to 1. For this purpose, we apply a Sinkhorn-like algorithm, however with

thus guaranteeing that all the diagonal entries of LkL_{k} and RkR_{k} are automatically unitary. Here, we call a complex number xx unitary iff x=1|x|=1 and define the function Φ\Phi of a complex number yy as

If this procedure ultimately converges, then both LkL_{k} and RkR_{k} ultimately become the n×nn\times n unit matrix. This means that ultimately AkA_{k} becomes a matrix with all line sums having Φ=1\Phi=1, i.e. with all line sums real (positive or zero). Below, we will see that not only AkA_{k} converges to a matrix AA_{\infty} with exclusively realThe n×nn\times n unitary matrices with real line-sums form a subset of U(nn), but not a subgroup of U(nn). This can easily be illustrated by multiplying two U(2) matrices: the square root of NOT matrix M_{1}=\frac{1}{2}\ {\scriptsize\left(\begin{array}[]{rr}1+i&1-i\\ 1-i&1+i\end{array}\right)} and the orthogonal matrix M_{2}=\frac{1}{2}\ {\scriptsize\left(\begin{array}[]{rr}\sqrt{3}&-1\\ 1&\sqrt{3}\end{array}\right)}. All four line sums of both matrices are real (and positive); their product M1M2M_{1}M_{2}, however, has a real and positive first column sum c1=(3+1)/2c_{1}=(\sqrt{3}+1)/2, but a complex first row sum r1=(3i)/2r_{1}=(\sqrt{3}-i)/2. (non-negative) line sums, but that surprisingly those line sums moreover equal 1.

A progress measure

For investigating the progress in the matrix sequence A0,A1,A2,...A_{0},A_{1},A_{2},..., we basicly follow a reasoning similar to the elegant proofs of Sinkhorn’s theorem by Linial et al. and by Aaronson . However, the pivotal role (called either ‘progress measure’ or ‘potential function’) played either by the matrix permanent (Linial et al.) or by the matrix product (Aaronson) is taken over here by the absolute value of the matrix sum. The matrix sum of an n×mn\times m matrix XX is defined as the sum of all its entries:

Assume a matrix XX with row sums rar_{a} and column sums cbc_{b}. We denote by LL the diagonal matrix with entries LaaL_{aa} equal to 1/Φ(ra)1/\Phi(r_{a}). If XX^{\prime} is a short-hand notation for LXLX, and rar^{\prime}_{a} and cbc^{\prime}_{b} are its row sums and column sums, respectively, then we have

because ara\sum_{a}r^{\prime}_{a} can be regarded as a vector sum of vectors with the same length as the vectors rar_{a} but with zero angles between them. The equality sign holds iff all numbers rar_{a} have the same argument. Similarly, we denote by RR the diagonal matrix with entries RbbR_{bb} equal to 1/Φ(cb)1/\Phi(c^{\prime}_{b}). If XX^{\prime\prime} is a short-hand notation for XRX^{\prime}R, and cbc^{\prime\prime}_{b} are its column sums, then

where the equality sign holds iff all numbers cbc^{\prime}_{b} have the same argument. We thus can conclude that

where the equality holds iff the equality holds both in (1) and in (2). The equality in (1) occurs if all rar_{a} have the same argument, whereas the equality in (2) occurs if all cbc^{\prime}_{b} have the same argument. But, a constant arg(ra)\arg(r_{a}) leads to an XX^{\prime} of the form eiαXe^{i\alpha}\,X and therefore to column sums cb=eiαcbc^{\prime}_{b}=e^{i\alpha}\,c_{b} and the condition of constant arg(cb)\arg(c^{\prime}_{b}) then is equivalent to the condition of constant arg(cb)\arg(c_{b}).

We conclude that, in the matrix sequence A0,A1,A2,...A_{0},A_{1},A_{2},... of Section 1, we have

where the equality sign holds iff Ak1A_{k-1} is a matrix with all line sum arguments equal. As soon as k1>0k-1>0, this condition is equivalent to iff Ak1A_{k-1} is a matrix with all line sums real, either zero or positiveWe remark that, for k>0k>0, the procedure of Section 1 guarantees that all column sums and thus also the matrix sum are positive or zero. Hence, for k2k\geq 2, the absolute value symbols in (3) could be omitted..

In Appendix A, we prove that, for an arbitrary n×nn\times n unitary matrix UU, we have \mboxsum(U)n|\mbox{sum}(U)|\leq n. The equality sign holds iff UU is, up to a global phase, a matrix with all line sums equal to 1. For sake of convenience, we define the potential function Ψ\Psi of an n×nn\times n matrix MM as

such that for all unitary matrices 0Ψn20\leq\Psi\leq n^{2} holds and the zero potential corresponds with unit line-sum matrices (times a factor eiαe^{i\alpha}). With this convention, we rewrite (3) as

The Ψ\Psi landscape of the matrix group U(nn) displays stationary points. In Appendix B, we show that these points either have zero matrix sum or have all (non-zero) line sums with same argument. We distinguish three categories of stationary points: their potential satisfies

If the first case occurs, then the stationary point is a global maximum; if the second case occurs, then the stationary point is a global minimum. We conjecture that the third class consists of saddle points. In other words: we conjecture that the Ψ\Psi landscape has no local minima (nor local maxima). As a result, the scaling procedure (with ever decreasing Ψ\Psi) ultimaltely converges to the point with minimal potential. We conjecture that this global minimum is a matrix with Ψ=0\Psi=0 and thus is a wanted unit line-sum matrix BB.

Either the given matrix AA does not have constant line sum arguments, in which case we choose A0=AA_{0}=A. As long as the subsequent matrices A1A_{1}, A2A_{2}, … do not have all row sums real, we have a strictly decreasing sequence Ψ(A0)>Ψ(A1)>Ψ(A2)>...\Psi(A_{0})>\Psi(A_{1})>\Psi(A_{2})>.... This sequence is bounded by 0. Therefore a limit matrix AA_{\infty} exists, with Ψ(A)0\Psi(A_{\infty})\geq 0.

If Ψ(A)=0\Psi(A_{\infty})=0, then AA_{\infty} is the wanted scaled matrix. The scaling matrices are L=L...L2L1L=L_{\infty}...L_{2}L_{1} and R=R1R2...RR=R_{1}R_{2}...R_{\infty}.

In the (very unlikely) case that Ψ(A)>0\Psi(A_{\infty})>0, the matrix AA_{\infty} is a stationary point in the potential landscape Ψ\Psi. According to the conjecture, this point is a saddle point and therefore we can apply appropriate matrices LL and RR, both close to the unit matrix, such that LARLA_{\infty}R has potential Ψ\Psi lower than Ψ(A)\Psi(A_{\infty}). It is sufficient to try nn mutually orthogonal directions in the (2n1)(2n-1)-dimensional neighbourhood of the saddle point. After applying these LL and RR, we restart the algorithm with a new A0A_{0}, equal to LARLA_{\infty}R.

Or the given matrix AA has all line sum arguments equal and thus AA is a stationary point. In this case we choose A0=L0AR0A_{0}=L_{0}AR_{0}, with two appropriate matrices L0L_{0} and R0R_{0}, such that the start matrix A0A_{0} is not a stationary point. For this purpose, we can proceed as follows:

If at least two row sums of AA are different from 0, e.g. rx0r_{x}\neq 0 and ry0r_{y}\neq 0, then we take R0R_{0} equal to the unit matrix and all entries of L0L_{0} equal to 1, except (L0)xx(L_{0})_{xx}, thus resulting in at least two different row-sum arguments for A0A_{0}.

If at least two column sums of AA are different from 0, e.g. cx0c_{x}\neq 0 and cy0c_{y}\neq 0, then we take L0L_{0} equal to the unit matrix and all entries of R0R_{0} equal to 1, except (R0)xx(R_{0})_{xx}, thus resulting in at least two different column-sum arguments for A0A_{0}.

If only one row sum (say rxr_{x}) and only one column sum (say cyc_{y}) of AA differ from 0, i.e. if AA is a generalized Hadamard matrix , then we take R0R_{0} equal to the unit matrix and all entries of L0L_{0} equal to 1, except (L0)xx(L_{0})_{xx}, thus resulting in at least two different column-sum arguments for A0A_{0}.

An example of the latter case is the orthogonal matrix

where, for convenience, we assume 0<ϕ<π/40<\phi<\pi/4. Indeed: all its line sums have zero argument. If we would take A0=AA_{0}=A, then L1L_{1} would be equal to the 2×22\times 2 unit matrix and subsequently R1R_{1} would be equal to the unit matrix, such that A1A_{1} and, in fact, all subsequent AkA_{k} would be equal to AA and therefore Ψ(A0)=Ψ(A1)=Ψ(A2)=...\Psi(A_{0})=\Psi(A_{1})=\Psi(A_{2})=..., equal to 2cos(ϕ)2\cos(\phi) in this example. In this way, the algorithm cannot find the wanted solution, in spite of the fact that such scaled matrices with exclusively unit line sums actually exist, e.g.

In order to avoid the no-start of the convergence towards the desired scaled matrix BB, we apply e.g. the matrices L_{0}={\scriptsize\left(\begin{array}[]{rr}i&0\\ 0&1\end{array}\right)} and R_{0}={\scriptsize\left(\begin{array}[]{rr}1&0\\ 0&1\end{array}\right)}, resulting in

where row sums r1=i(cos(ϕ)+sin(ϕ))r_{1}=i(\cos(\phi)+\sin(\phi)) and r2=cos(ϕ)sin(ϕ)r_{2}=\cos(\phi)-\sin(\phi) indeed have unequal arguments: π/2\pi/2 and 0, respectively.

Convergence speed

As a first example of the procedure of Section 2, we take the unitary matrix

with line sums, matrix sum, and potential

Because Ψn2\Psi\neq n^{2}, Ψ0\Psi\neq 0, and the line sums do not have equal argument, we are in a ‘common’ case, i.e. not in a stationary point of the Ψ\Psi landscape. We thus choose A0=AA_{0}=A. Subsequent steps of the algorithm yield potentials

Next, with the method of Życzkowski et al. , we generate 1,000 random elements of U(3), uniformly distributed with respect to the Haar measure. Table 1 shows how the potential Ψ\Psi decreases after each step of the scaling procedure. Whereas the Ψ\Psi values of the initial matrices A0A_{0} have a wide distribution between 0 and n2=9n^{2}=9, the distribution of Ψ(Ak)\Psi(A_{k}) is very peaked at Ψ=0\Psi=0, as soon as k>0k>0.

The convergence speed turns out to be strongly different for different matrices AA. Among the 1,000 samples, some converge exceptionally slowly, as is illustrated by the column ‘maximum(Ψ\Psi)’. Usually, however, convergence is fast, as is illustrated by the column ‘average(Ψ\Psi)’. We stress the fact that all 1,000 experiments directly converge to the global minimum Ψ=0\Psi=0, and thus none ‘gets trapped’ in a local minimum and none temporarily ‘halts’ in a saddle point.

Finally, similar experiments with 1,000 random elements from U(4) (see Table 1) and with 1,000 random elements from U(5) lead to similar results.

For nn equal to 2, 4, 8, 16, and 32, Figure 1 shows the probability distribution of the potential Ψ\Psi, after k=0k=0, 11, 22, and 44 iteration steps. We see how the distribution, at each step, shifts more and more to Ψ=0\Psi=0.

The convergence can also be visualized by displaying Ψk+1\Psi_{k+1} as a function of Ψk\Psi_{k}, i.e. the correlation between a Ψ\Psi after and before an iteration step. Figure 2 shows Ψ1(Ψ0)\Psi_{1}(\Psi_{0}), Ψ2(Ψ1)\Psi_{2}(\Psi_{1}), and Ψ3(Ψ2)\Psi_{3}(\Psi_{2}) for both n=2n=2 and n=3n=3. As expected, all points lay below the line Ψk+1=Ψk\Psi_{k+1}=\Psi_{k}. We see how the cloud of points, after each step, becomes smaller and smaller and moves closer and closer to the point Ψk=Ψk+1=0\Psi_{k}=\Psi_{k+1}=0.

Application

In Reference , two subgroups of the unitary group U(nn) are presented:

the subgroup ZU(nn), consisting of all n×nn\times n diagonal matrices with upper-left entry equal to 1 and other diagonal entries from U(1);

the subgroup XU(nn), consisting of all n×nn\times n unitary matrices with all of their 2n2n line sums are equal to 1,

and the following theorem is proved: any U(nn) matrix UU can be decomposed as

with pn(n1)/2+1p\leq n(n-1)/2+1 and where all ZjZ_{j} are ZU(nn) matrices and all XjX_{j} are XU(nn) matrices. In Reference , it is proved that a shorter decomposition exists: with pnp\leq n.

In the present paper, we conjecture that even a far stronger theorem holds: p2p\leq 2. This means that any U(nn) matrix UU can be decomposed as

where both Z1Z_{1} and Z2Z_{2} are ZU(nn) matrices and XX is an XU(nn) matrix. In , it is proved that the group XU(nn) is isomorphic to U(n1n-1) and hence has dimension (n1)2(n-1)^{2}. Therefore, the product eiαZ1XZ2e^{i\alpha}\,Z_{1}XZ_{2} has

degrees of freedom, matching exactly the dimension of U(nn) and thus making the conjecture dimensionally possible. However, no analytic expression is provided for the unknown entries neither of the matrices Z1Z_{1} and Z2Z_{2} nor of the scaled matrix XX. An analytic solution of the decomposition problem is easily found for n=2n=2. Indeed, an arbitrary member of U(2) can be decomposed according to (4), in two different ways:

For more details about the case U(2), the reader is referred to Appendix C.

No analytic solution is found as soon as n>2n>2. Even the decomposition of an arbitrary member of U(3) is an unsolved problemAs soon as one of the nine entries of the U(3) matrix equals zero, the problem is analytically solvable., in spite of substantial efforts by the authors of the present paper. Independently, in the framework of an other but related problem, Shchesnovich comes to a similar conclusion. For arbitrary nn, Reference gives an analytic solution for a 2n2n-dimensional subset of the n2n^{2}-dimensional group U(nn). We conjecture that the asymptotic scaling procedure of Sections 1 and 2 provides a numerical solution, for any member of U(nn), with arbitrary nn.

If, in particular, we have n=2wn=2^{w}, then a U(nn) matrix represents a quantum circuit of width ww, i.e. acting on ww qubits. We thus may conclude that such circuit can be decomposed as the cascade of an overall phase, two ZZ subcircuits and one XX subcircuit. The basic building block of any ZZ circuit is the 1-qubit circuit represented by 2×22\times 2 matrix

the basic building block of any XX circuit is the 1-qubit circuit represented by

The NEGATOR realizes an arbitrary root of NOT and thus is a natural generalization of the square root of the NOT gate .

Each 2w×2w2^{w}\times 2^{w} matrix ZZ is implemented by a string of 2w12^{w-1} controlled PHASORs; any 2w×2w2^{w}\times 2^{w} matrix XX represents a circuit composed of controlled NEGATORs .

where aa is a short-hand notation for eiαe^{i\alpha} and X0X_{0} is the permutation matrix

we can transform (4) into a decomposition containing exclusively XU and ZU matrices:

where X0X_{0} is an XU matrix which can be implemented with classical reversible gates (i.e. NOTs and controlled NOTs), where Z0Z_{0} is a ZU matrix which can be implemented by a single (uncontrolled) PHASOR gate, and where Z1Z_{1}^{\prime} is the product \mboxdiag(1,a,1,a,1,...,1,a)Z1\mbox{diag}(1,a,1,a,1,...,1,a)\,Z_{1}. The short decomposition (6) illustrates the power of the two subgroups XU(nn) and ZU(nn), which are complementary , in the sense that

they overlap very little, as their intersection is the trivial group consisting of merely the n×nn\times n unit matrix and

they strengthen each other sufficiently, as their closure is the whole unitary group U(nn).

As an example, we give here a decomposition of the Hadamard gate according to schemes (4) and (6), respectively:

Conclusion

We have presented a method for scaling an arbitrary matrix from the unitary group U(nn), by multiplying the matrix to the left and to the right by unitary diagonal matrices. We conjecture that the resulting scaled matrix is a member of XU(nn), i.e. the subgroup of U(nn) consisting of all n×nn\times n unitary matrices with all 2n2n line sums equal to 1. If n=2n=2, then scaling can be performed analytically and thus with infinite precision. If n>2n>2, then scaling has to be performed numerically and thus with finite precision. In the terminology of Linial et al. , we would say that matrices from U(2) are ‘scalable’, whereas matrices from U(nn) with n>2n>2 are ‘almost scalable’. The conjecture that the numerical algorithm converges to a unit line-sum matrix, is based on four observations:

the proof that such matrix exists in the case n=2n=2,

the proof that such matrix exists for a 2n2n-dimensional subset of the general case U(nn) ,

the success of 1,000 numerical experiments in the cases n=3n=3, n=4n=4, and n=5n=5, and

the fact that, according to (5), there are, for arbitrary nn, exactly the right number of freedoms.

For the special case of n=2wn=2^{w}, this leads to a decomposition of an arbitrary quantum circuit into

one overall phase, one X circuit and two Z circuits or

References

Appendix A Theorem

Theorem : The absolute value of the matrix sum of a U(nn) matrix is smaller than or equal to nn. The U(nn) matrices with abs(matrixsum) = nn are member of the subgroup eiαe^{i\alpha} XU(nn), where XU(nn) denotes the subgroup of U(nn) consisting of the matrices with unit line sums.

Let r1r_{1}, r2r_{2}, …, and rnr_{n} be the row sums of an n×nn\times n matrix. For convenience, we give their real and imaginary parts an explicit notation:

as proved in Appendix A of Reference . We rewrite this property as follows:

where the angle Σ\Sigma is allowed to have any value.

We consider (9) as the eqn of an nn-dimensional hypersphere. We ask ourselves what is the highest value of the function

on the surface of this hypersphere. For this purpose, we note that f=kf=k, with kk some positive constant, is the eqn of the set of two parallel hyperplanes:

The highest value of kk on the hypersphere is when the two planes are tangent to the sphere. This happens when kk equals n2cos2(Σ)n^{2}\,\cos^{2}(\Sigma), the two tangent points having coordinates (s1,s2,...,sn)=±cos(Σ) (1,1,...,1)(s_{1},s_{2},...,s_{n})=\pm\cos(\Sigma)\ (1,1,...,1) and ff having the value n2cos2(Σ)n^{2}\cos^{2}(\Sigma). See the 2-dimensional illustration in Figure 3.

A similar reasoning is possible for the function

Noting that m2|m|^{2} equals f+gf+g, we conclude that m2|m|^{2} has maximum value n2cos2(Σ)+n2sin2(Σ)=n2n^{2}\,\cos^{2}(\Sigma)+n^{2}\,\sin^{2}(\Sigma)=n^{2}. The unitary matrices with this particular m2|m|^{2} value are the matrices with rj=±cos(Σ)±isin(Σ)r_{j}=\pm\cos(\Sigma)\pm i\sin(\Sigma), i.e. the matrices with constant row sum equal to eiαe^{i\alpha}, where α\alpha is either Σ\Sigma or Σ+π\Sigma+\pi.

A dual reasoning holds for the column sums, with cj=dj+iejc_{j}=d_{j}+ie_{j}, such that d12+d22+...+dn2=ncos2(Δ)d_{1}^{2}+d_{2}^{2}+...+d_{n}^{2}=n\cos^{2}(\Delta) and e12+e22+...+en2=nsin2(Δ)e_{1}^{2}+e_{2}^{2}+...+e_{n}^{2}=n\sin^{2}(\Delta). The unitary matrices with m2=n2|m|^{2}=n^{2} are the matrices with cj=±cos(Δ)±isin(Δ)c_{j}=\pm\cos(\Delta)\pm i\sin(\Delta), i.e. the matrices with constant column sum equal to eiβe^{i\beta}, where β\beta is either Δ\Delta or Δ+π\Delta+\pi. Because a matrix can have only one matrix sum, a matrix with both constant row sum and constant column sum necessarily has constant line sum. Therefore, for the matrices with m2=n2|m|^{2}=n^{2}, the angle β\beta equals the angle α\alpha.

The maximum-m|m| matrices equal eiαe^{i\alpha} times a matrix with constant line sum equal to 1. Thus they are member of the group eiαe^{i\alpha} XU(nn), a subgroup of U(nn), isomorphic to U(1) ×\times XU(nn), and thus isomorphic to U(1) ×\times U(n1n-1).

Appendix B The potential landscape

Given a unitary matrix AA, finding the scaled matrix BB is equivalent to solving the (non-linear) eqn

we introduce a 2n2n-dimensional landscape Ψ\Psi, given by

We have to find the minimum of this function, i.e. the point Ψ=0\Psi=0.

In order to investigate the shape of the Ψ\Psi function, we linearize the equation around AA, i.e. in the neighbourhood of (λ1,λ2,...,λn,ρ1,ρ2,...,ρn)(\lambda_{1},\lambda_{2},...,\lambda_{n},\rho_{1},\rho_{2},...,\rho_{n}) = (0,0,...,0,(0,0,...,0, 0,0,...,0)0,0,...,0). For this purpose, we write the row sums, column sums, and matrix sum of the given matrix AA as follows:

The coefficients of λj\lambda_{j} and ρj\rho_{j} form the gradient vector of the Ψ\Psi landscape. A stationary point occurs whenever, for all jj,

These conditions can only be fulfilled in the following cases:

i.e. when the matrix sum is zero and hence Ψ\Psi has the global maximum value of n2n^{2},

i.e. when all non-zero line sums have the same argument.

We conjecture that all these stationary points are either maxima or saddle points or global minima. In other words: we conjecture that no local minima exist. Moreover, we conjecture that the global minima satisfy Ψ=0\Psi=0.

Appendix C The case U(2)

We consider the unitary group U(2). All 2×22\times 2 diagonal unitary matrices form the subgroup DU(2), isomorphic to U(1) ×\times U(1). The subgroup DU(2) divides its supergroup U(2) into double cosets. Let AA be an arbitrary U(2) matrix:

Its double coset consists of all matrices

where cc and ss are short-hand notations for cos(ϕ)\cos(\phi) and sin(ϕ)\sin(\phi), respectively. We introduce the variables

Therefore, the double coset is the 3-parameter space

Thus the double coset of AA consists of the matrices U(ϕ,x,y,z)U(\phi,x,y,z), i.e. of all matrices with the same value of the angle ϕ\phi. This constitutes a 3-dimensional subspace of the 4-dimensional space U(2), except for the cases s=0s=0 (i.e. for the double coset of the IDENTITY matrix {\tiny\left(\begin{array}[]{cc}1&0\\ 0&1\end{array}\right)}) and c=0c=0 (i.e. for the double coset of the NOT matrix {\tiny\left(\begin{array}[]{cc}0&1\\ 1&0\end{array}\right)}), which both are 2-dimensional onlyOne may consider an arbitrary place P(φ,λ)P(\varphi,\lambda) on earth. The points Q(φ,x)Q(\varphi,x), with same latitude φ\varphi but arbitrary longitude xx, form a 1-dimensional subspace of the 2-dimensional earth surface, called the parallel of PP, except if either φ=π/2\varphi=\pi/2, in which case Q(φ,x)Q(\varphi,x) is a 0-dimensional subspace, called the north pole, or φ=π/2\varphi=-\pi/2, in which case Q(φ,x)Q(\varphi,x) is a 0-dimensional subspace, called the south pole. We therefore may consider the 2-dimensional double coset of the IDENTITY matrix and the 2-dimensional double coset of the NOT matrix as the north and the south pole, respectively, of the 4-dimensional U(2) manifold..

What are, within the double coset of AA, the stationary points of the Ψ\Psi landscape? We easily find

The conditions Ψ/x=0\partial\Psi/\partial x=0, Ψ/y=0\partial\Psi/\partial y=0, and Ψ/z=0\partial\Psi/\partial z=0 immediately lead to

This set of two trigonometric equations in the three unknowns xx, yy, and zz has infinitely many solutions, leading to an infinite number of matrices:

with xx arbitrary, k{0,1,2,3}k\in\{0,1,2,3\}, and l{0,1,2,3}l\in\{0,1,2,3\}. These sixteen sets of matrices lead to Ψ\Psi values equal to 44, 4c24c^{2}, 4s24s^{2}, and , corresponding to global maxima, saddle points, saddle points, and global minima, respectively. The saddle points belong to U(1)×\timesO(2), subgroup of U(2); the global extrema do not.

What are, within the same double coset, the matrices with all line sums real? We easily find the conditions:

This set of three trigonometric equations in the three unknowns xx, yy, and zz has twelve solutions:

Four matrices have matrix sum and thus Ψ=4\Psi=4; four matrices have matrix sum ±2s\pm 2s and thus Ψ=4cos2(ϕ)\Psi=4\cos^{2}(\phi); four matrices have matrix sum ±2c\pm 2c and thus Ψ=4sin2(ϕ)\Psi=4\sin^{2}(\phi); four matrices have matrix sum ±2\pm 2 and thus Ψ=0\Psi=0. Thus four of the twelve matrices represent a global maximum; eight of the twelve matrices represent a saddle point; four of the twelve matrices represent a global minimum.

As an example, we choose the AA matrix with 0<ϕ<π/40<\phi<\pi/4, such that 0<s<c<1/20<s<c<1/\sqrt{2}. Among the twelve matrices of its double coset with real line sums , there are only four matrices where the four line sums are positive, i.e.

where ee is a short-hand notation for eiϕ=c+ise^{i\phi}=c+is. Among these four matrices, only BB and BB^{\prime} have unit line sum (and thus Ψ=0\Psi=0).

In the neighbourhood of SS, we have the matrices

with xx, yy, and zz small. This yields a matrix sum

The opposite signs of the coefficients of y2y^{2} and z2z^{2} illustrate the fact that SS is a saddle point of the Ψ\Psi landscape. Only if the subsequent matrices Ak,Ak+1,...A_{k},A_{k+1},... are situated on the z=0z=0 line, then the Sinkhorn-like procedure of Section 2 halts at the point SS. In order to leave this stop, it suffices to continue along another line, e.g. y=0y=0. Similar conclusions hold for the point SS^{\prime}.

In the neighbourhood of BB, we have the matrices

with xx, yy, and zz small. This yields a matrix sum

The positive signs of the coefficients of y2y^{2} and z2z^{2} illustrate the fact that BB is a minimum (actually, a global minimum) of the Ψ\Psi landscape. The same conclusion holds for the point BB^{\prime}.

Let us assume that, in spite of the direct analytic solution for n=2n=2, we scale a U(2) matrix by the iterative method of Sections 1 and 2. Once close to the point BB, how fast do we converge to this global minimum of Ψ\Psi? Close to BB, we have AkA_{k} of the form

As soon as k>0k>0, both column sums of AkA_{k} are real (or zero), such that x=0x=0 and z=(c2/s2)yz=(c^{2}/s^{2})y. Thus we have a matrix

and, because of (15), a potential Ψ(Ak)=(4c2/s2)y2\Psi(A_{k})=(4c^{2}/s^{2})\,y^{2}. Thus all matrices AkA_{k} lay on a line, the 1-dimensional space (17), subspace of the 3-dimensional space (16). If we apply the (k+1)(k+1) th step of the iterative algorithm, we find, after some algebra:

where a=14c2s2=cos2(2ϕ)a=1-4c^{2}s^{2}=\cos^{2}(2\phi). Hence the new potential is Ψ(Ak+1)=(4c2/s2)(ay)2\Psi(A_{k+1})=(4c^{2}/s^{2})\,(ay)^{2} and

This illustrates the fact that the convergence speed of the algorithm is indeed dependent on the given matrix AA, more specifically on its parameter ϕ\phi. If this angle is close to π/4\pi/4, then convergence is fast; if the angle is close to , then convergence is slow.

Finally, we ask ourselves, given the matrix AA, does the algorithm of Sections 1 and 2 lead to the scaled matrix BB or to the scaled matrix BB^{\prime} ? The separatrice consists of the spaces χ=ψ\chi=\psi and χ=ψ+π\chi=\psi+\pi. If 0<χψ<π0<\chi-\psi<\pi, then the trajectory A0A_{0}, A1A_{1}, A2A_{2}, … ends in the attractor BB; if π<χψ<0-\pi<\chi-\psi<0, then the trajectory A0A_{0}, A1A_{1}, A2A_{2}, … ends in the attractor BB^{\prime}; if χψ=0\chi-\psi=0 or χψ=π\chi-\psi=\pi, then A1A_{1} is an orthogonal matrix (either SS or SS^{\prime}) and thus a saddle-point, such that the final destination (either BB or BB^{\prime}) depends on the direction in which one leaves the saddle point.

We close this appendix by comparing the above quantitative U(2) results with the qualitative U(nn) properties. It is well-known that, if a finite group G has a subgroup H, then H divides G into double cosets with sizes ranging from order(H) to order2(H). Similarly, if a Lie group G has a Lie subgroup H, then H divides G into double cosets with dimension ranging from dim(H) to 2 dim(H). The group U(nn) is n2n^{2}-dimensional and its subgroup DU(nn) is nn-dimensional. As a result, DU(nn) divides U(nn) into double cosetsThis set of double cosets, i.e. the double coset space \mboxU(1)n\mboxU(n)/ \mboxU(1)n\mbox{U}(1)^{n}\setminus\mbox{U}(n)\,/\ \mbox{U}(1)^{n} can be mapped to the set (not group!) of so-called unistochastic n×nn\times n matrices , a subset of the well-known semigroup of n×nn\times n bistochastic matrices (a.k.a. doubly stochastic matrices)., each with dimension between nn and 2n2n. In fact, in this particular case, the dimensions of the double cosets range from nn to 2n12n-1. Most of the double cosets are (2n1)(2n-1)-dimensional; only some are lower-dimensional, e.g. the n!n! double cosets of permutation matrices being only nn-dimensionalTogether these n!n! double cosets form the group of complex permutation matrices, a group isomorphic to the semidirect product DU(nn) : Sn, where Sn is the symmetric group of degree nn.. Thus within a double coset, we have at most 2n12n-1 degrees of freedom. If we want a matrix with all line sums real, then this imposes 2n12n-1 conditions, usually lowering the number of freedoms to 0. In other words: in each double coset there usually are a finite number of real line-sum matrices. We conjecture that at least one of these matrices is a unit line-sum matrix.