An iterative construction of solutions of the TAP equations for the Sherrington-Kirkpatrick model

Erwin Bolthausen

Introduction

The TAP equations for the Sherrington-Kirkpatrick model describe the quenched expectations of the spin variables in a large system.

We write N,β,h,ω\left\langle\cdot\right\rangle_{N,\beta,h,\omega} for the expectation under this measure. We will often drop the indices N,β,h,ωN,\beta,h,\omega if there is no danger of confusion. We set

which have to be understood in a limiting sense, as N.N\rightarrow\infty. q=q(β,h)q=q\left(\beta,h\right) is the solution of the equation

where ϕ(dz)\phi\left(dz\right) is the standard normal distribution. It is known that this equation has a unique solution q>0q>0 for h>0h>0 (see Proposition 1.3.8). If h=0,h=0, then q=0q=0 is the unique solution if β1,\beta\leq 1, and there are two other (symmetric) solutions when β>1,\beta>1, which are supposed to be the relevant ones. Mathematically, the validity of the TAP equations has only been proved in the high temperature case, i.e. when β\beta is small, although in the physics literature, it is claimed that they are valid also at low temperature, but there they have many solutions, and the Gibbs expectation has to be taken inside “pure states”. For the best mathematical results, see Chap. 1.7.

The appearance of the so-called Onsager term β2(1q)mi\beta^{2}\left(1-q\right)m_{i} is easy to understand. From standard mean-field theory, one would expect an equation

but one has to take into account the stochastic dependence between the random variables mjm_{j} and gij.g_{ij}. In fact, it turns out that the above equation should be correct when one replaces mjm_{j} by mj(i)m_{j}^{\left(i\right)} where the latter is computed under a Gibbs average dropping the interactions with the spin i.i. Therefore mj(i)m_{j}^{\left(i\right)} is independent of the gik, 1kN,g_{ik},\ 1\leq k\leq N, and one would get

The Onsager term is an Itô-type correction expanding the dependency of mjm_{j} on gji=gij,g_{ji}=g_{ij}, and replacing mj(i)m_{j}^{\left(i\right)} on the right hand side by mj.m_{j}. The correction term is non-vanishing because of

i.e. exactly for the same reason as in the Itô-correction in stochastic calculus. We omit the details which are explained in .

In the present paper, there are no results about SK itself. We introduce an iterative approximation scheme for solutions of the TAP equations which is shown to converge below and at the de Almayda-Thouless line, i.e. under condition (2.1) below (see ). This line is supposed to separate the high-temperature region from the low-temperature one, but although the full Parisi formula for the free energy of the SK-model has been proved by Talagrand , there is no proof yet that the AT line is the correct phase separation line.

The iterative scheme we propose reveals, we believe, an interesting structure of the dependence of the mim_{i} on the family {gij},\left\{g_{ij}\right\}, even below the AT line. The main technical result, Proposition 2.5 is proved at all temperatures, but beyond the AT-line, it does not give much information.

We finish the section by introducing some notations.

As mentioned above, we suppress NN in notations as far as possible, but this parameter is present everywhere.

g=(gij)\mathbf{g}=\left(g_{ij}\right) is a Gaussian N×NN\times N-matrix where the gijg_{ij} for i<ji<j are independent centered Gaussians with variance 1/N1/N, and where gij=gji, gii=0.g_{ij}=g_{ji},\ g_{ii}=0. We will exclusively reserve the notation g\mathbf{g} for such a Gaussian matrix.

We will use Z,Z,Z1,Z2,Z,Z^{\prime},Z_{1},Z_{2},\ldots as generic standard Gaussians. Whenever several of them appear in the same formula, they are assumed to be independent, without special mentioning. We then write EE when taking expectations with respect to them. (This notation is simply an outflow of the abhorrence probabilists have of using integral signs, as John Westwater once put it).

provided there exists a constant C>0C>0 such that

Clearly, if XNYN,X_{N}\simeq Y_{N}, and XNYNX_{N}^{\prime}\simeq Y_{N}^{\prime}, then XN+XNYN+YN.X_{N}+X_{N}^{\prime}\simeq Y_{N}+Y_{N}^{\prime}.

We will use C>0C>0 as a generic positive constant, not necessarily the same at different occurrences. It may depend on β,h,\beta,h, and on the level kk of the approximation scheme appearing in the next section, but on nothing else, unless stated otherwise.

In order to avoid endless repetitions of the parameters hh and β,\beta, we use the abbreviation

We always assume h0,h\neq 0, and as there is a symmetry between the signs, we assume h>0.h>0. q=q(β,h)q=q\left(\beta,h\right) will exclusively be used for the unique solution of (1.2). In the case h=0, β>1,h=0,\ \beta>1, there is a unique solution of (1.2) which is positive. Proposition 2.5 is valid in this case, too, but this does not lead to a useful result. So, we stick to the h>0h>0 case.

Gaussian random variables are always assumed to be centered.

The recursive scheme for the solutions of the TAP equations

1\mathbf{1} here the vector with coordinates all 1,1, and q=q(β,h)q=q\left(\beta,h\right) is the unique solution of (1.2). We define

kk will exclusively been used to number this level of the iteration. Our main result is

Assume h>0.h>0. If β>0\beta>0 is below the AT-line, i.e. if

If there is strict inequality in (2.1), then there exist 0<λ(β,h)<1,0<\lambda\left(\beta,h\right)<1, and C>0,C>0, such that for all kk

The theorem is a straightforward consequence of a computation of the inner products m(i),m(j).\left\langle\mathbf{m}^{\left(i\right)},\mathbf{m}^{\left(j\right)}\right\rangle. We explain that first. The actual computation of these inner products will be quite involved and will depend on clarifying the structural dependence of m(k)\mathbf{m}^{\left(k\right)} on g.\mathbf{g.}

where Z,Z,Z,Z,Z^{\prime},Z^{\prime\prime}, as usual, are independent standard Gaussians. Remember that Th(x)=tanh(h+βx).\operatorname*{Th}\left(x\right)=\tanh\left(h+\beta x\right).

ψ\psi satisfies 0<ψ(0)=α2<ψ(q)=q,0<\psi\left(0\right)=\alpha^{2}<\psi\left(q\right)=q, and is strictly increasing and convex on [0,q].\left[0,q\right].

ψ(0)=α2,\psi\left(0\right)=\alpha^{2}, and ψ(q)=q\psi\left(q\right)=q are evident by the definition of α,q.\alpha,q. We compute the first two derivatives of ψ:\psi:

the second equality by Gaussian partial integration.

In both expressions, we can first integrate out Z,Z,Z^{\prime},Z^{\prime\prime}, getting

and the similar expression for ψ\psi^{\prime\prime} with Th\operatorname*{Th}\nolimits^{\prime} replaced by Th.\operatorname*{Th}\nolimits^{\prime\prime}. So, we see that ψ\psi is increasing and convex. Furthermore, as

If (2.1) is satisfied, then qq is the only fixed point of ψ\psi in the interval [0,q].\left[0,q\right]. If (2.1) is not satisfied then there is a unique fixed point of ψ(t)=t\psi\left(t\right)=t inside the interval (0,q).\left(0,q\right).

If there is strict inequality in (2.1) , then Γk2\Gamma_{k}^{2} and ρk\rho_{k} converge to qq exponentially fast.

We prove by induction on kk that ρk>Γk12.\rho_{k}>\Gamma_{k-1}^{2}. For k=1,k=1, as ρ1=γ1q,\rho_{1}=\gamma_{1}\sqrt{q}, the statement follows.

i.e. ρk>Γk2.\rho_{k}>\Gamma_{k}^{2}. As ρk+1>ρk,\rho_{k+1}>\rho_{k}, the statement follows.

c) Linearization of ψ\psi around qq easily shows that the convergence is exponentially fast if ψ(q)<1.\psi^{\prime}\left(q\right)<1.

Remark that by a) of the above lemma, one has γk>0\gamma_{k}>0 for all k.k.

As the variables m(k)\mathbf{m}^{\left(k\right)} are bounded, (2.5) implies

Taking the NN\rightarrow\infty limit, using Proposition 2.5, this converges to 2q2ρk1.2q-2\rho_{k-1}. From Lemma 2.4, the claim follows. ∎

Proposition 2.5 is true for all temperatures. However, beyond the AT-line, it does not give much information on the behavior of the m(k)\mathbf{m}^{\left(k\right)} for large k.k. It would be very interesting to know if these iterates satisfy some structural properties beyond the AT-line.

The main task is to prove the Proposition 2.5. It follows by an involved induction argument. We first remark that (2.7) is a consequence of (2.5) and (2.6).

COND(J)\operatorname{COND}\left(J\right) implies that for all kJ,k\leq J, we have with

Evidently, all variables ϕ(k)\mathbf{\phi}^{\left(k\right)} are bounded by a constant on AJ,A_{J}, if kJ.k\leq J. The constant may depend on J,J, of course. The m(k)\mathbf{m}^{\left(k\right)} are bounded by 11 everywhere.

Iterative modifications of the interaction variables

Let G\mathcal{G} be a sub-σ\sigma-field of F,\mathcal{F}, and y=(yij)1i,jN\mathbf{y}=\left(y_{ij}\right)_{1\leq i,j\leq N} be a random matrix. We are only interested in the case where y\mathbf{y} is symmetric and on the diagonal, but this is not important for the moment. We assume that y\mathbf{y} is jointly Gaussian, conditioned on G,\mathcal{G}, i.e. there is a positive semidefinite N2×N2N^{2}\times N^{2}- G\mathcal{G}-m.b. matrix Γ\Gamma such that

(We do not assume that y\mathbf{y} is Gaussian, unconditionally). Consider a G\mathcal{G}-measurable random vector x\mathbf{x}, and the linear space of random variables

We consider the linear projection πL(y)\pi_{\mathcal{L}}\left(\mathbf{y}\right) of y\mathbf{y} onto L,\mathcal{L}, which is defined to be the unique matrix with components πL(yij)\pi_{\mathcal{L}}\left(y_{ij}\right) in L\mathcal{L} which satisfy

As y\mathbf{y} is assumed to be conditionally Gaussian, given G,\mathcal{G}, it follows that yπL(y)\mathbf{y}-\pi_{\mathcal{L}}\left(\mathbf{y}\right) is conditionally independent of the variables in L,\mathcal{L}, given G.\mathcal{G}.

If y\mathbf{y} is symmetric, then clearly πL(y)\pi_{\mathcal{L}}\left(\mathbf{y}\right) is symmetric, too.

If XX is a G\mathcal{G}-measurable random variable then yX\mathbf{y}X is conditionally Gaussian as well and

g(k)\mathbf{g}^{\left(k\right)} is conditionally Gaussian, given Fk1.\mathcal{F}_{k-1}.

m(k), M(k),\mathbf{m}^{\left(k\right)},\ \mathbf{M}^{\left(k\right)}, and ϕ(k)\mathbf{\phi}^{\left(k\right)} are Fk1\mathcal{F}_{k-1}-measurable

i.e. we perform the above construction with G=Fk1\mathcal{G}=\mathcal{F}_{k-1} and x=M(k)\mathbf{x}=\mathbf{M}^{\left(k\right)}.

In order that the construction is well defined, we have to inductively prove the properties (C1) and (C2). We actually prove a condition which is stronger than (C1):

Conditionally on Fk2,\mathcal{F}_{k-2}, g(k)\mathbf{g}^{\left(k\right)} is Gaussian, and conditionally independent of Fk1.\mathcal{F}_{k-1}.

(C1’) implies that g(k)\mathbf{g}^{\left(k\right)} is conditionally Gaussian, given Fk1,\mathcal{F}_{k-1}, and the conditional law, given Fk1,\mathcal{F}_{k-1}, is the same as given Fk2.\mathcal{F}_{k-2}.

The case k=1k=1 is trivial. We first prove (C2) for k2,k\geq 2, using (C1’), (C2) up to k1.k-1. We claim that

where R(k2)\mathbf{R}^{\left(k-2\right)} stands for a generic Fk2\mathcal{F}_{k-2}-measurable random variable, not necessarily the same at different occurrences.

As g(k1)M(k1)=M(k1)ξ(k1),\mathbf{g}^{\left(k-1\right)}\mathbf{M}^{\left(k-1\right)}=\left\|\mathbf{M}^{\left(k-1\right)}\right\|\mathbf{\xi}^{\left(k-1\right)}, and M(k1)\mathbf{M}^{\left(k-1\right)} is Fk2\mathcal{F}_{k-2}-measurable, by the induction hypothesis, it follows from (3.2) that m(k)\mathbf{m}^{\left(k\right)} is Fk1\mathcal{F}_{k-1}-measurable The statements for M(k),ϕ(k)\mathbf{M}^{\left(k\right)},\mathbf{\phi}^{\left(k\right)} are then trivial consequences.

We therefore have to prove (3.2). We prove by induction on jj that

The case j=1j=1 follows from the definition of m(k),\mathbf{m}^{\left(k\right)}, and the case j=k1j=k-1 is (3.2).

Assume that (3.3) is true for j<k1.j<k-1. We replace g(j)\mathbf{g}^{\left(j\right)} by g(j+1)\mathbf{g}^{\left(j+1\right)} through the recursive definition

as πLj(g(j))\pi_{\mathcal{L}_{j}}\left(\mathbf{g}^{\left(j\right)}\right) is Fj\mathcal{F}_{j}-measurable and therefore πLj(g(j))M(k1,j1)\pi_{\mathcal{L}_{j}}\left(\mathbf{g}^{\left(j\right)}\right)\mathbf{M}^{\left(k-1,j-1\right)} is Fk2\mathcal{F}_{k-2}-measurable

Using (3.1), one gets g(j+1)M(j)=0,\mathbf{g}^{\left(j+1\right)}\mathbf{M}^{\left(j\right)}=0, and therefore

This proves (3.2), and therefore (C2) for k.k.

We condition on Fk2.\mathcal{F}_{k-2}. By (C2), M(k1)\mathbf{M}^{\left(k-1\right)} is Fk2\mathcal{F}_{k-2}-measurable As g(k1),\mathbf{g}^{\left(k-1\right)}, conditioned on Fk3\mathcal{F}_{k-3}, is Gaussian, and independent of Fk2,\mathcal{F}_{k-2}, it has the same distribution also conditioned on Fk2.\mathcal{F}_{k-2}. By the construction of g(k),\mathbf{g}^{\left(k\right)}, this variable is, conditioned on Fk2,\mathcal{F}_{k-2}, independent of Fk1,\mathcal{F}_{k-1}, and conditionally Gaussian. ∎

The proof is by induction on k.k. For k=1,k=1, there is nothing to prove.

Assume that the statement is proved up to k.k. We want to prove g(k+1)ϕ(m)=0\mathbf{g}^{\left(k+1\right)}\mathbf{\phi}^{\left(m\right)}=0 for mk.m\leq k. The case m=km=k is covered by (3.1). For m<k,m<k, it follows by Remark 3.1, as ϕ(m)\mathbf{\phi}^{\left(m\right)} is Fk1\mathcal{F}_{k-1}-measurable, that

as g(k)ϕ(m)=0\mathbf{g}^{\left(k\right)}\mathbf{\phi}^{\left(m\right)}=0 by the symmetry of g(k)\mathbf{g}^{\left(k\right)} and the induction hypothesis. ∎

We write Ok(Nr)O_{k}\left(N^{-r}\right) for a generic Fk\mathcal{F}_{k}-measurable random variable XX which satisfies

for some K>0.K>0. The constants C,K>0C,K>0 here may depend on h,β,h,\beta, and the level k,k, and on the formula where they appear, but on nothing else, in particular not on N,N, and any further indices. For instance, if we write

we mean that there exists C(β,h,k), K(β,h,k)>0C\left(\beta,h,k\right),\ K\left(\beta,h,k\right)>0 with

Furthermore, in such a case, it is tacitly assumed that XijYijX_{ij}-Y_{ij} are Fk\mathcal{F}_{k}-measurable

Evidently, if X,YX,Y are Ok(Nr),O_{k}\left(N^{-r}\right), then X+YX+Y is Ok(Nr),O_{k}\left(N^{-r}\right), and if XX is Ok(Nr),O_{k}\left(N^{-r}\right), and YY is Ok(Ns),O_{k}\left(N^{-s}\right), then XYXY is Ok(Nrs).O_{k}\left(N^{-r-s}\right).

We will finally prove the validity of the following relations:

The λm,A(k)\lambda_{m,A}^{\left(k\right)} are real numbers, not random variables, which depend on AA only through the type of subset which is taken. For instance, there is only one number (for every m,km,k) if all four indices are taken.

The main point with assuming COND(J)\operatorname*{COND}\left(J\right) is (2.9). On AJ,A_{J}, the variables ϕ(k)\mathbf{\phi}^{\left(k\right)} are bounded for kJ.k\leq J.

Assume (4.1) - (4.3) for k=J,k=J, and (2.9). Then

a) As ϕ(J)\mathbf{\phi}^{\left(J\right)} is FJ1\mathcal{F}_{J-1}-measurable, and g(J)\mathbf{g}^{\left(J\right)} is independent of FJ1,\mathcal{F}_{J-1}, conditionally on FJ2,\mathcal{F}_{J-2}, we get

Using (4.1), (4.2), and the boundedness of the ϕ\phi’s on AJA_{J}, and N1iϕi(J)2=1, iϕi(J)ϕi(m)=0N^{-1}\sum_{i}\phi_{i}^{\left(J\right)2}=1,\ \sum_{i}\phi_{i}^{\left(J\right)}\phi_{i}^{\left(m\right)}=0 for m<Jm<J, we get

We split the sum over (s,t)\left(s,t\right) into the one summand s=j,t=is=j,t=i, in A={(s,s):si,j}, B={(j,t):ti,j}, C={(s,i):si,j},A=\left\{\left(s,s\right):s\neq i,j\right\},\ B=\left\{\left(j,t\right):t\neq i,j\right\},\ C=\left\{\left(s,i\right):s\neq i,j\right\}, and D={(s,t):{s,t}{i,j}=}.D=\left\{\left(s,t\right):\left\{s,t\right\}\cap\left\{i,j\right\}=\emptyset\right\}. The one summand s=j,t=is=j,t=i gives ϕi(J)ϕj(J)/N+OJ1(N2).\phi_{i}^{\left(J\right)}\phi_{j}^{\left(J\right)}/N+O_{J-1}\left(N^{-2}\right).

Because ϕ(J),ϕ(m)=0\left\langle\mathbf{\phi}^{\left(J\right)},\mathbf{\phi}^{\left(m\right)}\right\rangle=0 for m<J,m<J, this is seen to be OJ1(N2).O_{J-1}\left(N^{-2}\right). The same applies to C.\sum_{C}.

Take e.g. A={i,j,s}.A=\left\{i,j,s\right\}. Then λm,A(J)=λm,3(J)\lambda_{m,A}^{\left(J\right)}=\lambda_{m,3}^{\left(J\right)} with no further dependence of this number on i,j,s.i,j,s. So we get for this part for any summand on mm with m<Jm<J

Using again ϕ(J),ϕ(m)=0,\left\langle\mathbf{\phi}^{\left(J\right)},\mathbf{\phi}^{\left(m\right)}\right\rangle=0, we get that this is OJ1(N2).O_{J-1}\left(N^{-2}\right). This applies in the same way to all the parts. Therefore b) follows.

due to the orthogonality of the ϕ(m).\mathbf{\phi}^{\left(m\right)}.

where the FJ1\mathcal{F}_{J-1}-measurable coefficients xij,s(J)x_{ij,s}^{\left(J\right)} satisfy

The existence of FJ1\mathcal{F}_{J-1}-measurable coefficients xij,s(J)x_{ij,s}^{\left(J\right)} comes from linear algebra.

Therefore, we can replace the xij,(J)x_{ij,\cdot}^{\left(J\right)} by

which satisfy the desired property (4.9).

We keep i,ji,j fixed for the moment and write xsx_{s} for xij,s(J).x_{ij,s}^{\left(J\right)}. The requirement for them is that for all tt

Due to the orthonormality of the ϕ,\mathbf{\phi}, one gets

Writing rijr_{ij} for the OJ1(N2)O_{J-1}\left(N^{-2}\right) error term in (4.5), and for j=i,j=i, the OJ1(N1)O_{J-1}\left(N^{-1}\right) error term in (4.4), we arrive at

In the first summand, we sum now over all s,s, remarking that we have assumed that sxsϕs(m)=0\sum_{s}x_{s}\phi_{s}^{\left(m\right)}=0 for m<J.m<J. The error for not summing over the single tt can be incorporated into rtt.r_{tt}. We therefore arrive at

Write Φ\Phi for the matrix (N1ϕi(J)ϕj(J))\left(N^{-1}\phi_{i}^{\left(J\right)}\phi_{j}^{\left(J\right)}\right) and RR for (rij).\left(r_{ij}\right). Then we have to invert the matrix (I+Φ+R).\left(I+\Phi+R\right). Remark that (I+Φ)1=IΦ/2.\left(I+\Phi\right)^{-1}=I-\Phi/2. Therefore

The right hand side, we can develop as a Neumann series:

As (Φy)i=OJ1(N3),\left(\Phi\mathbf{y}\right)_{i}=O_{J-1}\left(N^{-3}\right), we get the desired conclusion. ∎

The summands involving the x(J)x^{\left(J\right)} all only give contributions which enter the OJ1O_{J-1}-terms. Take for instance s=j, ti,j.s=j,\ t\neq i,j. In that case, the claimed OJ1O_{J-1}-term is OJ1(N3).O_{J-1}\left(N^{-3}\right). In the last summand of (4.10), there is one summand, namely u=v=j,u=v=j, where the x(J)x^{\left(J\right)} are OJ1(N2),O_{J-1}\left(N^{-2}\right), so this summand is only OJ1(N4).O_{J-1}\left(N^{-4}\right).

The other summands behave similarly. The third and fourth summand in (4.10) behave similarly.

As another case, take {i,j}{s,t}=,\left\{i,j\right\}\cap\left\{s,t\right\}=\emptyset, where we have to get OJ1(N4)O_{J-1}\left(N^{-4}\right) for the second to fourth summand in (4.10).

We write m×nm\times n for the summand, we get by multiplying the mm-th summand in the first bracket with the nn-th in the second. By induction hypothesis, we get

In the 1×21\times 2-term, only the multiplication of gij(J)g_{ij}^{\left(J\right)} with ξj(J)\xi_{j}^{\left(J\right)} counts, the other part giving OJ2(N3).O_{J-2}\left(N^{-3}\right). Therefore

2×12\times 1 gives the same. In 2×2,2\times 2, again only the matching of ξj(J)\xi_{j}^{\left(J\right)} with ξj(J)\xi_{j}^{\left(J\right)} counts, so we get

The other parts are easily seen to give OJ1(N3).O_{J-1}\left(N^{-3}\right). We we have proved that

c) We have here {i,j}{s,t}=.\left\{i,j\right\}\cap\left\{s,t\right\}=\emptyset.

The 1×1,1×2,2×1,1\times 1,1\times 2,2\times 1, and 2×22\times 2-terms are clearly of the desired form, either from induction hypothesis or Lemma 4.2.

For u=iu=i we get for the expectation ϕi(J)ϕj(J)/N+OJ1(N2),\phi_{i}^{\left(J\right)}\phi_{j}^{\left(J\right)}/N+O_{J-1}\left(N^{-2}\right), so this is of the desired form. The same applies to u=j.u=j. It therefore remains

As uϕu(J)ϕu(m)=0,\sum_{u}\phi_{u}^{\left(J\right)}\phi_{u}^{\left(m\right)}=0, the whole expression is OJ1(N4).O_{J-1}\left(N^{-4}\right). The other cases are handled similarly. ∎

Proof of Proposition 2.5

We assume COND(J),\operatorname{COND}\left(J\right), and (4.1) - (4.3) for kJ.k\leq J. By Proposition 4.1 of the last section, this implies (4.1) - (4.3) for kJ+1.k\leq J+1. Using this, we prove now (2.5) and (2.6) for k=J+1,k=J+1, so that we have proved COND(J+1).\operatorname{COND}\left(J+1\right). Having achieved this, the proof of Proposition 2.5 is complete.

Remark that under COND(J)\operatorname{COND}\left(J\right)

From q>Γk12,q>\Gamma_{k-1}^{2}, by (2.5) and (2.6) for kJ,k\leq J, and the fact that the ϕj(k)\phi_{j}^{\left(k\right)} are uniformly bounded on AJ,A_{J}, we have

Remark that by Lemma 3.2, we have g(s)M(k1,s1)=g(s)m(k).\mathbf{g}^{\left(s\right)}\mathbf{M}^{\left(k-1,s-1\right)}=\mathbf{g}^{\left(s\right)}\mathbf{m}^{\left(k\right)}. Evidently

This proposition is correct for all β.\beta. The key point with (2.1) is that the first summand M(k1)ξ(k1)\left\|\mathbf{M}^{\left(k-1\right)}\right\|\mathbf{\xi}^{\left(k-1\right)} disappears for kk\rightarrow\infty as M(k1)qΓk22,\left\|\mathbf{M}^{\left(k-1\right)}\right\|\simeq\sqrt{q-\Gamma_{k-2}^{2}}, so that for large k,k, m^(k)\mathbf{\hat{m}}^{\left(k\right)} stabilizes to Th(t=1k2γtξ(t)),\operatorname*{Th}\left(\sum\nolimits_{t=1}^{k-2}\gamma_{t}\mathbf{\xi}^{\left(t\right)}\right), but above the AT-line qΓk22q-\Gamma_{k-2}^{2} does not converge to 0.0. Therefore, above the AT-line, in every iteration, new conditionally independent contributions appear.

The above proposition is proved by showing that COND(J)\operatorname{COND}\left(J\right) implies

As COND(J)\operatorname{COND}\left(J\right) implies trivially COND(J)\operatorname{COND}\left(J^{\prime}\right) for J<J,J^{\prime}<J, it is then clear that COND(J)\operatorname{COND}\left(J\right) implies m(k)m^(k)\mathbf{m}^{\left(k\right)}\approx\mathbf{\hat{m}}^{\left(k\right)} for all kJ+1.k\leq J+1. As the mj(k)m_{j}^{\left(k\right)} are uniformly bounded by 1,1, we get from that

for all jJ+1.j\leq J+1. We will then prove (Lemma 5.3) that

This will prove COND(J+1),\operatorname{COND}\left(J+1\right), and therefore, this will have finished the whole induction procedure.

Together with proving (5.3), we also show

for k=J+1k=J+1 which is not evident from (5.3) as the ξi(m)\xi_{i}^{\left(m\right)} are not bounded.

Assume the validity of (2.5)-(2.7) and (5.4) for kJ.k\leq J. Then for s=1,,J1s=1,\ldots,J-1

We prove by induction on s, 1sJ1,s,\ 1\leq s\leq J-1, that

and define μ(n)\mathbf{\mu}^{\left(n\right)} where y\mathbf{y} is replaced by y(n),\mathbf{y}^{\left(n\right)}, n=1,,5.n=1,\ldots,5. Remark that

which prove the desired induction in s.s.

To switch from μ(0)\mathbf{\mu}^{\left(0\right)} to μ(1),\mathbf{\mu}^{\left(1\right)}, we observe that by the estimates of Lemma 4.3, one has

By choosing KK large enough, we get for 1/Nt11/\sqrt{N}\leq t\leq 1 by Corollary A.2 a)

For t1/N,t\leq 1/\sqrt{N}, the bound is trivial anyway. This proves (5.7) for n=1.n=1. (5.8) follows in the same way using Corollary A.2 b).

on AJ.A_{J}. (5.7) for n=2n=2 then follows from Corollary A.2 c). As for (5.8), we remark that

We can then again use Corollary A.2 c) remarking that exp[Nt/C]exp[Nt2/C]\exp\left[-Nt/C\right]\leq\exp\left[-Nt^{2}/C\right] for t1.t\leq 1.

(5.7) for n=3n=3 follows from the induction hypothesis (2.7), and Corollary A.2 a). Similarly with (5.8) but here, one has to use part b) of Corollary A.2.

on Ak,A_{k}, and one uses the induction hypothesis (5.4) for JJ to get (5.7) for n=4.n=4. Remark that actually, one has a bound uniform in i:i:

Therefore, one also gets (5.8) using Corollary A.2. Up to now, we have obtained

and we can therefore replace ξ(s),m^(J)ϕ(s)\left\langle\mathbf{\xi}^{\left(s\right)},\mathbf{\hat{m}}^{\left(J\right)}\right\rangle\mathbf{\phi}^{\left(s\right)} on the right hand side, by β(1q)γsϕ(s)\beta\left(1-q\right)\gamma_{s}\mathbf{\phi}^{\left(s\right)} for s<J1,s<J-1, or β(1q)qΓJ22ϕ(J1)\beta\left(1-q\right)\sqrt{q-\Gamma_{J-2}^{2}}\mathbf{\phi}^{\left(J-1\right)} for s=J1,s=J-1, which is the same as replacing X(J1,s1)\mathbf{X}^{\left(J-1,s-1\right)} by X(J1,s).\mathbf{X}^{\left(J-1,s\right)}. Therefore, the lemma is proved. ∎

We assume COND(J)\operatorname{COND}\left(J\right).

We condition on FJ2.\mathcal{F}_{J-2}. Then ξ(J1)\mathbf{\xi}^{\left(J-1\right)} is conditionally Gaussian with covariances given in Lemma 4.2 a), b). We can therefore apply Lemma A.3 which gives, conditionally on FJ2,\mathcal{F}_{J-2}, on an event BJ2FJ2B_{J-2}\in\mathcal{F}_{J-2} which has probability 1Cexp[N/C],\geq 1-C\exp\left[-N/C\right],

Applying now Lemma A.3 successively to ξ(J2),ξ(J2),\mathbf{\xi}^{\left(J-2\right)},\mathbf{\xi}^{\left(J-2\right)},\ldots , we get

The case s<J1s<J-1 uses a minor modification of the argument. One first uses Lemma A.3 successively to get

b) This also comes with a modification of the reasoning in a).

In the case j=J+1,j=J+1, the outcome is similar, one only has to replace the second factor by Th(M(J)ZJ+t=1J1γtξi(t)).\operatorname*{Th}\left(\left\|\mathbf{M}^{\left(J\right)}\right\|Z_{J}+\sum\nolimits_{t=1}^{J-1}\gamma_{t}\mathbf{\xi}_{i}^{\left(t\right)}\right).

The next observation is that by the induction hypothesis, one can replace M(J)\left\|\mathbf{M}^{\left(J\right)}\right\| by qΓJ12\sqrt{q-\Gamma_{J-1}^{2}} and we get

The important point is that the factor before ZJZ_{J} is replaced by a constant, which is due to the induction hypothesis. We can now proceed in the same way with ξ(J1),\mathbf{\xi}^{\left(J-1\right)}, applying again Lemma A.3, conditioned on FJ2,\mathcal{F}_{J-2}, and the induction hypothesis. The final outcome is

For the latter case, the right hand side is simply q.q. For the case jJ,j\leq J, we can rewrite the expression on the right hand side as

Solving, we get a2+b2=qΓj22,a^{2}+b^{2}=q-\Gamma_{j-2}^{2}, and

Appendix A Appendix

then supi,jNαijC/N.\sup_{i,j\leq N}\left|\alpha_{ij}\right|\leq C/N. Therefore, we can represent the ζi\zeta_{i} as

where the ZiZ_{i} are i.i.d. standard Gaussians. Then

By choosing KK appropriate, we get the desired estimate.

To prove (A.2), we use the same representation. As

for large enough K,K, we get the desired conclusion. ∎

Assume COND(J)\operatorname{COND}\left(J\right) and kJ.k\leq J.

For any mkm\leq k there exist C,K>0C,K>0 such that

For any m,l,m,l, there exist C,K>0C,K>0 such that

If YiY_{i} are Fm1\mathcal{F}_{m-1}-measurable with

so that we see that it suffices to consider l=m.l=m. Then we apply the lemma, part b).

As for c), we have that the conditional distribution of Nξ(m),Y\sqrt{N}\left\langle\mathbf{\xi}^{\left(m\right)},\mathbf{Y}\right\rangle, given Fm1,\mathcal{F}_{m-1}, is Gaussian, with bounded variance. So the statement follows. ∎

and there are vectors {xi(N)}iN, {yi(r,N)}iN, rm, m\left\{x_{i}^{\left(N\right)}\right\}_{i\leq N},~{}\left\{y_{i}^{\left(r,N\right)}\right\}_{i\leq N,\ r\leq m},\ m fixed, which are bounded in all indices, such that

We leave out NN in notations, as often as possible. Consider

The constant K>0K>0 will be specified below. Then

where LL is a bound on the Lipshitz constants for the FN,i,F_{N,i}, and cc is a bound of the yi(r).\left|y_{i}^{\left(r\right)}\right|.

We choose KK large enough such that the N×NN\times N-matrix Γ\Gamma which is (rij)\left(r_{ij}\right) off diagonal, and

on the diagonal is positive definite. This is possible as rijCN2.\left|r_{ij}\right|\leq CN^{-2}.

Let {Ui}\left\{U_{i}\right\} be a Gaussian matrix with covariance matrix Γ.\Gamma. Then

has the same distribution as {ηi}.\left\{\eta_{i}^{\prime}\right\}. Here we assume that {Ui}\left\{U_{i}\right\} is independent of the ZZ’s. So, we assume that the ηi\eta_{i}^{\prime} are presented in this way.

We can apply Lemma A.1 to the vector (NUi)1iN\left(\sqrt{N}U_{i}\right)_{1\leq i\leq N}, and (A.3) to the first summand on the right-hand side, obtaining

follows by standard Gaussian isoperimetry (see e.g. ). ∎

References