A Converse to Banach's Fixed Point Theorem and its CLS Completeness

Constantinos Daskalakis, Christos Tzamos, Manolis Zampetakis

Introduction

Several widely used computational methods are fixed point iteration methods. These include gradient descent, the power iteration method, alternating optimization, the expectation-maximization algorithm, kk-means clustering, and others. In several important applications, we have theoretical guarantees for the convergence of these methods. For example, convergence to a unique solution can be guaranteed when the method is explicitly, or can be related to, gradient descent on a convex function [BTN01, Nes13, BV04]. More broadly, convergence to a stationary point can be guaranteed when the method is, or can be related to, gradient descent; for some interesting recent work on the limit points of gradient descent, see [PP16, LSJR16] and their references.

Ultimately, all fixed point iteration methods aim at converging to a fixed point of their iteration map. For global convergence to a unique solution, it should also be the case that the fixed point of the iteration map is unique. It is, thus, unsurprising that another widely used approach for establishing convergence of these methods is by appealing to Banach’s fixed point theorem. To recall, consider an iteration map xt+1f(xt)x_{t+1}\leftarrow f(x_{t}), where f:DDf:{\cal D}\rightarrow{\cal D}, and suppose that there is a distance metric dd such that (D,d)({\cal D},d) is a complete metric space and ff is contracting with respect to dd, i.e. for some constant c<1c<1, d(f(x),f(y))cd(x,y)d(f(x),f(y))\leq c\cdot d(x,y), for all x,yDx,y\in{\cal D}. Under this condition, Banach’s fixed point theorem guarantees that there is a unique fixed point x=f(x)x^{*}=f(x^{*}). Moreover, iterating ff is bound to converge to xx^{*}. Specifically, the tt-fold composition, f[t]f^{[t]}, of ff with itself satisfies: d(f[t](x0),x)ctd(x0,x)d(f^{[t]}(x_{0}),x^{*})\leq c^{t}d(x_{0},x^{*}), for any starting point x0x_{0}.

Given Banach’s theorem, if you established that your iteration method is contracting under some distance metric dd, you would also have immediately proven that your method converges and that it may only converge to a unique point. Moreover, you can predict how many steps you need from any starting point x0x_{0} to reach an approximate fixed point xx satisfying d(f(x),x)<ϵd(f(x),x)<\epsilon for some accuracy ϵ\epsilon.Indeed, it can be easily shown that d(f[t+1](x0),f[t](x0))ctd(x1,x0)d(f^{[t+1]}(x_{0}),f^{[t]}(x_{0}))\leq c^{t}d(x_{1},x_{0}). So t=log1/cd(x1,x0)ϵt=\log_{1/c}{d(x_{1},x_{0})\over\epsilon} steps suffice. Alas, several widely used fixed point iteration methods are not generally contracting, or only contracting in a small neighborhood around their fixed points and not the entire domain where they are defined. At least, this is typically the case for the metric dd under which approximate fixed points, d(f(x),x)<ϵd(f(x),x)<\epsilon, are sought. There is also quite an important reason why they may not be contracting: several of these methods may in fact have multiple fixed points.

Given the above motivation, our goal in this paper is to investigate the extent to which Banach’s fixed point theorem is a universal analysis tool for establishing that a fixed point iteration method both converges and globally converges to a unique fixed point. More precisely, our question is the following: if an iterative map xt+1f(xt)x_{t+1}\leftarrow f(x_{t}) for some f:DDf:{\cal D}\rightarrow{\cal D} converges to a unique fixed point xx^{*} from any starting point, is there always a way to prove this using Banach’s fixed point theorem? Additionally, can we always use Banach’s fixed point theorem to compute how many iterations we would need to find an approximate fixed point xx of ff satisfying d(x,f(x))<ϵd(x,f(x))<\epsilon, for some distance metric dd and accuracy ϵ>0\epsilon>0?

We study this question from both a mathematical and a computational perspective. On the mathematical side, we show a strong converse of Banach’s fixed point theorem, saying the following: given an iterative map xt+1f(xt)x_{t+1}\leftarrow f(x_{t}) for some f:DDf:{\cal D}\rightarrow{\cal D}, some distance metric dd on D{\cal D} such that (D,d)({\cal D},d) is a complete and proper metric space, and some accuracy ϵ>0\epsilon>0, if ff has a unique fixed point that the ff-iteration converges to from any starting point, then for any constant c(0,1)c\in(0,1), there exists a distance metric dcd_{c} on D\cal D such that:

dcd_{c} certifies uniqueness and convergence to the fixed point, by satisfying dc(f(x),f(y))cdc(x,y)d_{c}(f(x),f(y))\leq c\cdot d_{c}(x,y), for all x,yDx,y\in{\cal D};

dcd_{c} allows an analyst to predict how many iterations of ff would suffice to arrive at an approximate fixed point xx satisfying d(x,f(x))<ϵd(x,f(x))<\epsilon; notice in particular that we are interested in finding an approximate fixed point with respect to the original distance metric dd (and not the constructed one dcd_{c}).

Our converse theorem is formally stated as Theorem 1 in Section 3. In the same section we discuss its relationship to other known converses of Banach’s theorem known in the literature, in particular Bessaga’s and Meyers’s converse theorems. The improvement over these converses is that our constructed metric dcd_{c} is such that it allows us to bound the number of steps requied to reach an approximate fixed point according to the metric of interest dd and not just dcd_{c}; namely Property 2 above holds. We discuss this further in Section 3.3. Section 3.2 provides a sketch of the proof, and the complete details can be found in Appendix B.

While the proof of Theorem 1 is non-constructive, it does imply that Banach’s fixed point theorem is a universal analysis tool for establishing global convergence of fixed point iteration methods to unique solutions. Namely, it implies that one can always find a witnessing metric. We illustrate this by studying an important such method: power iteration. The power iteration method is a widely-used and well-understood method for computing the eigenvalues and eigenvectors of a matrix. It is well known that if a matrix AA has a unique principal eigenvector, then the power method starting from a vector non-perpendicular to the principal eigenvector will converge to it. This is shown using a potential function argument, outlined above and in Appendix A.2, which also pins down the rate of convergence.

We close the circle by studying Banach’s fixed point theorem from a computational standpoint. Recent work of Daskalakis and Papadimitriou [DP11] has identified a complexity class, CLS, where finding a Banach fixed point lies. CLS, defined formally in Section 5, is a complexity class at the intersection of PLS [JPY88] and PPAD [Pap94]. Roughly speaking, PLS contains total problems whose existence of solutions is guaranteed by a potential function argument, while PPAD contains total problems whose existence of solutions is guaranteed by Brouwer’s fixed point theorem. Lots of interesting work has been done on both classes in the past two decades; for a small sample see e.g. [DGP09, CDT09, Rub16, FPT04, SV08, ER17, ABPW17] and their references. CLS, lying in the intersection of PLS and PPAD, contains comptutational problems whose existence of solutions is guaranteed by both a potential function and a fixed point argument.More precisely, it contains all problems reducible to Continuous LocalOpt, defined in Section 5, and which doesn’t necessarily capture the whole intersection of PPAD and PLS.

Unsurprisingly CLS contains several interesting problems, whose complexity is not known to lie in P, but which also are unlikely to be complete for PPAD or PLS. One of these problems is finding a Banach fixed point. Others include the P-matrix Linear Complementarity Problem, finding mixed Nash equilibria of network coordination and congestion games, computational problems related to finding KKT points, and solving Simple Stochastic Games; see [DP11] for precise definitions of these problems and for references. Moreover, recent work has provided cryptographic hardness results for CLS [HY17] based on obfuscation, extending work which proved cryptographic hardness results for PPAD [BPR15, RSS16, KNY17].

Ultimately, the definition of CLS was inspired by a vast range of total problems that could not be properly classified as complete in PPAD or PLS due to the nature of their totality arguments. However, no natural complete problem for this class has been identified, besides Continuous LocalOpt, through which the class was defined. By making our converse to Banach’s fixed point theorem constructive, we show that finding a Banach fixed point is CLS-complete. More precisely, in Section 5 we define problem Banach, whose input is a continuous function ff and a continuous metric dd, and whose goal is to either output an approximate fixed point of ff or a violation of the contraction of ff with respect to dd. In Theorem 2 we show that Banach is CLS-complete.It is worth pointing out that, while some problems in CLS (e.g. Banach fixed points, simple stochastic games) have unique solutions, most do not. Given that contraction maps have unique fixed points, the way we bypass the potential oxymoron, is by accepting as solutions violations of contraction.

We note that contemporaneously and independently from our work, Fearnley et al. [FGMS17] have also identified a CLS-complete problem related to Banach’s fixed point theorem. Their problem, called MetametricContraction, takes as input a function ff and a metametric dd, and asks to find an approximate fixed point of ff, or a violation of the contraction of ff with respect to dd. In comparison to our CLS-completeness results, the CLS-hardness of Banach in our paper is stronger than that of MetametricContraction as the input to Banach is a metric. On the other hand, the containment of MetametricContraction into CLS is stronger than the containment of Banach, as Banach is polynomial-time reducible to MetametricContraction.

Notation and Preliminaries

In Appendix A.1, the reader can find some well-known definitions that we are going to use in the rest of the paper. More precisely in the field of:

Metric Spaces, we define the notion of: distance metric, metric space, diameter, bounded metric space, continuous function, open and closed sets in a metric space, compact set, locally compact metric space, proper metric space, open and closed balls, Cauchy sequence, complete metric space, equivalent metrics, continuity, Lipschitz continuity, contraction property, fixed point.

Because of its importance for the rest of the paper we also give here the definition of a distance metric and metric space.

d(x,y)0d(x,y)\geq 0 for all x,yDx,y\in\mathcal{D}.

d(x,y)=d(y,x)d(x,y)=d(y,x) for all x,yDx,y\in\mathcal{D}.

d(x,y)d(x,z)+d(z,x)d(x,y)\leq d(x,z)+d(z,x) for all x,y,zDx,y,z\in\mathcal{D}. This is called triangle inequality.

Then we say that dd is a metric on D\mathcal{D}, and (D,d)(\mathcal{D},d) is a metric space.

If a selfmap ff has a fixed point and is continuous, we can define the following sequence of points xn+1=f(xn)x_{n+1}=f(x_{n}) where the starting point x0x_{0} can be picked arbitrarily. If (xn)(x_{n}) converges to a point xˉ\bar{x} then

This observation implies that a candidate procedure for computing a fixed point of a selfmap ff is to iteratively apply the function ff starting from an arbitrary point x0x_{0}. If this procedure converges then the limit is a fixed point xx^{*} of ff. We will refer to this method of computing fixed points as the Basic Iterative Procedure.

Arithmetic Circuits.

In Section 5 we work with functions from continuous domains to continuous domains represented as arithmetic circuits. An arithmetic circuit is defined by a directed acyclic graph (DAG). The inputs to the circuit are in-degree nodes, and the outputs are out-degree nodes. Each non-input node is a gate from the set {+,,,max,min,>}\{+,-,*,\max,\min,>\}, performing an operation on the outputs of its in-neighbors. The meaning of the “>>” gate is >(x,y)=1>(x,y)=1 if x>yx>y and otherwise. We also allow “output a rational constant” gates. These are gates without any inputs, which output a rational constant.

Converse Banach Fixed Point Theorems

We start, in Section 3.1, with an overview of known converses to Banach’s fixed point theorem. We also explain why these converses are not enough to prove that Banach’s fixed point theorem is a universal tool for analyzing the convergence of iterative algorithms. Then, in Section 3.2, we prove a stronger converse theorem that demonstrates the universality of Banach’s fixed point theorem for the analysis of iterative algorithms. Before beginning, we formally state Banach’s fixed Point Theorem. A useful survey of the applications of this theorem can be found in [Con14].

Suppose dd is a distance metric function such that (D,d)(\mathcal{D},d) is a complete metric space, and suppose that f:DDf:\mathcal{D}\rightarrow\mathcal{D} is a contraction map according to dd, i.e.

Then ff has a unique fixed point xx^{*} and the convergence rate of the Basic Iterative Procedure with respect to dd is cc. That is, d(f[n](x0),x)<cnd(x0,x)d(f^{[n]}(x_{0}),x^{*})<c^{n}\cdot d(x_{0},x^{*}), for all x0x_{0}.

The first known converse to Banach’s fixed point theorem is the following [Bes59].

The implication of the above theorem is that, if we want to prove existence and uniqueness of fixed points of f[n]f^{[n]} for all nn, then Banach’s fixed point theorem is a universal way to do it. Moreover, there is a potential function of the form p(x)=dc(x,f(x))p(x)=d_{c}(x,f(x)), where dcd_{c} is a distance metric, that decreases under successive applications of ff, and successive applications of ff starting from any point x0x_{0} are bound to converge to the unique fixed point of ff.

Unfortunately, dcd_{c} cannot provide any information about the number of steps that the Basic Iterative Procedure needs before computing an approximate fixed point under some metric dd of interest. The reason is that, after logcε\log_{c}\varepsilon steps of the Basic Iterative Procedure, we only have dc(xn,f(xn))εd_{c}(x_{n},f(x_{n}))\leq\varepsilon. However, dcd_{c} might not have any relation to dd, hence an approximate fixed point under dcd_{c} may not be one for dd. So Bessaga’s theorem is not useful for bounding the running time of iterative methods for approximate fixed point computation.

Given the above discussion, it is reasonable to expect that a converse to Banach’s theorem that is useful for bounding the running time of approximate fixed point computation methods should take into account, besides the function ff and its domain D\mathcal{D}, the distance metric dd under which we are interested in computing approximate fixed points. One step in this direction has already been made by Meyers [Mey67].

Let (D,d)(\mathcal{D},d) be a complete metric space, where D\mathcal{D} is compact, and suppose that f:DDf:\mathcal{D}\rightarrow\mathcal{D} is continuous with respect to dd. Suppose further that ff has a unique fixed point xx^{*}, that the Basic Iterative Method converges to xx^{*} from any starting point, and that there exists an open neighborhood UU of xx^{*} such that f[n](U){x}f^{[n]}(U)\rightarrow\{x^{*}\}. Then, for any c(0,1)c\in(0,1), there exists a distance metric dcd_{c}, which is topologically equivalent to dd, such that (D,dc)(\mathcal{D},d_{c}) is a complete metric space and ff is a contraction map with respect to dcd_{c} with contraction cc.

Compared to Bessaga’s theorem, the improvement offered by Meyer’s Theorem is that, instead of the existence of an arbitrary metric, it proves the existence of a metric that is topologically equivalent to the metric dd. However, this is still not enough to bound the number of steps needed by the Basic Iterative Procedure in order to arrive at a point xnx_{n} such that d(xn,f(xn))εd(x_{n},f(x_{n}))\leq\varepsilon. Our goal in the next section is to close this gap. We will also replace the compactness assumption with the assumption that (D,d)(\mathcal{D},d) is proper, so that the converse holds for unbounded spaces.

2 A New Converse to Banach’s Fixed Point Theorem

The main technical idea behind our converse to Banach’s fixed point theorem is to adapt the proof of Meyers’s theorem to get a distance metric dcd_{c} with the property dc(x,y)d(x,y)d_{c}(x,y)\geq d(x,y) everywhere, except maybe for the region d(x,x)εd(x,x^{*})\leq\varepsilon. This implies that, if we guarantee that dc(xn,x)εd_{c}(x_{n},x^{*})\leq\varepsilon, then d(xn,x)εd(x_{n},x^{*})\leq\varepsilon.

Suppose (D,d)(\mathcal{D},d) is a complete, proper metric space, f:DDf:\mathcal{D}\to\mathcal{D} is continuous with respect to dd and the following hold:

for every xDx\in\mathcal{D}, the sequence (f[n](x))(f^{[n]}(x)) converges to xx^{*} with respect to dd; moreover there exists an open neighborhood UU of xx^{*} such that f[n](U){x}f^{[n]}(U)\to\{x^{*}\}.

Then, for every c(0,1)c\in(0,1) and ε>0\varepsilon>0, there exists a distance metric function dc,εd_{c,\varepsilon} that is topologically equivalent to dd and is such that (D,dc,ε)(\mathcal{D},d_{c,\varepsilon}) is a complete metric space and

The proof of our Theorem 1 adapts the construction of Meyers’s proof, to ensure that (2b) is satisfied. We give here a proof sketch postponing the complete details to Appendix B, where we repeat also all the technical details proven by Meyers [Mey67].

The construction of the metric dcd_{c} follows is done in three steps:

Starting from the original metric dd, a non-expanding closure of dd is defined as the metric dM(x,y)=supi0d(f(i)(x),f(i)(y))d_{M}(x,y)=\sup_{i\geq 0}d(f^{(i)}(x),f^{(i)}(y)). This is topologically equivalent to dd, but ensures that the images of any two points are at least as close in dMd_{M} as the original two points (non-expanding property).

Notice that as dM(x,y)d(x,y)d_{M}(x,y)\geq d(x,y) for all points x,yDx,y\in\mathcal{D}, if we ensure that Property (2b) holds with respect to dMd_{M} for the final constructed metric dc,εd_{c,\varepsilon}, it will also hold with respect to the original metric dd.

Given dMd_{M}, the construction proceeds by defining a function ρc,ε\rho_{c,\varepsilon} which satisfies (2a). This function achieves contraction by a constant c<1c<1 by counting the number of steps required to reach an ε\varepsilon-ball close to the fixed point.

While for the original proof of Meyer any such ε\varepsilon-ball suffices, in order to guarantee Property (2b), our proof requires a set SS of points with small diameter with respect to dd such that performing an iteration of ff on any one of them results in a point still in the set SS. We show that such a set always exists in Lemma 5 in Appendix B.

This guarantees that ρc,ε(x,y)dM(x,y)\rho_{c,\varepsilon}(x,y)\geq d_{M}(x,y) if max{d(x,x),d(x,y)}ε\max\{d(x^{*},x),d(x^{*},y)\}\geq\varepsilon, and therefore Property (2b) is preserved.

The function ρc,ε\rho_{c,\varepsilon} satisfies all required properties other than triangle inequality and thus is not a metric. However, it can be converted into one.

Given ρc,ε\rho_{c,\varepsilon}, we construct the sought after metric dc,εd_{c,\varepsilon} by taking it equal to the ρc,ε\rho_{c,\varepsilon}-geodesic distance (metric closure of ρc,ε\rho_{c,\varepsilon}). This directly converts ρc,ε\rho_{c,\varepsilon} into a metric. We show that after this operation Properties (2a) and (2b). This is done in Lemma 9 and Lemma 10 in Appendix B.

3 Corollaries of Theorem 1

Property (2b) of the metric output by Theorem 1 has some interesting corollaries that we would not be able to get using the known converses to Banach’s theorem discussed in Section 3.1. The first one is that we can now compute, from dc,εd_{c,\varepsilon}, the number of iterations needed in order to get to within ε\varepsilon of the fixed point xx^{*} of ff from any starting point x0Dx_{0}\in\mathcal{D}.

Under the assumptions of Theorem 1, starting from a point x0Dx_{0}\in\mathcal{D}, and for any constant c(0,1)c\in(0,1), the Basic Iterative Procedure finds a point xx such that d(x,x)εd(x,x^{*})\leq\varepsilon after

iterations, where dc,ε/2d_{c,\varepsilon/2} is the metric guaranteed by Theorem 1.

In Corollary 1, for any given ε\varepsilon of interest, we have to identify a different distance metric dc,ε/2d_{c,\varepsilon/2}, guaranteed by Theorem 1, to bound the number of steps required by the Basic Iterative Procedure to get to within ε\varepsilon from the fixed point. Sometimes we are interested in the explicit tradeoff between the number of steps required to get to the proximity of the fixed point and the amount of proximity ε\varepsilon. To find such a tradeoff we have to make additional assumptions on ff. A mild assumption that is commonly satisfied by iterative procedures for non-convex problems is that the Basic Iterative Procedure locally converges to the fixed point xx^{*}. That is, if x0x_{0} is appropriately close to xx^{*}, then the Basic Iterative Procedure converges. A common way of proving local convergence is to prove that ff is a contraction with respect to dd locally for x,yBˉ(x,ε)x,y\in\bar{B}(x^{*},\varepsilon). Theorem 1 provides a way to extend this local contraction property to the whole domain D\mathcal{D} and get an an explicit closed form of the tradeoff between the number of steps and ε\varepsilon, as implied by the following result.

Under the assumptions of Theorem 1, and the assumption that there exists 0<c<10<c<1, δ>0\delta>0 such that

starting from any point x0Dx_{0}\in\mathcal{D}, the Basic Iterative Procedure finds a point xx such that d(x,x)εd(x,x^{*})\leq\varepsilon after

iterations, where dc,δ/2d_{c,\delta/2} is the metric guaranteed by Theorem 1.

Example: The Power Iteration as a Contraction Map

The results of the previous section imply that Banach’s fixed point theorem is a universal analysis tool for establishing global convergence of fixed point iteration methods to unique solutions. While the proof of Theorem 1 is non-constructive, it does imply that one can always find a witnessing metric under which the iterative map is contracting.

In this section, we illustrate this possibility by studying an important iterative method, the power iteration. The power iteration method is a widely-used and well-understood method for computing the eigenvalues and eigenvectors of a matrix. For a given matrix AA, it is defined as:

It is well known that if a matrix AA has a unique principal eigenvector, then the power method starting from a vector non-perpendicular to the principal eigenvector will converge to it. This is shown using a potential function argument, presented in Appendix A.2, which also pins down the rate of convergence.

Our converse to Banach’s theorem, guarantees that, besides the potential function argument, there must also exist a distance metric under which the power iteration is a contraction map. To illustrate our theorem, we identify a new distance metric under which the power method is indeed contracting at the optimal rate.

Moreover, t=log(d(x0,v1)/ε)log(λ1/λ2)t=\frac{\log(d(\bm{x}_{0},\bm{v}_{1})/\varepsilon)}{\log(\lambda_{1}/\lambda_{2})} iterations suffice to have xtv12d(xt,v1)ε\left\|\bm{x}_{t}-\bm{v}_{1}\right\|_{2}\leq d(\bm{x}_{t},\bm{v}_{1})\leq\varepsilon.

For any vector x\bm{x}, it holds that Ax,v1=λ1x,v1.\langle A\bm{x},\bm{v}_{1}\rangle=\lambda_{1}\langle\bm{x},\bm{v}_{1}\rangle. We have that

where the inequality is true as the vector xx,v1yy,v1\frac{\bm{x}}{\langle\bm{x},\bm{v}_{1}\rangle}-\frac{\bm{y}}{\langle\bm{y},\bm{v}_{1}\rangle} is perpendicular to the principal eigenvector v1\bm{v}_{1}. This shows that ff is contracting with respect to dd as required.

Using these observations and following the same approach as in Corollaries 1-2 we get the required bound on the number of iterations. ∎

Banach is Complete for CLS

As discussed in Section 1, the complexity class CLS was defined in [DP11] to capture problems in the intersection of PPAD and PLS, such as P-matrix LCP, mixed Nash equilibria of congestion and multi-player coordination games, finding KKT points, etc. It also contains computational variants of finding fixed points whose existence is guaranteed by Banach’s fixed point theorem. In this section, we close the circle by proposing two variants of Banach fixed point computation that are both CLS-complete. Our CLS completeness results are obtained by making our proof of Theorem 1 constructive. We start with a formal definition of CLS, which is defined in terms of the problem Continuous LocalOpt.

Continuous LocalOpt takes as input two functions f:33f:^{3}\to^{3}, p:3p:^{3}\to, both represented as arithmetic circuits, and two rational positive constants ε\varepsilon and λ\lambda. The desired output is any of the following:

a point x3x\in^{3} such that p(f(x))p(x)εp(f(x))\geq p(x)-\varepsilon.

two points x,x3x,x^{\prime}\in^{3} violating the λ\lambda-Lipschitz continuity of ff, i.e. f(x)f(x)1>λxx1|f(x)-f(x^{\prime})|_{1}>\lambda|x-x^{\prime}|_{1}.

two points x,xx,x^{\prime} violating the λ\lambda-Lipschitz continuity of pp, i.e. p(x)p(x)>λxx1|p(x)-p(x^{\prime})|>\lambda|x-x^{\prime}|_{1}.

The class CLS is the set of search problems that can be reduced to Continuous LocalOpt.

The variant of Banach’s theorem that is known to belong to CLS is Contraction Map, defined as follows:

Contraction Map takes as input a function f:33f:^{3}\to^{3} represented as an arithmetic circuit and three rational positive constants ε\varepsilon, λ\lambda, c<1c<1. The desired output is any of the following (where dd represents Euclidean distance):

a point x3x\in^{3} such that d(x,f(x))εd(x,f(x))\leq\varepsilon

two points x,x3x,x^{\prime}\in^{3} disproving the contraction of ff w.r.t. dd with constant cc, i.e. d(f(x),f(x))>cd(x,x)d(f(x),f(x^{\prime}))>c\cdot d(x,x^{\prime})

two points x,x3x,x^{\prime}\in^{3} disproving the λ\lambda-Lipschitz continuity of ff, i.e. f(x)f(x)1>λxx1|f(x)-f(x^{\prime})|_{1}>\lambda|x-x^{\prime}|_{1}.

Contraction Map targets fixed points whose existence is guaranteed by Banach’s fixed point theorem when ff is a contraction map with respect to the Euclidean distance. However, it doesn’t capture the full generality of Banach’s theorem, since the latter can be applied to any complete metric space. We thus define a more general problem, Banach that: (i) still lies inside CLS, (ii) captures the generality of Banach’s theorem, (iii) and in fact tightly captures the complexity of the class CLS, by being CLS-complete. This problem is defined as follows:

a point x3x\in^{3} such that d(x,f(x))εd(x,f(x))\leq\varepsilon

two points x,x3x,x^{\prime}\in^{3} disproving the contraction of ff w.r.t. dd with constant cc, i.e. d(f(x),f(x))>cd(x,x)d(f(x),f(x^{\prime}))>c\cdot d(x,x^{\prime})

two points x,x3x,x^{\prime}\in^{3} disproving the λ\lambda-Lipschitz continuity of ff, i.e. f(x)f(x)1>λxx1|f(x)-f(x^{\prime})|_{1}>\lambda|x-x^{\prime}|_{1}.

four points x1,x2,y1,y23x_{1},x_{2},y_{1},y_{2}\in^{3} with x1x2x_{1}\neq x_{2} and y1y2y_{1}\neq y_{2} disproving the λ\lambda-Lipschitz continuity of d(,)d(\cdot,\cdot), i.e. d(x1,x2)d(y1,y2)>λ(x1y11+x2y21)|d(x_{1},x_{2})-d(y_{1},y_{2})|>\lambda\left(\left|x_{1}-y_{1}\right|_{1}+\left|x_{2}-y_{2}\right|_{1}\right).

We remark that Banach is tightly related to Contraction Map defined above, with the following differences. First, instead of Euclidean distance, the metric with respect to which ff is purportedly contacting is provided as part of the input and it is promised to be a metric. Second, we need to add an extra type of accepted solution (Od), which is a violation of the Lipschitz property of that metric. This is necessary to guarantee that the above problem has a solution of polynomial length for any possible input, and in particular is needed to place the above problem in CLS. (It is not needed for the CLS-hardness.)

We give here a sketch of the proof of Theorem 2 and we present the full proof in Appendix C.

Since the inclusion to CLS is a simple argument very similar to the argument from [DP11] that shows that Contraction Map belongs to CLS, we focus here on the hardness proof.

The inspiration of this proof is to make the proof of Theorem 1 constructive in polynomial time. We therefore follow the steps of the proof sketch of Theorem 1 as presented in Section 3.

Step I. Since we don’t have the strong requirement of Theorem 1 to output a metric that is topologically equivalent with some given metric we can use in place of dMd_{M} any metric dd^{\prime} such that ff is non-expanding with respec to dd^{\prime}. Hence we can easily observe that the discrete metric can be used as dMd_{M}.

Step II. The construction of Theorem 1 uses in the definition of d(x,y)d(x,y) the number of times n(x)n(x), that we have to apply ff on xx in order for f[n(x)](x)f^{[n(x)]}(x) to come ε\varepsilon-close to the fixed point xx^{*} of ff. Of course n(x)n(x) is not a quantity that can be computed in polynomial time. Instead we show that it suffices to use an upper bound on n(x)n(x) which we can get using the potential function, namely p(x)/εp(x)/\varepsilon. Of course the operations that we are allowed to use to describe dd as an arithmetic circuit are limited and this step appears to need more expressive power that the simple arithmetic operations that we are allowed to use. We give a careful construction that bypasses these difficulties and completes this step of the proof.

Steps III. This step of Theorem 1 is highly non-constructive and hence we cannot hope to replicate it in polynomial time. But we prove that our carefully designed metric already has the triangle inequality and hence the transitive closure step is not necessary.

Acknowledgements

The authors were supported by NSF CCF-1551875, CCF-1617730, CCF-1650733, and a Simons Graduate Research Fellowship.

References

Appendix A Preliminaries

Let D\mathcal{D} be a set and τ\tau a collection of subsets of D\mathcal{D} with the following properties.

The empty set τ\emptyset\in\tau and the space Dτ\mathcal{D}\in\tau.

If UaτU_{a}\in\tau for all aAa\in A then aAUaτ\bigcup_{a\in A}Ua\in\tau.

Then we say that τ\tau is a topology on D\mathcal{D} and that (D,τ)(\mathcal{D},\tau) is a topological space. We call open sets the members of τ\tau. Also a subset CC of D\mathcal{D} is called closed if DC\mathcal{D}\setminus C is an open set, i.e. belongs to τ\tau. Let (D,τ)(\mathcal{D},\tau) be a topological space and AA a subset of D\mathcal{D}. We write

Metric Spaces

The diameter of a set WDW\subseteq\mathcal{D} according to the metric dd is defined as

then dSd_{S} is called the discrete metric on D\mathcal{D}.

Remark.

It is very easy to see that discrete metric is indeed a metric, i.e. it satisfies the conditions (i)-(iv).

Let (D,d)(\mathcal{D},d) and (X,d)(\mathcal{X},d^{\prime}) be metric spaces. A function f:DXf:\mathcal{D}\to\mathcal{X} is called continuous if, given xDx\in\mathcal{D} and ε>0\varepsilon>0, we can find a δ(x,ε)>0\delta(x,\varepsilon)>0 such that

We say that a subset EDE\subseteq\mathcal{D} is open in D\mathcal{D} if, whenever eEe\in E, we can find a δ>0\delta>0 (depending on ee) such that

The next lemma connects the definition of open sets according to some metric with the definition of open sets in a topological space.

If (D,d)(\mathcal{D},d) is a metric space, then the collection of open sets forms a topology.

We define the open ball of radius rr around xx to be B(x,r)={yDd(x,y)<r}B(x,r)=\{y\in\mathcal{D}|d(x,y)<r\}.

Closed Sets for Metric Spaces

then we say that xnxx_{n}\rightarrow x as nn\rightarrow\infty and that xx is the limit of the sequence (xn)(x_{n}). A set GDG\subseteq\mathcal{D} is said to be closed if, whenever xnGx_{n}\in G and xnxx_{n}\rightarrow x then xGx\in G. A proof of the following lemma can be found in [Kör10].

We define the closed ball of radius rr around xx to be Bˉ(x,r)={yDd(x,y)r}\bar{B}(x,r)=\{y\in\mathcal{D}|d(x,y)\leq r\}. A subset GG of a metric space (D,d)(\mathcal{D},d) is called compact if GG is closed and every sequence in GG has a convergent subsequence. A metric space (D,d)(\mathcal{D},d) is called compact if D\mathcal{D} is compact, locally compact if for any xDx\in\mathcal{D}, xx has a neighborhood that is compact and proper if every closed ball is compact.

Complete Metric Spaces

A metric space (D,d)(\mathcal{D},d) is complete if every Cauchy sequence converges.

Two metrics dd, dd^{\prime} of the same set D\mathcal{D} are called topologically equivalent (or just equivalent) if for every sequence (xn)(x_{n}) in D\mathcal{D}, (xn)(x_{n}) is dd-Cauchy sequence if and only if it is dd^{\prime}-Cauchy sequence.

Let (D,d)(\mathcal{D},d) be a metric spaces. A function f:DDf:\mathcal{D}\to\mathcal{D} is called continuous with repect to dd, if given xDx\in\mathcal{D} and ε>0\varepsilon>0, we can find a δ(x,ε)\delta(x,\varepsilon) such that

Lipschitz Continuity

If a function f:DXf:\mathcal{D}\to\mathcal{X} is Lipschitz continuous then it is continuous.

A fixed point of a selfmap ff is any point xDx^{*}\in\mathcal{D} such that f(x)=xf(x^{*})=x^{*}.

A.2 Introduction to Power Method

and since we assumed that v0v_{0} is not perpendicular to q1q_{1} we have that c10c_{1}\neq 0.

Since the eigenvalues are assumed to be real, distinct, and ordered by decreasing magnitude, it follows that

So, as kk increases, Akv0A^{k}v_{0} approaches c1λ1kq1c_{1}\lambda_{1}^{k}q_{1}, and thus for large values,

The power iteration method is simple and elegant, but suffers some drawbacks. Except from a measure of initial conditions, the method returns a single eigenvector, corresponding to the eigenvalue of largest magnitude. In addition, convergence is only guaranteed if the eigenvalues are distinct—in particular, the two eigenvalues of largest absolute value must have distinct magnitudes. The rate of convergence primarily depends upon the ratio of these magnitudes, so if the two largest eigenvalues have similar sizes, then the convergence will be slow.

In spite of its drawbacks, the power method is still used in many applications, since it works well on large, sparse matrices when only a single eigenvector is needed. However, there are other methods overcoming some of the issues with the power iteration method.

Appendix B Proof of Theorem 1

The proof of Theorem 1, follows the construction of [Mey67]. We give a complete step by step description of this construction. For every step of this construction we explain what it was already proven by Meyers and we additionally prove some properties that are needed in order to satisfy our additional condition (2b).

There exists an open neighborhood WW of xx^{*} such that

Since f[n](U){x}f^{[n]}(U)\rightarrow\{x^{*}\}, there is an integer kk such that f[k](U)Uf^{[k]}(U)\subseteq U. Let

Then for any xWx\in W and for any 1jk11\leq j\leq k-1 it holds that xf[j](U)x\in f^{[-j]}(U) and thus f(x)f[(j1)](U)f(x)\in f^{[-(j-1)]}(U). Moreover xUx\in U, so that f[k](x)f[k](U)Uf^{[k]}(x)\in f^{[k]}(U)\subset U and thus f(x)f[(k1)](U)f(x)\in f^{[-(k-1)]}(U). Hence xWx\in W implies f(x)Wf(x)\in W, which was to be shown. The diameter of WW can be bounded by the diameter of UU and hence is less that ε\varepsilon.

We now proceed to the main line of the proof. The construction follows three steps:

We first construct a metric dMd_{M}, which is topologically equivalent to dd, and with respect to which ff is non-expanding. It also holds that dM(x,y)d(x,y)d_{M}(x,y)\geq d(x,y) for all x,yDx,y\in\mathcal{D} and therefore Property (2b) can be transferred from dMd_{M} to dd.

Given dMd_{M}, we proceed to construct a “distance” function ρc,ε\rho_{c,\varepsilon}, which satisfies (2a) and all the metric properties except maybe for the triangle inequality. Moreover ρc,ε\rho_{c,\varepsilon} satisfies that ρc,ε(x,y)dM(x,y)\rho_{c,\varepsilon}(x,y)\geq d_{M}(x,y) if max{d(x,x),d(x,y)}ε\max\{d(x^{*},x),d(x^{*},y)\}\geq\varepsilon, and therefore (2b) is preserved.

Given ρc,ε\rho_{c,\varepsilon}, we construct the sought after metric dc,εd_{c,\varepsilon} by taking it equal to the ρc,ε\rho_{c,\varepsilon}-geodesic distance. Given the properties of ρc,ε\rho_{c,\varepsilon} and the definition of dc,εd_{c,\varepsilon}, we can prove that dc,εd_{c,\varepsilon} is a metric and Properties (2a) and (2b) hold.

In the fist step of the construction we define an metric dMd_{M} as

and we show that ff is non-expanding with respect to dMd_{M}.

For the dMd_{M} function defined above we have that:

dMd_{M} is well defined and satisfies all the metric properties (see Definition 1).

dMd_{M} is topologically equivalent with dd.

The proof of Lemma 6 can be found in Section B.2 where for completeness we keep the proofs that were already proved by Meyers’s [Mey67].

For our purposes we also observe that by the definition of dMd_{M} the following holds

and hence if dM(x,y)εd_{M}(x,y)\leq\varepsilon then also d(x,y)εd(x,y)\leq\varepsilon, therefore dMd_{M} satisfies (2b).

For xK0{x}x\in K_{0}\setminus\{x^{*}\} we set n(x)=maxxKn{n}0n(x)=\max_{x\in K_{n}}\{n\}\geq 0. The fact that n(x)n(x) is finite is guaranteed by (4). To see this assume that there is an infinite sequence n1,n2,n_{1},n_{2},\dots such that xKnix\in K_{n_{i}} which implies that xi=1Kix\in\bigcap_{i=1}^{\infty}K_{i} which definitely contradicts (4). We define also n(x)=n(x^{*})=\infty, and for xDK0x\in\mathcal{D}\setminus K_{0} set n(x)=minf[m](x)K0{m}=maxxKn{n}<0n(x)=-\min_{f^{[m]}(x)\in K_{0}}\{m\}=\max_{x\in K_{n}}\{n\}<0 which again is finite because of condition 2. Let κ(x,y)=min{n(x),n(y)}\kappa(x,y)=\min\{n(x),n(y)\}, we define ρc\rho_{c} to be

For the ρc\rho_{c} function defined above we have that:

ρc\rho_{c} is well defined and satisfies the metric properties (see Definition 1), except the triangle inequality (iv).

ff is a contraction map with respect to ρc\rho_{c} with contraction constant cc.

The proof of Lemma 7 is almost immediate from the definition of ρc\rho_{c}, but for a detailed explanation we refer to the initial proof by Meyers [Mey67].

In this last step what we do is that we assign the distance between two points to be the length of the shortest path that connects these two points, with the lengths computed according to ρc\rho_{c}. Then the distance satisfies the triangle inequality because of the shortest path property.

Formally, denote by SxyS_{xy} the set of chains sxy=(x=x0,x1,,xm=y)s_{xy}=(x=x_{0},x_{1},\dots,x_{m}=y) from xx to yy with associated lengths Lc(sxy)=i=1mρc(xi,xi1)L_{c}(s_{xy})=\sum_{i=1}^{m}\rho_{c}(x_{i},x_{i-1}). We define

For the dcd_{c} function defined above we have that:

dcd_{c} is well defined and satisfies all the metric properties (see Definition 1).

ff is a contraction map with respect to dcd_{c} with contraction constant cc.

dcd_{c} is topologically equivalent with dd and hence (D,dc)(\mathcal{D},d_{c}) is a complete metric space.

The proof of Lemma 8 can be found in Section B.2.

We know need to prove two lemmas to prove that Meyers’s construction also satisfies (2b).

Consider any xxx\neq x^{*} and yx,xy\neq x,x^{*} and assume that yK0y\notin K_{0}, then

Proof of Lemma 9: By definition any chain sxys_{xy} either lies in DK0\mathcal{D}\setminus K_{0} entirely, or has a last link which leaves K0K_{0}. If sxys_{xy} lies in DK0\mathcal{D}\setminus K_{0} entirely then n(x),n(y)<0n(x),n(y)<0 and hence dc(x,y)dM(x,y)d_{c}(x,y)\geq d_{M}(x,y). Otherwise we consider the last link that leaves K0K_{0} and we have that the length between xx and yy according to dcd_{c} is greater than the distance with respect to dMd_{M} of xx from K0K_{0} which gives that

If either dM(x,x)εd_{M}(x,x^{*})\leq\varepsilon or dM(y,x)εd_{M}(y,x^{*})\leq\varepsilon then we are done since as we have seen in the construction of dMd_{M}, dM(x,y)d(x,y)d_{M}(x,y)\geq d(x,y), thus either d(x,x)εd(x,x^{*})\leq\varepsilon or d(y,x)εd(y,x^{*})\leq\varepsilon and (2b) is satisfied. So we may assume that d(x,x)εd(x,x^{*})\geq\varepsilon and d(y,x)εd(y,x^{*})\geq\varepsilon. Therefore x,yDK0x,y\in\mathcal{D}\setminus K_{0} and which translates to n(x),n(y)<0n(x),n(y)<0. So now using Lemma 9 and we get

Now we consider two cases according to the value of dM(x,K0)d_{M}(x,K_{0}). If dM(x,K0)εd_{M}(x,K_{0})\geq\varepsilon then

B.2 Omitted Proof of Lemmas proven by Meyers in [Mey67]

Thus dMd_{M} is indeed a metric. We now show that dMd_{M} is topologically equivalent to dd. From the inequality d(x,y)dM(x,y)d(x,y)\leq d_{M}(x,y) it follows that any dMd_{M}-convergent sequence is also dd-convergent, with the same limit point. To prove the implication in the opposite direction, note that condition 2. of the hypotheses of the theorem implies the existence for each η>0\eta>0 of an NN such that

For each xDx\in\mathcal{D}, it follows from 2. that

is finite. Since ff is continuous, there is an δ>0\delta>0 so small that d(x,y)<δd(x,y)<\delta implies

By (3a) we have f[n+N+ν(x)](x)f[n+N](W)f^{[n+N+\nu(x)]}(x)\in f^{[n+N]}(W) and f[n+N+ν(x)](y)f[n+N](W)f^{[n+N+\nu(x)]}(y)\in f^{[n+N]}(W) for all n>0n>0, so that the (11) implies

Thus d(x,y)δd(x,y)\leq\delta implies dM(x,y)ηd_{M}(x,y)\leq\eta. This shows that a sequence which is dd-convergent to xx is also dMd_{M}-convergent to xx, completing the proof of topological equivalence. Finally since dd and dMd_{M} are topologically equivalent and dd is complete for D\mathcal{D} it follows that dMd_{M} is also complete for D\mathcal{D}. \hfill\qed\hfill\qed

Proof of Lemma 8: We will prove that dcd_{c} is the desired metric. That ff is a contraction with constant cc with respect to dcd_{c} follows by applying (7) to the links [xi1,xi][x_{i-1},x_{i}] of any chain sxys_{xy}. Clearly dcd_{c} is symmetric and dc(x,x)=0d_{c}(x,x)=0. The triangle law holds since following a sxys_{xy} with a syzs_{yz} yields a sxzs_{xz}. It remains to show that it is positive definite.

Consider any xxx\neq x^{*} and yxy\neq x and assume n(x)n(y)n(x)\leq n(y) without loss of generality. If yxy\neq x^{*}, any chain sxys_{xy} either lies in DKn(y)+1\mathcal{D}\setminus K_{n(y)+1}, or has a last link which leaves Kn(y)+1K_{n(y)+1}, so that

The remaining case, y=xy=x^{*} is covered by

Thus dcd_{c} is a distance metric. We now have to prove that dcd_{c} is equivalent to dMd_{M}. Let Bν=Df[ν](W)B_{\nu}=\mathcal{D}\setminus f^{[-\nu]}(W) for ν0\nu\geq 0, so that the definition of ν(x)\nu(x) (10) implies dM(x,Bν(x))>0d_{M}(x,B_{\nu(x)})>0 and n(x)ν(x)n(x)\geq-\nu(x). For any xxx\neq x^{*}, if yy obeys

then n(x)ν(x)n(x)\geq-\nu(x), so that (6) and (12), the last with xx and yy interchanged, imply

Now choose k(x)>max{0,n(x)}k(x)>\max\{0,n(x)\} such that zKk(x)z\in K_{k(x)} implies dM(z,x)<dc(x,x)/2d_{M}(z,x^{*})<d_{c}(x,x^{*})/2. Then dc(x,Kk(x))dc(x,x)/2d_{c}(x,K_{k(x)})\geq d_{c}(x,x^{*})/2, so that if yy obeys

then only chains disjoint from Kk(x)K_{k(x)} need enter (6), implying

then with (16) and (17) this implies (14) and hence (15) applies. Thus dc(xn,x)0d_{c}(x_{n},x)\to 0 whenever dM(xn,x)0d_{M}(x_{n},x)\to 0.

Now if x=xx=x^{*}, note first that if dM(x,y)<dM(x,B0)d_{M}(x^{*},y)<d_{M}(x^{*},B_{0}), then

Also note that for any η>0\eta>0, f[n](W){x}f^{[n]}(W)\to\{x^{*}\} guarantees an N(η)>0N(\eta)>0 such that dM(x,z)<η/2d_{M}(x^{*},z)<\eta/2 for all zKN(η)z\in K_{N(\eta)}. Then dM(x,y)>ηd_{M}(x^{*},y)>\eta implies that dM(y,KN(η))η/2d_{M}(y,K_{N(\eta)})\geq\eta/2 and thus that

Hence dc(xn,x)0d_{c}(x_{n},x^{*})\to 0 if and only if dM(xn,x)0d_{M}(x_{n},x^{*})\to 0.

Now exactly as above choose k((xn))=P>max0,Nk((x_{n}))=P>\max{0,N} such that zKk((xn))z\in K_{k((x_{n}))} implies

for all p>ip>i, and using (17) with k(x)=Pk(x)=P, we have

so that (xn)(x_{n}) is a dMd_{M}-Cauchy sequence. Therefore since (D,dM)(\mathcal{D},d_{M}) is complete, the topological space (D,dc)(\mathcal{D},d_{c}) is complete too. \hfill\qed\hfill\qed

Appendix C Proof of Theorem 2

On route to establishing the CLS-completeness of Banach, we will define an intermediate, syntactic problem MetricBanach, which is similar to Banach except that the function dd given in the input is not promised to be a metric, and hence a violation of the metricity of dd is accepted as a solution.

a point x3x\in^{3} such that d(x,f(x))εd(x,f(x))\leq\varepsilon

two points x,x3x,x^{\prime}\in^{3} disproving the contraction of ff w.r.t. dd with constant cc, i.e. d(f(x),f(x))>cd(x,x)d(f(x),f(x^{\prime}))>c\cdot d(x,x^{\prime})

two points x,x3x,x^{\prime}\in^{3} disproving the λ\lambda-Lipschitz continuity of ff, i.e. f(x)f(x)1>λxx1|f(x)-f(x^{\prime})|_{1}>\lambda|x-x^{\prime}|_{1}.

four points x1,x2,y1,y23x_{1},x_{2},y_{1},y_{2}\in^{3} with x1x2x_{1}\neq x_{2} and y1y2y_{1}\neq y_{2} disproving the λ\lambda-Lipschitz continuity of d(,)d(\cdot,\cdot), i.e. d(x1,x2)d(y1,y2)>λ(x1y1+x2y2)|d(x_{1},x_{2})-d(y_{1},y_{2})|>\lambda\left(\left|x_{1}-y_{1}\right|+\left|x_{2}-y_{2}\right|\right).

points x,y,z3x,y,z\in^{3} violating any of the metric properties of dd ((i)-(iv) of Definition 1).

Notice that MetricBanach is syntactic, namely for any input there exists a solution. We proceed to show that the problem is CLS-complete.

We first show that MetricBanach belongs to CLS even when we disallow (Oe). Starting from an instance (f,d,ε,λ,c)(f,d,\varepsilon,\lambda,c) of Continuous LocalOpt we create the following instance

Now we have to show that any output of the Continuous LocalOpt with input (f,p,ε,λ)(f,p,\varepsilon^{\prime},\lambda) will give us a output of MetricBanach with input (f,d,ε,λ,c)(f,d,\varepsilon,\lambda,c).

(CO1)     \implies If d(f(x),f(f(x)))>cd(x,f(x))d(f(x),f(f(x)))>c\cdot d(x,f(x)) then (x,f(x))(x,f(x)) satisfies (Ob). Otherwise

Therefore xx satisfies (Oa) and therefore is a solution of MetricBanach.

(CO3)     \implies Without loss of generality let xf(x)yf(y)\left\|x-f(x)\right\|\leq\left\|y-f(y)\right\|. If x=f(x)x=f(x) then if d(x,f(x))=0d(x,f(x))=0 we immediately satisfy (Oa) otherwise we satisfy (Oe). Otherwise we can give x1=xx_{1}=x, x2=f(x)x_{2}=f(x), y1=yy_{1}=y, y2=f(y)y_{2}=f(y) and since x1x2x_{1}\neq x_{2}, y1y2y_{1}\neq y_{2} we satisfy (ii) of (Od).

This implies that any output of Continuous LocalOpt at the instance (f,p,ε,λ)(f^{\prime},p,\varepsilon^{\prime},\lambda^{\prime}) can produce an output to the instance (f,d,ε,λ,c)(f,d,\varepsilon,\lambda,c) of the MetricBanach problem. Therefore \textscMetricBanach\textscCLS\textsc{MetricBanach}\in\textsc{CLS}.

Now we are going to show the opposite direction and reduce Continuous LocalOpt to MetricBanach. Starting from an instance (f,p,ε,λ)(f,p,\varepsilon,\lambda) of Continuous LocalOpt we define for any x,y3x,y\in^{3},

We also remind the reader the definition of the discrete metric

Finally we define the smooth interpolation function for w0w\leq 0

The basic observation about B()B\left(\cdot\right) since c<1c<1 is that cκ(x,y)+1B(κ(x,y))cκ(x,y)c^{\mathop{\left\lceil\kappa(x,y)\right\rceil}+1}\leq B({\kappa(x,y)})\leq c^{\mathop{\left\lceil\kappa(x,y)\right\rceil}}.

Based on these definitions we create the following instance of MetricBanach

As in the previous reduction we have to show that any result of the MetricBanach with input (f,d,ε,λ,c)(f,d,\varepsilon^{\prime},\lambda,c) will give us a result of Continuous LocalOpt with input (f,p,ε,λ)(f,p,\varepsilon,\lambda).

(Oa)     \implies If p(f(x))p(x)p(f(x))\geq p(x) then xx satisfies (CO1). Otherwise we can see that κ(x,f(x))=p(x)/2ε\kappa(x,f(x))=-p(x)/2\varepsilon and xf(x)x\neq f(x) so

so p(f(x))0p(x)εp(f(x))\geq 0\geq p(x)-\varepsilon and so xx satisfies (CO1).

(Ob)     \implies As in the previous case we may assume that p(f(x))p(x)εp(f(x))\leq p(x)-\varepsilon and that p(f(y))p(y)εp(f(y))\leq p(y)-\varepsilon. Without loss of generality we can assume that p(x)>p(y)p(x)>p(y). If also p(f(x))p(f(y))p(f(x))\geq p(f(y)) then κ(x,y)=p(x)/ε\kappa(x,y)=-p(x)/\varepsilon and κ(f(x),f(y))=p(f(x))/ε\kappa(f(x),f(y))=-p(f(x))/\varepsilon. Therefore

Now similarly if p(f(y))>p(f(x))p(f(y))>p(f(x)) then p(f(y))>p(x)εp(f(y))>p(x)-\varepsilon. But by our assumption that p(x)>p(y)p(x)>p(y) we get p(f(y))>p(y)εp(f(y))>p(y)-\varepsilon. Therefore yy satisfies (CO1).

and because c<1c<1 we have that maxx[0,1/ε]h(x)=c1/εln(1/c)\max_{x\in[0,1/\varepsilon]}h^{\prime}(x)=c^{-1/\varepsilon}\ln(1/c).

Let now κ(x1,x2)=p(x1)/ε\kappa(x_{1},x_{2})=-p(x_{1})/\varepsilon and κ(y1,y2)=p(y1)/ε\kappa(y_{1},y_{2})=-p(y_{1})/\varepsilon. Since x1x2x_{1}\neq x_{2} and y1y2y_{1}\neq y_{2} we have d(x1,x2)=B(cp(x1)/ε)d(x_{1},x_{2})=B\left(c^{-p(x_{1})/\varepsilon}\right) and d(y1,y2)=B(cp(y1)/ε)d(y_{1},y_{2})=B\left(c^{-p(y_{1})/\varepsilon}\right). Since B(κ(x,y))B\left(\kappa(x,y)\right) is just an linear interpolation of points that belong to cκx,yc^{\kappa{x,y}} using the Mean Value Theorem we have that B(κ(x1,x2))B(κ(y1,y2))maxx[0,1/ε]h(x)p(x1)εp(y1)ε\left|B\left(\kappa(x_{1},x_{2})\right)-B\left(\kappa(y_{1},y_{2})\right)\right|\leq\max_{x\in[0,1/\varepsilon]}h^{\prime}(x)\left|\frac{p(x_{1})}{\varepsilon}-\frac{p(y_{1})}{\varepsilon}\right|

Now if p(x1)p(y1)>λx1y1|p(x_{1})-p(y_{1})|>\lambda|x_{1}-y_{1}| then x1,y1x_{1},y_{1} satisfy (CO3) and we have a solution for Continuous LocalOpt. So p(x1)p(y1)λx1y1|p(x_{1})-p(y_{1})|\leq\lambda|x_{1}-y_{1}| and from the last inequality we have that

But this contradicts with (Od) since λ=max{λ,c1/ελln(1/c)ε}\lambda^{\prime}=\max\left\{\lambda,\mathop{\left\lceil c^{-1/\varepsilon}\lambda\frac{\ln(1/c)}{\varepsilon}\right\rceil}\right\}.

By inspecting the proof of the CLS-completeness of MetricBanach we realize that, in the CLS-hardness part of the proof, we can actually guarantee that dd is a metric. We can thus also establish the CLS-completeness of Banach.

Obviously because of Theorem 3, Banach belongs to CLS.

For the opposite direction, we use the same reduction as in the proof of Theorem 3. We then prove that dd satisfies the desired properties. We remind that we used the following instance of Banach for the reduction

We first prove that dd is always a distance metric.

(ii) If xyx\neq y then dS(x,y)>0d_{S}(x,y)>0. Also always cκ(x,y)>0c^{\kappa(x,y)}>0, therefore d(x,y)>0d(x,y)>0. Now since dS(x,x)=0d_{S}(x,x)=0 we also have d(x,x)=0d(x,x)=0.

(iii) It is obvious from the definition of κ\kappa that κ(x,y)=κ(y,x)\kappa(x,y)=\kappa(y,x) and since dSd_{S} is a distance metric, the same is true for the dSd_{S} and thus for dd.

(iv) Without loss of generality we assume that p(x)p(y)p(x)\geq p(y). We consider the following cases

then we have d(x,y)=d(x,z)d(x,y)=d(x,z) and therefore obviously d(x,y)d(x,z)+d(z,y)d(x,y)\leq d(x,z)+d(z,y).

then we have d(x,y)=B(p(x)2ε)d(x,y)=B\left(-\frac{p(x)}{2\varepsilon}\right), d(x,z)=d(z,y)=B(p(z)2ε)d(x,z)=d(z,y)=B\left(-\frac{p(z)}{2\varepsilon}\right) but since p(x)p(z)p(x)\leq p(z) obviously 2B(p(z)2ε)B(p(x)2ε)2B\left(-\frac{p(z)}{2\varepsilon}\right)\geq B\left(-\frac{p(x)}{2\varepsilon}\right).

Finally we will show the completeness of (3,d)(^{3},d). We first observe that for all xyx\neq y, d(x,y)>1d(x,y)>1, this comes from the fact that c<1c<1 and so cp(x)/ε>1c^{-p(x)/\varepsilon}>1.

Appendix D Proofs of Corollaries 1 2

Proof of Corollary 1: Let dc,ε/2d_{c,\varepsilon/2} be the distance metric guaranteed by Theorem 1 with parameters cc, ε/2\varepsilon/2. Let also (xn)(x_{n}) be the sequence produced by the Basic Iterative Procedure. Since ff is a contraction with respect to dc,ε/2d_{c,\varepsilon/2}, we have dc,ε/2(xn,x)cn1cdc,ε/2(x0,x1)d_{c,\varepsilon/2}(x_{n},x^{*})\leq\frac{c^{n}}{1-c}d_{c,\varepsilon/2}(x_{0},x_{1}). If we make sure that dc,ε/2(xn,xn+1)ε/2d_{c,\varepsilon/2}(x_{n},x_{n+1})\leq\varepsilon/2 then according to Theorem 1 d(xn,x)εd(x_{n},x^{*})\leq\varepsilon. So the number of steps that are needed are:

Proof of Corollary 2: Using Corollary 1, we get that after n=log(dc,δ/2(x0,f(x0)))+log((22c)/δ)log(1/c)n=\frac{\log(d_{c,\delta/2}(x_{0},f(x_{0})))+\log((2-2c)/\delta)}{\log(1/c)} iterations we will have d(xn,x)δd(x_{n},x^{*})\leq\delta or d(xn+1,x)δd(x_{n+1},x^{*})\leq\delta. Since in Bˉ(x,δ)\bar{B}(x^{*},\delta), ff is a contraction with respect to dd, it certainly must be that d(xn+1,x)δd(x_{n+1},x^{*})\leq\delta. By the same token, d(xn+1+m,x)cmd(xn+1,x)d(x_{n+1+m},x^{*})\leq c^{m}d(x_{n+1},x^{*}), for all m>0m>0. Therefore, to guarantee d(xn+1+m,x)εd(x_{n+1+m},x^{*})\leq\varepsilon, it suffices to take mlog(1/δ)+log(1/ε)log(1/c)m\geq\frac{-\log(1/\delta)+\log(1/\varepsilon)}{\log(1/c)}. So in total we need n+1+mn+1+m iterations, implying the number of iterations stated in the statement of the corollary. \hfill\qed\hfill\qed