PPP-Completeness with Connections to Cryptography
Katerina Sotiraki, Manolis Zampetakis, Giorgos Zirdelis
Introduction
The fundamental task of Computational Complexity theory is to classify computational problems according to their inherent computational difficulty. This led to the definition of complexity classes such as which contains the decision problems with efficiently verifiable proofs in the “yes” instances. The search analog of the class , called , is defined as the class of search problems whose decision version is in . The same definition extends to the class , as the search analog of . The seminal works of [JPY88, Pap94] considered search problems in that are total, i.e. their decision version is always affirmative and thus a solution must always exist. This totality property makes the definition of inadequate to capture the intrinsic complexity of total problems in the appropriate way as it was first shown in [JPY88]. Moreover, there were evidences for the hardness of total search problems e.g. in [HPV89]. Megiddo and Papadimitriou [MP89] defined the class that contains the total search problems of , and Papadimitriou [Pap94] proposed the following classification rule of problems in :
Total search problems should be classified in terms of the profound mathematical principles that are invoked to establish their totality.
Along these lines, many subclasses for have been defined. Johnson, Papadimitriou and Yannakakis [JPY88] defined the class . A few years later, Papadimitriou [Pap94] defined the complexity classes , , and , each one associated with a profound mathematical principle in accordance with the above classification rule. More recently, the classes and were defined in [DP11] and [Jer16], respectively. In Section 1.1 we give a high-level description of all these classes.
Finding complete problems for the above classes is important as it enhances our understanding of the underlying mathematical principles. In turn, such completeness results reveal equivalences between total search problems, that seemed impossible to discover without invoking the definition of these classes. Since the definition of these classes in [JPY88, Pap94] it was clear that the completeness results about problems that don’t have explicitly a Turing machine or a circuit as a part of their input are of particular importance. For this reason it has been established to call such problems natural in the context of the complexity of total search problems (see [FRG18]).
Many natural complete problems are known for and , and recently natural complete problems for were identified too (see Section 1.1). However, no natural complete problems are known for the classes , that have profound connections with the hardness of important cryptographic primitives, as we explain later in detail.
Our Contributions. Our main contribution is to provide the first natural complete problems for and , and thus solve a longstanding open problem from [Pap94]. Beyond that, our completeness results lead the way towards answering important questions in cryptography and lattice theory as we highlight below.
Universal Collision-Resistant Hash Function. Building on the inherent connection of with collision-resistant hash functions, we construct a natural hash function family with the following properties:
Worst-Case Universality. No efficient algorithm can find a collision in every function of the family , unless worst-case collision-resistant hash functions do not exist.
Moreover, if an (average-case hard) collision-resistant hash function family exists, then there exists an efficiently samplable distribution over , such that is an (average-case hard) collision-resistant hash function family.
Average-Case Hardness. No efficient algorithm can find a collision in a function chosen uniformly at random from , unless we can efficiently find short lattice vectors in any (worst-case) lattice.
The first property of is reminiscent of the existence of worst-case one-way functions from the assumption that [Sel92]. The corresponding assumption for the existence of worst-case collision-resistance hash functions is assuming , but our hash function family is the first natural definition that does not involve circuits, and admits this strong completeness guarantee in the worst-case.
The construction and properties of lead us to the first candidate of a natural and universal collision-resistant hash function family. The idea of universal constructions of cryptographic primitives was initiated by Levin in [Lev87], who constructed the first universal one-way function and followed up by [Lev03, KN09]. Using the same ideas we can also construct collision a universal collision-resistant hash function family as we describe in Appendix C. The constructed hash function though invokes in the input an explicit description of a Turing machine and hence it fails to be natural, with the definition of naturality that we described before. In contrast, our candidate construction is natural, simple, and could have practical applications.
Complexity of Lattice Problems in . The hardness of lattice problems in [AR04] has served as the foundation for numerous cryptographic constructions in the past two decades. This line of work was initiated by the breakthrough work of Ajtai [Ajt96], and later developed in a long series of works (e.g. [AD97, MR07, Reg09, GPV08, Pei09, GVW13, BLP+13, BV14, GSW13, GVW15b, GKW17, WZ17, PRSD17]). This wide use of search (approximation) lattice problems further motivates their study.
We make progress in understanding this important research front by showing that:
the computational problem associated with Blichfeldt’s theorem, which can be viewed as a generalization of Minkowski’s theorem, is -complete,
the problem, a constrained version of the Short Integer Solution (), is -complete,
we combine known results and techniques from lattice theory to show that most approximation lattice problems are reducible to and .
These results create a new path towards a better understanding of lattice problems in terms of complexity classes.
Complexity of Other Cryptographic Assumptions. Besides lattice problems, we discuss the relationship of other well-studied cryptographic assumptions and . Additionally, we formulate a white-box variation of the generic group model for the discrete logarithm problem [Sho97]; we observe that this problem is in and is another natural candidate for being -complete.
In this section we discuss the previous work on the complexity of total search problems, that has drawn attention from the theoretical computer science community over the past decades. We start with a high-level description of the total complexity classes and then discuss the known results for each one of them.
The class of problems whose totality is established using a potential function argument. Every finite directed acyclic graph has a sink.
The class of problems whose totality is proved through a parity argument. Any finite graph has an even number of odd-degree nodes.
The class of problems whose totality is proved through a directed parity argument. All directed graphs of degree two or less have an even number of degree one nodes.
The class of problems whose totality is proved through a pigeonhole principle argument. Any map from a set to itself either is onto or has a collision.
Using the same spirit two more classes were defined after [Pap94], in [DP11] and [Jer16].
The class of problems whose totality is established using both a potential function argument and a parity argument.
The class of problems whose totality is proved through a weak pigeonhole principle. Any map from a set to a strict subset of has a collision.
Recently, a syntactic analog of the semantic class has been defined in [GP17], and a complete problem for this class has been identified. It has also been shown that all the classes we described above are subsets of . Oracle separations between all these classes are known [BCE+98], with the only exception of whether is contained in .
-completeness. The class represents the complexity of local optimization problems. Some important problems that have been shown to be -complete are: Local Max-Cut [SY91], Local Travelling Salesman Problem [Pap92], and Finding a Pure Nash Equilibrium [FPT04]. Recently, important results for the smoothed complexity of the Local Max-Cut problem were shown in [ER17, ABPW17].
-completeness. Arguably, the most celebrated application of the complexity of total search problems is the characterization of the computational complexity of finding a Nash equilibrium in terms of -completeness [DGP09, CDT09]. This problem lies in the heart of game theory and economics. The proof that Nash Equilibrium is -complete initiated a long line of research in the intersection of computer science and game theory and revealed connections between the two scientific communities that were unknown before (e.g. [EGG06, CDDT09, VY11, KPR+13, CDO15, Rub15, Rub16, CPY17, SSB17]).
-completeness. -complete problems usually arise as the undirected generalizations of their -complete analogs. For example, Papadimitriou [Pap94] showed that Sperner’s Lemma in a 3-D cube is -complete and later Grigni [Gri01] showed that Sperner’s Lemma in a 3-manifold consisting of the product of a Möbius strip and a line segment is -complete. Since Möbius strip is non-orientable, this indeed is a non-directed version of the Sperner’s Lemma. Similarly, other problems have been showed to be -complete, all involving some circuit as an input in their definition [ABB15, DEF+16, BIQ+17]. Recently, the first natural -complete problem, without a circuit as part of the input, has been identified in [FRG18]. This illustrates an interesting relation between and complexity of social choice theory problems.
-completeness. The class was defined in [DP11] to capture the complexity of problems such as P-matrix LCP, computing KKT-points, and finding Nash equilibria in congestion and network coordination games. Recently, it has been proved that the problem of finding a fixed point whose existence invokes Banach’s Fixed Point Theorem, is -complete [DTZ18, FGMS17].
and cryptography. The connection of and cryptography was illustrated by Papadimitriou in [Pap94], where he proved that if then one-way permutations cannot exist. In [BO06], a special case of integer factorization was shown to be in . This was generalized in [Jer16] by proving that the problem of factoring integers is in under randomized reductions. Recently, strong cryptographic assumptions were used to prove the average-case hardness of and [BPR15, GPS16, HY17]. In [RSS17] it was shown that average-case hardness does not imply one-way function under black-box reductions, whereas in [HNY17] it was shown that any hard on average problem in implies the average case hardness of . Finally, in [KNY17] it is proved that the existence of multi-collision resistant hash functions is equivalent with a variation of the total search problem , which is not known to belong to any of the above complexity classes. Interestingly, they prove that a variation of called colorful-Ramsey () is -hard. Although this an important result, the problem still invokes a circuit in the input and in not known to be in , hence does not resolve the problem of identifying a natural complete problem for .
2 Roadmap of the paper
We start our exposition with a brief description of the results contained in this paper. First we briefly describe the -completeness of that illustrates some of the basic ideas behind our main result that is -complete. The complete proof of the -completeness of can be found in Section 3. We suggest to readers that have experience with the fundamental concepts of the theory of lattices to skip the details Section 3.
Then, we present a brief description of our main theorem and its proof. The complete proof of the -completeness of can be found in Section 4.
In Section 5 we describe the -completeness of a weaker version of and its relation with the definition of the first natural universal collision resistant hash function family in the worst-case sense. This proof also provides the first candidate for a collision resistant hash function family that is both natural and universal in the average-case sense.
Finally in Section 6 we present, for completeness of our exposition, other lattice problems that are already known to belong to and and in Section 7 we present more general other cryptographic assumptions that belong to and .
3 Overview of the Results
We give a proof overview of our first main theorem, and highlight the obstacles that arise, along with our solutions.
The problem is -complete.
The output of is either an \mathbf{s}_{\boldsymbol{\underaccent{\bar}{x}}}=\begin{bmatrix}\boldsymbol{\underaccent{\bar}{x}}\\ \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}})\end{bmatrix}\in S\cap\mathcal{L} that (by construction of ) implies \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}})=\mathbf{0}, or two different elements of , \mathbf{s}_{\boldsymbol{\underaccent{\bar}{x}}}=\begin{bmatrix}\boldsymbol{\underaccent{\bar}{x}}\\ \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}})\end{bmatrix}, \mathbf{s}_{\boldsymbol{\underaccent{\bar}{y}}}=\begin{bmatrix}\boldsymbol{\underaccent{\bar}{y}}\\ \mathcal{C}(\boldsymbol{\underaccent{\bar}{y}})\end{bmatrix} with \mathbf{s}_{\boldsymbol{\underaccent{\bar}{x}}}-\mathbf{s}_{\boldsymbol{\underaccent{\bar}{y}}}\in\mathcal{L} that implies \boldsymbol{\underaccent{\bar}{x}}\neq\boldsymbol{\underaccent{\bar}{y}} and (by construction of ) \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}})=\mathcal{C}(\boldsymbol{\underaccent{\bar}{y}}).
3.2 𝗰𝗦𝗜𝗦𝗰𝗦𝗜𝗦\boldsymbol{\mathsf{cSIS}} is 𝗣𝗣𝗣𝗣𝗣𝗣\boldsymbol{\mathsf{PPP}}-complete.
Part of the input to is represented with a value function which requires a small circuit. As we explained before this makes a non-natural problem with the respect to the definition of naturality in the context of the complexity of total search problems. We now introduce a natural problem that we call constrained Short Integer Solution (), and show that it is -complete. The problem is a generalization of the well-known Short Integer Solution () problem, and discuss their connection in Section 5.2.
The problem is -complete.
It can be argued that this reduction shares common ideas with the reduction of to ; this shows the importance of the input conditions for and hints to the numerous complications that arise in the proof. Without these conditions, we could end up with a trivial reduction to an -hard problem! Fortunately, we are able to show that our construction satisfies the input conditions of .
3.3 Towards a Natural and Universal Collision-Resistant Rash Family.
is a subclass of in which a collision always exists; it is not hard to show that variations of both and are -complete. We tweak the parameters of valid inputs in order to guarantee that a collision always exists. The -complete variation of , which we denote by , gives a function family which is a universal collision-resistant hash function family in a worst-case sense: if there is a function family that contains at least one function for which it is hard to find collisions, then our function family also includes a function for which it is hard to find collisions.
The great resemblance of with and its completeness for lead us to the first candidate for a universal collision-resistant hash function \mathcal{H}_{\mathsf{cSIS}}=\{h_{\boldsymbol{\underaccent{\bar}{s}}}:\{0,1\}^{k}\rightarrow\{0,1\}^{k^{\prime}}\}:
The key \boldsymbol{\underaccent{\bar}{s}} is a pair of matrices , where is binary invertible.
Because lattice problems have worst-to-average case reductions and our hash family is based on a lattice problem, this gives hope for showing that our construction is universal in the average-case sense.
3.4 Other Lattice Problems Known to be in 𝗣𝗣𝗣𝗣𝗣𝗣\boldsymbol{\mathsf{PPP}}.
3.5 Other Cryptographic Assumptions in 𝗣𝗣𝗣𝗣𝗣𝗣\boldsymbol{\mathsf{PPP}}.
4 Open questions.
Numerous new questions arise from our work and the connections we draw between , cryptography and lattices. We summarize here some of them.
Is there a worst-to-average case reduction from to itself? This result will provide the first natural, in the sense that does not invoke explicitly a Turing machine in the input, and universal collision resistant hash function family.
Is or -hard?
Is - in for ?
Is - -hard for ?
Is the discrete logarithm problem in for general elliptic curves?
Preliminaries
The bit decomposition function is extended to integer vectors and the result is the concatenation of the bit decomposition of each coordinate of the vector. Similarly, this is also extended to integer matrices. Then of course the bit composition function is no longer well defined because its output can be either a number or a vector of numbers, but for simplicity we still use the notation and it will be made clear from the context whether the output is a number or a vector of numbers. When is applied to a set the output is still a set with the bit decomposition of each element .
Circuits. We can now define a circuit with inputs and outputs as an ordered tuple of boolean circuits which defines a function , where \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}})=(\mathcal{C}_{1}(\boldsymbol{\underaccent{\bar}{x}}),\dots,\mathcal{C}_{m}(\boldsymbol{\underaccent{\bar}{x}})). The size of is equal to . It is known that (see [AB09]), where is the class of polynomial-sized circuits. Thus, any polynomial time procedure we describe, implies an equivalent circuit of polynomial size.
Search Problems. A search problem in is defined by a relation that on input of size and for every of size , is polynomial-time computable on . A solution to the search problem with input is a of size such that holds.
A search problem is total if for every input of size , there exists a of size such that holds. The class of total search problems in is called .
Karp Reductions Between Search Problems. A search problem is Karp-reducible to a search problem if there exist polynomial-time (in the input size of ) computable functions and such that if is an input of , then is an input of and if is any solution of with input then is a solution of .
Cook Reductions Between Search Problems. A search problem is Cook-reducible to a search problem if there exists a polynomial-time (in the input size of ) oracle Turing machine such that if is an input of , computes a such that is a solution of whenever all the oracle answers are solutions of . The set of all search problems that are Cook-reducible to problem is denoted by .
The weak Complexity Class. The class is the set of all search problems Karp-reducible to the following problem called .
2 Set Description Using Circuits
Characteristic Function. We say that a circuit with binary inputs and one output is a characteristic function representation of if \mathcal{CH}_{S}(\boldsymbol{\underaccent{\bar}{x}})=1 if and only if \boldsymbol{\underaccent{\bar}{x}}\in\mathsf{bd}(S).
a characteristic function representation of , where ,
a value function representation of , where and
an index function representation of , where .
In the next sections, the constructions of value and index functions for sets lie at the heart of our proofs for showing membership in , and we make frequent use of the above Lemma 2.1. Specifically, in Section 3, we see that the set of integer cosets of a lattice admits an index function that can be implemented with a polynomial sized circuit, and in Section 7 we see that any cyclic group admits a value function that can be implemented with a polynomial-size circuit. In the latter case, we demonstrate that an efficient implementation of an index function of a group shows that the Discrete Logarithm problem for this group belongs to the class .
3 Lattice Basics
The following lemma for -ary lattices is useful in our proofs. For a simple proof of this Lemma 2.3 we refer to Sections 2.3 and 2.4 of [AP11].
and ,
Determinant. The determinant of a lattice is the volume of the fundamental parallelepiped, or simply for full-rank lattices.
We now state a fundamental relation between , and .
For a proof of Proposition 2.4 see the proof of Lemma 3.2.
The concept of lattices was introduced by Hermann Minkowski in his influential book Geometrie der Zahlen [Min10], first published at 1896. In his book, Minkowski developed the theory of the geometry of numbers and resolved many difficult problems in number theory. His fundamental theorem, known as Minkowski’s Convex Body Theorem, was the main tool of these proofs.
Despite the excitement created by Minkowski’s groundbreaking work, it was only after 15 years that a new principle in geometry of numbers was discovered. The credit for this discovery goes to Hans Frederik Blichfeldt, who in 1914 published a paper [Bli14] with his new theorem and some very important applications in number theory These introductory paragraphs were inspired from Chapter 9 of [OLD01]..
In this section, we characterize the computational complexity of Blichfeldt’s existence theorem and in later sections we discuss its applications (see Section 6). We recall the statement of Blichfeldt’s theorem below, introduce its computational search version, and prove that it is -complete.
A proof of Theorem 3.1 can be found in Chapter 9 of [OLD01]. We now define the computational discrete version of Blichfeldt’s theorem.
is in .
Before defining , we note that numbers that satisfy case “0.” in the output of exist if and only if is not a valid value function of . Therefore, an output of case “0.” certifies that the input of is not a valid input.
Now we are ready to describe . Let , observe that
We use Lemma 2.1 to construct an index function . Finally, we define the index function of as .
If \mathcal{V}_{S}(\mathsf{bc}(\boldsymbol{\underaccent{\bar}{x}}))\not\in S, then \mathsf{bc}(\boldsymbol{\underaccent{\bar}{x}}) is a solution to our initial instance. Otherwise, let \boldsymbol{y}=\boldsymbol{\sigma}\left(\mathcal{V}_{S}(\boldsymbol{\underaccent{\bar}{x}})\right) and \boldsymbol{\underaccent{\bar}{y}}=\mathsf{bd}(\boldsymbol{y}). In this case, we have that \mathcal{I}_{R}(\boldsymbol{\underaccent{\bar}{y}})=\boldsymbol{\underaccent{\bar}{0}}, which implies that \mathcal{I}_{R_{\mathbf{D}}}\left(\boldsymbol{\pi}(\boldsymbol{y})\right)=\boldsymbol{\underaccent{\bar}{0}}. From the definition of in Lemma 2.1, we get that and hence . Finally, \boldsymbol{\sigma}\left(\mathcal{V}_{S}(\boldsymbol{\underaccent{\bar}{x}})\right)=\boldsymbol{0} implies \mathcal{V}_{S}(\boldsymbol{\underaccent{\bar}{x}})\in\boldsymbol{0}+\mathcal{L} and so \mathbf{x}=\mathcal{V}_{S}(\boldsymbol{\underaccent{\bar}{x}}) is a solution to our initial instance.
Using the same reasoning as in the previous case we conclude that either \mathsf{bc}(\boldsymbol{\underaccent{\bar}{x}}),\mathsf{bc}(\boldsymbol{\underaccent{\bar}{y}}) is a solution to our initial instance or there exists a such that \mathcal{V}_{S}(\boldsymbol{\underaccent{\bar}{x}}),\mathcal{V}_{S}(\boldsymbol{\underaccent{\bar}{y}})\in\mathbf{c}+\mathcal{L}(\mathbf{B}) and hence if \mathbf{x}=\mathcal{V}_{S}(\boldsymbol{\underaccent{\bar}{x}}), \mathbf{y}=\mathcal{V}_{S}(\boldsymbol{\underaccent{\bar}{y}}) then we have that and so is a solution to our initial instance.
We now proceed to show the -hardness of .
is -hard.
a single vector \mathbf{y}=\begin{bmatrix}\boldsymbol{\underaccent{\bar}{x}}\\ \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}})\end{bmatrix}\in S\cap\mathcal{L}(\mathbf{B}).
two vectors , such that and .
Finally, we argue that so that the problem does not output trivially. This follows directly from Lemma 2.3 since . ∎
Combining Lemma 3.2 and Lemma 3.4, we prove the following theorem.
is -complete.
Constrained Short Integer Solution is 𝗣𝗣𝗣𝗣𝗣𝗣\boldsymbol{\mathsf{PPP}}-Complete
In this section we define the first -complete problem that is natural, i.e. does not explicitly invoke any circuit as part of its input in contrast to the problem. We call this problem the constrained Short Integer Solution () problem because it shares a similar structure with the well-known and well-studied Short Integer Solution problem that was defined in the seminal work of Ajtai [Ajt96], and later studied in [Mic04, MR07, GPV08, MP12].
To expand on the complexity theoretic importance and potential of , we note the problem is contained in by its collision-resistance nature but it is unknown if it is also -hard. This poses a fascinating open question, since it implies a unique characterization of a concrete cryptographic assumption using a complexity class and vice versa. We view our result in this section (as well as in the next section) as a first step towards this direction, by showing that is -complete.
Similar to the presentation of the previous section; we first define , then prove its membership and finally show its -hardness. In all these steps, a matrix with a special structure, that we call binary invertible plays an important role, and thus we define it below and prove a key property that we use.
Next, we formalize the main property of binary invertible matrices that is in the core of our proof for the inclusion of in , and also explains the name “binary invertible”.
where . The fact that can be computed by a polynomial sized circuit, follows easily from (4.1).
The property of Proposition 4.2 is the only property of binary invertible matrices that we need for our proofs. We could potentially define binary invertible matrices in a more general way. For example, a binary invertible matrix could be a permutation of the columns of a matrix of Definition 4.1. Our results follow immediately for the more general class of matrices that satisfy the properties of Proposition 4.2. Let us denote by this set of matrices. We focus on the more restrictive case of Definition 4.1, not only for ease of the exposition, but also because given a matrix there is no known efficient procedure to check whether . In fact, this problem is -complete, since we can encode a instance in and reduce to checking whether . Alternatively, we could define a promise version of where , is promised to be in . However, this would deprive us from a syntactic definition of .
In this case, we define the vectors \mathbf{x}=\begin{bmatrix}\mathcal{C}_{1}(\boldsymbol{\underaccent{\bar}{x}})\\ \boldsymbol{\underaccent{\bar}{x}}\\ \boldsymbol{0}\end{bmatrix} and \mathbf{y}=\begin{bmatrix}\mathcal{C}_{1}(\boldsymbol{\underaccent{\bar}{y}})\\ \boldsymbol{\underaccent{\bar}{y}}\\ \boldsymbol{0}\end{bmatrix} such that
As explained in Section 2.1, it suffices to assume that the in-degree of every non-input node is two. Let be the index of the first, in the topological ordering, predecessor of the node and be the index of the second. Since nodes are indexed in topological ordering we have that . Every row of corresponds to a node , with , of , and contains the coefficients of the variables in the modular equation of a form that appears in Table 2, depending on the label of . We prove the correctness of these equations later in the text but it becomes also clear from the following Claim 4.6. The proof of Claim 4.6 goes through a simple enumeration of the different values for the boolean variables and can be found in Appendix A.1. The equation of node defines also the -th element of according to Table 2.
Let , then the following equivalences holds
So as we said, each column of corresponds to a variable, each row of corresponds to an equation as described above and is defined based on the label of each node of according to Table 2. The order of both the rows and the columns is specified by the topological sorting of . Specifically, the first row of describes the equation corresponding to node (the output node of ), the second row of describes the equation corresponding to node . In general, the -th row of describes the equation corresponding to node . We emphasize that, since there are no equations for the input nodes of , we have equations in total. The order of columns follows a similar rule, i.e. it corresponds to the reverse order of the topological ordering of . The first two columns correspond to variables , of node followed by pairs of rows corresponding to the variables of all non-input nodes of . Among the two columns of each node, the first corresponds to the auxiliary variable and the second to the value variable, unless the label of the node is . In the case, the first corresponds to the value variable and the second to the auxiliary variable. Finally, has columns at the end for the variables that correspond to the input nodes of . For the last columns, all the columns of the value variables precede the columns of the auxiliary variables. These rules completely define matrix (see Table 3 for an illustration).
Before defining the final matrix , we state and prove some basic properties of .
The matrix is binary invertible.
Proof of Claim 4.7. We remind that the dimensions of are . Because of the order of rows and columns of , and by the form of the equations of Table 2, we have that the vectors that appear in the diagonal of are equal to . The only other non-zero elements of the -th row of appear in the columns corresponding to and . But, by construction and are always before in the topological ordering of . Therefore, their corresponding columns are after the columns of . Hence, the only non-zero elements of are above its 3rd diagonal. This implies that has the form and as shown has non-zero elements only above the 3rd diagonal. This concludes the claim that is binary invertible.
Proof of Claim 4.8. We inductively prove that the value of the coordinate of is equal to the value of the non-input node in in the evaluation of \mathcal{C}_{i}(\boldsymbol{\underaccent{\bar}{x}}).
Induction Base. By the definition of \boldsymbol{\underaccent{\bar}{x}} we have that the coordinates are equal to the input values.
Inductive Hypothesis. Assume that for any such that the value of the coordinate of is equal to the value of the non-input node in in the evaluation of \mathcal{C}_{k}(\boldsymbol{\underaccent{\bar}{x}}).
But, from inductive hypothesis we know that and take the correct values of and in the evaluation of \mathcal{C}_{i}(\boldsymbol{\underaccent{\bar}{x}}). Hence, from Equation (4.2) we immediately get that takes the value of node . Similarly, we can show the inductive step for all other possible labels of .
For the statement that we just proved through Induction implies that s_{2}=\mathcal{C}_{i}(\boldsymbol{\underaccent{\bar}{x}}).
We are finally ready to describe our matrix and vector . We remind that and we define . Let . The matrix is of dimension . We remind that from Claim 4.7 matrices are binary invertible and hence let and the matrices that satisfy the equation
From the definition of and Claim 4.7 it is immediate that is binary invertible.
The matrix defined in Equation (4.4) is binary invertible.
Additionally, let , then the following claim is a simple corollary of Equations (4.4) and (4.5) and Claim 4.8.
The output of on input , is one of the following:
This completes the hardness proof for .
Combining Lemma 4.4 and Lemma 4.5, we prove the main theorem of this section.
The problem is -complete.
We provide an example of our reduction for very simple circuit in Figure 3.
Complete Collision Resistant Hash Function
The similarities between the problem and raise the question of whether has cryptographic applications. In this section, we propose a candidate family of collision-resistant hash (CRH) functions based on the average-case hardness of the problem, and also discuss its worst-case hardness. The computational problem associated with our collision resistant hash function family is a variant of presented in Section 4. In this case, the modular constraint are homogeneous, i.e. , and the inequality constraints on the dimension of the matrices are strict. This change in the relation of the parameters might seem insignificant, but it is actually very important since it transforms or problem into a purely lattice problem: On input matrices with corresponding bases and , where is binary invertible, find two vectors and such that and . Our proof that is -complete, increases our hope that is -complete too, since it overcomes one important difficulty towards reducing to . We start with a formal definition of collision-resistant hash function.
Collision-Resistant Hash Functions: Let be the security parameter. A family of functions
where and are polynomials, is collision-resistant if:
(Shrinking) The output of \text{H}_{\boldsymbol{\underaccent{\bar}{s}}} is smaller than its input. Namely, .
(Efficient Sampling) There exists a probabilistic polynomial-time algorithm that on input samples a uniform key \boldsymbol{\underaccent{\bar}{s}}.
(Efficient Evaluation) There exists a deterministic polynomial-time algorithm that on input a key \boldsymbol{\underaccent{\bar}{s}} and an \boldsymbol{\underaccent{\bar}{x}}\in\{0,1\}^{k} outputs \text{H}_{\boldsymbol{\underaccent{\bar}{s}}}(\boldsymbol{\underaccent{\bar}{x}})\in\{0,1\}^{p(k)}.
In the rest of this section, we analyze the hardness of finding collision in the family. We call this problem weak- () because of its similar nature to . First, we consider its worst-case hardness and draw connections between the family , a restricted version of the problem, and the complexity class . Then, we move on to the average-case hardness of and argue that it defines a canditate for a universal CRH function family.
The class Polynomial Weak Pigeon Principle (a subclass of ) is particularly interesting for cryptography because it contains all collision-resistant hash functions. In this part, we show that a generalized version of the problem we define below, namely , is complete for the class .
is in .
Now, we move to the more challenging hardness proof. Even though some of the proof techniques are reminiscent of the ones in Section 4, we need new ideas, mainly because of the homogeneity of the constraints in . Before presenting the proof, we define a special version of problem. Then, we state and prove an easy lemma. This lemma has appeared in various previous works (see Lemma 2.2 in [Jer16]), but it is useful for us to state and prove it using our notation.
Then, the following lemma is easy to check:
The problem is Karp-reducible to .
First, we claim that it suffices to show that is Karp-reducible to . This is because every circuit with inputs and outputs can be transformed into a circuit with inputs and outputs by padding the output with zeros. We note that every collision of is a collision of .
Now, we show that is Karp-reducible to . Let be a circuit with inputs and outputs, we create a new circuit with inputs and outputs such that \mathcal{C}^{\prime}(\boldsymbol{\underaccent{\bar}{x}},b)=\mathcal{C}(\mathcal{C}(\boldsymbol{\underaccent{\bar}{x}}),b), where \boldsymbol{\underaccent{\bar}{x}}\in\{0,1\}^{n} and . Then, with input outputs \boldsymbol{\underaccent{\bar}{y}}_{1}=(\boldsymbol{\underaccent{\bar}{x}}_{1},b_{1}) and \boldsymbol{\underaccent{\bar}{y}}_{2}=(\boldsymbol{\underaccent{\bar}{x}}_{2},b_{2}) such that \boldsymbol{\underaccent{\bar}{y}}_{1}\neq\boldsymbol{\underaccent{\bar}{y}}_{2} and \mathcal{C}^{\prime}(\boldsymbol{\underaccent{\bar}{y}}_{1})=\mathcal{C}^{\prime}(\boldsymbol{\underaccent{\bar}{y}}_{2}). We consider the following possible cases:
, then \boldsymbol{\underaccent{\bar}{y}}_{1} and \boldsymbol{\underaccent{\bar}{y}}_{2} form a collision for .
, then \boldsymbol{\underaccent{\bar}{x}}_{1}\neq\boldsymbol{\underaccent{\bar}{x}}_{2} and one of the following holds: \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}}_{1})\neq\mathcal{C}(\boldsymbol{\underaccent{\bar}{x}}_{2}) or \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}}_{1})=\mathcal{C}(\boldsymbol{\underaccent{\bar}{x}}_{2}). In the first case, \boldsymbol{\underaccent{\bar}{y}}_{1} and \boldsymbol{\underaccent{\bar}{y}}_{2} form a collision for . Otherwise, \boldsymbol{\underaccent{\bar}{x}}_{1} and \boldsymbol{\underaccent{\bar}{x}}_{2} form a collision for .
The above lemma naturally generalizes to any polynomial shrinkage of the input by repeating the same construction. For our purposes shrinking the input by two bits is enough.
is -hard.
From Lemma 5.3, it suffices to show a Karp-reduction from to .
First, we observe that the fact that guarantees the existence of at least pairs of \boldsymbol{\underaccent{\bar}{x}}_{1} and \boldsymbol{\underaccent{\bar}{x}}_{2} such that \boldsymbol{\underaccent{\bar}{x}}_{1}\neq\boldsymbol{\underaccent{\bar}{x}}_{2} and \text{H}_{\boldsymbol{\underaccent{\bar}{s}}}(\boldsymbol{\underaccent{\bar}{x}}_{1})=\text{H}_{\boldsymbol{\underaccent{\bar}{s}}}(\boldsymbol{\underaccent{\bar}{x}}_{2}). In particular, there exist collision pairs such that \boldsymbol{\underaccent{\bar}{x}}_{1}\neq\boldsymbol{\underaccent{\bar}{0}} and \boldsymbol{\underaccent{\bar}{x}}_{2}\neq\boldsymbol{\underaccent{\bar}{0}}.
As we explained, without loss of generality, we assume that consists of only gates. Before starting with the reduction we change our circuit to a circuit such that any collision of corresponds to a collision of . We construct a circuit such that it first computes the of all the inputs, . Then, in the circuit’s graph we substitute the outgoing edges of nodes with label “” with outgoing edges from the node , that we introduced. Hence, ’ only contains gates. The next claim follows by construction of .
For every \boldsymbol{\underaccent{\bar}{x}}\in\{0,1\}^{k}\setminus{\boldsymbol{\underaccent{\bar}{0}}}, it holds that \mathcal{C}^{\prime}(\boldsymbol{\underaccent{\bar}{x}})=\mathcal{C}(\boldsymbol{\underaccent{\bar}{x}}) and \mathcal{C}^{\prime}(\boldsymbol{\underaccent{\bar}{0}})=\boldsymbol{\underaccent{\bar}{0}}.
Let \boldsymbol{\underaccent{\bar}{x}}\in\{0,1\}^{k}\setminus{\boldsymbol{\underaccent{\bar}{0}}}, then for this input the value of the node that we introduced is certainly equal to . Since we replaced all the uses of the constant gate with the output of and outputs one, the results of the circuit will be exactly the same. Also, it is easy to observe that any circuit with only gates, on input \boldsymbol{\underaccent{\bar}{0}} outputs always \boldsymbol{\underaccent{\bar}{0}}, and hence \mathcal{C}^{\prime}(0)=\boldsymbol{\underaccent{\bar}{0}}. ∎
Let (\boldsymbol{\underaccent{\bar}{x}}_{1},\boldsymbol{\underaccent{\bar}{x}}_{2}) be a solution of with \boldsymbol{\underaccent{\bar}{x}}_{1}\neq\boldsymbol{\underaccent{\bar}{0}} and \boldsymbol{\underaccent{\bar}{x}}_{2}\neq\boldsymbol{\underaccent{\bar}{0}}, then the pair \boldsymbol{\underaccent{\bar}{x}}_{1} and \boldsymbol{\underaccent{\bar}{x}}_{2} is a collision for . But, we cannot guarantee that \boldsymbol{\underaccent{\bar}{x}}_{1}\neq\boldsymbol{\underaccent{\bar}{0}} and \boldsymbol{\underaccent{\bar}{x}}_{2}\neq\boldsymbol{\underaccent{\bar}{0}} when we use as input for . Thus, we need a modification of that guarantees that will not return the \boldsymbol{\underaccent{\bar}{0}}-vector as part of a solution. To achieve this, we construct a new circuit that is exactly the same as with one more output variable set to be equal to . This last output of is equal to if and only if \boldsymbol{\underaccent{\bar}{x}}\neq\boldsymbol{\underaccent{\bar}{0}}. Observe that the output of consists of bits, and hence still compresses its domain by one bit. Let be the matrices corresponding to according to the construction of proof of Lemma 4.5, then \boldsymbol{\underaccent{\bar}{s}}^{\prime}=(\mathbf{A}^{\prime},\mathbf{G}^{\prime}) is a valid input for .
We conclude the proof by observing that every output of with input \boldsymbol{\underaccent{\bar}{s}}^{\prime} gives a collision \mathcal{C}^{\prime\prime}(\boldsymbol{\underaccent{\bar}{x}})=\mathcal{C}^{\prime\prime}(\boldsymbol{\underaccent{\bar}{y}}), with \boldsymbol{\underaccent{\bar}{x}}\neq\boldsymbol{\underaccent{\bar}{y}}. If \boldsymbol{\underaccent{\bar}{x}}\neq\boldsymbol{\underaccent{\bar}{0}} and \boldsymbol{\underaccent{\bar}{y}}\neq\boldsymbol{\underaccent{\bar}{0}}, then it follows from Claim 5.6 and the construction of that \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}})=\mathcal{C}(\boldsymbol{\underaccent{\bar}{y}}), and hence the pair \boldsymbol{\underaccent{\bar}{x}}, \boldsymbol{\underaccent{\bar}{y}} is a solution to our initial instance. Additionally, there is no collision of the form \boldsymbol{\underaccent{\bar}{0}} and \boldsymbol{\underaccent{\bar}{x}}. Assume that there was, then since the last bit of \mathcal{C}^{\prime\prime}(\boldsymbol{\underaccent{\bar}{0}}) is , it must be that \mathcal{C}^{\prime\prime}(\boldsymbol{\underaccent{\bar}{x}}) is also . However, by construction of , \boldsymbol{\underaccent{\bar}{0}} is the unique binary vector for which the last bit of the output is , so it must be that \boldsymbol{\underaccent{\bar}{x}}=0 which is a contradiction. Therefore, \boldsymbol{\underaccent{\bar}{x}}\neq\boldsymbol{\underaccent{\bar}{0}} and \boldsymbol{\underaccent{\bar}{y}}\neq\boldsymbol{\underaccent{\bar}{0}}, and the lemma follows. ∎
By combining Lemma 5.2 and Lemma 5.4, we get the following theorem.
The problem is -complete.
In the previous section, we showed that defines a worst-case collision resistant hash function, similar to the result of [Sel92] for one-way functions. However, the definition of collision-resistance in cryptography requires a stronger property. More specifically, it says that given a random chosen function in the family, it is hard to find a collision for this function. In this section, we investigate the average-case hardness of in the search of a construction for a candidate of a both natural and universal collision-resistant hash function.
We have briefly mentioned the connection between and . Now, we describe it in more detail. We start by defining the problem.
Finally we note that, since is -complete and contains all collision-resistant hash functions, the following statement is also true.
If there exists a family of collision-resistant hash functions , then there exists a distribution over keys \boldsymbol{\underaccent{\bar}{s}}\in\{0,1\}^{p(k)} for , where draws key from this distribution and is collision-resistant.
Lattice Problems and 𝗣𝗣𝗣𝗣𝗣𝗣\boldsymbol{\mathsf{PPP}}
The lattice nature of the problem raises also the next question:
What is the connection between the class and lattices?
In this section we present the known result for the lattice problems that are contained in and we also mention how some of these results are implied by the completeness results that we presented in the previous sections. We start with the formal definition of the lattice problems that we are interested in.
Lattice Problems. We recall some of the most important lattice problems: the Shortest Vector Problem (), the Shortest Independent Vectors Problem () and the Closest Vector Problem (). We start by defining some important lattice quantities. For a lattice ,
where .
Where is a non-decreasing function in the lattice dimension . For we get the exact version of the problem.
We now show that well-studied (approximation) lattice problems are contained in . First, we define the problem and show a reduction to . Second, using known reductions, we conclude the membership of other lattice problems to .
For and , is in .
For any it holds that . This implies a Karp-reduction from to . Hence, it suffices to show a Karp-reduction from to .
To show that and are a valid input for , we need to show . This follows from the next claim for .
Finally, on input and will output one of following:
a vector such that . Since , we get . Hence, is a solution to .
is in .
Furthermore, there are known connections between with polynomial approximation factor and the class . Specifically, it is known that is Cook-reducible to (see [Rot16, Theorem 1.23]).
This, along with the reduction from to in [SD15] (for ) and Lemma 6.1, implies the following.
and are Cook-reducible to .
A special type of lattices, that have gained a lot of attention due to their efficiency in cryptographic applications, are ideal lattices. The definition and cryptographic applications of ideal lattices are outside the scope of this work and can be found in [LPR13]. But, we include the following lemma, which needs only the basic fact that in ideal lattices, where is the length of the -th linearly independent vector. Let us denote by the shortest vector problem on ideal lattices.
is in .
For ideal lattices, it holds that . Minkowski’s second Theorem states that
Hence, on ideal lattices . Combining this with the first Minkowski’s theorem, that states that , we get that with input solves of . ∎
Cryptographic Assumptions in 𝗣𝗣𝗣𝗣𝗣𝗣\boldsymbol{\mathsf{PPP}}
Additionally, recently it was shown that integer factorization is in [Jer16]. In fact, Jeřábek [Jer16] showed that integer factorization lies in , which gives strong evidence that factoring is not -complete.
We extend the connections between cryptographic assumptions and by presenting a more general formulation of discrete logarithm and showing its membership to . We call this formulation discrete logarithm over general groups.
the polynomial-size circuit for the group operation function .
a binary vector \boldsymbol{\underaccent{\bar}{x}} such that \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}})=\boldsymbol{\underaccent{\bar}{0}}.
two binary vectors \boldsymbol{\underaccent{\bar}{x}}, \boldsymbol{\underaccent{\bar}{y}} such that \boldsymbol{\underaccent{\bar}{x}}\neq\boldsymbol{\underaccent{\bar}{y}} and \mathcal{C}(\boldsymbol{\underaccent{\bar}{x}})=\mathcal{C}(\boldsymbol{\underaccent{\bar}{y}}).
We note that it is not known if elliptic curves are general groups, and hence discrete logarithm over general elliptic curves is not known to belong in (see Open Problem 1.5).
Acknowledgements
We thank the anonymous FOCS reviewers for their helpful comments. We thank an anonymous reviewer and Nico Döttling for bringing in our attention a universal CRH following Levin’s paradigm. We thank Vinod Vaikuntanathan, Daniel Wichs and Constantinos Daskalakis for helpful and enlightening discussions. We thank Christos-Alexandros Psomas and his coauthors for sharing their unpublished manuscript [BJP+15]. MZ also thanks Christos-Alexandros Psomas and Christos Papadimitriou for many fruitful discussions during his visit to Simons Institute at Berkeley at Fall 2015.
References
Appendix A Missing Proofs
We give a summary of the notation that we use for the parameters of and (or ).
Also, the known relations between these quantities for a valid or input are:
Appendix C Universal Collision Resistant Hash Function Family
We sketch the construction of a universal (average-case hard) hash function, following Levin’s paradigm [Lev87] for a universal one-way function. Let be a hash function family that takes two inputs a key and a vector , and compresses the input by one bit, i.e. . Let be a polynomial that bounds the running time of . First, using padding on the input we argue the existence of a hash function family that is defined as
such that . This implies that runs in quadratic time in (see [Gol06, §2.4.1]). Second, using standard domain extension we argue the existence of a hash function family that compresses the input for more than one bits, i.e. (we exclude the key from the description of the domain). Specifically, we require that the compressing ratio is enough so that the concatenation of copies of is smaller than the input, meaning that . Let be a polynomial that bounds the running time of . The hash function runs in time quadratic to its input, and using this fact, can be made explicit depending on the length of the extended domain that we require from , once that length is explicitly specified. This is enough to get an upper bound for the running time of .
The universal hash function is described by a collection of strings:
The numbers (represented as strings) are the indices to Turing Machines (assume a canonical ordering of TMs) that describe different hash function families , . The keys define the hash functions . Finally, for , we run each for at most steps and output:
If does not terminate after , we output for that . We can see that if at least one hash function family is collision-resistant, then so is . At a high level, if at least one hash function family is collision-resistant then so is , and moreover so is by the security of domain extension (e.g. Merkle–Damgård). Without loss of generality we assume that all families are defined on the same domain and range. Finally, the simple hash function combiner in which we concatenate the output of different hash functions , is collision-resistant as long as at least one hash function family is collision-resistant.