Universally Utility-Maximizing Privacy Mechanisms

Arpita Ghosh, Tim Roughgarden, Mukund Sundararajan

Introduction

Organizations including the census bureau, medical establishments, and Internet companies collect and publish statistical information . The census bureau may, for instance, publish the result of a query such as: “How many individuals have incomes that exceed 100,000?.Animplicithopeinthisapproachisthataggregateinformationissufficientlyanonymoussoasnottobreachtheprivacyofanyindividual.Unfortunately,publicationschemesinitiallythoughttobeprivatehavesuccumbedtoprivacyattacks,highlightingtheurgentneedformechanismsthatareprovablyprivate.Thedifferentialprivacyliteraturehasproposedarigorousandquantifiabledefinitionofprivacy,aswellasprovablyprivacypreservingmechanismsfordiverseapplicationsincludingstatisticalqueries,machinelearning,andpricing.Informally,for100,000?”. An implicit hope in this approach is that aggregate information is sufficiently anonymous so as not to breach the privacy of any individual. Unfortunately, publication schemes initially thought to be “private” have succumbed to privacy attacks , highlighting the urgent need for mechanisms that are provably private. The differential privacy literature has proposed a rigorous and quantifiable definition of privacy, as well as provably privacy-preserving mechanisms for diverse applications including statistical queries, machine learning, and pricing. Informally, for\alpha\in,arandomizedmechanismis, a randomized mechanism is\alphadifferentiallyprivateifchangingarowoftheunderlyingdatabasethedataofasingleindividualchangestheprobabilityofeverymechanismoutputbyatmostan-differentially private if changing a row of the underlying database—the data of a single individual—changes the probability of every mechanism output by at most an\alphafactor.Largervaluesoffactor. Larger values of\alphacorrespondtogreaterlevelsofprivacy.Differentialprivacyistypicallyachievedbyaddingnoisethatscaleswithcorrespond to greater levels of privacy. Differential privacy is typically achieved by adding noise that scales with\alpha.Whileitistriviallypossibletoachieveanylevelofdifferentialprivacy,forinstancebyalwaysreturningrandomnoise,thiscompletelydefeatstheoriginalpurposeofprovidingusefulinformation.Ontheotherhand,returningfullyaccurateresultscanleadtoprivacydisclosures.Thegoalofthispaperistoidentify,foreach. While it is trivially possible to achieve any level of differential privacy, for instance by always returning random noise, this completely defeats the original purpose of providing useful information. On the other hand, returning fully accurate results can lead to privacy disclosures . The goal of this paper is to identify, for each\alpha\in,theoptimal(i.e.,utilitymaximizing), the optimal (i.e., utility-maximizing)\alpha$-differentially private mechanism.

Model

We consider databases with nn rows drawn from a finite domain DD. Every row corresponds to an individual. Two databases are neighbors if they coincide in n1n-1 rows. A count query ff takes a database dDnd\in D^{n} as input and returns the result f(d)N={0,,n}f(d)\in N=\{0,\ldots,n\} that is the number of rows that satisfy a fixed, non-trivial predicate on the domain DD. Such queries are also called predicate or subset-sum queries; they have been extensively studied in their own right , and form a basic primitive from which more complex queries can be constructed .

A randomized mechanism with a (countable) range RR is a function xx from DnD^{n} to RR, where xdrx_{dr} is the probability of outputting the response rr when the underlying database is dd. For α\alpha\in, a mechanism xx is α\alpha-differentially private if the ratio xd1r/xd2rx_{{d_{1}}r}/x_{{d_{2}}r} lies in the interval [α,1/α][\alpha,1/\alpha] for every possible output rRr\in R and pair d1,d2d_{1},d_{2} of neighboring databases The usual definition of differential privacy requires this bound on the ratio for all subsets of the range. With a countable range, the two definitions are equivalent.. (We interpret 0/00/0 as 11.) Intuitively, the probability of every response of the privacy mechanism — and hence the probability of a successful privacy attack following an interaction with the mechanism — is, up to a controllable α\alpha factor, independent of whether a given user “opts in” or “opts out” of the database.

A mechanism is oblivious if, for all rRr\in R, xd1r=xd2rx_{d_{1}r}=x_{d_{2}r} whenever f(d1)=f(d2)f(d_{1})=f(d_{2}) — if the output distribution depends only on the query result. Most of this paper considers only oblivious mechanisms; for optimal privacy mechanism design, this is without loss of generality in a precise sense (see Section 6.2). The notation and definitions above simplify for oblivious mechanisms and count queries. We can specify an oblivious mechanism via the probabilities xirx_{ir} of outputting a response rRr\in R for each query result iNi\in N; α\alpha-differential privacy is then equivalent to the constraint that the ratios xir/x(i+1)rx_{ir}/x_{(i+1)r} lie in the interval [α,1/α][\alpha,1/\alpha] for every possible output rRr\in R and query result iN{n}i\in N\setminus\{n\}.

The α\alpha- geometric mechanism is defined as follows. When the true query result is f(d)f(d), the mechanism outputs f(d)+Zf(d)+Z. ZZ is a random variable distributed as a two-sided geometric distribution: Pr[Z=z]=1α1+ααzPr[Z=z]=\frac{1-\alpha}{1+\alpha}\alpha^{|z|} for every integer zz. This (oblivious) mechanism is α\alpha-differentially private because the probabilities of adjacent points in its range differ by an α\alpha factor and because the true answer to a count query differs by at most one on neighboring databases.

2 Utility Model

This paper pursues strong and general utility guarantees. Just as differential privacy guarantees protection against every potential attacker, independent of its side information, we seek mechanisms that guarantee near-optimal utility to every potential user, independent of its side information and preferences.

We now formally define preferences and side information. We model the preferences of a user via a loss function ll; l(i,r)l(i,r) denotes the user’s loss when the query result is ii and the mechanism’s (perturbed) output is rr. We allow ll to be arbitrary, subject only to being nonnegative, and nondecreasing in ir|i-r| for each fixed ii. For example, the loss function ir|i-r| measures mean error, the implicit measure of (dis)utility in most previous literature on differential privacy. Two among the many other natural possibilities are (ir)2(i-r)^{2}, which essentially measures variance of the error; and the binary loss function lbin(i,r)l_{bin}(i,r), defined as if i=ri=r and 11 otherwise.

We model the side information of a user as a prior probability distribution {pi}\{p_{i}\} over the query results iNi\in N. This prior represents the beliefs of the user, which might stem from other information sources, previous interactions with the mechanism, introspection, or common sense. We emphasize that we are not introducing priors to weaken the definition of differential privacy; we use the standard definition of differential privacy (which makes no assumptions about the side information of an attacker) and use a prior only to discuss the utility of a (differentially private) mechanism to a potential user.

Now consider a user with a prior {pi}\{p_{i}\} and loss function ll and an oblivious mechanism xx with range RR. For a given input dd with query result i=f(d)i=f(d), the user’s expected loss is rRxirl(i,r)\sum_{r\in R}x_{ir}\cdot l(i,r), where the expectation is over the coin flips internal to the mechanism. The user’s prior then yields a measure of the mechanism’s overall (dis)utility to the the user:

This is simply the expected loss over the coin tosses of the mechanism and the prior.The central theorem of choice theory (e.g. [15, Chapter 6]) states that every preference relation over mechanisms that satisfies reasonable axioms (encoding “rationality”) can be modeled via expected utility, just as we propose. In particular, this theorem justifies the use of priors for expressing a rational user’s trade-off over possible query results. We can then define the optimal α\alpha-differentially private mechanism for a user as one that minimizes the user-specific objective function (1).

3 User Post-Processing

Could a single mechanism be good simultaneously for all users? A crucial observation for an affirmative answer is that a user has the power to post-process the output of a privacy mechanism, and that such post-processing can decrease the user’s expected loss.

Fix a database size nn that is odd. Consider a user with the binary loss function lbinl_{bin} and prior p0=pn=1/2p_{0}=p_{n}=1/2, pj=0p_{j}=0 for all jN{0,n}j\in N\setminus\{0,n\}, that interacts with the α\alpha-geometric mechanism. Without post-processing, i.e., when the user accepts the mechanism’s outputs at face value, the user’s expected loss is (2α)/(1+α)(2\cdot\alpha)/(1+\alpha). If the user maps outputs of the geometric mechanism that are at least (n+1)/2(n+1)/2 to nn and all other outputs to , it effectively induces a new mechanism with the much smaller expected loss of α(n+1)/2/(1+α)\alpha^{(n+1)/2}/(1+\alpha).

In general, a (randomized) remap of a mechanism xx with range RR is a probabilistic function yy, with yrry_{r^{\prime}r} denoting the probability that the user reinterprets the mechanism’s response rRr^{\prime}\in R as the response rRr\in R. A mechanism xx and a remap yy together induce a new (α\alpha-differentially private) mechanism yxy\circ x with (yx)ir=rRxiryrr(y\circ x)_{ir}=\sum_{r^{\prime}\in R}x_{ir^{\prime}}\cdot y_{r^{\prime}r}.

We assume that a (rational) user with prior pp and loss function ll, interacting with a publicly known mechanism xx, employs a remap yy that induces the mechanism yxy\circ x that minimizes its expected loss (1) over all such remaps. It is well known (e.g. [15, Chapter 9]) and easy to see that, among all possible (randomized) remappings, the optimal one follows from applying Bayes rule and then minimizing expected loss. Precisely, for each response rr of xx, compute the posterior probability over query results: For every iRi\in R, p(ir)=(xirpi)/(iRxirpi)p(i|r)=(x_{ir}\cdot p_{i})/(\sum_{i^{\prime}\in R}x_{i^{\prime}r}\cdot p_{i^{\prime}}). Then, choose the query result iRi^{*}\in R that minimizes expected loss subject to the posterior and set yri=1y_{ri^{*}}=1 and yri=0y_{ri}=0 for iii\neq i^{*}. This remap is simple, deterministic, and can be computed efficiently.

Main Result and Discussion

Our main result is that for every α\alpha\in, the α\alpha-geometric mechanism is simultaneously optimal for every rational user.

Let xGx^{G} denote the α\alpha- geometric mechanism for some database size n1n\geq 1 and privacy level α\alpha\in, and let yuy^{u} denote an optimal remap of xGx^{G} for the user uu with prior pp and (monotone) loss function ll. Then yuxGy^{u}\circ x^{G} minimizes uu’s expected loss (1) over all oblivious, α\alpha-differentially private mechanisms with range NN.

This is an extremely strong utility-maximization guarantee: every potential user uu, no matter what its side information and preferences, derives as much utility from the geometric mechanism as it does from interacting with a differentially private mechanism MuM_{u} that is optimally tailored to uu. We reiterate that the prior from the utility model plays no role in the definition of privacy, which is the standard, worst-case (over adversaries with arbitrary side-information and intent) guarantee provided by differential privacy. We emphasize that while the geometric mechanism is user-independent (all users see the same distribution over responses), different users remap its responses in different ways, as informed by their individual prior distributions and loss functions. Rephrasing Theorem 3.1, for every user there is an optimal mechanism for it that factors into a user-independent part—the α\alpha-geometric mechanism—and a user-specific computation that can be delegated to the user. (See Figure 1.)

Theorem 3.1 shows how to achieve the same utility as a user-specific optimal mechanism without directly implementing one. Direct user-specific optimization would clearly involve several challenges. First, it would require advance knowledge or elicitation of user preferences, which we expect is impractical in most applications. And even if a mechanism was privy to the various preferences of its users, it would effectively need to answer the same query in different ways for different users, which in turn degrades its differential privacy guarantee.

In Theorem 3.1, the restriction to oblivious mechanisms is, in a precise sense, without loss of generality. (See Section 6.2.) The restriction to the range NN effectively requires that the mechanism output is a legitimate query result for some database; this type of property is called “consistency” in the literature (e.g. ).

Related Work

Differential privacy is motivated in part by the provable impossibility of absolute privacy against attackers with arbitrary side information . One interpretation of differential privacy is: no matter what prior distribution over databases a potential attacker has, its posterior after interacting with a differentially private mechanism is almost independent of whether a given user “opted in” or “opted out” of the database . Below we discuss the papers in the differential privacy literature closest to the present work; see for a recent, thorough survey of the state of the field.

Dinur and Nissim showed that for a database with nn rows, answering O(nlog2n)O(n\log^{2}n) randomly chosen subset count queries with o(n)o(\sqrt{n}) error allows an adversary to reconstruct most of the rows of the database (a blatant privacy breach); see Dwork et al. for a more robust impossibility result of the same type. Most of the differential privacy literature circumvents these impossibility results by focusing on interactive models where a mechanism supplies answers to only a sub-linear (in nn) number of queries. Count queries (e.g. ) and more general queries (e.g. ) have been studied from this perspective.

Blum et al. take a different approach by restricting attention to count queries that lie in a restricted class; they obtain non-interactive mechanisms that provide simultaneous good accuracy (in terms of worst-case error) for all count queries from a class with polynomial VC dimension. Kasiviswanathan et al. give further results for privately learning hypotheses from a given class.

The use of abstract “utility functions” in McSherry and Talwar has a similar flavor to our use of loss functions, though the motivations and goals of their work and ours are unrelated. Motivated by pricing problems, McSherry and Talwar design differentially private mechanisms for queries that can have very different values on neighboring databases (unlike count queries); they do not consider users with side information (i.e., priors) and do not formulate a notion of mechanism optimality (simultaneous or otherwise).

Finally, in recent and independent work, McSherry and Talwar (personal communication, October 2008) also apply linear programming theory in the analysis of privacy mechanisms. Again, their goal is different: they do not consider a general utility model, but instead ask how expected error must scale with the number of queries answered by a differentially private mechanism.

Proof of Main Result

This section proves Theorem 3.1. The proof has three high-level steps.

For a given user uu, we formulate the problem of determining the differentially private mechanism that minimizes expected loss as a solution to a linear program (LP). The objective function of this LP is user-specific, but the feasible region is not.

We identify several necessary conditions met by every privacy mechanism that is optimal for some user.

For every privacy mechanism that satisfies these conditions, we construct a remap yy such that yxG=xy\circ x^{G}=x. By assumption, a rational user employs an “optimal remap” of xGx^{G}, so the mechanism induced by this map must be optimal for the user uu.

Fix a database size nn and a privacy level α\alpha. Theorem 3.1 is trivially true for the degenerate cases of α=0,1\alpha=0,1. So, we assume that α(0,1)\alpha\in(0,1). For every fixed user with loss function ll and prior pp, the formulation of privacy constraints in Section 2.1 together with the objective function (1) yields the following LP whose solution is an optimal mechanism for this user.

Since the LP is bounded and feasible, we have the following (e.g. ).

Every user-specific LP has an optimal solution that is a vertex.

For the rest of this section, fix a user with prior {pi}\{p_{i}\} and a loss function l(i,r)l(i,r) that is monotone in ir|i-r| for every iNi\in N. Fix a mechanism xx that is optimal for this user, and also a vertex of the polytope of the user-specific LP. Vertices can be uniquely identified by the set of all constraints that are tight at the vertex. This motivates us to characterize the state of constraints (slack or tight) of mechanisms that are optimal for some user.

We now begin the second step of the proof. We will view xx as a (n+1)×(n+1)(n+1)\times(n+1)-matrix where rows correspond to query results (inputs) and columns correspond to query responses (outputs). We state our necessary conditions in terms of an n×(n+1)n\times(n+1) constraint matrix CC associated with the mechanism xx. Row ii of the constraint matrix corresponds to rows ii and i+1i+1 of the corresponding mechanism. Every entry of C(i,r)C(i,r), for iNni\in N\setminus{n}, rNr\in N takes on exactly one of four values. If xir=x(i+1)r=0x_{ir}=x_{(i+1)r}=0 then C(i,r)=ZC(i,r)=Z. If xir,x(i+1)rZx_{ir},x_{(i+1)r}\neq Z, then there are three possibilities. If αxir=x(i+1)r\alpha\cdot x_{ir}=x_{(i+1)r} then C(i,r)=C(i,r)=\downarrow. If xir=αx(i+1)rx_{ir}=\alpha\cdot x_{(i+1)r} then C(i,r)=C(i,r)=\uparrow. Otherwise C(i,r)=SC(i,r)=S.

Figure 2 shows an optimal, 1/21/2-differentially private mechanism for a specific user. Figure 3 lists the constraint matrix of this mechanism. This mechanism can be derived from the 1/21/2-geometric mechanism by mapping every negative number to , every number larger than 55 to 55, 11 to 22 and the other numbers to themselves.

The constraint matrix is well defined: since α<1\alpha<1, at most one of αxij=x(i+1)j\alpha\cdot x_{ij}=x_{(i+1)j} or xij=αx(i+1)jx_{ij}=\alpha\cdot x_{(i+1)j} holds (or else both are zero). Also, using that α>0\alpha>0, xij=0x_{ij}=0 implies that xkj=0x_{kj}=0 for all kk. Thus every column of CC is either all ZZ’s, which we then call a ZZ-column, or has no ZZ entries, which we then call a non-ZZ column.

By definition, the constraint matrix encodes the state of all the privacy constraints (3) and (4). The constraint matrix also implicitly encodes the state of the inequality constraints (6). First, xi,j>0x_{i,j}>0 if and only if the jjth column of CC is non-ZZ. Also, we can assume that xij<1x_{ij}<1 for all i,jNi,j\in N; otherwise, since α>0\alpha>0, xx has singleton support and the proof of the theorem is trivial. Since vertices are uniquely identified by the set of all the constraints that are tight at the vertex, we have the following.

Every mechanism that is a vertex of the polytope of the user-specific LP has a unique constraint matrix.

Let ss denote the number of SS entries of CC. Let sis_{i} denote the number of SS entries in the iith non-ZZ column. Let Si=kisiS_{i}=\sum_{k\leq i}s_{i}. Let zz denote the number of ZZ-columns of CC. The next few lemmas, Lemma 5.4, 5.5 and 5.6, use the fact that xx is an optimal mechanism to show that the rows of CC must exhibit certain structure. Corollary 5.7 and Lemma 5.8 additionally use that xx is a vertex to show that the total number of SS entries of CC is equal to the number of ZZ-columns. Unless otherwise stated, we ignore ZZ-columns of CC. The following lemma holds for every feasible mechanism.

No row of the constraint matrix is either all \downarrow’s or all \uparrow’s.

Suppose row ii of the constraint matrix is all \downarrow’s. So, for all rNr\in N, αxir=x(i+1)r\alpha\cdot x_{ir}=x_{(i+1)r} and because α<1\alpha<1, xir>x(i+1)rx_{ir}>x_{(i+1)r}. Since rxir=1\sum_{r}x_{ir}=1, rx(i+1)r<1\sum_{r}x_{(i+1)r}<1, contradicting the feasibility of the mechanism. The other case is analogous. ∎

The next, key lemma relies on the monotonicity of the loss function ll and the optimality of xx.

Every row of the constraint matrix CC has the following pattern: some \downarrow’s followed by at most one SS followed by some \uparrow’s.

Suppose not; then there exists a row index ii and column indices l<ml<m such that C(i,l)C(i,l)\neq\downarrow and C(i,m)C(i,m)\neq\uparrow. There are two cases: i(l+m)/2i\leq(l+m)/2 and i>(l+m)/2i>(l+m)/2. We prove the first case, and suggest modifications to the proof for the second. We now define a feasible mechanism xx^{\prime} with strictly smaller expected loss, contradicting the optimality of xx. We multiply the numbers ximx_{i^{\prime}m} for all iii^{\prime}\leq i by a 1δ1-\delta factor for some small δ>0\delta>0 and set xil=xil+δximx_{i^{\prime}l}=x_{i^{\prime}l}+\delta\cdot x_{i^{\prime}m}.

Because i(l+m)/2i\leq(l+m)/2, ilim|i^{\prime}-l|\leq|i^{\prime}-m|, and, so, for all iii^{\prime}\leq i the expected loss strictly decreases for strictly monotone loss functions and priors with support NN.If the prior does not have full support or if the loss function is not strictly monotone, we can reach the same conclusion using perturbations and a limiting argument. We now discuss feasibility xx^{\prime}. Notice that we modify only two columns of xx. The set of constraints (5) are preserved by feasibility of xx and the definition of xx^{\prime}. Because α>0\alpha>0, xim>0x_{i^{\prime}m}>0 for all iNi^{\prime}\in N and for sufficiently small δ\delta the set of constraints (6) continue to hold. For all j<ij<i, privacy constraints that involve the numbers {xjm}\{x_{jm}\} continue to hold as all we are doing is scaling. For all j<ij<i, privacy constraints that only involve the numbers in the sets {xjm}\{x_{jm}\} and {xjl}\{x_{jl}\} continue to hold since differential privacy is preserved by scaling and adding. Finally, the privacy constraints involving the numbers xim,x(i+1)m,xil,x(i+1)lx_{im},x_{(i+1)m},x_{il},x_{(i+1)l} are preserved for sufficiently small δ\delta since C(i,l)C(i,l)\neq\downarrow and C(i,m)C(i,m)\neq\uparrow.

For the second case, when i>(l+m)/2i>(l+m)/2, we interchange the roles of the columns ll and mm and modify rows ii+1i^{\prime}\geq i+1 rather than iii^{\prime}\leq i, i.e., we multiply xilx_{i^{\prime}l} for all ii+1i^{\prime}\geq i+1 by 1δ1-\delta for a small, positive δ\delta and set xim=xim+δxilx_{i^{\prime}m}=x_{i^{\prime}m}+\delta\cdot x_{i^{\prime}l}. The proof can be modified accordingly. ∎

The next lemma relates adjacent rows of CC; it uses the previous lemma combined with the fact that each row of xx is a valid probability distribution.

For all i0n2i\in 0\ldots n-2, row i+1i+1 of CC has at least one more \downarrow than row ii of CC; unless row i+1i+1 has a SS in which case row i+1i+1 has at least as many \downarrow’s as row ii.

Suppose to the contrary that the pair of rows ii,i+1i+1 violates the condition in the lemma statement, then we will show that row i+2i+2 of the mechanism violates the probability constraint (5), i.e. kx(i+2)k>1\sum_{k}x_{(i+2)k}>1.

Recall the pattern of a row ii of CC from Lemma 5.5. Suppose that row ii has \downarrow’s in exactly positions 0j10\ldots j-1. Let k<jxik=a\sum_{k<j}x_{ik}=a, xij=bx_{ij}=b, k>jxik=c\sum_{k>j}x_{ik}=c and x(i+1)j=bx_{(i+1)j}=b^{\prime}. Because row ii of xx must satisfy the probability constraint (5), we have

By assumption, C(i,k)=C(i,k)=\downarrow, for all k<jk<j. By Lemma 5.5, we have that for all kj+1nk\in j+1\ldots n, C(i,k)=C(i,k)=\uparrow and C(i,j)C(i,j) is either a SS or a \uparrow. So, k<jx(i+1)k=αa\sum_{k<j}x_{(i+1)k}=\alpha\cdot a, and k>jx(i+1)k=c/α\sum_{k>j}x_{(i+1)k}=c/\alpha. Because row i+1i+1 of xx must satisfy the probability constraint (5), we have

By assumption the rows i,i+1i,i+1 violate the lemma condition. So, by Lemma 5.5 and the definition of the index jj, we have C(i+1,k)=C(i+1,k)=\uparrow for all columns kjk\geq j. So, kjx(i+2)k=1/αkjx(i+1)k=b/α+c/α2\sum_{k\geq j}x_{(i+2)k}=1/\alpha\sum_{k\geq j}x_{(i+1)k}=b^{\prime}/\alpha+c/\alpha^{2}. Suppose that for each k1j1k\in 1\ldots j-1, x(i+2)kx_{(i+2)k} is as small as possible subject to the privacy constraint (so x(i+2)k=αx(i+1)kx_{(i+2)k}=\alpha\cdot x_{(i+1)k}); this is the worst case for our argument. Then, k<jx(i+2)k=αk<jx(i+1)k=α2a\sum_{k<j}x_{(i+2)k}=\alpha\sum_{k<j}x_{(i+1)k}=\alpha^{2}\cdot a. To complete the proof we show that α2a+b/α+c/α2\alpha^{2}a+b^{\prime}/\alpha+c/\alpha^{2} is strictly larger than 11 and so row i+2i+2 of xx violates the probability constraint.

Using Equations (7) and (8) to eliminate aa and cc, we can show that α2a+b/α+c/α2=1/α+α1+bbα\alpha^{2}a+b^{\prime}/\alpha+c/\alpha^{2}=1/\alpha+\alpha-1+b-b^{\prime}\alpha, which is at least 1/α+α11/\alpha+\alpha-1 because bbα0b-b^{\prime}\alpha\geq 0 (xx is α\alpha-differentially private). Simple algebra shows that 1/α+α1>11/\alpha+\alpha-1>1 whenever (α1)2>0(\alpha-1)^{2}>0, which holds since α<1\alpha<1. ∎

The next two lemmas relate the number of ZZ-columns of CC to the number of SS entries in it.

The number ss of SS entries of CC is at least the number zz of ZZ-columns of CC.

Suppose there are aa \downarrow’s and bb SS’s in the first row of the constraint matrix. From Lemma 5.4, the first row cannot consist entirely of \uparrow’s, i.e. a+b1a+b\geq 1. By Lemma 5.6, the number of \downarrow’s in the last row is at least a+(n1)(sb)a+(n-1)-(s-b). i.e. at least nsn-s. By Lemma 5.4, the number of \downarrow’s in the last row must be at most nzn-z. Chaining the two inequalities gives us the result. ∎

Unlike the previous lemmas, the next lemma uses the fact that xx is a vertex.

The number ss of SS entries of CC is equal to the number zz of ZZ-columns of CC.

By Corollary 5.7 all we need to show is that szs\leq z. Recall that xx is the solution to an LP with (n+1)2(n+1)^{2}-variables. Since xx is a vertex of the polytope of the user-specific LP, it must be at the intersection of at least (n+1)2(n+1)^{2} linearly independent constraints.

Let us account for them. The n+1n+1 constraints (5) are tight. Exactly z(n+1)z(n+1) constraints of the type specified by (6) are tight—these involve variables xijx_{ij}, for all iNi\in N, such that jj is an index of some ZZ-column of CC. Thus, of the remaining privacy constraints, at least (n+1)(nz)(n+1)(n-z) must be tight. Because (by definition of CC) every such tight constraint corresponds to a unique \downarrow or \uparrow entry, there are at least (n+1)(nz)(n+1)(n-z) such entries. Thus, of the n(n+1z)n(n+1-z) entries which are not ZZ’s, at most n(n+1z)(n+1)(nz)=zn(n+1-z)-(n+1)(n-z)=z of the entries are SS. ∎

Lemma 5.9 leverages the conditions on the rows of CC established by previous lemmas to establish conditions on the columns of CC via a counting argument. This completes the second step of the proof of Theorem 3.1.

For all i0nzi\in 0\ldots n-z, the iith non-ZZ column of CC has \uparrow’s in rows indexed 0i1+Si10\ldots i-1+S_{i-1}, SS’s in the sis_{i} positions that follow and \downarrow’s in the remaining positions.

We do induction on non-ZZ columns. The base case: The first column does not contain any \uparrow’s, because then by Lemma 5.5, an entire row contains only \uparrow’s, contradicting Lemma 5.4. By Lemma 5.6 and Lemma 5.5, a SS cannot follow a \downarrow and a \uparrow cannot follow a SS in the same column. Therefore, the first column contains s0s_{0} SS’s, followed by \downarrow’s.

For the inductive step, assume column i1i-1 has \uparrow’s and SS’s in rows 0i2+Si10\ldots i-2+S_{i-1} and so column ii has \uparrow’s in these positions by Lemma 5.5. We now show by contradiction that C(i1+Si1,i)=C(i-1+S_{i-1},i)=\uparrow; assume otherwise. Let the indicator variable II be 1 if C(i1+Si1,i)=C(i-1+S_{i-1},i)=\downarrow and 0 if C(i1+Si1,i)=SC(i-1+S_{i-1},i)=S. By the induction hypothesis, C(i1+Si1,i1)=C(i-1+S_{i-1},i-1)=\downarrow and so by Lemma 5.5, row i1+Si1i-1+S_{i-1} has at least i+Ii+I \downarrow’s (we index from ). By Lemma 5.5, every one of the rows from row i+Si1i+S_{i-1} to row n1n-1 must either contain an additional \downarrow or expend one of the sSi11+Is-S_{i-1}-1+I remaining SS’s. Thus row nn has at least i+Ii+I + nis+1In-i-s+1-I = n+1sn+1-s \downarrow’s. By Corollary 5.8, s=zs=z and so the last row of CC is all \downarrow’s, which contradicts Lemma 5.4.

We next show that C(i+Si1,i)C(i+S_{i-1},i)\neq\uparrow and so with an argument similar to the base case, rows i+Si1ni+S_{i-1}\ldots n of column ii consists of sis_{i} zeros, followed by \downarrow’s, concluding the proof of the induction step. Suppose C(i+Si1,i)=C(i+S_{i-1},i)=\uparrow. By the induction hypothesis, C(i1+Si1,i1)=C(i-1+S_{i-1},i-1)=\downarrow, thus row i1+Si1i-1+S_{i-1} contains at least ii \downarrow’s. By induction C(i+Si1,i1)=C(i+S_{i-1},i-1)=\downarrow. So by Lemma 5.5, row i+Si1i+S_{i-1} does not contain a SS and contains ii \downarrow’s; this contradicts Lemma 5.6. ∎

This concludes the second step of the proof. For the third step of the proof, we work with a finite-range version of the geometric mechanism xGx^{G}, called GG. GG is wxGw\circ x^{G}, where ww is a deterministic remap that maps all negative integers to zero, all integers larger than nn to nn, and all other integers to themselves. We show, constructively, that there exists a remap y^u\hat{y}^{u} such that x=y^uxGx=\hat{y}^{u}\circ x^{G}. This proves the theorem because the composed remap y^uw\hat{y}^{u}\circ w applied to xGx^{G} induces xx: x=(y^uw)xGx=(\hat{y}^{u}\circ w)\circ x^{G}.Incidentally, there do exist vertices of the polytope of the user-specific LP that are not derivable from GG by remapping. Fortunately, these vertices are not optimal for any user. Lemma 5.10 clarifies the structure of columns of mechanisms induced by deterministic remaps of GG. Lemma 5.11 combines Lemma 5.10 and Lemma 5.9 to prove Theorem 3.1.

Consider a deterministic map that maps the integer sequence aba\ldots b (and no others) to a fixed lNl\in N. Then, the llth column of the constraint matrix of the mechanism induced by applying this map to GG has \uparrow’s in rows 0,,a10,\ldots,a-1, SS’s in rows a,,b1a,\ldots,b-1, and \downarrow’s in the remaining rows.

Let the variables {gij}\{g_{ij}\} denote the entries of GG and let wiw_{i} be the entry of in row ii of column ll of the induced mechanism. By definition, wi=j=abgijw_{i}=\sum_{j=a}^{b}g_{ij}.

The jjth column of GG is a multiple of the column vector (αj,αj1,,α0,,αnj1,αnj)(\alpha^{j},\alpha^{j-1},\ldots,\alpha^{0},\ldots,\alpha^{n-j-1},\alpha^{n-j}). So, for i,ji,j, 0ia10\leq i\leq a-1, ajba\leq j\leq b, gij=αg(i+1)jg_{ij}=\alpha g_{(i+1)j}, and so, wi=αwi+1w_{i}=\alpha w_{i+1}. Similarly, for i,ji,j, binb\leq i\leq n, ajba\leq j\leq b gij=(1/α)gi+1jg_{ij}=(1/\alpha)g_{i+1j}, and so, wi=(1/α)wi+1w_{i}=(1/\alpha)w_{i+1}.

Finally, for every ii, aib1a\leq i\leq b-1, there exist j,jj,j^{\prime}, aj,jba\leq j,j^{\prime}\leq b, such that gij=αgi+1jg_{ij}=\alpha g_{i+1j} and gij=(1/α)gi+1jg_{ij^{\prime}}=(1/\alpha)g_{i+1j^{\prime}}. Also α<1\alpha<1. Therefore, for ii such that aib1a\leq i\leq b-1 we have that αwi<wi+1<(1/α)wi\alpha w_{i}<w_{i+1}<(1/\alpha)w_{i}. We have the proof by the definition of constraint matrices. ∎

The following lemma shows that there exists a remapping of GG that induces a mechanism with constraint matrix identical to CC, the constraint matrix corresponding to our fixed (vertex) optimal mechanism xx. By Observation 5.3, the induced mechanism must be xx and the theorem is proved.

There exists a remapping of GG such that the induced mechanism has a constraint matrix identical to CC.

Define the following (user-specific) deterministic map y^\hat{y}, which we apply to GG. Let kik_{i} be the index of the iith non-ZZ column of CC. Map the integers in i+Si1i+Sii+S_{i-1}\ldots i+S_{i} to kik_{i}. We check that y^\hat{y} is well defined: Si+1S_{i}+1 responses of GG are mapped to kik_{i}, and so, a total of n+1z+sn+1-z+s distinct responses are mapped. By Lemma 5.8, z=sz=s, so the map is well defined for every member of the range NN of GG.

No integer that is an index of a ZZ-column of CC is in the range of the map. So, the constraint matrix of the induced mechanism has the same set of ZZ-columns as CC. By Lemma 5.9, the iith non-ZZ column has \uparrow’s in rows 0,,i1+Si10,\ldots,i-1+S_{i-1}, SS’s in rows i+Si1,,i1+Sii+S_{i-1},\ldots,i-1+S_{i}, and \downarrow’s in the remaining rows. By Lemma 5.10 and the definition of y^\hat{y}, this is precisely the pattern of the iith non-ZZ column of the constraint matrix of the induced mechanism. ∎

Discussion

Are there other mechanisms for which an analog of Theorem 3.1 holds? An obvious candidate is the well-studied Laplace mechanism (release the result with noise added from a distribution with density function ϵ/2eϵt\epsilon/2\cdot e^{-\epsilon|t|}, where α=eϵ\alpha=e^{-\epsilon}), essentially a continuous version of the geometric mechanism. Here is an example where the Laplace mechanism compares unfavorably with the geometric mechanism. Fix a count query, a database size nn, and a user with loss function lerrl_{err} and prior (1/2,1/2)(1/2,1/2) over the two possible results {0,1}\{0,1\}. The geometric mechanism has (optimal) expected loss α/(1+α)\alpha/(1+\alpha) whereas the Laplace mechanism has expected loss α/2\sqrt{\alpha}/2. The approximation achieved by the Laplace mechanism tends to \infty as α\alpha approaches . Though somewhat pathological, this example rules out the Laplace mechanism as an answer to the above question.

So, is the α\alpha-geometric mechanism the unique mechanism for which an analog of Theorem 3.1 holds? No, because, the proofs from the previous section demonstrate that the range-restricted variant GG also satisfies the guarantee of the theorem. But a uniqueness result is possible: if we restrict attention to mechanisms with range NN, then GG is the unique simultaneously optimal mechanism, up to a permutation of the range.

Fix a database size n1n\geq 1 and a privacy level α\alpha\in. Suppose there is an α\alpha-differentially private mechanism xx with range NN such that there exists remap yuy^{u} for every user uu where yuxy^{u}\circ x minimizes the expected loss of user uu (Eq 1) over all oblivious, α\alpha-differentially private mechanisms with range NN. Then, there exists permutation remap pp such that px=xGp\circ x=x^{G}.

Consider a user with uniform prior over NN and the loss function lbinl_{bin}. We can solve the user-specific LP to show that GG is the unique optimal mechanism for the user, where the user is constrained to accept the mechanism’s responses at face value. Suppose there is a mechanism GG^{\prime} distinct from GG that is optimal for all users, and in particular for this user. As GG in the above sense, GG^{\prime} must induce GG on application of the user’s optimal remap. By assumption, the range of GG^{\prime} is NN, while GG is onto NN. Recall that optimal remaps are deterministic and so GG^{\prime} must also be onto NN and the optimal remap must be a permutation. ∎

2 Obliviousness

Recall that our main result (Theorem 3.1) compares the utility of the remapped α\alpha-geometric mechanism only to oblivious mechanisms. While natural mechanisms (such as the Laplace mechanism from ) are usually oblivious, we now justify this restriction from first principles. Suppose we deploy non-oblivious mechanisms. Measuring the expected utility of such a mechanism for a given user requires the user to have a prior over databases, and not merely over query results. If a user begins only with a prior over query results (arguably the most natural assumption if that’s all it cares about), the prior could be extended to one over databases in numerous ways. Singling out any one extension would be arbitrary, so we consider optimizing worst-case utility over all such extensions. Formally, for Lemma 6.2 below, we temporarily replace (1) with the objective function

In (9), pdpip_{d}\rightarrow p_{i} indicates that pdp_{d} is a prior over DnD^{n} that induces the user’s prior pip_{i} over NN, meaning pi=df(d)=ipdp_{i}=\sum_{d|f(d)=i}p_{d} for every iNi\in N. The following lemma then shows that the restriction to oblivious mechanisms is without of generality.

Fix a database size n1n\geq 1 and privacy level α\alpha. For every user with prior {pi}\{p_{i}\} and loss function ll, there is an α\alpha-differentially private mechanism that minimizes the objective function (9) and is also oblivious.

Fix a privacy level α\alpha, a database size nn and a user with prior {pt}\{p_{t}\} and monotone loss function ll. Let xx be a mechanism that minimizes (9) for this user. We define a mechanism xx^{\prime} that is oblivious, α\alpha-differentially private, and has at most as much expected loss as xx for the objective (9). For any database dDnd\in D^{n}, define E(d)E(d) as the set of databases with the same query result as dd. For a database dDnd\in D^{n} and response rNr\in N, let xdrx^{\prime}_{dr} be the average of xdrx_{d^{\prime}r} over the databases {ddE(d)}\{d^{\prime}|d^{\prime}\in E(d)\}; xx^{\prime} is oblivious by definition and also a valid mechanism, in that it specifies a distribution over responses for every underlying database.

We now show that xx^{\prime} is α\alpha-differentially private. Fix two databases d1,d2Dnd_{1},d_{2}\in D^{n} such that d1d_{1} and d2d_{2} differ in exactly one row; We need to show that αxd1rxd2r\alpha x^{\prime}_{d_{1}r}\leq x^{\prime}_{d_{2}r}. Assume f(d1)f(d2)f(d_{1})\neq f(d_{2}), otherwise the proof is trivial.

For any database of E(d1)E(d_{1}), we can generate all its neighbors (databases that differ in exactly one row) in E(d2)E(d_{2}) by enumerating all the ways in which we can change the query result by exactly 11. For instance when f(d1)=f(d2)+1f(d_{1})=f(d_{2})+1, pick one of the nf(d1)n-f(d_{1}) rows that satisfy the predicate in d1d_{1} and change its value to one of those that violates the predicate. This process is identical for all databases of E(d1)E(d_{1}), and so for all dE(d1)d\in E(d_{1}), the number of neighbors of dd that belong to the set E(d2)E(d_{2}) is the same (does not vary with dd). Similarly, for all dE(d2)d\in E(d_{2}), the number of neighbors of dd that belong to the set E(d1)E(d_{1}) is the same.

Consider the following set of inequalities that hold because xx is α\alpha-differentially private: dE(d1)d\in E(d_{1}), dE(d2)d^{\prime}\in E(d_{2}), where d1d_{1} and d2d_{2} are neighbors, αxdrxdr\alpha x_{dr}\leq x_{d^{\prime}r}. By the argument in the above paragraph, all the databases in E(d1)E(d_{1}) appears equally frequently in the left-hand-side of the above inequality and all the databases in E(d2)E(d_{2}) equally frequently in the right-hand-side. Summing the inequalities and recalling the definition of xx^{\prime} completes the proof of privacy.

We now show that xx^{\prime} has better worst case expected loss. Since xx^{\prime} is oblivious, the expected loss is the same for all prior distributions over databases that induces the prior {pi}\{p_{i}\} over results. By definition of xx^{\prime}, the expected loss is: iNpiavgdf(d)=irNxdrl(i,r)\sum_{i\in N}p_{i}\,avg_{d|f(d)=i}\sum_{r\in N}x_{dr}\cdot l(i,r). For the mechanism xx, we can construct an adversarial distribution over databases that induces {pi}\{p_{i}\}, such that for every iNi\in N, the weight pip_{i} is assigned entirely to a database dd that maximizes rxdrl(f(d),r)\sum_{r}x_{dr}\cdot l(f(d),r) over df(d)=i{d|f(d)=i}. Thus xx incurs at least as much expected loss as xx^{\prime}. ∎

An alternative is to model users as having a prior over databases. Though priors on databases induce priors on query results, the converse is not necessarily true. Formally, for Theorem 6.3 below, we use the following objective function.

We show that no analog of Theorem 3.1 is possible in this model.Proposition 6.2 implicitly shows that with the restriction to oblivious mechanisms, Theorem 3.1 would continue to hold in this model. However, Theorem 6.3 implies that obliviousness is no longer without loss of generality.

There exists a database size, level of privacy and two users, each with monotone loss functions and distinct priors over databases such that no mechanism is simultaneously optimal for both of them.

We now prove Theorem 6.3. Suppose that the domain DD is {0,1}\{0,1\} and we draw databases from D3D^{3}. Fix a count query that counts the number of rows that are 11. Fix a privacy level α=1/2\alpha=1/2. We label databases by the set of rows that have values 11; in this notation, the result of the query is the cardinality of the set. The first user’s prior over databases is 14\frac{1}{4} on {1}\{1\} and {2}\{2\}, 12ϵ\frac{1}{2}-\epsilon on {2,3}\{2,3\}, ϵ\epsilon on {1,3}\{1,3\} and on all others. The second user’s prior is defined by interchanging the roles of rows 11 and 22 in this definition. Both users have the monotone loss function l(i,r)=ir1δl(i,r)=|i-r|^{1-\delta}. There exist small, positive ϵ\epsilon and δ\delta, for which the unique (non-oblivious) optimal mechanism (where the user accepts results at face value) for the first user can be shown to be the mechanism x1x_{1} defined in Figure 4. (Only the rows that referred to by our proof are specified.) Similarly, the unique optimal mechanism for the second user can be derived by interchanging the roles of the rows 11 and 22.

We will show that there is no 1/21/2-differentially private mechanism xx that implements x1x_{1} and x2x_{2} simultaneously. Assume to the contrary: Let y1y_{1} and y2y_{2} be deterministic maps that when applied to xx induce x1x_{1} and x2x_{2} respectively. (We skip the easy modification to the proof that allows y1y_{1} and y2y_{2} to be randomized.) We first show that xx, y1y_{1} and y2y_{2} have the following form:

Without any loss of generality, xx has a range R={l,m,n,o}R=\{l,m,n,o\} of size 44, y1y_{1} maps ll and nn to 11 and mm and oo to 22 and y2y_{2} maps ll and oo to 11 and mm and nn to 22.

Both x1x_{1} and x2x_{2} have range size 22. Label each member of the range of xx in one of four ways based on where it is mapped by y1y_{1} (11 or 22) and y2y_{2} (11 or 22). All the range points with the same label may be combined into one range point. For instance oo consists of all the range points mapped to 22 by y1y_{1} and 11 by y2y_{2}. The proof is complete. ∎

Proof of Theorem 6.3: By the previous lemma and the definitions of x1x_{1} and x2x_{2}, the mechanism xx must have the form in Figure 5. Because xx is 1/21/2-differentially private and as {1}\{1\} and {2}\{2\} differ in exactly two rows, columns nn and oo yield the following inequalities: 3+δ14δ23+\delta_{1}\leq 4\delta_{2}, and 3+δ24δ13+\delta_{2}\leq 4\delta_{1}. Adding the two inequalities, we have: 63(δ1+δ2)6\leq 3(\delta_{1}+\delta_{2}). Because all the entries of xx are probabilities, we have 0δ1,δ210\leq\delta_{1},\delta_{2}\leq 1. So it must be that δ1=δ2=1\delta_{1}=\delta_{2}=1. By a similar argument applied to the databases {3\{3} and {4}\{4\}, we can show that δ3=δ4=1\delta_{3}=\delta_{4}=1.

Thus, the probability masses on ll when the underlying databases are {1}\{1\} and {1,3}\{1,3\} are 7/127/12 and 1/61/6 respectively. But this violates privacy because, the two databases differ in exactly one row but the two probabilities are not within a factor 22 of each other. ∎

Future Directions

We proposed a model of user utility, where users are parametrized by a prior (modeling side information) and a loss function (modeling preferences). Theorem 3.1 shows that for every fixed count query, database size, and level of privacy, there is a single simple mechanism that is simultaneously optimal for all rational users. Are analogous results possible for other definitions of privacy, such as the additive variant of differential privacy (see )? Is an analogous result possible for other types of queries or for multiple queries at once? When users have priors over databases (Theorem 6.3), are any positive results (such as simultaneous approximation) achievable via a single mechanism?

Acknowledgments

We thank Preston McAfee, John C. Mitchell, Rajeev Motwani, David Pennock and the anonymous referees.

References