The Complexity of Computing the Optimal Composition of Differential Privacy

Jack Murtagh, Salil Vadhan

Introduction

Differential privacy is a framework that allows statistical analysis of private databases while minimizing the risks to individuals in the databases. The idea is that an individual should be relatively unaffected whether he or she decides to join or opt out of a research dataset. More specifically, the probability distribution of outputs of a statistical analysis of a database should be nearly identical to the distribution of outputs on the same database with a single person’s data removed. Here the probability space is over the coin flips of the randomized differentially private algorithm that handles the queries. To formalize this, we call two databases $D_{0},D_{1}$ with $n$ rows each neighboring if they are identical on at least $n-1$ rows, and define differential privacy as follows:

where the probabilities are over the coin flips of the algorithm $M$ .

In the practice of differential privacy, we generally think of $\epsilon$ as a small, non-negligible, constant (e.g. $\epsilon=.1$ ). We view $\delta$ as a “security parameter” that is cryptographically small (e.g. $\delta=2^{-30}$ ). One of the important properties of differential privacy is that if we run multiple distinct differentially private algorithms on the same database, the resulting composed algorithm is also differentially private, albeit with some degradation in the privacy parameters $(\epsilon,\delta)$ . In this paper, we are interested in quantifying the degradation of privacy under composition. We will denote the composition of $k$ differentially private algorithms $M_{1},M_{2},\ldots,M_{k}$ as $(M_{1},M_{2},\ldots,M_{k})$ where

A handful of composition theorems already exist in the literature. The first basic result says:

For every $\epsilon\geq 0$ , $\delta\in$ , and $(\epsilon,\delta)$ -differentially private algorithms $M_{1},M_{2},\ldots,M_{k}$ , the composition $(M_{1},M_{2},\ldots,M_{k})$ satisfies $(k\epsilon,k\delta)$ -differential privacy.

This tells us that under composition, the privacy parameters of the individual algorithms “sum up,” so to speak. We care about understanding composition because in practice we rarely want to release only a single statistic about a dataset. Releasing many statistics may require running multiple differentially private algorithms on the same database. Composition is also a very useful tool in algorithm design. Often, new differentially private algorithms are created by combining several simpler algorithms. Composition theorems help us analyze the privacy properties of algorithms designed in this way.

Theorem 1.2 shows a linear degradation in global privacy as the number of algorithms in the composition $(k)$ grows and it is of interest to improve on this bound. If we can prove that privacy degrades more slowly under composition, we can get more utility out of our algorithms under the same global privacy guarantees. Dwork, Rothblum, and Vadhan gave the following improvement on the basic summing composition above [DRV10].

Theorem 1.3 shows that privacy under composition degrades by a function of $O(\sqrt{k\ln(1/\delta^{\prime})})$ which is an improvement if $\delta^{\prime}=2^{-O(k)}$ . It can be shown that a degradation function of $\Omega(\sqrt{k\ln(1/\delta)})$ is necessary even for the simplest differentially private algorithms, such as randomized response [War65].

Despite giving an asymptotically correct upper bound for the global privacy parameter, $\epsilon_{g}$ , Theorem 1.3 is not exact. We want an exact characterization because, beyond being theoretically interesting, constant factors in composition theorems can make a substantial difference in the practice of differential privacy. Furthermore, Theorem 1.3 only applies to “homogeneous” composition where each individual algorithm has the same pair of privacy parameters, $(\epsilon,\delta)$ . In practice we often want to analyze the more general case where some individual algorithms in the composition may offer more or less privacy than others. That is, given algorithms $M_{1},M_{2},\ldots,M_{k}$ , we want to compute the best achievable privacy parameters for $(M_{1},M_{2},\ldots,M_{k})$ . Formally, we want to compute the function:

It is convenient for us to view $\delta_{g}$ as given and then compute the best $\epsilon_{g}$ , but the dual formulation, viewing $\epsilon_{g}$ as given, is equivalent (by binary search). Actually, we want a function that depends only on the privacy parameters of the individual algorithms:

Empirically (see Appendix A), this optimal bound provides a 30-40 $\%$ savings in $\epsilon_{g}$ compared to Theorem 1.3 (and a $20\%$ savings compared to an improved asymptotic bound from [KOV15]). The problem remains to find the optimal composition behavior for the more general heterogeneous case. Kairouz, Oh, and Viswanath also provide an upper bound for heterogeneous composition that generalizes the $O(\sqrt{k\ln(1/\delta^{\prime})})$ degradation found in Theorem 1.3 for homogeneous composition but do not comment on how close it is to optimal.

We begin by extending the results of Kairouz, Oh, and Viswanath [KOV15] to the general heterogeneous case.

Theorem 1.5 exactly characterizes the optimal composition behavior for any arbitrary set of differentially private algorithms. It also shows that optimal composition can be computed in time exponential in $k$ by computing the sum over $S\subseteq\{1,\ldots,k\}$ by brute force. Of course in practice an exponential-time algorithm is not satisfactory for large $k$ . Our next result shows that this exponential complexity is necessary:

There is a polynomial-time algorithm that given rational $\epsilon_{1},\ldots,\epsilon_{k}\geq 0,\delta_{1},\ldots\delta_{k},\delta_{g}\in[0,1),$ and $\eta\in(0,1)$ , outputs $\epsilon^{*}$ satisfying

where $\overline{\epsilon}=\sum_{i\in[k]}\epsilon_{i}/k$ , assuming constant-time arithmetic operations.

Note that we incur a relative error of $\eta$ in approximating $\delta_{g}$ and an additive error of $\eta$ in approximating $\epsilon_{g}$ . Since we always take $\epsilon_{g}$ to be non-negligible or even constant, we get a very good approximation when $\eta$ is polynomially small or even a constant. Thus, it is acceptable that the running time is polynomial in $1/\eta$ .

In addition to the results listed above, our proof of Theorem 1.5 also provides a somewhat simpler proof of the Kairouz-Oh-Viswanath homogeneous composition theorem (Theorem 1.4 [KOV15]). The proof in [KOV15] introduces a view of differential privacy through the lens of hypothesis testing and uses geometric arguments. Our proof relies only on elementary techniques commonly found in the differential privacy literature.

The theoretical results presented here were motivated by our work on an applied project called “Privacy Tools for Sharing Research Data”privacytools.seas.harvard.edu. We are building a system that will allow researchers with sensitive datasets to make differentially private statistics about their data available through data repositories using the Dataversedataverse.org platform [Cro11, Kin07]. Part of this system is a tool that helps both data depositors and data analysts distribute a global privacy budget across many statistics. Users select which statistics they would like to compute and are given estimates of how accurately each statistic can be computed. They can also redistribute their privacy budget according to which statistics they think are most valuable in their dataset. We implemented the approximation algorithm from Theorem 1.7 and integrated it with this tool to ensure that users get the most utility out of their privacy budget.

Technical Preliminaries

A useful notation for thinking about differential privacy is defined below.

For two discrete random variables $Y$ and $Z$ taking values in the same output space $S$ , the $\delta$ -approximate max-divergence of $Y$ and $Z$ is defined as:

Notice that an algorithm $M$ is $(\epsilon,\delta)$ differentially private if and only if for all pairs of neighboring databases, $D_{0},D_{1}$ , we have $D_{\infty}^{\delta}(M(D_{0})\|M(D_{1}))\leq\epsilon$ . The standard fact that differential privacy is closed under “post processing” [DMNS06, DR13] now can be formulated as:

If $f\colon S\to R$ is any randomized function, then

The composition results in our paper actually hold for a more general model of composition than the one described in the introduction. The model is called $k$ -fold adaptive composition and was formalized in [DRV10]. We generalize their formulation to the heterogeneous setting where privacy parameters may differ across different algorithms in the composition.

The idea is that instead of running $k$ differentially private algorithms chosen all at once on a single database, we can imagine an adversary adaptively engaging in a “composition game.” The game takes as input a bit $b\in\{0,1\}$ and privacy parameters $(\epsilon_{1},\delta_{1}),\ldots,(\epsilon_{k},\delta_{k})$ . A randomized adversary $A$ , tries to learn $b$ through $k$ rounds of interaction as follows: on the $i$ th round of the game, $A$ chooses an $(\epsilon_{i},\delta_{i})$ -differentially private algorithm $M_{i}$ and two neighboring databases $D_{(i,0)},D_{(i,1)}$ . $A$ then receives an output $y_{i}=M_{i}(D_{(i,b)})$ where the internal randomness of $M_{i}$ is independent of the internal randomness of $M_{1},\ldots,M_{i-1}$ . The choices of $M_{i},D_{(i,0)},$ and $D_{(i,1)}$ may depend on $y_{0},\ldots,y_{i-1}$ as well as the adversary’s own randomness.

The outcome of this game is called the view of the adversary, $V^{b}$ which is defined to be $(y_{1},\ldots,y_{k})$ along with $A$ ’s coin tosses. The algorithms $M_{i}$ and databases $D_{(i,0)},D_{(i,1)}$ from each round can be reconstructed from $V^{b}$ . Now we can formally define privacy guarantees under $k$ -fold adaptive composition.

We say that the sequences of privacy parameters $\epsilon_{1},\ldots,\epsilon_{k}\geq 0,\delta_{1},\ldots,\delta_{k}\in[0,1)$ satisfy $(\epsilon_{g},\delta_{g})$ -differential privacy under adaptive composition if for every adversary $A$ we have $D_{\infty}^{\delta_{g}}(V^{0}\|V^{1})\leq\epsilon_{g}$ , where $V^{b}$ represents the view of $A$ in composition game $b$ with privacy parameter inputs $(\epsilon_{1},\delta_{1}),\ldots,(\epsilon_{k},\delta_{k})$ .

Computing real-valued functions.

where $\epsilon_{g}$ is the true optimal parameter with full precision.

Characterization of OptComp

Following [KOV15], we show that to analyze the composition of arbitrary $(\epsilon_{i},\delta_{i})$ -DP algorithms, it suffices to analyze the composition of the following simple variant of randomized response [War65].

Given an $(\epsilon,\delta)$ -DP algorithm $M$ with output space $R$ and neighboring databases $D_{0},D_{1}$ , let $P_{0},P_{1}$ be the probability mass functions of $M(D_{0})$ and $M(D_{1})$ , respectively. The definition of differential privacy tells us that for all sets $S\subseteq R$ :

The left-hand side of the first inequality is maximized by $S=S_{0}$ for

and the left-hand side of the second inequality is maximized by

We will show how to simulate $M$ using the following algorithm.

To see that all of the terms are non-negative we need to show that the recurring terms $e^{\epsilon}\alpha_{1}-\alpha_{0}$ and $e^{\epsilon}\alpha_{0}-\alpha_{1}$ are non-negative and the rest follows by inspection.

For every $(\epsilon,\delta)$ -DP algorithm, $M$ with output space $R$ and neighboring databases $D_{0}$ and $D_{1}$ , $e^{\epsilon}\alpha_{1}-\alpha_{0}$ and $e^{\epsilon}\alpha_{0}-\alpha_{1}$ are non-negative where $\alpha_{0}=1-\delta_{0},\alpha_{1}=1-\delta_{1}$ and $\delta_{0},\delta_{1}$ are defined in Equations 4 and 5.

The other inequality follows by symmetry. ∎

Fix neighboring databases, $D_{0},D_{1}$ and let $P_{0},P_{1}$ be the probability mass functions of $M$ on $D_{0},D_{1}$ , respectively. We will use $S_{0},S_{1},\delta_{0},$ and $\delta_{1}$ as defined above in Equations 2, 3, 4, and 5. Fix $r\in R$ . $T^{\prime}\colon\{0,1,2,3\}\to R$ is defined in the table below.

We need to show that $T^{\prime}(x)$ is a valid probability distribution for each $x$ . All of the terms are non-negative because $e^{\epsilon}\alpha_{1}-\alpha_{0}$ and $e^{\epsilon}\alpha_{0}-\alpha_{1}$ are non-negative by Lemma 3.4.

The sums of $\Pr[T^{\prime}(0)=r]$ and $\Pr[T^{\prime}(3)=r]$ are immediate from the definitions of $\delta_{0}$ and $\delta_{1}$ , respectively:

A symmetrical argument works for $\Pr[T^{\prime}(3)=r]$ . We now analyze the sum for $\Pr[T^{\prime}(1)=r]$ . The sum for $\Pr[T^{\prime}(2)=r]$ follows by symmetry. We use the following identities:

From here we break the calculation into the three possible cases:

Case 3: $r\in R\setminus S_{0}\setminus S_{1}$

All of the weights are non-negative because $\alpha_{1}\geq\alpha_{0}\geq\alpha$ , $e^{\epsilon}\alpha_{1}\geq\alpha_{0}$ , and $p$ is also at most $1$ , which we verify now:

Next we check the probabilities with which $T^{\prime\prime}$ outputs $1$ and $2$ when $b=0$ .

For all $\epsilon_{1},\ldots,\epsilon_{k}\geq 0,\delta_{1},\ldots,\delta_{k},\delta_{g}\in[0,1)$ ,

For the other direction, it suffices to show that for every $M_{1},\ldots,M_{k}$ that are $(\epsilon_{1},\delta_{1}),\ldots,(\epsilon_{k},\delta_{k})$ -differentially private, we have

without loss of generality. Given $\epsilon_{g}$ , the set $S\subseteq\{0,1,2,3\}^{k}$ that maximizes the right-hand side is

We can further split $S(\epsilon_{g})$ into $S(\epsilon_{g})=S_{0}(\epsilon_{g})\cup S_{1}(\epsilon_{g})$ with

Fix an adversary $A$ . On each round $i$ , $A$ uses its coin tosses $r$ and the previous outputs $y_{1},\ldots,y_{i-1}$ to select an $(\epsilon_{i},\delta_{i})$ -differentially private algorithm $M_{i}=M_{i}^{r,y_{1},\ldots,y_{i-1}}$ and neighboring databases $D_{0}=D_{0}^{r,y_{1},\ldots,y_{i-1}},D_{1}=D_{1}^{r,y_{1},\ldots,y_{i-1}}$ . Let $V^{b}$ be the view of $A$ with the given privacy parameters under composition game $b$ for $b=0$ and $b=1$ .

For $i=1,\ldots,k,$ let $y_{i}\leftarrow T_{i}^{r,y_{1},\ldots,y_{i-1}}(z_{i})$

Hardness of OptComp

$\#P$ is the class of all counting problems associated with decision problems in NP. It is a set of functions that count the number of solutions to some NP problem. More formally:

A function $g$ is called $\#P$ -hard if every function $f\in\#P$ can be computed in polynomial time given oracle access to $g$ . That is, evaluations of $g$ can be done in one time step.

If a function is $\#P$ -hard, then there is no polynomial-time algorithm for computing it unless there is a polynomial-time algorithm for counting the number of solutions of all NP problems.

A function $f$ is called $\#P$ -easy if there is some function $g\in\#P$ such that $f$ can be computed in polynomial time given oracle access to $g$ .

If a function is both $\#P$ -hard and $\#P$ -easy, we say it is $\#P$ -complete. Proving that computing OptComp is $\#P$ -complete can be broken into two steps: showing that it is $\#P$ -easy and showing that it is $\#P$ -hard.

For convenience we will view rational $(\epsilon_{1},\delta_{1}),\ldots,(\epsilon_{k},\delta_{k})$ and $\epsilon_{g}$ as given arguments to OptComp and compute $\delta_{g}$ . Recall that the two versions of OptComp, viewing $\epsilon_{g}$ as given and computing $\delta_{g}$ and vice versa, are equivalent up to a polynomial factor (just run binary search over values of $\delta_{g}$ computing polynomially many bits of precision). So the formulation we choose for the proof will not affect whether OptComp is in $\#$ P or not. Recall that in our model of computing real valued functions, we will take another input $q$ and we will output an approximation of $\delta_{g}$ to $q$ bits of precision in polynomial time using a $\#P$ oracle where $\delta_{g}$ satisfies the following:

Notice that the only part of the expression above that cannot be computed in polynomial time is the summation over subsets of $\{1,\ldots,k\}$ . If we knew the sum, computing $\delta_{g}$ would be easy given our inputs. We show how to compute the sum in polynomial time using a $\#$ P oracle and it follows that computing $\delta_{g}$ is $\#$ P-easy .

We can now phrase a decision problem in NP: Does there exist a pair $(S,n)$ such that $g(S,n)=1$ ? This is in NP because given a witness $(S,n)$ , we can compute $m\cdot\hat{f}(S)$ and compare the output to $n$ , thereby verifying the solution, in polynomial time. Since this is an NP problem, a $\#P$ oracle can count the number of solutions to it in one time step. Notice that for every set $S$ , the number of solutions (pairs of the form $(S,n)$ satisfying $g(S,n)=1$ ) is exactly $m\cdot\hat{f}(S)$ because $g$ will output $1$ for $g(S,1),g(S,2),\ldots,g(S,m\cdot\hat{f}(S))$ . So over all possible sets $S$ , the number of solutions as counted by the $\#P$ oracle equals $m\cdot\sum_{S\subseteq[k]}\hat{f}(S)$ . Dividing this by $m$ gives us the sum up to an additive error of $\frac{2^{k}}{2^{q+k}}=\frac{1}{2^{q}}$ , which can be used to compute $\delta_{g}$ to $q$ bits of precision in polynomial time. This only required one call to a $\#P$ oracle. So computing OptComp is $\#P$ -easy. ∎

Next we show that computing OptComp is also $\#P$ -hard through a series of reductions. We start with a multiplicative version of the partition problem that is known to be $\#P$ -complete by Ehrgott [Ehr00]. The problems in the chain of reductions are defined below.

$\#\textsc{INT-PARTITION}$ is the following problem: given a set $Z=\{z_{1},z_{2},\ldots,z_{k}\}$ of positive integers, count the number of partitions $P\subseteq[k]$ such that

All of the remaining problems in our chain of reductions take inputs $\{w_{1},\ldots,w_{k}\}$ where $1\leq w_{i}\leq e$ is the $Dth$ root of a positive integer $z_{i}$ for all $i\in[k]$ and some positive integer $D$ . All of the reductions we present actually hold for every positive integer $D$ , including $D=1$ (in which case the inputs are integers). However, we will constrain $D$ to be large enough so that our inputs are in the range $[1,e]$ . This is because in the final reduction to OptComp, $\epsilon_{i}$ values in the proof are set to $\ln(w_{i})$ . We want to show that our reductions hold for reasonable values of $\epsilon$ ’s in a differential privacy setting so throughout the proofs we use $w_{i}$ ’s $\in[1,e]$ to correspond to $\epsilon_{i}$ ’s $\in$ in the final reduction. In fact, we will later state our reductions as applying to instances where $\prod_{i}w_{i}\leq e^{\epsilon}$ (and hence $\sum_{i}\epsilon_{i}\leq\epsilon$ ) for any desired $\epsilon>0$ .

(The real numbers $w_{1},\ldots,w_{k}$ are specified in the input by $z_{1},\ldots,z_{k}$ and $D$ with the input size being the combined bit length of these integers in binary).

(The real numbers $w_{1},\ldots,w_{k}$ and $T$ are specified in the input by $z_{1},\ldots,z_{k},t,t^{\prime}$ and $D$ with the input size being the combined bit length of these integers in binary).

(The real numbers $w_{1},\ldots,w_{k}$ are specified in the input by $z_{1},\ldots,z_{k}$ and $D$ with the input size being the combined bit length of these integers and the numerator and denominator of $r$ in binary).

Since the output of SUM-PARTITION is irrational, the actual computational problem is defined according to our convention in Section 2 for computing real-valued functions. That is, given an additional input $q$ , compute a number $y$ such that

We prove that computing OptComp is $\#P$ -hard by the following series of reductions:

For every constant $c>1$ , $\#\textsc{PARTITION}$ is $\#P$ -hard, even on instances where $\prod_{i}w_{i}\leq c$ .

There is a one-to-one correspondence between solutions to the $\#\textsc{PARTITION}$ problem and solutions to the given $\#\textsc{INT-PARTITION}$ instance. We can solve $\#\textsc{INT-PARTITION}$ in polynomial time with a $\#\textsc{PARTITION}$ oracle. Therefore $\#\textsc{PARTITION}$ is $\#P$ -hard. ∎

For every constant $c>1$ , $\#\textsc{T-PARTITION}$ is $\#P$ -hard, even on instances where $\prod_{i}w_{i}\leq c$ .

Let $c>1$ be a constant. We will reduce from $\#\textsc{PARTITION}$ , so consider an instance of the $\#\textsc{PARTITION}$ problem, $W=\{w_{1},w_{2},\ldots,w_{k}\}$ of $D$ th roots of integers $z_{1},\ldots,z_{k}$ , respectively. We may assume $\prod_{i}w_{i}\leq\sqrt{c}$ since $\sqrt{c}$ is also a constant greater than $1$ .

Set $W^{\prime}=W\cup\{w_{k+1}\}$ , where $w_{k+1}=\prod_{i=1}^{k}w_{i}$ . Notice that $\prod_{i=1}^{k+1}w_{i}\leq(\sqrt{c})^{2}=c$ . Set $T=\sqrt{w_{k+1}}\left(w_{k+1}-1\right)$ . Notice that $w_{k+1}=\left(\prod_{i=1}^{k}z_{i}\right)^{\frac{1}{D}}$ so by setting integers $t=\left(\prod_{i=1}^{k}z_{i}\right)^{3}$ and $t^{\prime}=\prod_{i=1}^{k}z_{i}$ we get that

which meets the input requirement for $\#\textsc{T-PARTITION}$ . So we can use a $\#\textsc{T-PARTITION}$ oracle to count the number of partitions $Q\subseteq\{1,\ldots,k+1\}$ such that

Let $P=Q\cap\{1,\ldots,k\}$ . We will argue that $\prod_{i\in Q}w_{i}-\prod_{i\not\in Q}w_{i}=T$ if and only if $\prod_{i\in P}w_{i}=\prod_{i\not\in P}w_{i}$ , which completes the proof. There are two cases to consider: $w_{k+1}\in Q$ and $w_{k+1}\not\in Q$ .

Case 1: $w_{k+1}\in Q$ . In this case, we have:

So there is a one-to-one correspondence between solutions to the $\#\textsc{T-PARTITION}$ instance $W^{\prime}$ where $w_{k+1}\in Q$ and solutions to the original $\#\textsc{PARTITION}$ instance $W$ .

Case 2: $w_{k+1}\not\in Q$ . Solutions now look like:

One way this can be true is if $w_{i}=1$ for all $i\in[k]$ . We can check ahead of time if our input set $W$ contains all ones. If it does, then there are $2^{k}-2$ partitions that yield equal products (all except $P=[k]$ and $P=\emptyset$ ) so we can just output $2^{k}-2$ as the solution and not even use our oracle. The only other way to satisfy the above expression is for $\prod_{i\in P}w_{i}>\prod_{i\in[k]}w_{i}$ which cannot happen because $P\subseteq[k]$ . So there are no solutions in the case that $w_{k+1}\not\in Q$ .

Therefore the output of the $\#\textsc{T-PARTITION}$ oracle on $W^{\prime}$ is the solution to the $\#\textsc{PARTITION}$ problem. So $\#\textsc{T-PARTITION}$ is $\#P$ -hard. ∎

For the next two proofs we will make use of the following fact to bound the amount of precision needed when approximating irrational numbers by rational ones in our reductions:

For all real numbers $y>x$ and functions $f$ that are differentiable on the interval $[x,y]$ :

For every constant $c>1$ , SUM-PARTITION is $\#P$ -hard even on instances where $\prod_{i}w_{i}\leq c$ and where there are no partitions $S$ such that $\prod_{i\in S}w_{i}=r\cdot\prod_{i\not\in S}w_{i}$ .

We will use a SUM-PARTITION oracle to solve $\#\textsc{T-PARTITION}$ given a set $W=\{w_{1},\ldots,w_{k}\}$ of $D$ th roots of positive integers $z_{1},\ldots,z_{k}$ , respectively, and a positive real number $T=\sqrt[2D]{t}-\sqrt[2D]{t^{\prime}}$ for integers $t,t^{\prime}$ given in the input. Notice that for every $x>0$ :

Above, $j$ must be a positive integer greater than $\left(\prod_{i=1}^{k}z_{i}\right)^{1/2}$ , which tells us that the gap in products from every partition must take a particular form. This means that for a given $D$ and $W$ , $\#\textsc{x-PARTITION}$ can only be non-zero on a discrete set of possible values of $x$ . So given our $\#$ T-PARTITION instance we can find a $T^{\prime}>T$ such that the above has no solutions for $x$ in the interval $(T,T^{\prime})$ . Specifically, solve the above quadratic for $\sqrt[D]{j}$ . If $j$ is not an integer, then we know the answer to the $\#$ T-PARTITION instance is 0, so assume $j$ is an integer and set $T^{\prime}=\sqrt[D]{j+1}-\prod_{i}w_{i}/\sqrt[D]{j+1}$ . We can also find an interval $(T^{\prime\prime},T)$ just below $T$ where no value of $x$ in the interval can yield a solution above by setting $T^{\prime\prime}=\sqrt[D]{j-1}-\prod_{i}w_{i}/\sqrt[D]{j-1}$ . We use these discreteness properties twice in the proof. Also notice that these intervals are not too small:

$T^{\prime}-T\geq 2^{-\text{poly}(n)}$ and $T-T^{\prime\prime}\geq 2^{-\text{poly}(n)}$ where $n$ is the input length (i.e. the bit lengths of the integers $z_{1},\ldots,z_{k},t,t^{\prime}$ ).

where the last inequality follows from Fact 4.11. This final value is only exponentially small because $j$ is upper bounded by $\prod_{i=1}^{k}z_{i}$ , which is at most exponentially large in the bit length of the $z_{i}$ ’s. A very similar proof shows that $(T^{\prime\prime},T)$ is only exponentially small. ∎

This means that we can always find $\hat{T}\in(T,T^{\prime})$ such that $\hat{T}$ is rational and can be fully specified with a bit length that is polynomial in the input length. Fix such a quantity $\hat{T}$ . For all $y>0$ , define $P^{y}\equiv\{P\subseteq[k]\mid\prod_{i\in P}w_{i}-\prod_{i\not\in P}w_{i}\geq y\}$ . Then, since $x$ -PARTITION has no solutions for $x\in(T,T^{\prime})$ :

We now show how to compute the two sums in the final term using the SUM-PARTITION oracle. We will give the procedure for computing $\sum\limits_{P\in P^{T}}\left(\prod\limits_{i\in P}w_{i}-\prod\limits_{i\not\in P}w_{i}\right)$ and the case with $\hat{T}$ will follow by symmetry. The oracle returns a real number, so by our model of computing real valued functions, we will also give the oracle an additional input that specifies the number of bits of precision in its output. Ultimately we only need to approximate each sum to within $\pm T/4$ . This will give an approximation to the $\#\textsc{T-PARTITION}$ problem to within $\pm 1/2$ , thereby solving it by rounding the approximation because the solution will be an integer. We want to set the input $r$ to the SUM-PARTITION oracle to be $r=r_{T}$ such that for all $P\subseteq[k]$ , we have:

Taking $w=\prod_{i\in[k]}w_{i}$ and thinking of $v=\prod_{i\in P}w_{i}$ , it suffices that all positive solutions to each of the following two inequalities are the same:

The positive solutions to the left one are $v\geq\sqrt{r_{T}w}$ , and to the right one are $v\geq(T+\sqrt{T^{2}+4w})/2$ . Setting the right-hand sides equal gives

Since $r_{T}$ might be irrational and SUM-PARTITION takes as input rational values of $r$ , we need to find a rational $r$ that approximates $r_{T}$ and preserves the set of solutions $P^{T}$ . Recall from Claim 4.13 that there is an (only) exponentially small interval $(T^{\prime\prime},T)$ below $T$ such that for all $\bar{T}\in(T^{\prime\prime},T)$ , $P^{T}=P^{\bar{T}}$ . This translates to a corresponding interval $(r_{T^{\prime\prime}},r_{T})$ such that for all $r\in(r_{T^{\prime\prime}},r_{T})$ , equivalence (6) holds. Furthermore, this interval is also only exponentially small.

$r_{T}-r_{T^{\prime\prime}}\geq 2^{-\text{poly}(n)}$ where $n$ is the input length (i.e. the bit lengths of the integers $z_{1},\ldots,z_{k},t,t^{\prime}$ ).

To see this, view $r_{T}$ from Equation 7 as a function $r(T)$ of $T$ , and calculate the derivative:

(Recall that $1\leq w=\prod_{i}w_{i}\leq c$ ). This is only exponentially small in the input length by Claim 4.13. ∎

So we can choose a rational $r\in(r_{T^{\prime\prime}},r_{T})$ that can be specified with a number of bits that is polynomial in the input length and preserves $P^{T}=\left\{P\subseteq[k]\mid\prod_{i\in P}w_{i}-r\cdot\prod_{i\not\in P}w_{i}\geq 0\right\}$ . However the SUM-PARTITION oracle gives us

whereas we want to compute the right-hand side without the $r$ coefficient. To get this we just pick another rational $r^{\prime}\in(r_{T^{\prime\prime}},r_{T})$ such that $r^{\prime}-r\geq 2^{-\text{poly}(n)}$ . If precision were not an issue, we could run our SUM-PARTITION oracle for $r$ and $r^{\prime}$ and receive the output:

Then the following linear combination of $S_{1}$ and $S_{2}$ gives us what we want:

Computing $S_{1}$ and $S_{2}$ to within $\pm 2^{-\text{poly}(n)}$ yields an approximation of $\sum_{P\in P^{T}}\left(\prod_{i\in P}w_{i}-\prod_{i\not\in P}w_{i}\right)$ to within $\pm T/4$ .

We just need to approximate $S_{1}$ and $S_{2}$ to within $\pm\frac{T}{8}\cdot\frac{r^{\prime}-r}{r^{\prime}-1}$ to get the desired precision. This additive error is only exponentially small by Claim 4.14. ∎

Running this whole procedure again for $\hat{T}\in(T,T^{\prime})$ , which we fixed above gives us all the information we need to count the number of solutions to the $\#\textsc{T-PARTITION}$ instance we were given. We can solve $\#\textsc{T-PARTITION}$ in polynomial time with four calls to a SUM-PARTITION oracle. Therefore SUM-PARTITION is $\#P$ -hard. ∎

Now we prove that computing OptComp is $\#P$ -complete.

We have already shown that computing OptComp is $\#P$ -easy. Here we prove that it is also $\#P$ -hard, thereby proving $\#P$ -completeness.

This last expression is exactly the solution to the instance of SUM-PARTITION we were given. Taking precision into account, the input SUM-PARTITION instance has an additional input $q$ that specifies the desired number of bits of precision in the output and we can only pass OptComp rational values so we will have to approximate $\epsilon_{i}=\ln(w_{i})$ for all $i$ and $\epsilon_{g}=\ln(r)$ . Again there is a worry that when we approximate these values the set of partitions $S$ that make $\prod_{i\in S}w_{i}-r\cdot\prod_{i\not\in S}w_{i}>0$ might change. We want to get enough precision in our inputs so that the set of partitions over which we sum does not change and enough precision so that the output is accurate to $q$ bits. We will calculate the approximations required for each of these two goals separately and the final precision that we use will just be the maximum of the two. We prove that we can achieve both of these goals with the next two claims.

There exists a polynomial $p(n)$ in the length $n$ of the input (the bit lengths of $z_{1},\ldots,z_{k},q$ , and the numerator and denominator of $r$ ) such that if $|w_{i}-w^{\prime}_{i}|\leq 2^{-p(n)}$ for each $i$ , then the set of partitions $S$ satisfying

is the same as the set of partitions satisfying

Recall that SUM-PARTITION is $\#$ P-hard even on instances where there are no partitions $S$ such that $\prod_{i\in S}w_{i}=r\cdot\prod_{i\not\in S}w_{i}$ so we may assume our input instance of SUM-PARTITION has no such partitions and still prove the hardness of OptComp. So to ensure that we have enough precision such that the set over which we sum does not change, we must make the error smaller than the minimum possible (in absolute value) nonzero outcome of $\prod_{i\in S}w_{i}-r\cdot\prod_{i\not\in S}w_{i}$ . We now bound this quantity. Let

Since $r$ is rational, $r=a/b$ for two integers $a$ and $b$ . Let $a^{\prime}=a^{D}$ and $b^{\prime}=b^{D}$ . Then:

Where the last line follows from Fact 4.11 applied to the function $f(x)=x^{1/D}$ . $1/\left(\prod_{i\in[k]}z_{i}\right)^{(D-1)/D}$ is only exponentially small because $\prod_{i\in[k]}z_{i}$ is at most exponentially large in the bit length of the integers $z_{1},\ldots,z_{k}$ . We claim that $\left|\prod_{i\in S}z_{i}-\frac{a^{\prime}}{b^{\prime}}\prod_{i\not\in S}z_{i}\right|$ is at least $1/b^{\prime}$ for all $S\in\mathcal{S}$ . Fix $S\in\mathcal{S}$ :

Where the last implication follows because $b^{\prime}\cdot\prod_{i\in S}z_{i}-a^{\prime}\cdot\prod_{i\not\in S}z_{i}$ is just a difference of integers so the closest nonzero value it can take on is $\pm 1$ . ∎

We will choose $p(n)=p_{1}(n)+p_{2}(n)$ where $p_{1}(n)$ is the polynomial that exists from Claim 4.16 and $p_{2}(n)$ will be determined later. Define

Bounding each term in the final expression above by $2^{-(q+1)}$ then gives us the accuracy we want. We will show directly how to bound the second term and the argument for the first term follows symmetrically. By hypothesis we have that for all $S\subseteq[k]$ :

Since $|S^{+}|\leq 2^{k}$ and $1\leq\prod_{i\not\in S}w_{i}\leq c$ for all $S$ we get:

Picking $p_{2}(n)$ such that $p(n)=p_{1}(n)+p_{2}(n)>2k+\log(rc)+q+1$ then suffices to bound the absolute value of the sum by $2^{-(q+1)}$ . Repeating the same calculation for $\sum_{S\in S^{+}}\left(\prod_{i\in S}w^{\prime}_{i}-\prod_{i\in S}w_{i}\right)$ will yield the same approximation except without the factor of $r$ . So we can bound both terms by $2^{-(q+1)}$ (and therefore their sum by $2^{-q}$ ) by approximating each $w_{i}$ to a precision that is polynomial in $n$ , which proves the claim. ∎

Approximation of OptComp

where $\overline{\epsilon}=\sum_{i\in[k]}\epsilon_{i}/k$ , assuming constant-time arithmetic operations.

We prove Theorem 1.7 using the following three lemmas:

In other words, if the $\epsilon$ values we are given are all integer multiples of some $\epsilon_{0}$ where $e^{\epsilon_{0}}$ is rational, we can determine whether or not the composition of those privacy parameters is $(a^{*}\cdot\epsilon_{0},\delta_{g})$ -DP in pseudo-polynomial time, for every positive integer $a^{*}$ . Running binary search over integers $a^{*}$ , we can find the minimum such integer. When $\epsilon_{0}$ is small, this gives us a good overestimate of the optimal composition of the discrete input privacy parameters. This means that given any inputs $(\epsilon_{1},\delta_{1}),\ldots,(\epsilon_{k},\delta_{k}),\delta_{g}$ to OptComp, we can discretize and polynomially bound the $\epsilon_{i}$ values to new values $\epsilon_{i}^{\prime}$ for all $i\in[k]$ and use Lemma 5.2 to approximate OptComp $((\epsilon_{1}^{\prime},\delta_{1}),\ldots,(\epsilon_{k}^{\prime},\delta_{k}),\delta_{g})$ . The next lemma tells us that this is also a good approximation of OptComp $((\epsilon_{1},\delta_{1}),\ldots,(\epsilon_{k},\delta_{k}),\delta_{g})$ .

For all $\epsilon_{1},\ldots,\epsilon_{k},c_{1},\ldots,c_{k}\geq 0$ and $\delta_{1},\ldots,\delta_{k},\delta_{g}\in[0,1)$ :

Next we prove the three lemmas and then show that Theorem 1.7 follows.

Each cell $F(r,s)$ in the table can be computed in constant time given earlier cells $F(r^{\prime},s^{\prime})$ where $r^{\prime}<r$ . Thus filling the entire table takes time $O(Bk)$ . ∎

Given a rational $e^{\epsilon_{0}}\geq 0$ and $\epsilon_{1}=a_{1}\cdot\epsilon_{0},\ldots,\epsilon_{k}=a_{k}\cdot\epsilon_{0},\epsilon^{*}=a^{*}\cdot\epsilon_{0}$ for positive integers $a_{1},\ldots,a_{k},a^{*}$ and rational $\delta_{1},\ldots\delta_{k},\delta_{g}\in[0,1)$ Theorem 1.5 tells us that answering whether or not

is equivalent to answering whether or not the following inequality holds:

The right-hand side and $\prod_{i=1}^{k}(1+e^{\epsilon_{i}})$ are easy to compute given the inputs (note that $e^{\epsilon_{i}}$ is rational for all $i\in[k]$ because each is an integer power of $e^{\epsilon_{0}}$ ). So in order to check the inequality, we will show how to compute the sum. Define

and observe that by setting $T=S^{\mathsf{c}}$ , we have

Multiplying both sides by $e^{c/2}$ gives:

The above inequality together with Theorem 1.5 means that showing the following will complete the proof:

Since $(1+e^{\epsilon_{i}+c_{i}})/(1+e^{\epsilon_{i}})\geq e^{c_{i}/2}$ for every $\epsilon_{i},c_{i}>0$ , it suffices to show:

This inequality holds term by term. If a right-hand term is zero $\left(\sum_{i\in S}\epsilon_{i}\leq\epsilon_{g}+\sum_{i\not\in S}\epsilon_{i}\right)$ , then so is the corresponding left-hand term $\left(\sum_{i\in S}(\epsilon_{i}+c_{i})\leq\epsilon_{g}+c+\sum_{i\not\in S}(\epsilon_{i}+c_{i})\right)$ . For the nonzero terms, the factor of $e^{c}$ ensures that the right-hand terms are larger than the left-hand terms. ∎

Lemma 5.2 tells us that we can determine whether a set of privacy parameters satisfies some $(\epsilon_{g},\delta_{g})$ differential privacy guarantee if the $\epsilon_{i}$ values and $\epsilon_{g}$ are all positive integer multiples of some $\epsilon_{0}$ where $e^{\epsilon_{0}}$ is rational. We are given rational $\epsilon_{1},\ldots,\epsilon_{k}\geq 0,\delta_{1},\ldots\delta_{k},\delta_{g}\in[0,1),$ and $\eta\in(0,1)$ . Let $\overline{\epsilon}=\sum_{i\in[k]}\epsilon_{i}/k$ be the arithmetic mean of the $\epsilon_{i}$ values. Let $\beta=\eta/(k\cdot(1+\overline{\epsilon})+1)$ , set $\epsilon_{0}=\ln(1+\beta)$ , and for all $i\in[k]$ set $a_{i}=\lceil{\epsilon_{i}\cdot(1/\beta+1)}\rceil$ and $\epsilon_{i}^{\prime}=\epsilon_{0}\cdot a_{i}$ . We will use the following bounds on $\epsilon_{0}$ in the proof:

With these settings, the $a_{i}$ ’s are non-negative integers, the $\epsilon_{i}^{\prime}$ values are all integer multiples of $\epsilon_{0}$ and $e^{\epsilon_{0}}$ is rational. So for every positive integer $a$ we can apply Lemma 5.2 to determine whether or not OptComp $((\epsilon_{1}^{\prime},\delta_{1}),\ldots,(\epsilon_{k}^{\prime},\delta_{k}),\delta_{g})\leq a\cdot\epsilon_{0}$ in time $O\left(k\cdot\sum_{i\in[k]}a_{i}\right)$ . Running binary search over integers $a$ , we can find the minimum such integer, which we will call $a^{*}$ . The algorithm’s estimate of OptComp $((\epsilon_{1},\delta_{1}),\ldots,(\epsilon_{k},\delta_{k}),\delta_{g})$ will be $a^{*}\cdot\epsilon_{0}$ . However since this number is irrational, we will use the Taylor approximation of the natural logarithm to output $\epsilon^{*}$ satisfying $a^{*}\cdot\epsilon_{0}\leq\epsilon^{*}\leq a^{*}\cdot\epsilon_{0}+\beta-\epsilon_{0}$ . Since we only need to calculate a few terms of the Taylor expansion of $\ln(1+\beta)$ to achieve this approximation, this step will not affect our running time.

Since we choose $a^{*}$ to be the minimum integer satisfying composition we have:

$a^{*}$ can range from to $\sum_{i\in[k]}a_{i}$ so the binary search can be done in $\log\left(\sum_{i\in[k]}a_{i}\right)=\log O\left(k^{2}\cdot\overline{\epsilon}\cdot(1+\overline{\epsilon})/\eta\right)$ iterations. This gives us a total running time of:

Now we argue that $\epsilon^{*}$ is a good approximation of OptComp $((\epsilon_{1},\delta_{1}),\ldots,(\epsilon_{k},\delta_{k}),\delta_{g})$ . For all $i\in[k]$ we have:

So all of the $\epsilon_{i}^{\prime}$ values are overestimates of their corresponding $\epsilon_{i}$ values and therefore

satisfying one of the inequalities in the theorem. We also have for all $i\in[k]$ :

Let $c_{i}=\beta\cdot(\epsilon_{i}+1)$ for all $i\in[k]$ and let $c=\sum_{i\in[k]}c_{i}=\beta\cdot k\cdot(1+\overline{\epsilon})$ . Now we get

by Lemma 5.3. Noting that $\beta\cdot k\cdot(1+\overline{\epsilon})$ and $\beta\cdot k\cdot(1+\overline{\epsilon})+\beta$ are both at most $\eta$ completes the proof. ∎

References

Appendix A Comparison of Composition Theorems

The figures below compare the performances of four homogeneous composition theorems. In all figures, “Summing” refers to basic composition - Theorem 1.2 [DKMMN06], “DRV” refers to advanced composition - Theorem 1.3 [DRV10], “KOV Bound” refers to a bound in [KOV15] that is a closed form approximation of the optimal composition theorem, and “Optimal” refers to the optimal composition theorem - Theorem 1.4 [KOV15]. Here we are composing $k$ mechanisms that are $(\epsilon,\delta)$ differentially private to obtain an $(\epsilon_{g},\delta_{g})$ differentially private mechanism as guaranteed by one of the composition theorems.