Optimal Membership Inference Bounds for Adaptive Composition of Sampled Gaussian Mechanisms

Saeed Mahloujifar, Alexandre Sablayrolles, Graham Cormode, Somesh Jha

Introduction

The recent success of machine learning models make them the go-to approach to solve a variety of problems, ranging from computer vision (Krizhevsky et al., 2012) to NLP (Sutskever et al., 2014), including applications to sensitive data such as health records or chatbots. Access to a trained machine learning model, through a black-box API or a white-box access to a published model, can leak traces of information (Dwork et al., 2015) from the training data. Researchers have tried to measure this information leakage through metrics such as membership inference (Shokri et al., 2017). Membership inference is the task of guessing, from a trained model, whether it includes a given sample or not. This task is both interesting in its own right, as the participation of an individual in a data collection can be a sensitive information. It also serves as the “most significant bit” of information: if membership inference fails, attacks revealing more information such as reconstruction attacks (Fredrikson et al., 2014; Carlini et al., 2020) will also fail. In other words, defending against membership inference attacks would also defend against attacks such as reconstruction attacks that aim at reconstructing training examples.

The standard approach to provably defeat these membership privacy attacks is differential privacy (Dwork et al., 2006). Differential privacy defines a class of training algorithms that respect a privacy budget ϵ\epsilon and a probability of failure δ\delta. These quantities quantify how much information about each individual training example is revealed by the output of the algorithm. Most algorithms obtaining differential privacy need to inject noise somewhere in their process. The amount of injected noise then creates a trade-off between privacy utility of the trained model. To measure the privacy of a given algorithms, researchers have developed advanced mathematical tools and notions such as Renyi differential privacy Mironov (2017); Abadi et al. (2016) and advanced composition theorems Dwork et al. (2010); Kairouz et al. (2015). These tools allow us to calculate (ϵ,δ)(\epsilon,\delta) values for carefully designed algorithms.

Previous work has shown that any deferentially private algorithm will provably bound the accuracy of any membership inference adversary. Specifically, starting with a given (ϵ,δ)(\epsilon,\delta), Humphries et al. (2020) prove that any model trained with (ϵ,δ)(\epsilon,\delta) differential privacy will induce an upper bound on the accuracy of membership adversary and this upper bound depends only on ϵ\epsilon and δ\delta. These upper bounds enables us to obtain provable defenses against membership inference attacks by using deferentially private algorithms. In fact, there is a large gap between upper bounds proved for the power of adversaries in performing membership inference, and the power of real adversaries that try to attack deferentially private models. One hypothesis is that the membership inference bound obtained by differential privacy is in reality stronger than what we could prove. In particular, the process of obtaining differential privacy bounds and then converting those bounds to membership inference bounds could be sub-optimal . In this work we ask the following question:

Can we develop tools to directly and optimally analyze the membership inference upper bounds for algorithms, without going through differential privacy?

Our contributions in this work are as follows:

Membership inference bounds for composition of sampled Gaussian mechanisms Our main theorem bounds the membership inference advantage of any adversary the composition of an arbitrary set of sampled Gaussian mechanisms that could adaptive depend on each other. Specifically, for any adaptive series of sampled Gaussian mechanism (M1,,MT)(M_{1},\dots,M_{T}) where MiM_{i} has sensitivity 1.01.0 and standard deviation σi\sigma_{i} and sub-sampling rate qiq_{i}, we show that the membership inference advantage of any adversary is bounded by the total variation distance between two mixture of Gaussians defined by σ=(σ1,,σT)\sigma=(\sigma_{1},\dots,\sigma_{T}) and q=(q1,,qT)q=(q_{1},\dots,q_{T}). This bound is optimal as it reflects the membership inference advantage for a real adversary on a particular series of Gaussian mechanisms.

We propose a numerical way to calculate the total variation distance between mixture of Gaussians. Our algorithm is computationally tractable and works in linear time with respect to number of mechanisms and is independent from the dimension. We use our numerical approach to obtain concrete bounds on membership inference for Gaussian mechanisms and compare our bounds to that of Humphries et al. (2020).

Finally, to understand the practical implication of our bound for mainstream datasets, we use DP-SGD to train models on CIFAR10 and calculate the membership inference bounds using our techniques. Our approach allows us to achieve state-of-the-art provable membership inference privacy for any given accuracy.

The most widely-used DP algorithm in machine learning applications is DP-SGD: it is a small change of the classical stochastic gradient descent algorithm that only requires to clip per-sample gradients, average them and add Gaussian noise. DP-SGD has been shown to be more accurate than other differentially private training algorithms in the case of linear and convex models (van der Maaten and Hannun, 2020). Each iteration of DP-SGD is an instance of the sampled Gaussian mechanism, which chooses a fraction qq of a dataset and outputs a noisy sum of the desired quantity. Our analysis mirrors the recent shift in the field of empirical membership inference, from advantage (or accuracy) metrics (Shokri et al., 2017; Yeom et al., 2018; Sablayrolles et al., 2019) to precision/recall measures (Watson et al., 2022; Carlini et al., 2020). Differential privacy guarantees are known to yield tight true positive and false positive rates (Nasr et al., 2021). Our paper is the first to show equivalent results in the case of advantage. Our analysis explains, from a theoretical point of view, why the precision and recall of membership inference attacks is stronger than the accuracy.

Background

In this section we provide all the background information necessary for understanding our main theorems, proofs, and algorithms.

Here, we recall two notion of distance between probability distribution, KL divergence and total variation distance.

Total variation distance between two probability distributions XX and YY is defined as:

TV properties.

We recall briefly some properties of TV that will be useful in the remainder of this paper. We first note the following characterization of TV:

Numerous properties about TV\mathbf{TV} can be derived from this characterization. In particular, we are interested in post-processing. Given a function g:EFg:E\to F, applying gg to the samples from XX and YY can only decrease TV\mathbf{TV}:

In particular if gg is a bijective transform, we have TV(g(X),g(Y))=TV(X,Y)\mathbf{TV}(g(X),g(Y))=\mathbf{TV}(X,Y).

Kullback-Leibler (KL) divergence:

The KL divergence between two probability distributions XX and YY defined over a emasurebale space Ω\Omega is defined as:

Now we are ready to state Pinsker inequality that connects total variation to KL divergence:

Pinsker’s Inequality.

For any two probability distributions XX and YY we have

2 Differential Privacy

A randomized algorithm M\mathcal{M} satisfies (ϵ,δ)(\epsilon,\delta)-differential privacy if, for any two datasets DD and DD^{\prime} that differ in at most one sample, and for any subset RR of the output space, we have:

While the probability of failure δ\delta is often chosen to be inversely proportional to the number of samples, there is no consensus in the literature over desirable values of ϵ\epsilon.

Rényi differential privacy (RDP)

is a stronger definition of privacy. For two probability distributions PP and QQ defined over R\mathcal{R}, the Rényi divergence of order α>1\alpha>1 is

with D1D_{1} defined by continuity D1(P    Q)=KL(P    Q)D_{1}(P\;\|\;Q)=\text{KL}(P\;\|\;Q).

A randomized mechanism f ⁣:DRf\colon\mathcal{D}\to\mathcal{R} satisfies (α,ϵ)(\alpha,\epsilon)-Rényi differential privacy (RDP) if, for any adjacent datasets DD and DD^{\prime}, we have

Rényi divergence enjoys nice properties: it is non-decreasing in α\alpha, and limα1Dα(P    Q)=KL(P    Q)\lim_{\alpha\to 1}D_{\alpha}(P\;\|\;Q)=\text{KL}(P\;\|\;Q).

DP-SGD with the subsampled Gaussian mechanism.

DP-SGD (Abadi et al., 2016) is a modification of Stochastic Gradient Descent that makes it differentially private. The core mechanism behind it is the subsampled Gaussian mechanism. Given a function ϕ\phi that operates on sets SS with sensitivity 11 (i.e. ϕ(S)ϕ(S)21\|\phi(S)-\phi(S^{\prime})\|_{2}\leq 1), the subsampled Gaussian mechanism selects sets SS as random and adds noise to the output of ϕ\phi. Note that the mechanism is differentially private even if all intermediate steps of the training process are revealed.

RDP accounting for DP-SGD.

(Abadi et al., 2016; Mironov, 2017) propose methods to account for RDP for Gaussian mechanism. The implementations of DP-SGD Opacus have these accounting procedures. This is important for us as we use these accounting methods to calculate one of the bounds we prove for membership inference for composition of sub-sampled Gaussian mechanims.

3 Membership inference attacks

is the task of predicting whether a given sample was in the training set of a given model. Homer et al. (2008) showed the first proof of concept, and Shokri et al. (2017) showed that a wide variety of machine learning models are vulnerable to such attacks. Shokri et al. (2017) train neural networks to attack machine learning models, and measure the success of the attack by the percentage of correctly predicted (train/test) samples, or equivalently the advantage (Yeom et al., 2018). While Shokri et al. (2017) trained neural networks to attack machine learning models, it was shown later that simple heuristics such as the loss (Yeom et al., 2018; Sablayrolles et al., 2019) is a more accurate and robust measure of membership inference.

Recent works (Watson et al., 2022; Carlini et al., 2021; Rezaei and Liu, 2021) have proposed to evaluate membership inference by the precision/recall trade-off (Watson et al., 2022) or the precision at low levels of recall (Carlini et al., 2021). In particular, such works show that some setups which were thought to be private because the membership accuracy is significantly less than 100%100\% can actually reveal membership of a small group of samples with very high precision.

There is also a line of work developed to design algorithms to specifically defend against membership inference attacks . Although differential privacy would bound the membership inference, these empirical defenses are potentially able to achieve better utility while withstanding against existing membership inference attacks.

Membership inference and total variation

In this section, we define our security game for membership inference and show its connection to the notion of total variation distance (or statistical distance) between probability distributions.

We adopt the classical assumptions of membership inference (Yeom et al., 2018; Sablayrolles et al., 2019; Humphries et al., 2020). We assume that data is assembled in a fixed set D={z1,,zm}D=\{z_{1},\dots,z_{m}\} (resp. D={z1,,zm,z}D^{\prime}=\{z_{1},\dots,z_{m},z^{\prime}\}), and a model θ\theta is produced by a training algorithm M\mathcal{M}: θM(z1,,zm)\theta\sim\mathcal{M}(z_{1},\dots,z_{m}).

Similar to Humphries et al. (2020), we study in particular a more powerful adversary who knows θ\theta, z1,,zmz_{1},\dots,z_{m}, zz^{\prime} and wants to know whether zz^{\prime} was used in training θ\theta. Specifically, we use the following security game between an adversary and a challenger to measure membership inference advantage.

Adversary picks a datasets D={z1,,zT}D=\{z_{1},\dots,z_{T}\} and a data point zz^{\prime}

Challenger samples a bit bb uniformly at random and creates

Challenger learns a model θ\theta by running L(D)L(D^{\prime}) and sends θ\theta to adversary.

Adversary guess a bit bb^{\prime} and wins if b=bb^{\prime}=b.

We then define the advantage of adversary AA on learning algorithm LL to be

to denote the advantage of the worst adversary against algorithm LL.

Note that we are using the notion of add/remove for neighboring datasets where the two datasets are exactly the same except that one of them has one less example. In the rest of paper, wherever we report advantage, we report it for this setting (including when we discuss the analysis of the previous works). To convert this advantage to the advantage defined for notion of neighboring datasets with replacement, we can just double the advantage of add/remove setting. Note that doubling the advantage can potentially lead to values greater than 1, in which case the bound will be vacuous.

In section 4 we prove bounds on the membership advantage for composition of Gaussian mechanisms with and without sub-sampling. Both of these bounds are tight for the advantage defined based on addition/removal. However, for the notion of advantage defined based on replacement, if we double the bounds, only the bound without replacement will remain tight. We leave the question of obtaining tight bounds on the advantage based on replacement for the sub-sampled Gaussian as an open question.

Let ZZ be a random variable that corresponds to the output of the challenger in step 3 of the security game. Also let X:=Zb=0X:=Z\mid b=0 and Y:=Zb=1Y:=Z\mid b=1. A deterministic adversary AA defines a region A\mathcal{A} and predicts that θ\theta is sampled from XX if θA\theta\in\mathcal{A} and from YY if θA\theta\notin\mathcal{A}. We use X(A)X(\mathcal{A}) and Y(A)Y(\mathcal{A}) to denote Pr[XA]\Pr[X\in\mathcal{A}] and Pr[YA]\Pr[Y\in\mathcal{A}]. For such an adversary we have

Note that with a simple averaging argument we can show that the best adversarial strategy in membership security game is a deterministic strategy. Therefore, the advantage for the learning algorithm LL is then defined as

where TV\mathbf{TV} is the total variation distance. Therefore, total variation distance gives us an upper bound on the advantage of any adversary. However, it is not clear how to calculate the total variation distance in general. In next subsection we discuss an approximation of total variation distance that can actually be calculated using existing techniques for RDP accounting.

2 Bounding Membership Inference using Pinsker’s inequality

Using Equation 5 and directly applying Pinsker’s inequality, we have:

where DαD_{\alpha} is the Renyi divergence at α\alpha. Now one might ask why this bound is better than our bound using total variation distance. The reason we state this bound is that we have techniques for calculating the Renyi divergence of composition of adaptive and sampled Gaussian mechanisms. This enables us to calculate numerical upper bounds on the membership inference advantage of any adversary against adaptive composition of sampled Gaussian mechanisms, e.g. DP-SGD.

To this end, we use RDP accounting to calculate the DαD_{\alpha} for an α>1\alpha>1. We know that for any α>1\alpha>1, Dα(X,Y)D_{\alpha}(X,Y) is greater than KL((    X),Y)\text{KL}((\;\|\;X),Y) because DαD_{\alpha} is increasing in α\alpha. This means that for any α>1\alpha>1 DαD_{\alpha} will be a valid upper bound on the membership inference advantage and the bound becomes better as we decrease α\alpha.

In Figures 1 and 2 we calculate numerical upper bounds for membership inference for DP-SGD using typical parameters. We refer to this bound as the Pinsker bound. The figure shows that this bound obtains better numerical values compared to the bound of Humphries et al. (2020). However, there are two main limitations with our Pinsker bound: 1) The bound is not tight. We are applying Pinsker’s inequality which is not optimal for Gaussian mechanism. 2) It provides vacuous bounds in cases where KL(X    Y)>2\text{KL}(X\;\|\;Y)>2. In next section, we will optimally bound the membership inference for composition of adaptive and sampled gaussian mechanisms.

Membership Inference Bounds for Composition of Gaussian Mechanisms

In this section, we show a tighter upper-bound for the adversary’s advantage. Our main results are Theorem 5 and Theorem 6. In particular, we will upper-bound the total variation between the transcripts of the (subsampled) Gaussian mechanism, specifically the noisy gradients produced by the DP-SGD algorithm. The upper bound on the result of the DP-SGD algorithm follows by application of post-processing.

The proof technique is the following: we show that the entire process of DP-SGD can be replaced by a process where each step is replaced by either N(0,σ2)\mathcal{N}(0,\sigma^{2}) or N(r,σ2)\mathcal{N}(r,\sigma^{2}) for the basic Gaussian mechanism, and replaced by (1q)N(0,σ2)+qN(r,σ2)(1-q)\mathcal{N}(0,\sigma^{2})+q\mathcal{N}(r,\sigma^{2}) for the subsampled Gaussian mechanism. Our proof technique relies on a modified version of TV\mathbf{TV}, called TVa\mathbf{TV}_{a}.

For a>0a>0 define TVa(X,Y)=12ΩP(x)aY(x)dx.\mathbf{TV}_{a}(X,Y)=\frac{1}{2}\int_{\Omega}|P(x)-a\cdot Y(x)|dx.

While we do not know an explicit form of the TVa\mathbf{TV}_{a} divergence between Gaussians, we can still prove that it increases with u1u22\lVert u_{1}-u_{2}\rVert_{2}, as formalized in the following lemma.

We use s=(s1,,sT)s=(s_{1},\dots,s_{T}) to denote the output (or transcript) of a random process that consists of TT adaptive steps (typically the subsampled Gaussian mechanism). We use st{s}_{\leq t} to denote the first tt steps of the transcript of the random process. The sampling rate qq is the probability of including any sample in the random set SS. We use SiS_{i} to denote the union of the support set of the iith step of the mechanism MM on all possible datasets. That is Si={si;D,siSupp(M(D)i)}.S_{i}=\{s_{i};\exists D,s_{i}\in\mathsf{Supp}(M(D)_{i})\}. We also define Si=S1×,Si{S}_{\leq i}=S_{1}\times\dots,S_{i}.

Let M1,,MTM_{1},\dots,M_{T} be a series of adaptive Gaussian Mechanisms with L2L_{2} sensitivity rr and Gaussian noise with standard deviation σ\sigma. The membership inference risk of the composition of MiM_{i}’s is at most as much as a single Gaussian mechanism with sensitivity Tr\sqrt{T}\cdot r and standard deviation σ\sigma.

Let Mi(D)M^{i}(D) be the mechanism that works on DD (resp. DD^{\prime}) and consists of ii Gaussians N(0,σ2)\mathcal{N}(0,\sigma^{2}) (resp. N(r,σ2)\mathcal{N}(r,\sigma^{2})) steps followed by TiT-i steps of DP-SGD applied to DD (resp. DD^{\prime}). Specifically, the mechanism M0M^{0} corresponds to DP-SGD, and MTM^{T} to pure Gaussians. We will show in this proof that

To this end, we will first argue that the very final step of the mechanism MiM^{i} can be replaced with a Gaussian step without increasing the total variation distance. Then we can move this noise to the start of the process without affecting the result, obtaining Mi+1M^{i+1}. Figure 3 illustrates this procedure.

Let us fix a step ii and let s=(s1,,sT)s=(s_{1},\dots,s_{T}) be a transcript. Denoting XMi(D)X\sim M^{i}(D) and YMi(D)Y\sim M^{i}(D^{\prime}), we have

We know that the last (TT’th) step of Mi(D)M^{i}(D) (resp. Mi(D)M^{i}(D^{\prime})) follow isotropic Gaussian distributions centered around two points u1u_{1} and u2u_{2} such that u1u22r\lVert u_{1}-u_{2}\rVert_{2}\leq r and with standard deviation σ\sigma. These centers could be chosen adaptively according to the history of the mechanism. We use a(sT1)a({s}_{\leq T-1}) to denote Pr[YT1=sT1]Pr[XT1=sT1]\frac{\Pr[{Y}_{\leq T-1}={s}_{\leq T-1}]}{\Pr[{X}_{\leq T-1}={s}_{\leq T-1}]}. By Lemma 4 we have

Denoting by NiN^{i} the mechanism that coincides with MiM^{i} for the first T1T-1 steps and is replaced by a Gaussian at the last (TT’th) step, we thus have, by following Equation 9 in the reverse direction

Given that the last step of NiN^{i} does not depend on the first T1T-1 steps, we can permute to put it in first position (see Figure 3), which shows that TV(Ni(D),Ni(D))=TV(Mi+1(D),Mi+1(D))\mathbf{TV}(N^{i}(D),N^{i}(D^{\prime}))=\mathbf{TV}(M^{i+1}(D),M^{i+1}(D^{\prime})). ∎

2 With sampling

Let M1,,MTM_{1},\dots,M_{T} be a series of adaptive Gaussian Mechanisms with L2L_{2} sensitivity rr and Gaussian noise with standard deviation σ\sigma and sub-sampling rate qq. The membership inference risk of the composition of MiM_{i}’s is at most

Let X(1q)Y+qXX^{\prime}\equiv(1-q)\cdot Y+q\cdot X then we have

The proof steps are similar to that of Theorem 5. First, we have

But since XTX_{T} and YTY_{T} are subsampled Gaussian mechanisms we have XT(1q)YT+qXTX_{T}\equiv(1-q)Y_{T}+qX^{\prime}_{T} where YY and XX^{\prime} are mixtures of Gaussians. Therefore, by Lemma 4 and Lemma 7 we have

Therefore, we can replace XTX_{T} with a mixture of two Gaussians centered at and rr and YTY_{T} with a single Gaussian centered at 0.0. Now we can use the same technique used in proof of Theorem 5 and move XTX_{T} and YTY_{T} to the first round and repeat this process. At the end, YY will turn to a nn-dimensional Gaussian centered at and standard deviation σ\sigma and XX will be a mixture of Gaussians with center randomly selected according to a nn-dimensional Bernoulli distribution with probability qq. That is, the advantage is bounded by

Theorem 5 could be simply extended to composition of Gaussian mechanisms with varying noise levels. However, if the noise levels are selected adaptively, the proof is not clear. We leave the composition of Gaussians with adaptive noise selection as an open question.

3 Numerical computation

In order to numerically approximate the upper-bound, we first convert the notion of TV\mathbf{TV} into a expectation formulation as follows:

Note that this expectation is over distribution XX. So we can sample a dataset from XX and approximate this expectation using empirical averaging (or Monte-Carlo sampling):

We know that Monte-Carlo estimation of this expectation using is very precise because the quantity is bounded between and 11.

Note that in-order to calculate this, we need to calculate Y(ti)/X(ti)Y(t_{i})/X(t_{i}) and that is possible because we have the mathematical form of the probability distribution function of for YY and XX. In Figures 1 and 2 we calculate the upper bound using this Monte-Carlo simulation for typical settings in DP-SGD.

Conclusion

In this paper, we directly analyzed membership inference bounds for composition of adaptive sampled Gaussian mechanisms. Our analysis enables us to obtain bounds that are much better that one can obtain by converting differential privacy guarantees to membership inference guarantees. Our analysis shows that although differential privacy guarantees might sometimes large membership inference guarantees, but the mechanisms that obtain differential privacy can be in fact much more secure against membership inference attacks. Previously, this phenomenon was observed for DP-SGD and here for the first time we prove it.

Our analysis is the first to directly analyze membersihp inference bounds. We limited our study to membership inference attacks against sampled Gaussian mechanisms as DP-SGD is the most used differential private learning algorithm. But this kind of membership inference analysis could be potentially done for other mechanisms and algorithm. We leave this for future work.

We also note that The parameters in DP-SGD that achieve optimal membership privacy versus utility might be different than that of differential privacy. Our new analysis opens up the possibility of a systematic search for optimal hyper parameters to obtain optimal utility for a given upper bound on membership inference advantage.

References

Appendix A Proof of Lemma 4

The first part follows by the symmetry of isotropic Gaussian. For the second part (monotonicity) we use the definition of TVa\mathbf{TV}_{a}. Without loss of generality we can assume aa\in as otherwise we can work with TVa(P,Q)/a=TV1/a(Q,P)\mathbf{TV}_{a}(P,Q)/a=\mathbf{TV}_{1/a}(Q,P). Let r=u1u22r=\lVert u_{1}-u_{2}\rVert_{2}. We can show that the derivative of the integral is always positive. In the following calculations, we use c1,c2,c3c_{1},c_{2},c_{3} and c4c_{4} to denote positive constants that are independent of rr.

Now note that we have e(r2ln(a)σ222σr)2=a1/2e(r2ln(a)σ222σr)2e^{-\left(\frac{r^{2}-\ln(a)\sigma^{2}}{2\sqrt{2}\sigma r}\right)^{2}}=a^{1/2}\cdot e^{-\left(\frac{-r^{2}-\ln(a)\sigma^{2}}{2\sqrt{2}\sigma r}\right)^{2}}. Therefore, we have

Now since aa\in, we have ln(a)0\ln(a)\leq 0 and a1<0\sqrt{a}-1<0. Which means the term 1+a22σ+ln(a)(a1)σ22r2\frac{1+\sqrt{a}}{2\sqrt{2}\sigma}+\frac{\ln(a)(\sqrt{a}-1)\sigma}{2\sqrt{2}r^{2}} is positive. This implies that the whole gradient is positive.

Appendix B Membership inference precision

In this section, we refine the analysis of Sablayrolles et al. for the accuracy of a membership attack.

Let us first derive a bound on the precision of membership inference. We assume that there are two datasets DD and DD^{\prime} and that a differentially-private mechanism M\mathcal{M} trains a model represented by θ\theta.

With probability (1δ(1-\delta) over the choice of θ\theta, we have:

with σ(u)=1/(1+exp(u))\sigma(u)=1/(1+\exp(-u)) the sigmoid function.

Upper-bound on attack accuracy.

This means that the attack accuracy is bounded by σ(ϵ)\sigma(\epsilon) with probability 1δ1-\delta. Empirically, we see that the sigmoid function closely matches the bound given by Humphries et al. . Simply stated, this derivation shows that the bound proven by Humphries et al. actually holds with probability 1δ1-\delta instead of on average.