Optimal Membership Inference Bounds for Adaptive Composition of Sampled Gaussian Mechanisms

Saeed Mahloujifar, Alexandre Sablayrolles, Graham Cormode, Somesh Jha

Introduction

The recent success of machine learning models make them the go-to approach to solve a variety of problems, ranging from computer vision (Krizhevsky et al., 2012) to NLP (Sutskever et al., 2014), including applications to sensitive data such as health records or chatbots. Access to a trained machine learning model, through a black-box API or a white-box access to a published model, can leak traces of information (Dwork et al., 2015) from the training data. Researchers have tried to measure this information leakage through metrics such as membership inference (Shokri et al., 2017). Membership inference is the task of guessing, from a trained model, whether it includes a given sample or not. This task is both interesting in its own right, as the participation of an individual in a data collection can be a sensitive information. It also serves as the “most significant bit” of information: if membership inference fails, attacks revealing more information such as reconstruction attacks (Fredrikson et al., 2014; Carlini et al., 2020) will also fail. In other words, defending against membership inference attacks would also defend against attacks such as reconstruction attacks that aim at reconstructing training examples.

The standard approach to provably defeat these membership privacy attacks is differential privacy (Dwork et al., 2006). Differential privacy defines a class of training algorithms that respect a privacy budget $\epsilon$ and a probability of failure $\delta$ . These quantities quantify how much information about each individual training example is revealed by the output of the algorithm. Most algorithms obtaining differential privacy need to inject noise somewhere in their process. The amount of injected noise then creates a trade-off between privacy utility of the trained model. To measure the privacy of a given algorithms, researchers have developed advanced mathematical tools and notions such as Renyi differential privacy Mironov (2017); Abadi et al. (2016) and advanced composition theorems Dwork et al. (2010); Kairouz et al. (2015). These tools allow us to calculate $(\epsilon,\delta)$ values for carefully designed algorithms.

Previous work has shown that any deferentially private algorithm will provably bound the accuracy of any membership inference adversary. Specifically, starting with a given $(\epsilon,\delta)$ , Humphries et al. (2020) prove that any model trained with $(\epsilon,\delta)$ differential privacy will induce an upper bound on the accuracy of membership adversary and this upper bound depends only on $\epsilon$ and $\delta$ . These upper bounds enables us to obtain provable defenses against membership inference attacks by using deferentially private algorithms. In fact, there is a large gap between upper bounds proved for the power of adversaries in performing membership inference, and the power of real adversaries that try to attack deferentially private models. One hypothesis is that the membership inference bound obtained by differential privacy is in reality stronger than what we could prove. In particular, the process of obtaining differential privacy bounds and then converting those bounds to membership inference bounds could be sub-optimal . In this work we ask the following question:

Can we develop tools to directly and optimally analyze the membership inference upper bounds for algorithms, without going through differential privacy?

Our contributions in this work are as follows:

Membership inference bounds for composition of sampled Gaussian mechanisms Our main theorem bounds the membership inference advantage of any adversary the composition of an arbitrary set of sampled Gaussian mechanisms that could adaptive depend on each other. Specifically, for any adaptive series of sampled Gaussian mechanism $(M_{1},\dots,M_{T})$ where $M_{i}$ has sensitivity $1.0$ and standard deviation $\sigma_{i}$ and sub-sampling rate $q_{i}$ , we show that the membership inference advantage of any adversary is bounded by the total variation distance between two mixture of Gaussians defined by $\sigma=(\sigma_{1},\dots,\sigma_{T})$ and $q=(q_{1},\dots,q_{T})$ . This bound is optimal as it reflects the membership inference advantage for a real adversary on a particular series of Gaussian mechanisms.

We propose a numerical way to calculate the total variation distance between mixture of Gaussians. Our algorithm is computationally tractable and works in linear time with respect to number of mechanisms and is independent from the dimension. We use our numerical approach to obtain concrete bounds on membership inference for Gaussian mechanisms and compare our bounds to that of Humphries et al. (2020).

Finally, to understand the practical implication of our bound for mainstream datasets, we use DP-SGD to train models on CIFAR10 and calculate the membership inference bounds using our techniques. Our approach allows us to achieve state-of-the-art provable membership inference privacy for any given accuracy.

The most widely-used DP algorithm in machine learning applications is DP-SGD: it is a small change of the classical stochastic gradient descent algorithm that only requires to clip per-sample gradients, average them and add Gaussian noise. DP-SGD has been shown to be more accurate than other differentially private training algorithms in the case of linear and convex models (van der Maaten and Hannun, 2020). Each iteration of DP-SGD is an instance of the sampled Gaussian mechanism, which chooses a fraction $q$ of a dataset and outputs a noisy sum of the desired quantity. Our analysis mirrors the recent shift in the field of empirical membership inference, from advantage (or accuracy) metrics (Shokri et al., 2017; Yeom et al., 2018; Sablayrolles et al., 2019) to precision/recall measures (Watson et al., 2022; Carlini et al., 2020). Differential privacy guarantees are known to yield tight true positive and false positive rates (Nasr et al., 2021). Our paper is the first to show equivalent results in the case of advantage. Our analysis explains, from a theoretical point of view, why the precision and recall of membership inference attacks is stronger than the accuracy.

Background

In this section we provide all the background information necessary for understanding our main theorems, proofs, and algorithms.

Here, we recall two notion of distance between probability distribution, KL divergence and total variation distance.

Total variation distance between two probability distributions $X$ and $Y$ is defined as:

TV properties.

We recall briefly some properties of TV that will be useful in the remainder of this paper. We first note the following characterization of TV:

Numerous properties about $\mathbf{TV}$ can be derived from this characterization. In particular, we are interested in post-processing. Given a function $g:E\to F$ , applying $g$ to the samples from $X$ and $Y$ can only decrease $\mathbf{TV}$ :

In particular if $g$ is a bijective transform, we have $\mathbf{TV}(g(X),g(Y))=\mathbf{TV}(X,Y)$ .

Kullback-Leibler (KL) divergence:

The KL divergence between two probability distributions $X$ and $Y$ defined over a emasurebale space $\Omega$ is defined as:

Now we are ready to state Pinsker inequality that connects total variation to KL divergence:

Pinsker’s Inequality.

For any two probability distributions $X$ and $Y$ we have

2 Differential Privacy

A randomized algorithm $\mathcal{M}$ satisfies $(\epsilon,\delta)$ -differential privacy if, for any two datasets $D$ and $D^{\prime}$ that differ in at most one sample, and for any subset $R$ of the output space, we have:

While the probability of failure $\delta$ is often chosen to be inversely proportional to the number of samples, there is no consensus in the literature over desirable values of $\epsilon$ .

Rényi differential privacy (RDP)

is a stronger definition of privacy. For two probability distributions $P$ and $Q$ defined over $\mathcal{R}$ , the Rényi divergence of order $\alpha>1$ is

with $D_{1}$ defined by continuity $D_{1}(P\;\|\;Q)=\text{KL}(P\;\|\;Q)$ .

A randomized mechanism $f\colon\mathcal{D}\to\mathcal{R}$ satisfies $(\alpha,\epsilon)$ -Rényi differential privacy (RDP) if, for any adjacent datasets $D$ and $D^{\prime}$ , we have

Rényi divergence enjoys nice properties: it is non-decreasing in $\alpha$ , and $\lim_{\alpha\to 1}D_{\alpha}(P\;\|\;Q)=\text{KL}(P\;\|\;Q)$ .

DP-SGD with the subsampled Gaussian mechanism.

DP-SGD (Abadi et al., 2016) is a modification of Stochastic Gradient Descent that makes it differentially private. The core mechanism behind it is the subsampled Gaussian mechanism. Given a function $\phi$ that operates on sets $S$ with sensitivity $1$ (i.e. $\|\phi(S)-\phi(S^{\prime})\|_{2}\leq 1$ ), the subsampled Gaussian mechanism selects sets $S$ as random and adds noise to the output of $\phi$ . Note that the mechanism is differentially private even if all intermediate steps of the training process are revealed.

RDP accounting for DP-SGD.

(Abadi et al., 2016; Mironov, 2017) propose methods to account for RDP for Gaussian mechanism. The implementations of DP-SGD Opacus have these accounting procedures. This is important for us as we use these accounting methods to calculate one of the bounds we prove for membership inference for composition of sub-sampled Gaussian mechanims.

3 Membership inference attacks

is the task of predicting whether a given sample was in the training set of a given model. Homer et al. (2008) showed the first proof of concept, and Shokri et al. (2017) showed that a wide variety of machine learning models are vulnerable to such attacks. Shokri et al. (2017) train neural networks to attack machine learning models, and measure the success of the attack by the percentage of correctly predicted (train/test) samples, or equivalently the advantage (Yeom et al., 2018). While Shokri et al. (2017) trained neural networks to attack machine learning models, it was shown later that simple heuristics such as the loss (Yeom et al., 2018; Sablayrolles et al., 2019) is a more accurate and robust measure of membership inference.

Recent works (Watson et al., 2022; Carlini et al., 2021; Rezaei and Liu, 2021) have proposed to evaluate membership inference by the precision/recall trade-off (Watson et al., 2022) or the precision at low levels of recall (Carlini et al., 2021). In particular, such works show that some setups which were thought to be private because the membership accuracy is significantly less than $100\%$ can actually reveal membership of a small group of samples with very high precision.

There is also a line of work developed to design algorithms to specifically defend against membership inference attacks . Although differential privacy would bound the membership inference, these empirical defenses are potentially able to achieve better utility while withstanding against existing membership inference attacks.

Membership inference and total variation

In this section, we define our security game for membership inference and show its connection to the notion of total variation distance (or statistical distance) between probability distributions.

We adopt the classical assumptions of membership inference (Yeom et al., 2018; Sablayrolles et al., 2019; Humphries et al., 2020). We assume that data is assembled in a fixed set $D=\{z_{1},\dots,z_{m}\}$ (resp. $D^{\prime}=\{z_{1},\dots,z_{m},z^{\prime}\}$ ), and a model $\theta$ is produced by a training algorithm $\mathcal{M}$ : $\theta\sim\mathcal{M}(z_{1},\dots,z_{m})$ .

Similar to Humphries et al. (2020), we study in particular a more powerful adversary who knows $\theta$ , $z_{1},\dots,z_{m}$ , $z^{\prime}$ and wants to know whether $z^{\prime}$ was used in training $\theta$ . Specifically, we use the following security game between an adversary and a challenger to measure membership inference advantage.

Adversary picks a datasets $D=\{z_{1},\dots,z_{T}\}$ and a data point $z^{\prime}$

Challenger samples a bit $b$ uniformly at random and creates

Challenger learns a model $\theta$ by running $L(D^{\prime})$ and sends $\theta$ to adversary.

Adversary guess a bit $b^{\prime}$ and wins if $b^{\prime}=b$ .

We then define the advantage of adversary $A$ on learning algorithm $L$ to be

to denote the advantage of the worst adversary against algorithm $L$ .

Note that we are using the notion of add/remove for neighboring datasets where the two datasets are exactly the same except that one of them has one less example. In the rest of paper, wherever we report advantage, we report it for this setting (including when we discuss the analysis of the previous works). To convert this advantage to the advantage defined for notion of neighboring datasets with replacement, we can just double the advantage of add/remove setting. Note that doubling the advantage can potentially lead to values greater than 1, in which case the bound will be vacuous.

In section 4 we prove bounds on the membership advantage for composition of Gaussian mechanisms with and without sub-sampling. Both of these bounds are tight for the advantage defined based on addition/removal. However, for the notion of advantage defined based on replacement, if we double the bounds, only the bound without replacement will remain tight. We leave the question of obtaining tight bounds on the advantage based on replacement for the sub-sampled Gaussian as an open question.

Let $Z$ be a random variable that corresponds to the output of the challenger in step 3 of the security game. Also let $X:=Z\mid b=0$ and $Y:=Z\mid b=1$ . A deterministic adversary $A$ defines a region $\mathcal{A}$ and predicts that $\theta$ is sampled from $X$ if $\theta\in\mathcal{A}$ and from $Y$ if $\theta\notin\mathcal{A}$ . We use $X(\mathcal{A})$ and $Y(\mathcal{A})$ to denote $\Pr[X\in\mathcal{A}]$ and $\Pr[Y\in\mathcal{A}]$ . For such an adversary we have

Note that with a simple averaging argument we can show that the best adversarial strategy in membership security game is a deterministic strategy. Therefore, the advantage for the learning algorithm $L$ is then defined as

where $\mathbf{TV}$ is the total variation distance. Therefore, total variation distance gives us an upper bound on the advantage of any adversary. However, it is not clear how to calculate the total variation distance in general. In next subsection we discuss an approximation of total variation distance that can actually be calculated using existing techniques for RDP accounting.

2 Bounding Membership Inference using Pinsker’s inequality

Using Equation 5 and directly applying Pinsker’s inequality, we have:

where $D_{\alpha}$ is the Renyi divergence at $\alpha$ . Now one might ask why this bound is better than our bound using total variation distance. The reason we state this bound is that we have techniques for calculating the Renyi divergence of composition of adaptive and sampled Gaussian mechanisms. This enables us to calculate numerical upper bounds on the membership inference advantage of any adversary against adaptive composition of sampled Gaussian mechanisms, e.g. DP-SGD.

To this end, we use RDP accounting to calculate the $D_{\alpha}$ for an $\alpha>1$ . We know that for any $\alpha>1$ , $D_{\alpha}(X,Y)$ is greater than $\text{KL}((\;\|\;X),Y)$ because $D_{\alpha}$ is increasing in $\alpha$ . This means that for any $\alpha>1$ $D_{\alpha}$ will be a valid upper bound on the membership inference advantage and the bound becomes better as we decrease $\alpha$ .

In Figures 1 and 2 we calculate numerical upper bounds for membership inference for DP-SGD using typical parameters. We refer to this bound as the Pinsker bound. The figure shows that this bound obtains better numerical values compared to the bound of Humphries et al. (2020). However, there are two main limitations with our Pinsker bound: 1) The bound is not tight. We are applying Pinsker’s inequality which is not optimal for Gaussian mechanism. 2) It provides vacuous bounds in cases where $\text{KL}(X\;\|\;Y)>2$ . In next section, we will optimally bound the membership inference for composition of adaptive and sampled gaussian mechanisms.

Membership Inference Bounds for Composition of Gaussian Mechanisms

In this section, we show a tighter upper-bound for the adversary’s advantage. Our main results are Theorem 5 and Theorem 6. In particular, we will upper-bound the total variation between the transcripts of the (subsampled) Gaussian mechanism, specifically the noisy gradients produced by the DP-SGD algorithm. The upper bound on the result of the DP-SGD algorithm follows by application of post-processing.

The proof technique is the following: we show that the entire process of DP-SGD can be replaced by a process where each step is replaced by either $\mathcal{N}(0,\sigma^{2})$ or $\mathcal{N}(r,\sigma^{2})$ for the basic Gaussian mechanism, and replaced by $(1-q)\mathcal{N}(0,\sigma^{2})+q\mathcal{N}(r,\sigma^{2})$ for the subsampled Gaussian mechanism. Our proof technique relies on a modified version of $\mathbf{TV}$ , called $\mathbf{TV}_{a}$ .

For $a>0$ define $\mathbf{TV}_{a}(X,Y)=\frac{1}{2}\int_{\Omega}|P(x)-a\cdot Y(x)|dx.$

While we do not know an explicit form of the $\mathbf{TV}_{a}$ divergence between Gaussians, we can still prove that it increases with $\lVert u_{1}-u_{2}\rVert_{2}$ , as formalized in the following lemma.

We use $s=(s_{1},\dots,s_{T})$ to denote the output (or transcript) of a random process that consists of $T$ adaptive steps (typically the subsampled Gaussian mechanism). We use ${s}_{\leq t}$ to denote the first $t$ steps of the transcript of the random process. The sampling rate $q$ is the probability of including any sample in the random set $S$ . We use $S_{i}$ to denote the union of the support set of the $i$ th step of the mechanism $M$ on all possible datasets. That is $S_{i}=\{s_{i};\exists D,s_{i}\in\mathsf{Supp}(M(D)_{i})\}.$ We also define ${S}_{\leq i}=S_{1}\times\dots,S_{i}$ .

Let $M_{1},\dots,M_{T}$ be a series of adaptive Gaussian Mechanisms with $L_{2}$ sensitivity $r$ and Gaussian noise with standard deviation $\sigma$ . The membership inference risk of the composition of $M_{i}$ ’s is at most as much as a single Gaussian mechanism with sensitivity $\sqrt{T}\cdot r$ and standard deviation $\sigma$ .

Let $M^{i}(D)$ be the mechanism that works on $D$ (resp. $D^{\prime}$ ) and consists of $i$ Gaussians $\mathcal{N}(0,\sigma^{2})$ (resp. $\mathcal{N}(r,\sigma^{2})$ ) steps followed by $T-i$ steps of DP-SGD applied to $D$ (resp. $D^{\prime}$ ). Specifically, the mechanism $M^{0}$ corresponds to DP-SGD, and $M^{T}$ to pure Gaussians. We will show in this proof that

To this end, we will first argue that the very final step of the mechanism $M^{i}$ can be replaced with a Gaussian step without increasing the total variation distance. Then we can move this noise to the start of the process without affecting the result, obtaining $M^{i+1}$ . Figure 3 illustrates this procedure.

Let us fix a step $i$ and let $s=(s_{1},\dots,s_{T})$ be a transcript. Denoting $X\sim M^{i}(D)$ and $Y\sim M^{i}(D^{\prime})$ , we have

We know that the last ( $T$ ’th) step of $M^{i}(D)$ (resp. $M^{i}(D^{\prime})$ ) follow isotropic Gaussian distributions centered around two points $u_{1}$ and $u_{2}$ such that $\lVert u_{1}-u_{2}\rVert_{2}\leq r$ and with standard deviation $\sigma$ . These centers could be chosen adaptively according to the history of the mechanism. We use $a({s}_{\leq T-1})$ to denote $\frac{\Pr[{Y}_{\leq T-1}={s}_{\leq T-1}]}{\Pr[{X}_{\leq T-1}={s}_{\leq T-1}]}$ . By Lemma 4 we have

Denoting by $N^{i}$ the mechanism that coincides with $M^{i}$ for the first $T-1$ steps and is replaced by a Gaussian at the last ( $T$ ’th) step, we thus have, by following Equation 9 in the reverse direction

Given that the last step of $N^{i}$ does not depend on the first $T-1$ steps, we can permute to put it in first position (see Figure 3), which shows that $\mathbf{TV}(N^{i}(D),N^{i}(D^{\prime}))=\mathbf{TV}(M^{i+1}(D),M^{i+1}(D^{\prime}))$ . ∎

2 With sampling

Let $M_{1},\dots,M_{T}$ be a series of adaptive Gaussian Mechanisms with $L_{2}$ sensitivity $r$ and Gaussian noise with standard deviation $\sigma$ and sub-sampling rate $q$ . The membership inference risk of the composition of $M_{i}$ ’s is at most

Let $X^{\prime}\equiv(1-q)\cdot Y+q\cdot X$ then we have

The proof steps are similar to that of Theorem 5. First, we have

But since $X_{T}$ and $Y_{T}$ are subsampled Gaussian mechanisms we have $X_{T}\equiv(1-q)Y_{T}+qX^{\prime}_{T}$ where $Y$ and $X^{\prime}$ are mixtures of Gaussians. Therefore, by Lemma 4 and Lemma 7 we have

Therefore, we can replace $X_{T}$ with a mixture of two Gaussians centered at and $r$ and $Y_{T}$ with a single Gaussian centered at $0.$ Now we can use the same technique used in proof of Theorem 5 and move $X_{T}$ and $Y_{T}$ to the first round and repeat this process. At the end, $Y$ will turn to a $n$ -dimensional Gaussian centered at and standard deviation $\sigma$ and $X$ will be a mixture of Gaussians with center randomly selected according to a $n$ -dimensional Bernoulli distribution with probability $q$ . That is, the advantage is bounded by

Theorem 5 could be simply extended to composition of Gaussian mechanisms with varying noise levels. However, if the noise levels are selected adaptively, the proof is not clear. We leave the composition of Gaussians with adaptive noise selection as an open question.

3 Numerical computation

In order to numerically approximate the upper-bound, we first convert the notion of $\mathbf{TV}$ into a expectation formulation as follows:

Note that this expectation is over distribution $X$ . So we can sample a dataset from $X$ and approximate this expectation using empirical averaging (or Monte-Carlo sampling):

We know that Monte-Carlo estimation of this expectation using is very precise because the quantity is bounded between and $1$ .

Note that in-order to calculate this, we need to calculate $Y(t_{i})/X(t_{i})$ and that is possible because we have the mathematical form of the probability distribution function of for $Y$ and $X$ . In Figures 1 and 2 we calculate the upper bound using this Monte-Carlo simulation for typical settings in DP-SGD.

Conclusion

In this paper, we directly analyzed membership inference bounds for composition of adaptive sampled Gaussian mechanisms. Our analysis enables us to obtain bounds that are much better that one can obtain by converting differential privacy guarantees to membership inference guarantees. Our analysis shows that although differential privacy guarantees might sometimes large membership inference guarantees, but the mechanisms that obtain differential privacy can be in fact much more secure against membership inference attacks. Previously, this phenomenon was observed for DP-SGD and here for the first time we prove it.

Our analysis is the first to directly analyze membersihp inference bounds. We limited our study to membership inference attacks against sampled Gaussian mechanisms as DP-SGD is the most used differential private learning algorithm. But this kind of membership inference analysis could be potentially done for other mechanisms and algorithm. We leave this for future work.

We also note that The parameters in DP-SGD that achieve optimal membership privacy versus utility might be different than that of differential privacy. Our new analysis opens up the possibility of a systematic search for optimal hyper parameters to obtain optimal utility for a given upper bound on membership inference advantage.

References

Appendix A Proof of Lemma 4

The first part follows by the symmetry of isotropic Gaussian. For the second part (monotonicity) we use the definition of $\mathbf{TV}_{a}$ . Without loss of generality we can assume $a\in$ as otherwise we can work with $\mathbf{TV}_{a}(P,Q)/a=\mathbf{TV}_{1/a}(Q,P)$ . Let $r=\lVert u_{1}-u_{2}\rVert_{2}$ . We can show that the derivative of the integral is always positive. In the following calculations, we use $c_{1},c_{2},c_{3}$ and $c_{4}$ to denote positive constants that are independent of $r$ .

Now note that we have $e^{-\left(\frac{r^{2}-\ln(a)\sigma^{2}}{2\sqrt{2}\sigma r}\right)^{2}}=a^{1/2}\cdot e^{-\left(\frac{-r^{2}-\ln(a)\sigma^{2}}{2\sqrt{2}\sigma r}\right)^{2}}$ . Therefore, we have

Now since $a\in$ , we have $\ln(a)\leq 0$ and $\sqrt{a}-1<0$ . Which means the term $\frac{1+\sqrt{a}}{2\sqrt{2}\sigma}+\frac{\ln(a)(\sqrt{a}-1)\sigma}{2\sqrt{2}r^{2}}$ is positive. This implies that the whole gradient is positive.

Appendix B Membership inference precision

In this section, we refine the analysis of Sablayrolles et al. for the accuracy of a membership attack.

Let us first derive a bound on the precision of membership inference. We assume that there are two datasets $D$ and $D^{\prime}$ and that a differentially-private mechanism $\mathcal{M}$ trains a model represented by $\theta$ .

With probability $(1-\delta$ ) over the choice of $\theta$ , we have:

with $\sigma(u)=1/(1+\exp(-u))$ the sigmoid function.

Upper-bound on attack accuracy.

This means that the attack accuracy is bounded by $\sigma(\epsilon)$ with probability $1-\delta$ . Empirically, we see that the sigmoid function closely matches the bound given by Humphries et al. . Simply stated, this derivation shows that the bound proven by Humphries et al. actually holds with probability $1-\delta$ instead of on average.