Efficient Neural Network Robustness Certification with General Activation Functions

Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, Luca Daniel

Introduction

While neural networks (NNs) have achieved remarkable performance and accomplished unprecedented breakthroughs in many machine learning tasks, recent studies have highlighted their lack of robustness against adversarial perturbations . For example, in image learning tasks such as object classification or content captioning , visually indistinguishable adversarial examples can be easily crafted from natural images to alter a NN’s prediction result. Beyond the white-box attack setting where the target model is entirely transparent, visually imperceptible adversarial perturbations can also be generated in the black-box setting by only using the prediction results of the target model . In addition, real-life adversarial examples have been made possible through the lens of realizing physical perturbations . As NNs are becoming a core technique deployed in a wide range of applications, including safety-critical tasks, certifying robustness of a NN against adversarial perturbations has become an important research topic in machine learning.

Although certifying the largest possible robustness is challenging for ReLU networks, the piece-wise linear nature of ReLUs can be exploited to efficiently compute a non-trivial certified lower bound of the minimum distortion . Beyond ReLU, one fundamental problem that remains largely unexplored is how to generalize the robustness certification technique to other popular activation functions that are not piece-wise linear, such as tanh and sigmoid, and how to motivate and certify the design of other activation functions towards improved robustness. In this paper, we tackle the preceding problem by proposing an efficient robustness certification framework for NNs with general activation functions. Our main contributions in this paper are summarized as follows:

We propose a generic analysis framework CROWN for certifying NNs using linear or quadratic upper and lower bounds for general activation functions that are not necessarily piece-wise linear.

Unlike previous work , CROWN allows flexible selections of upper/lower bounds for activation functions, enabling an adaptive scheme that helps to reduce approximation errors. Our experiments show that CROWN achieves up to 26% improvements in certified lower bounds compared to .

Our algorithm is efficient and can scale to large NNs with various activation functions. For a NN with over 10,000 neurons, we can give a certified lower bound in about 1 minute on 1 CPU core.

Background and Related Work

For ReLU networks, finding the minimum adversarial distortion for a given input data point $x_{0}$ can be cast as a mixed integer linear programming (MILP) problem . Reluplex uses a satisfiable modulo theory (SMT) to encode ReLU activations into linear constraints. Similarly, Planet uses satisfiability (SAT) solvers. However, due to the NP-completeness for solving such a problem , these methods can only find minimum distortion for very small networks. It can take Reluplex several hours to find the minimum distortion of an example for a ReLU network with 5 inputs, 5 outputs and 300 neurons.

On the other hand, a computationally feasible alternative of robustness certificate is to provide a non-trivial and certified lower bound of minimum distortion. Some analytical lower bounds based on operator norms on the weight matrices or the Jacobian matrix in NNs do not exploit special property of ReLU and thus can be very loose . The bounds in are based on the local Lipschitz constant. assumes a continuous differentiable NN and hence excludes ReLU networks; a closed form lower-bound is also hard to derive for networks beyond 2 layers. applies to ReLU networks and uses Extreme Value Theory to provide an estimated lower bound (CLEVER score). Although the CLEVER score is capable of reflecting the level of robustness in different NNs and is scalable to large networks, it is not a certified lower bound. On the other hand, use the idea of a convex outer adversarial polytope in ReLU networks to compute a certified lower bound by relaxing the MILP certification problem to linear programing (LP). apply semidefinite programming for robustness certification in ReLU networks but their approach is limited to NNs with one hidden layer. exploit the ReLU property to bound the activation function (or the local Lipschitz constant) and provide efficient algorithms (Fast-Lin and Fast-Lip) for computing a certified lower bound, achieving state-of-the-art performance. A recent work uses abstract transformations to zonotopes for proving robustness property for ReLU networks. Nonetheless, there are still some applications demand non-ReLU activations, e.g. RNN and LSTM, thus a general framework that can efficiently compute non-trivial and certified lower bounds for NNs with general activation functions is of great importance. We aim at filling this gap and propose CROWN that can perform efficient robustness certification to NNs with general activation functions. Table 1 summarizes the differences of other approaches and CROWN. Note that a recent work based on solving Lagrangian dual can also handle general activation functions but it trades off the quality of robustness bound with scalability.

CROWN: A general framework for certifying neural networks

1 General framework

Input perturbation and pre-activation bounds.

Below, we first define the linear upper bounds and lower bounds of activation functions in Definition 3.1, which are the key to derive explicit output bounds for an $m$ -layer neural network with general activation functions. The formal statement of the explicit output bounds is shown in Theorem 3.2.

Note that the parameters $\mathbf{\alpha}^{(k)}_{U,{r}},\mathbf{\alpha}^{(k)}_{L,{r}},\mathbf{\beta}^{(k)}_{U,{r}},\mathbf{\beta}^{(k)}_{L,{r}}$ depend on $\mathbf{l}^{(k)}_{r}$ and $\mathbf{u}^{(k)}_{r}$ , i.e. for different $\mathbf{l}^{(k)}_{r}$ and $\mathbf{u}^{(k)}_{r}$ we may choose different parameters. Also, for ease of exposition, in this paper we restrict $\mathbf{\alpha}^{(k)}_{U,{r}},\mathbf{\alpha}^{(k)}_{L,{r}}\geq 0$ . However, Theorem 3.2 can be easily generalized to the case of negative $\mathbf{\alpha}^{(k)}_{U,{r}},\mathbf{\alpha}^{(k)}_{L,{r}}$ .

Theorem 3.2 illustrates how a NN function $f_{j}(\mathbf{x})$ can be bounded by two linear functions $f^{U}_{j}(\mathbf{x})$ and $f^{L}_{j}(\mathbf{x})$ when the activation function of each neuron is bounded by two linear functions $h^{(k)}_{U,r}$ and $h^{(k)}_{L,r}$ in Definition 3.1. The central idea is to unwrap the activation functions layer by layer by considering the signs of the associated (equivalent) weights of each neuron and apply the two linear bounds $h^{(k)}_{U,r}$ and $h^{(k)}_{L,r}$ . As we demonstrate in the proof, when we replace the activation functions with the corresponding linear upper bounds and lower bounds at the layer $m-1$ , we can then define equivalent weights and biases based on the parameters of $h^{(m-1)}_{U,r}$ and $h^{(m-1)}_{L,r}$ (e.g. $\mathbf{\Lambda}^{(k)},\mathbf{\Delta}^{(k)},\mathbf{\Omega}^{(k)},\mathbf{\Theta}^{(k)}$ are related to the terms $\mathbf{\alpha}^{(k)}_{U,{r}},\mathbf{\beta}^{(k)}_{U,{r}},\mathbf{\alpha}^{(k)}_{L,{r}},\mathbf{\beta}^{(k)}_{L,{r}}$ , respectively) and then repeat the procedure to “back-propagate” to the input layer. This allows us to obtain $f^{U}_{j}(\mathbf{x})$ and $f^{L}_{j}(\mathbf{x})$ in (1). The formal proof of Theorem 3.2 is in Appendix A. Note that for a neuron $r$ in layer $k$ , the slopes of its linear upper and lower bounds $\mathbf{\alpha}^{(k)}_{U,{r}},\mathbf{\alpha}^{(k)}_{L,{r}}$ can be different. This implies:

Fast-Lin is a special case of our framework as they require the slopes $\mathbf{\alpha}^{(k)}_{U,{r}},\mathbf{\alpha}^{(k)}_{L,{r}}$ to be the same; and it only applies to ReLU networks (cf. Sec. 3.2). In Fast-Lin, $\mathbf{\Lambda}^{(0)}$ and $\mathbf{\Omega}^{(0)}$ are identical.

Our CROWN framework allows adaptive selections on the linear approximation when computing certified lower bounds of minimum adversarial distortion, which is the main contributor to improve the certified lower bound as demonstrated in the experiments in Section 4.

Certified lower bound of minimum distortion.

Time Complexity.

2 Case studies: CROWN for ReLU, tanh, sigmoid and arctan activations

In Section 3.1 we showed that as long as one can identify two linear functions $h_{U}(y),h_{L}(y)$ to bound a general activation function $\sigma(y)$ for each neuron, we can use Corollary 3.3 with a binary search to obtain certified lower bounds of minimum distortion. In this section, we illustrate how to find parameters $\mathbf{\alpha}^{(k)}_{U,{r}},\mathbf{\alpha}^{(k)}_{L,{r}}$ and $\mathbf{\beta}^{(k)}_{U,{r}},\mathbf{\beta}^{(k)}_{L,{r}}$ of $h_{U}(y),h_{L}(y)$ for four most widely used activation functions: ReLU, tanh, sigmoid and arctan. Other activations, including but not limited to leaky ReLU, ELU and softplus, can be easily incorporated into our CROWN framework following a similar procedure.

Bounding tanh/sigmoid/arctan.

For tanh activation, $\sigma(y)=\frac{1-e^{-2y}}{1+e^{-2y}}$ ; for sigmoid activation, $\sigma(y)=\frac{1}{1+e^{-y}}$ ; for arctan activation, $\sigma(y)=\arctan(y)$ . All functions are convex on one side ( $y<0$ ) and concave on the other side ( $y>0$ ), thus the same rules can be used to find $h^{(k)}_{U,r}$ and $h^{(k)}_{L,r}$ . Below we call $(\mathbf{l}^{(k)}_{r},\sigma(\mathbf{l}^{(k)}_{r}))$ as left end-point and $(\mathbf{u}^{(k)}_{r},\sigma(\mathbf{u}^{(k)}_{r}))$ as right end-point. For $r\in\mathcal{S}^{+}_{k}$ , since $\sigma(y)$ is concave, we can let $h^{(k)}_{U,r}$ be any tangent line of $\sigma(y)$ at point $d\in[\mathbf{l}^{(k)}_{r},\mathbf{u}^{(k)}_{r}]$ , and let $h^{(k)}_{L,r}$ pass the two end-points. Similarly, $\sigma(y)$ is concave for $r\in\mathcal{S}^{+}_{k}$ , thus we can let $h^{(k)}_{L,r}$ be any tangent line of $\sigma(y)$ at point $d\in[\mathbf{l}^{(k)}_{r},\mathbf{u}^{(k)}_{r}]$ and let $h^{(k)}_{U,r}$ pass the two end-points. Lastly, for $r\in\mathcal{S}^{\pm}_{k}$ , we can let $h^{(k)}_{U,r}$ be the tangent line that passes the left end-point and $(d,\sigma(d))$ where $d\geq 0$ and $h^{(k)}_{U,r}$ be the tangent line that passes the right end-point and $(d,\sigma(d))$ where $d\leq 0$ . The value of $d$ for transcendental functions can be found using a binary search. The plots of upper and lower bounds for tanh and sigmoid are in Figure 1 and 3 (in Appendix). Plots for $\arctan$ are similar and so omitted.

Bounding ReLU.

For ReLU activation, $\sigma(y)=\max(0,y)$ . If $r\in\mathcal{S}^{+}_{k}$ , we have $\sigma(y)=y$ and so we can set $h^{(k)}_{U,r}=h^{(k)}_{L,r}=y$ ; if $r\in\mathcal{S}^{-}_{k}$ , we have $\sigma(y)=0$ , and thus we can set $h^{(k)}_{U,r}=h^{(k)}_{L,r}=0$ ; if $r\in\mathcal{S}^{\pm}_{k}$ , we can set $h^{(k)}_{U,r}=\frac{\mathbf{u}^{(k)}_{r}}{\mathbf{u}^{(k)}_{r}-\mathbf{l}^{(k)}_{r}}(y-\mathbf{l}^{(k)}_{r})$ and $h^{(k)}_{L,r}=ay,\,0\leq a\leq 1$ . Setting $a=\frac{\mathbf{u}^{(k)}_{r}}{\mathbf{u}^{(k)}_{r}-\mathbf{l}^{(k)}_{r}}$ leads to the linear lower bound used in Fast-Lin . Thus, Fast-Lin is a special case under our framework. We propose to adaptively choose $a$ , where we set $a=1$ when $\mathbf{u}^{(k)}_{r}\geq|\mathbf{l}^{(k)}_{r}|$ and $a=0$ when $\mathbf{u}^{(k)}_{r}<|\mathbf{l}^{(k)}_{r}|$ . In this way, the area between the lower bound $h^{(k)}_{L,r}=ay$ and $\sigma(y)$ (which reflects the gap between the lower bound and the ReLU function) is always minimized. As shown in our experiments, the adaptive selection of $h^{(k)}_{L,r}$ based on the value of $\mathbf{u}^{(k)}_{r}$ and $\mathbf{l}^{(k)}_{r}$ helps to achieve a tighter certified lower bound. Figure 4 (in Appendix) illustrates the idea discussed here.

Summary.

We summarized the above analysis on choosing valid linear functions $h^{(k)}_{U,r}$ and $h^{(k)}_{L,r}$ in Table 3 and 3. In general, as long as $h^{(k)}_{U,r}$ and $h^{(k)}_{L,r}$ are identified for the activation functions, we can use Corollary 3.3 to compute certified lower bounds for general activation functions. Note that there remain many other choices of $h^{(k)}_{U,r}$ and $h^{(k)}_{L,r}$ as valid upper/lower bounds of $\sigma(y)$ , but ideally, we would like them to be close to $\sigma(y)$ in order to achieve a tighter lower bound of minimum distortion.

3 Extension to quadratic bounds

Experiments

Methods. For ReLU networks, CROWN-Ada is CROWN with adaptive linear bounds (Sec. 3.2), CROWN-Quad is CROWN with quadratic bounds (Sec. 3.3). Fast-Lin and Fast-Lip are state-of-the-art fast certified lower bound proposed in . Reluplex can solve the exact minimum adversarial distortion but is only computationally feasible for very small networks. LP-Full is based on the LP formulation in and we solve LPs for each neuron exactly to achieve the best possible bound. For networks with other activation functions, CROWN-general is our proposed method.

Model and Dataset. We evaluate CROWN and other baselines on multi-layer perceptron (MLP) models trained on MNIST and CIFAR-10 datasets. We denote a feed-forward network with $m$ layers and $n$ neurons per layer as $m\times[n]$ . For models with ReLU activation, we use pretrained models provided by and also evaluate the same set of 100 random test images and random attack targets as in (according to their released code) to make our results comparable. For training NN models with other activation functions, we search for best learning rate and weight decay parameters to achieve a similar level of accuracy as ReLU models.

Implementation and Setup. We implement our algorithm using Python (numpy with numba). Most computations in our method are matrix operations that can be automatically parallelized by the BLAS library; however, we set the number of BLAS threads to 1 for a fair comparison to other methods. Experiments were conducted on an Intel Skylake server CPU running at 2.0 GHz on Google Cloud. Our code is available at https://github.com/huanzhang12/CROWN-Robustness-Certification

Conclusion

We have presented a general framework CROWN to efficiently compute a certified lower bound of minimum distortion in neural networks for any given data point $\mathbf{x_{0}}$ . CROWN features adaptive bounds for improved robustness certification and applies to general activation functions. Moreover, experiments show that (1) CROWN outperforms state-of-the-art baselines on ReLU networks and (2) CROWN can efficiently certify non-trivial lower bounds for large networks with over 10K neurons and with different activation functions.

Acknowledgement

This work was supported in part by NSF IIS-1719097, Intel faculty award, Google Cloud Credits for Research Program and GPUs donated by NVIDIA. Tsui-Wei Weng and Luca Daniel are partially supported by MIT-IBM Watson AI Lab and MIT-Skoltech program.

References

Appendix A Proof of Theorem 3.2

Assume the activation function $\sigma(y)$ is bounded by two linear functions $h^{(m-1)}_{U,i},h^{(m-1)}_{L,i}$ in Definition 3.1, we have

Thus, if the associated weight $\mathbf{W}^{(m)}_{j,i}$ to the $i$ -th neuron is non-negative (the terms in $F_{1}$ bracket), we have

otherwise (the terms in $F_{2}$ bracket), we have

Let $f_{j}^{U,m-1}(\mathbf{x})$ be an upper bound of $f_{j}(\mathbf{x})$ . To compute $f_{j}^{U,m-1}(\mathbf{x})$ , (6), (7) and (8) are the key equations. Precisely, for the $\mathbf{W}^{(m)}_{j,i}\geq 0$ terms in (6), the upper bound is the right-hand-side (RHS) in (7); and for the $\mathbf{W}^{(m)}_{j,i}<0$ terms in (6), the upper bound is the RHS in (8). Thus, we obtain:

From (9) to (10), we replace $h^{(m-1)}_{U,i}(\mathbf{y}^{(m-1)}_{i})$ and $h^{(m-1)}_{L,i}(\mathbf{y}^{(m-1)}_{i})$ by their definitions; from (10) to (11), we use variables $\mathbf{\lambda}^{(m-1)}_{j,i}$ and $\mathbf{\Delta}^{(m-1)}_{j,i}$ to denote the slopes in front of $\mathbf{y}^{(m-1)}_{i}$ and the intercepts in the parentheses:

and repeat again iteratively until obtain the final upper bound $f_{j}^{U,1}(\mathbf{x})$ , where $f_{j}(\mathbf{x})\leq f_{j}^{U,m-1}(\mathbf{x})\leq f_{j}^{U,m-2}(\mathbf{x})\leq\ldots\leq f_{j}^{U,1}(\mathbf{x})$ . We let $f_{j}(\mathbf{x})$ denote the final upper bound $f_{j}^{U,1}(\mathbf{x})$ , and we have

Lower bound.

The above derivations of upper bound can be applied similarly to deriving lower bounds of $f_{j}(\mathbf{x})$ , and the only difference is now we need to use the LHS of (7) and (8) (rather than RHS when deriving upper bound) to bound the two terms in (6). Thus, following the same procedure in deriving the upper bounds, we can iteratively unwrap the activation functions and obtain a final lower bound $f_{j}^{L,1}(\mathbf{x})$ , where $f_{j}(\mathbf{x})\geq f_{j}^{L,m-1}(\mathbf{x})\geq f_{j}^{L,m-2}(\mathbf{x})\geq\ldots\geq f_{j}^{L,1}(\mathbf{x})$ . Let $f_{j}^{L}(\mathbf{x})=f_{j}^{L,1}(\mathbf{x})$ , we have:

Indeed, $\mathbf{\lambda}^{(k)}_{j,i}$ and $\mathbf{\omega}^{(k)}_{j,i}$ only differs in the conditions of selecting $\mathbf{\alpha}^{(k)}_{U,{i}}$ or $\mathbf{\alpha}^{(k)}_{L,{i}}$ ; similarly for $\mathbf{\Delta}^{(k)}_{i,j}$ and $\mathbf{\Theta}^{(k)}_{i,j}$ .

Appendix B Proof of Corollary 3.3

Global lower bound.

Since $f_{j}^{L}(\mathbf{x})=\mathbf{\Omega}^{(0)}_{j,:}\mathbf{x}+\sum_{k=1}^{m}\mathbf{\Omega}^{(k)}_{j,:}(\mathbf{b}^{(k)}+\mathbf{\Theta}^{(k)}_{:,j})$ , we can derive $\gamma^{L}_{j}$ (similar to the derivation of $\gamma^{U}_{j}$ ) below:

Appendix C Illustration of linear upper and lower bounds on sigmoid activation function.

Let $f_{j}^{U}(\mathbf{x})$ be an upper bound of $f_{j}(\mathbf{x})$ . To compute $f_{j}^{U}(\mathbf{x})$ with quadratic approximations, we can still apply (7) and (8) except that $h^{(k)}_{U,r}(y)$ and $h^{(k)}_{L,r}(y)$ are replaced by the following quadratic functions:

From (21) to (22), we replace $h^{(m-1)}_{U,i}(\mathbf{y}^{(m-1)}_{i})$ and $h^{(m-1)}_{L,i}(\mathbf{y}^{(m-1)}_{i})$ by their definitions and let

From (22) to (23), we let $\mathbf{q}^{(m-1)}_{U,{j}}=\mathbf{W}^{(m)}_{j,:}\odot\mathbf{\tau}^{(m-1)}_{j,i}$ , and write in the matrix form. From (23) to (24), we substitute $\mathbf{y}^{(m-1)}$ by its definition: $\mathbf{y}^{(m-1)}=\mathbf{W}^{(m-1)}\Phi_{(m-2)}(\mathbf{x})+\mathbf{b}^{(m-1)}$ and then collect the quadratic terms, linear terms and constant terms of $\Phi_{(m-2)}(\mathbf{x})$ , where

Lower bound.

Similar to the above derivation, we can simply swap $h^{(k)}_{U,r}$ and $h^{(k)}_{L,r}$ and obtain lower bound $f_{j}^{L}(\mathbf{x})$ :

Efficient Neural Network Robustness Certification with General Activation Functions

Introduction

Background and Related Work

CROWN: A general framework for certifying neural networks

1 General framework

Input perturbation and pre-activation bounds.

Certified lower bound of minimum distortion.

Time Complexity.

2 Case studies: CROWN for ReLU, tanh, sigmoid and arctan activations

Bounding tanh/sigmoid/arctan.

Bounding ReLU.

Summary.

3 Extension to quadratic bounds

Experiments

Conclusion

Acknowledgement

References

Appendix A Proof of Theorem 3.2

Lower bound.

Appendix B Proof of Corollary 3.3

Global lower bound.

Appendix C Illustration of linear upper and lower bounds on sigmoid activation function.

Lower bound.

Appendix E Additional Experimental Results

E.2 Results on CROWN-general