Efficient Neural Network Robustness Certification with General Activation Functions
Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, Luca Daniel
Introduction
While neural networks (NNs) have achieved remarkable performance and accomplished unprecedented breakthroughs in many machine learning tasks, recent studies have highlighted their lack of robustness against adversarial perturbations . For example, in image learning tasks such as object classification or content captioning , visually indistinguishable adversarial examples can be easily crafted from natural images to alter a NN’s prediction result. Beyond the white-box attack setting where the target model is entirely transparent, visually imperceptible adversarial perturbations can also be generated in the black-box setting by only using the prediction results of the target model . In addition, real-life adversarial examples have been made possible through the lens of realizing physical perturbations . As NNs are becoming a core technique deployed in a wide range of applications, including safety-critical tasks, certifying robustness of a NN against adversarial perturbations has become an important research topic in machine learning.
Although certifying the largest possible robustness is challenging for ReLU networks, the piece-wise linear nature of ReLUs can be exploited to efficiently compute a non-trivial certified lower bound of the minimum distortion . Beyond ReLU, one fundamental problem that remains largely unexplored is how to generalize the robustness certification technique to other popular activation functions that are not piece-wise linear, such as tanh and sigmoid, and how to motivate and certify the design of other activation functions towards improved robustness. In this paper, we tackle the preceding problem by proposing an efficient robustness certification framework for NNs with general activation functions. Our main contributions in this paper are summarized as follows:
We propose a generic analysis framework CROWN for certifying NNs using linear or quadratic upper and lower bounds for general activation functions that are not necessarily piece-wise linear.
Unlike previous work , CROWN allows flexible selections of upper/lower bounds for activation functions, enabling an adaptive scheme that helps to reduce approximation errors. Our experiments show that CROWN achieves up to 26% improvements in certified lower bounds compared to .
Our algorithm is efficient and can scale to large NNs with various activation functions. For a NN with over 10,000 neurons, we can give a certified lower bound in about 1 minute on 1 CPU core.
Background and Related Work
For ReLU networks, finding the minimum adversarial distortion for a given input data point can be cast as a mixed integer linear programming (MILP) problem . Reluplex uses a satisfiable modulo theory (SMT) to encode ReLU activations into linear constraints. Similarly, Planet uses satisfiability (SAT) solvers. However, due to the NP-completeness for solving such a problem , these methods can only find minimum distortion for very small networks. It can take Reluplex several hours to find the minimum distortion of an example for a ReLU network with 5 inputs, 5 outputs and 300 neurons.
On the other hand, a computationally feasible alternative of robustness certificate is to provide a non-trivial and certified lower bound of minimum distortion. Some analytical lower bounds based on operator norms on the weight matrices or the Jacobian matrix in NNs do not exploit special property of ReLU and thus can be very loose . The bounds in are based on the local Lipschitz constant. assumes a continuous differentiable NN and hence excludes ReLU networks; a closed form lower-bound is also hard to derive for networks beyond 2 layers. applies to ReLU networks and uses Extreme Value Theory to provide an estimated lower bound (CLEVER score). Although the CLEVER score is capable of reflecting the level of robustness in different NNs and is scalable to large networks, it is not a certified lower bound. On the other hand, use the idea of a convex outer adversarial polytope in ReLU networks to compute a certified lower bound by relaxing the MILP certification problem to linear programing (LP). apply semidefinite programming for robustness certification in ReLU networks but their approach is limited to NNs with one hidden layer. exploit the ReLU property to bound the activation function (or the local Lipschitz constant) and provide efficient algorithms (Fast-Lin and Fast-Lip) for computing a certified lower bound, achieving state-of-the-art performance. A recent work uses abstract transformations to zonotopes for proving robustness property for ReLU networks. Nonetheless, there are still some applications demand non-ReLU activations, e.g. RNN and LSTM, thus a general framework that can efficiently compute non-trivial and certified lower bounds for NNs with general activation functions is of great importance. We aim at filling this gap and propose CROWN that can perform efficient robustness certification to NNs with general activation functions. Table 1 summarizes the differences of other approaches and CROWN. Note that a recent work based on solving Lagrangian dual can also handle general activation functions but it trades off the quality of robustness bound with scalability.
CROWN: A general framework for certifying neural networks
1 General framework
Input perturbation and pre-activation bounds.
Below, we first define the linear upper bounds and lower bounds of activation functions in Definition 3.1, which are the key to derive explicit output bounds for an -layer neural network with general activation functions. The formal statement of the explicit output bounds is shown in Theorem 3.2.
Note that the parameters depend on and , i.e. for different and we may choose different parameters. Also, for ease of exposition, in this paper we restrict . However, Theorem 3.2 can be easily generalized to the case of negative .
Theorem 3.2 illustrates how a NN function can be bounded by two linear functions and when the activation function of each neuron is bounded by two linear functions and in Definition 3.1. The central idea is to unwrap the activation functions layer by layer by considering the signs of the associated (equivalent) weights of each neuron and apply the two linear bounds and . As we demonstrate in the proof, when we replace the activation functions with the corresponding linear upper bounds and lower bounds at the layer , we can then define equivalent weights and biases based on the parameters of and (e.g. are related to the terms , respectively) and then repeat the procedure to “back-propagate” to the input layer. This allows us to obtain and in (1). The formal proof of Theorem 3.2 is in Appendix A. Note that for a neuron in layer , the slopes of its linear upper and lower bounds can be different. This implies:
Fast-Lin is a special case of our framework as they require the slopes to be the same; and it only applies to ReLU networks (cf. Sec. 3.2). In Fast-Lin, and are identical.
Our CROWN framework allows adaptive selections on the linear approximation when computing certified lower bounds of minimum adversarial distortion, which is the main contributor to improve the certified lower bound as demonstrated in the experiments in Section 4.
Certified lower bound of minimum distortion.
Time Complexity.
2 Case studies: CROWN for ReLU, tanh, sigmoid and arctan activations
In Section 3.1 we showed that as long as one can identify two linear functions to bound a general activation function for each neuron, we can use Corollary 3.3 with a binary search to obtain certified lower bounds of minimum distortion. In this section, we illustrate how to find parameters and of for four most widely used activation functions: ReLU, tanh, sigmoid and arctan. Other activations, including but not limited to leaky ReLU, ELU and softplus, can be easily incorporated into our CROWN framework following a similar procedure.
Bounding tanh/sigmoid/arctan.
For tanh activation, ; for sigmoid activation, ; for arctan activation, . All functions are convex on one side () and concave on the other side (), thus the same rules can be used to find and . Below we call as left end-point and as right end-point. For , since is concave, we can let be any tangent line of at point , and let pass the two end-points. Similarly, is concave for , thus we can let be any tangent line of at point and let pass the two end-points. Lastly, for , we can let be the tangent line that passes the left end-point and where and be the tangent line that passes the right end-point and where . The value of for transcendental functions can be found using a binary search. The plots of upper and lower bounds for tanh and sigmoid are in Figure 1 and 3 (in Appendix). Plots for are similar and so omitted.
Bounding ReLU.
For ReLU activation, . If , we have and so we can set ; if , we have , and thus we can set ; if , we can set and . Setting leads to the linear lower bound used in Fast-Lin . Thus, Fast-Lin is a special case under our framework. We propose to adaptively choose , where we set when and when . In this way, the area between the lower bound and (which reflects the gap between the lower bound and the ReLU function) is always minimized. As shown in our experiments, the adaptive selection of based on the value of and helps to achieve a tighter certified lower bound. Figure 4 (in Appendix) illustrates the idea discussed here.
Summary.
We summarized the above analysis on choosing valid linear functions and in Table 3 and 3. In general, as long as and are identified for the activation functions, we can use Corollary 3.3 to compute certified lower bounds for general activation functions. Note that there remain many other choices of and as valid upper/lower bounds of , but ideally, we would like them to be close to in order to achieve a tighter lower bound of minimum distortion.
3 Extension to quadratic bounds
Experiments
Methods. For ReLU networks, CROWN-Ada is CROWN with adaptive linear bounds (Sec. 3.2), CROWN-Quad is CROWN with quadratic bounds (Sec. 3.3). Fast-Lin and Fast-Lip are state-of-the-art fast certified lower bound proposed in . Reluplex can solve the exact minimum adversarial distortion but is only computationally feasible for very small networks. LP-Full is based on the LP formulation in and we solve LPs for each neuron exactly to achieve the best possible bound. For networks with other activation functions, CROWN-general is our proposed method.
Model and Dataset. We evaluate CROWN and other baselines on multi-layer perceptron (MLP) models trained on MNIST and CIFAR-10 datasets. We denote a feed-forward network with layers and neurons per layer as . For models with ReLU activation, we use pretrained models provided by and also evaluate the same set of 100 random test images and random attack targets as in (according to their released code) to make our results comparable. For training NN models with other activation functions, we search for best learning rate and weight decay parameters to achieve a similar level of accuracy as ReLU models.
Implementation and Setup. We implement our algorithm using Python (numpy with numba). Most computations in our method are matrix operations that can be automatically parallelized by the BLAS library; however, we set the number of BLAS threads to 1 for a fair comparison to other methods. Experiments were conducted on an Intel Skylake server CPU running at 2.0 GHz on Google Cloud. Our code is available at https://github.com/huanzhang12/CROWN-Robustness-Certification
Conclusion
We have presented a general framework CROWN to efficiently compute a certified lower bound of minimum distortion in neural networks for any given data point . CROWN features adaptive bounds for improved robustness certification and applies to general activation functions. Moreover, experiments show that (1) CROWN outperforms state-of-the-art baselines on ReLU networks and (2) CROWN can efficiently certify non-trivial lower bounds for large networks with over 10K neurons and with different activation functions.
Acknowledgement
This work was supported in part by NSF IIS-1719097, Intel faculty award, Google Cloud Credits for Research Program and GPUs donated by NVIDIA. Tsui-Wei Weng and Luca Daniel are partially supported by MIT-IBM Watson AI Lab and MIT-Skoltech program.
References
Appendix A Proof of Theorem 3.2
Assume the activation function is bounded by two linear functions in Definition 3.1, we have
Thus, if the associated weight to the -th neuron is non-negative (the terms in bracket), we have
otherwise (the terms in bracket), we have
Let be an upper bound of . To compute , (6), (7) and (8) are the key equations. Precisely, for the terms in (6), the upper bound is the right-hand-side (RHS) in (7); and for the terms in (6), the upper bound is the RHS in (8). Thus, we obtain:
From (9) to (10), we replace and by their definitions; from (10) to (11), we use variables and to denote the slopes in front of and the intercepts in the parentheses:
and repeat again iteratively until obtain the final upper bound , where . We let denote the final upper bound , and we have
Lower bound.
The above derivations of upper bound can be applied similarly to deriving lower bounds of , and the only difference is now we need to use the LHS of (7) and (8) (rather than RHS when deriving upper bound) to bound the two terms in (6). Thus, following the same procedure in deriving the upper bounds, we can iteratively unwrap the activation functions and obtain a final lower bound , where . Let , we have:
Indeed, and only differs in the conditions of selecting or ; similarly for and .
Appendix B Proof of Corollary 3.3
Global lower bound.
Since , we can derive (similar to the derivation of ) below:
Appendix C Illustration of linear upper and lower bounds on sigmoid activation function.
Let be an upper bound of . To compute with quadratic approximations, we can still apply (7) and (8) except that and are replaced by the following quadratic functions:
From (21) to (22), we replace and by their definitions and let
From (22) to (23), we let , and write in the matrix form. From (23) to (24), we substitute by its definition: and then collect the quadratic terms, linear terms and constant terms of , where
Lower bound.
Similar to the above derivation, we can simply swap and and obtain lower bound :