Title: Adversarial Defenses and Intrinsic Robustness: Bridging the Gap
Since the first discovery of the adversarial vulnerability of state-of-the-art machine learning models, numerous defense mechanisms against adversarial examples have been proposed. However, none of them are able to produce adversarially robust models. Starting with Gilmer et al. (2018), recent theoretical results show that under certain assumptions regarding the input metric probability space, adversarial examples with small perturbations are inevitable. To understand the inherent limitations of robust learning on actual datasets, our preliminary work (Mahloujifar et al. 2019) developed an empirical estimator for intrinsic robustness, the maximum robustness limit one can hope for a given robust learning problem. We demonstrate the existence of a large gap between the estimated intrinsic robustness limit and the robustness achieved by state-of-the-art adversarial training methods for typical classification tasks. In this proposal, we aim to push down the intrinsic robustness limit by considering a smaller but meaningful collection of models, and leverage the theoretical understanding of intrinsic robustness to design better ways to defend against adversarial examples.
- Mohammad Mahmoody, Chair (CS)
- David Evans, Advisor (CS)
- David Wu (CS)
- Tom Fletcher (ECE)
- Somesh Jha (CS, University of Wisconsin-Madison)