Title: Adversaries Don't Care About Averages: Batch Attacks Against Deep Learning Models
Abstract: We study a new batch attack scenario on deep learning models where the goal of the adversary is to acquire some adversarial examples (from a large pool of candidates instances) given limits on total resource cost. We consider black-box attacks, where cost is measured by the number of model queries. Our basic hypotheses are that (1) there is high variance in the cost to find an adversarial example across a set of seed images and (2) there exist efficient strategies to identify the easiest-to-attack seeds. Hence, the number of successful adversarial examples found by batch attackers in batch attack scenario can be much more than one would expect from an average cost evaluation since adversaries can focus their resources on the most cost efficient seeds. Our preliminary results on state-of-the-art deep learning models support both hypotheses, and show that a simple greedy strategy often provides surprisingly good performance. Since designing robust models resistant to adversarial samples has become an important research direction, we test the robustness of existing robust models against batch attackers, and propose to develop and evaluate methods for testing the robustness of machine learning systems in settings where attackers have limited knowledge and resources.
Committee Members:
Yanjun Qi (Chair)
David Evans (Advisor)
Yuan Tian (Advisor)
Yangfeng Ji