Improving Robustness of Machine Learning Models using Domain Knowledge
David Evans (Advisor), Yanjun Qi (Advisor), Vicente Ordóñez Román (Chair), Patrick McDaniel (Pennsylvania State University), Homa Alemzadeh (Minor Representative)
Although machine learning techniques have achieved great success in many areas, recent studies have shown that they are not robust under attack. A motivated adversary is often able to craft input samples that force a machine learning model to produce incorrect predictions, even if the target model achieves high accuracy on normal test inputs. This raises a great concern when machine learning models are deployed for security-sensitive tasks. The root cause is correlation does not imply causation--- machine learning models often don't learn truly causative factors in their decision rules. The gap between the ground-truth decision rules and the approximate ones learned by a machine learning model inevitably creates opportunities for adversaries. I propose to improve the robustness of machine learning models by exploiting domain knowledge. Domain knowledge goes beyond a given dataset of a task and helps uncover weaknesses of machine learning models, eventually improving their robustness. My completed work has shown that domain knowledge helps to find weaknesses of machine learning models for malware classification and can be used to improve the robustness of computer vision models.