Measuring and Mitigating Biases in Vision and Language Models
While we have seen remarkable progress in many vision and language tasks, recent studies have revealed that many proposed models exhibit various biases. For example, a human activity recognition model can overly correlate "man" with "coaching" and "woman" with "shopping". Great concerns have been raised about the potential adverse effect of such correlations on societal fairness and equality. As more and more research techniques are being adopted in practical applications, it is critical for us to be aware of how biases exist in datasets and models, and how to mitigate them. In this thesis, we approach the problem through three projects. In the first project, we present a framework to measure and mitigate intrinsic biases with respect to protected variables–such as gender–in visual recognition tasks. We show that even when datasets are balanced such that each label co-occurs equally with each gender, learned models amplify the association between labels and gender, as much as if data had not been balanced. To mitigate bias amplification, we adopt an adversarial approach to effectively remove unwanted features corresponding to protected variables from intermediate representations in a deep neural network. In the second project, we discover that semantic-agnostic corpus regularities such as word frequency captured by the word embeddings negatively impact the performance of existing debiasing algorithms. We propose a simple but effective technique which purifies the word embeddings against such corpus regularities prior to inferring and removing the gender subspace in biased word embeddings. Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches. In the third project, we focus on more general biases. We propose a clustering-based metric to measure bias without sensitive attribute annotations. We demonstrate that our metric provides consistent estimates when compared to measurements from existing metrics that leverage sensitive attribute annotations. We found this type of metrics are especially useful for active learning where the iterative selection of examples may introduce significant biases.
- Yanjun Qi, Committee Chair (CS/SEAS/UVA)
- Vicente Ordóñez Román, Advisor (CS/SEAS/UVA)
- Yangfeng Ji (CS/SEAS/UVA)
- Jundong Li (CS, ECE/SEAS, DSI/UVA)
- Paul Humphreys (Department of Philosophy/GSAS/UVA)
- Olga Russakovsky (Department of Computer Science, Princeton University)