Assessing and Improving Critical Properties of Test Oracles for Effective Software Bug Detection

Assessing and Improving Critical Properties of Test Oracles for Effective Software Bug Detection
 

Abstract: 
 

As software becomes integral to every aspect of our lives, particularly in safety-critical sectors such as healthcare and aviation, ensuring its robustness and reliability is of utmost importance. Even seemingly insignificant software bugs can compromise system stability and security. For instance, a mere copy-paste mistake caused Apple devices to accept invalid SSL certificates, introducing a major security vulnerability, while a date formatting bug led to a large-scale Twitter outage. These realities underscore the need for effective testing and bug detection mechanisms to ensure software reliability. At the heart of this challenge are test oracles, a fundamental component of testing, which play a crucial role in detecting software bugs.

My research focuses on exploring the properties of test oracles that are crucial for effective bug detection. Through extensive studies, I have identified three key properties that test oracles must possess: they must thoroughly check program behavior, they must correctly align with program specifications, and they must be strong enough to detect bugs. Collectively, I refer to these as the CCS properties of test oracles. My analysis reveals that even mature test suites often leave a significant portion of executed code unchecked and that automated test oracle generation methods frequently produce high rates of false positives, i.e., incorrect oracles and generate weak test oracles incapable of detecting bugs. To address these limitations, my research introduces a suite of methods comprising the OracleGuru framework, which employs dynamic and static program analysis and machine learning. The OracleGuru framework aims to improve the quality of test oracles by enabling more extensive checks of program behavior, generating correct and strong oracles that significantly outperform previous state-of-the-art methods. The effectiveness of the OracleGuru framework has been empirically evaluated on extensive real-world codebases, demonstrating significant improvements in test oracle quality and bug detection effectiveness, thereby greatly enhancing software reliability.

 

Committee:  

  • Sebastian Elbaum, Committee Chair (CS/SEAS/UVA)
  • Matthew Dwyer, Advisor (CS/SEAS/UVA)
  • Yangfeng Ji (CS/SEAS/UVA)
  • Matthew Bolton (ESE/SEAS/UVA)
  • Antonio Filieri (Amazon Web Services)