- Dr. Donald Brown (ESE) (Advisor)
- Dr. Afsaneh Doryab (ESE) (Chairperson)
- Dr. Laura Barnes (ESE)
- Dr. Sana Syed (UVA Health)
Zoom Meeting Information: Please send an email to firstname.lastname@example.org for the zoom information.
Title: Combining and evaluating interpretable computer vision and natural language processing techniques for use in clinical contexts
Medical data is a valuable resource that can help enable positive patient outcomes. Deep learning has proven the capability to analyze data being collected across health systems and accurately predict disease classifications or generate disease phenotypes. However, one of the major challenges of incorporating deep learning in healthcare is the need for explainability to validate model decisions. Existing explainability methods include generating saliency maps or self-attention based weight generation to assign importance to input data. The aim of this research is to investigate and extend explainable deep learning prediction models used to analyze multi-modal medical, from Electronic Medical Records (EMR), clinical notes and radiology images, and whether the explanations generated make sense or not.
For preliminary research, we analyzed EMR containing patient diagnosis codes using LSTM based deep learning models to predict probability of suffering from Long COVID. We found that analyzing historical diagnosis data of patients just before they contracted the COVID-19 virus helped us predict the chances of the patient suffering from Long COVID with the Area under the ROC curve (AUROC) of about 0.75. We investigated this model using Gradient-weighted Class Activation Maps (GradCAM) to assign weights to each temporally arranged input of the patient’s diagnosis history to find out which diagnosis was the most important for making that classification. With the help of these we are able to look up and summarize which diagnosis codes were given the highest weights by the model for all patients and see the distribution of when those diagnoses were made. For example, we found that "Difficulty Breathing" was highest weighted diagnosis code for most of the patients in the test set and look up the time distribution of how long before the patient caught COVID were they suffering from "Difficulty Breathing".
The secondary aim of this research is to extend this interpretability to deep learning with histopathology images. Digital histopathology consists of very high-resolution images of tissues that are often used to detect diseases like cancer, Crohn’s disease and other gut related diseases and many more. These digitized images called Whole Slide Images (WSIs) contain enough detail that can be analyzed using deep learning to automate the process of disease classification. Previous methods have used GradCAM to identify which areas of the image were important to activate the neurons in the model. Recent methods have used Vision Transformers to and their inherent explanatory power using self-attention layers to make classification of these high-resolution images more interpretable while also making the models more accurate. These images have associated clinical notes that contain more descriptions of the image and can also help characterize the disease. They have been used to pre-train deep learning models using contrastive training framework on radiological imaging data like Chest X-rays and associated reports and has shown to perform better interpretations and have better accuracy than just models trained using images. Thus, we should be able to create equivalent models that combine medical reports associated with WSIs to achieve equal or higher accuracy and build more interpretable models as has been shown to be possible on Chest X-rays.