- Chair: Laura Barnes, ESE
- Advisor: Donald Brown, ESE/SDS
- Michael Porter, ESE/SDS
- Rafael Alvarado, SDS
- Peter Alonzi, SDS
Send an email to email@example.com for Zoom information.
Title: Knowledge Discovery and Decision Support Systems using Natural Language Processing in Applications for Societal Good
Artificial Intelligence (AI) is becoming a crucial innovation that significantly impacts our everyday life. Not only have commercial AI applications improved our daily experiences, but AI has also proved to be beneficial in social domains such as healthcare, transportation, and education. As an essential sub-field of AI, Natural Language Processing (NLP) has great potential to benefit such social domains. In particular, many social domains have textual information that different NLP techniques can leverage to help individuals generate insights or aid in quick decision-making. This study presents several different NLP approaches that can provide societal benefits in three problem areas: namely, transportation and safety, security and counter-terrorism, and labor market gaps.
In the first application, we focused on short domain-specific types of texts in the field of transportation and safety. We used accident reports’ narratives, a free text field, to identify the causes of accidents. We used two deep-learning architectures, Recurrent Neural Networks and Convolutional Neural Networks, combined with word embeddings (Word2Vec and GloVe) to train a model capable of identifying any accident’s cause based on the accident’s narrative. Such a system can be used as a decision-support tool for evaluating accident reports.
In the security and counter-terrorism domain, we used NLP to analyze ISIS’s propaganda approach to women and compared it with the approach from a non-violent religious group for women. We collected the relevant texts by using web-scraping and optical character recognition (OCR), and used an unsupervised learning method for analyzing the texts. Furthermore, we used emotion analysis to check for the emotional aspects of these documents.
Finally, to address the skill gap in data science-related jobs, we collected a large corpus of online job advertisements and used the embedding vector space of the advertised skill terms and phrases in a semi-supervised approach in order to find the hard skills that the jobs required and extrapolate them from these documents. We also presented a complete framework for analyzing skills in the U.S. that allows individuals and organizations to understand the job market.