Class aims to equip students with skills to thrive in an era of 'big data'

Caitlin Wylie thinks about the social and ethical implications of engineering every day. As an assistant professor with a doctorate in the history and philosophy of science, she studies and teaches socially relevant and ethical research and design.

Gianluca Guadagni, her colleague in the Engineering and Society Department at the University of Virginia’s School of Engineering and Applied Science, is an applied mathematician. His research interest is to describe complex systems by random processes and probabilistic models, and eventually predict outcomes — without regard to who or what is affected by them.

“Math is not right or wrong,” he points out. “It’s just correct or incorrect.”

Yet, Guadagni knows these are not words he and his students can live by. We live in an era of big data and increasing reliance on statistics to inform policy on everything from health care to access to loans. Consequently, how one collects, analyzes and distributes data is fraught with questions of right and wrong. People in all walks of life ― from data scientists to policymakers to consumers whose data is aggregated and served back to them in the form of advertising ― should have some understanding of statistical methodologies and the moral issues attached to their use.

"Our areas of expertise are equally necessary to understand today’s culture of data-driven social institutions. Together we will integrate mathematical skills with ethical skills to analyze data justly and for social good."

Assistant professors Gianluca Guadagni (applied math) and Caitlin Wylie (science, technology and society)

Together, the social scientist and the mathematician recently won a Donchian Teaching Fellowship from UVA’s Institute for Practical Ethics and Public Life. They will co-create and team-teach a new class called Ethical Analytics. The fellowship provides a $7,000 stipend to develop a new academic course.

The purpose of Donchian awards is to expose more students to issues of right, wrong and justice in contexts where ethics is often neglected. When Guadagni saw the call for proposals, he emailed his Engineering and Society colleagues for any takers on a collaboration.

“Data science and ethics are actually a nice pairing and since we have people fluent in ethics here, I thought it was a good match for our department,” Guadagni said, noting that Engineering and Society may comprise the most interdisciplinary group of scholars in the Engineering School, if not in the university.

Wylie eagerly joined the project and they were soon working on a successful proposal.

“Our areas of expertise are equally necessary to understand today’s culture of data-driven social institutions,” they argued in their proposal. “Together we will integrate mathematical skills of analyzing data with ethical skills of judging how to analyze data justly and for social good.”

Wylie and Guadagni are working on the course this fall. The fellowship requires it to be offered at least twice over six semesters. Longer term, they want to see the new class added to the engineering curriculum.

Caitlin Wylie

"We hope students will take these things less for granted, when we say ‘What is data?’ and they get stumped. And they see that we’re also stumped. There are many ways to think about it. I think that’s a really valuable lesson."

Caitlin Wylie, assistant professor of science, technology and society

“Gianluca and I didn’t get that kind of [cross-disciplinary] training, but we wish we had,” Wylie said. “Hopefully the next generation will have one person who can teach a socio-technical approach to data science.”

With the course’s emphasis on applying analysis and judgment ― it will be structured around case studies ― Pamela Norris, Frederick Tracy Morse Professor of Mechanical and Aerospace Engineering and executive dean of UVA Engineering, said she appreciates how the course integrates the School’s values and its philosophy of learning by doing.

“Understanding the social implications of the knowledge we create and what we do with it is fundamental to engineering education today,” Norris said. “This collaboration demonstrates the kind of enriching experience offered to engineers trained at comprehensive universities, such as UVA, which values the sciences and technology as well as the humanities.”

And the course isn’t just for engineers. While Guadagni and Wylie see potential applications in the medical and nursing schools ― or any program where data analysis and ethics are in play ― it’s intended for any student who meets a calculus prerequisite.

“We don’t need a rigorous statistics course to understand how to analyze the data,” Guadagni said. “We’re going to use R, a commonly used statistical software program. Students from any background will be able to use it.”

Educating “ethically aware” data scientists of the future is an important goal, but so is cultivating data literacy in general. Wylie was quick to answer Guadagni’s email because she was already researching how data is used in “open governance” from the perspective of science and technology studies. She is studying Charlottesville Open Data, an online portal the city built to improve transparency and public engagement. The website launched a year ago, following a trend in local governments publishing huge datasets for public access.

“What does it mean to be a citizen if, instead of a government official handing you a report that analyzes the homelessness numbers in your city, you’re expected to make that analysis yourself?” Wylie said. “How were the data collected? By whom? How were they processed into a standard format? How do these decisions, made by people with their own purposes and biases, shape what the data can tell us?

“My research interest and the course idea come from a desire that my students would be able to ask those kinds of questions themselves.”

The course will be structured around case studies, including Charlottesville’s data portal. For example, student teams will choose a dataset, such as local crime or real estate values, and design the best ways to analyze that dataset, both mathematically and ethically.

Wylie and Guadagni wrote in the course proposal that students “will learn through experience that ethics is an integral part of all stages of data analytics, including posing research questions, organizing data, calculating results, and sharing their interpretations with relevant audiences, such as through valid and comprehensible visualizations.”

The course content is still in development and there are big questions yet to be answered ― some of which will come in the class itself.

“What is data? What isn’t data? What do we mean by ‘Big Data’? These are things that nobody really knows. It’s such an evolving field,” Wylie said.

“We hope students will take these things less for granted, when we say ‘What is data?’ and they get stumped. And they see that we’re also stumped. There are many ways to think about it. I think that’s a really valuable lesson.”