Navigating an Imperfect, Messy Data World

Date

Data-driven software has the ability to shape human behavior: it affects the products we view and purchase, the news articles we read, the social interactions we engage in, and, ultimately, the opinions we form. The correctness and proper function of such data-driven systems relies heavily on the correctness of their data. Unfortunately, data, a piece so central in the operation of many modern systems, is far from perfect. In this talk, I will focus on how errors, omissions, biases, and poor data quality in general can lead to disruptions, loss of revenue, incorrect conclusions, and misguided policy decisions, and what we can do to address these issues. Our work is grounded on an important insight: Existing data repair techniques focus on purging datasets of errors and other problems, but they disregard the fact that many problems are systemic, inherent to the process that produces the data, and thus will keep occurring unless they are corrected at their source. In this talk, I will focus on my lab's work on error diagnostic and bias mitigation frameworks and I will also provide a broader overview of our work on enhancing usability, understandability, and trust in data technologies, highlighting the role of data management in realizing a vision for toolsets that assist the exploration and effective use of information in a varied, diverse, and highly non-integrated data world.

Speaker
Alexandra Meliou
Speaker Title
Associate Professor, College of Information and Computer Science
Speaker Institution
University of Massachusetts Amherst
Speaker Biography

Alexandra Meliou is an Associate Professor in the College of Information and Computer Sciences, at the University of Massachusetts Amherst. Prior to joining UMass, she was a Postdoctoral Research Associate at the University of Washington. Alexandra received her PhD degree from the Electrical Engineering and Computer Sciences Department at the University of California, Berkeley. She has received recognitions for research, teaching, and service, including a CACM Research Highlight, an ACM SIGMOD Research Highlight Award, an ACM SIGSOFT Distinguished Paper Award, an NSF CAREER Award, a Google Faculty Research Award, multiple Distinguished Reviewer Awards, and a Lilly Fellowship for Teaching Excellence. Her research focuses on data provenance, causality, explanations, data quality, and algorithmic fairness.