Studying Bias in Models, Data Collection Processes, and Text
Abstract:
Language technologies are becoming more and more prevalent in our daily lives, and with their power comes significant risk, given bias that has been established in these models. While this bias has often been measured with respect to stereotypes, there are other ways in which models might demonstrate bias. I study how models demonstrate gender bias in how they retrieve factual information and present it to users. To do so, I prompt a variety of models to recount results of Olympic events, and find that when prompts are ambiguous, the output shows significant gender bias. In addition to gender bias in knowledge retrieval, I will discuss work on bias in data collection methods in NLP and finally, work that uses NLP to characterize bias in text.
Laura Biester is an assistant professor of computer science at Middlebury College. Her research interests lie at the intersection of natural language processing (NLP) and computational social science (CSS), with a particular focus on mental health and bias. She has worked on modeling language change over time on social media for individuals with depression, perspectivism in NLP, quantification of bias in language and in language models, and clinical NLP. She earned her PhD in computer science and engineering from the University of Michigan in 2023, and her BA in computer science from Carleton College in 2016.