AI & Big Data in Finance Research Forum (ABFR) Webinar
Title: The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning
Presenter: Sharad Goel (Harvard University)
Discussant: Jann Spiess (Stanford University)
Zoom webinar link:
Webinar ID: 960 9678 9670
The nascent field of fair machine learning aims to ensure that decisions guided by algorithms are equitable. Over the last several years, three formal definitions of fairness have gained prominence: (1) anti-classification, meaning that protected attributes—like race, gender, and their proxies—are not explicitly used to make decisions; (2) classification parity, meaning that common measures of predictive performance (e.g., false positive and false negative rates) are equal across groups defined by the protected attributes; and (3) calibration, meaning that conditional on risk estimates, outcomes are independent of protected attributes. Here we show that all three of these fairness definitions suffer from significant statistical limitations. Requiring anticlassification or classification parity can, perversely, harm the very groups they were designed to protect; and calibration, though generally desirable, provides little guarantee that decisions are equitable. In contrast to these formal fairness criteria, we argue that it is often preferable to treat similarly risky people similarly, based on the most statistically accurate estimates of risk that one can produce. Such a strategy, while not universally applicable, often aligns well with policy objectives; notably, this strategy will typically violate both anti-classification and classification parity. In practice, it requires significant effort to construct suitable risk estimates. One must carefully define and measure the targets of prediction to avoid retrenching biases in the data. But, importantly, one cannot generally address these difficulties by requiring that algorithms satisfy popular mathematical formalizations of fairness. By highlighting these challenges in the foundation of fair machine learning, we hope to help researchers and practitioners productively advance the area.
Bio of speaker:
Sharad Goel is a Professor of Public Policy at Harvard Kennedy School. He looks at public policy through the lens of computer science, bringing a new computational perspective to a diverse range of contemporary social issues. Some topics Sharad has recently worked on are: policing practices, including statistical tests for discrimination; fair machine learning, including in automated speech recognition; and democratic governance, including swing voting, polling errors, voter fraud, and political polarization. Before joining Harvard, he was on the faculty at Stanford University, with appointments in management science & engineering, computer science, sociology, and the law school. At Stanford, he founded the Computational Policy Lab. The lab is comprised of researchers, data scientists, and journalists who work to address policy problems through technical innovation. For example, they deployed a “blind charging” platform in San Francisco to mitigate racial bias in prosecutorial decisions. They also collected, released, and analyzed data on over 100 million traffic stops as part of the Stanford Open Policing Project. Sharad also writes essays about policy issues from a statistical perspective. These include discussions of algorithms in the courts (in the New York Times, the Washington Post, and the Boston Globe); policing (in Slate and the Huffington Post); mass incarceration (in the Washington Post); election polls (in the New York Times); claims of voter fraud (in Slate, and also an extended interview with This American Life); and affirmative action (in Boston Review). Sharad holds a BS in mathematics from the University of Chicago, as well as a master’s degree in computer science and PhD in applied mathematics from Cornell University. Before joining Stanford, Sharad was a senior researcher at Microsoft Research.
Bio of discussant:
Jann Spiess is an Assistant Professor at the Stanford Graduate School of Business. He works on integrating techniques and insights from machine learning into the econometric toolbox. His research brings together microeconometric methods, statistical decision theory, and mechanism design to clarify the use of flexible prediction algorithms in causal inference and data-driven decision-making. He is particularly interested in the role of human and machine decisions in replicable and robust inferences from big data. Jann holds a PhD in economics from Harvard University. Previously, Jann obtained a master’s degree in public policy from the Harvard Kennedy School. His background is in mathematics with a focus on probability theory and combinatorics, which he studied at the University of Cambridge (Part III of the Mathematical Tripos) and the Technical University of Munich. Jann also studied and worked in Hangzhou, China and Ouagadougou, Burkina Faso.