Fair Risk Algorithms

Richard A. Berk; Arun Kumar Kuchibhotla; Eric Tchetgen Tchetgen

doi:10.1146/annurev-statistics-033021-120649

Annual Review of Statistics and Its Application

Volume 10, 2023

Review Article

Open Access

Fair Risk Algorithms

Richard A. Berk^1,2, Arun Kumar Kuchibhotla³, and Eric Tchetgen Tchetgen²
View Affiliations Hide Affiliations

Affiliations: ¹Department of Criminology, University of Pennsylvania, Philadelphia, Pennsylvania, USA; email: [email protected] ²Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA ³Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Vol. 10:165-187 (Volume publication date March 2023) https://doi.org/10.1146/annurev-statistics-033021-120649
First published as a Review in Advance on October 07, 2022
Copyright © 2023 by the author(s).

This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See credit lines of images or other third-party material in this article for license information

Abstract

Machine learning algorithms are becoming ubiquitous in modern life. When used to help inform human decision making, they have been criticized by some for insufficient accuracy, an absence of transparency, and unfairness. Many of these concerns can be legitimate, although they are less convincing when compared with the uneven quality of human decisions. There is now a large literature in statistics and computer science offering a range of proposed improvements. In this article, we focus on machine learning algorithms used to forecast risk, such as those employed by judges to anticipate a convicted offender's future dangerousness and by physicians to help formulate a medical prognosis or ration scarce medical care. We review a variety of conceptual, technical, and practical features common to risk algorithms and offer suggestions for how their development and use might be meaningfully advanced. Fairness concerns are emphasized.

Keyword(s): algorithms, conformation prediction, criminal justice, discrimination, fairness, machine learning, optimal transport, risk assessment

Article metrics loading...

/content/journals/10.1146/annurev-statistics-033021-120649

2023-03-09

2024-04-30

Full text loading...

/deliver/fulltext/statistics/10/1/annurev-statistics-033021-120649.html?itemId=/content/journals/10.1146/annurev-statistics-033021-120649&mimeType=html&fmt=ahah

Literature Cited

Avin C, Shpitser Id, Pearl J. 2005. Identifiability of path-specific effects. Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence357–63 N.p.: IJCAI
[Google Scholar]
Athey S, Imbens G. 2019. Machine learning methods that economists should know about. Annu. Rev. Econ. 11:685–725
[Google Scholar]
Baer BR, Gilbert DE, Wells MT 2020. Fairness criteria through the lens of directed acyclic graphs: a statistical modeling perspective. The Oxford Handbook of Ethics of AI MD Dubber, F Pasquale, S Das 493–520 Oxford, UK: Oxford Univ. Press
[Google Scholar]
Becker GS. 1996. Accounting for Tastes Cambridge, MA: Harvard Univ. Press
Ben-Michael E, Greiner DJ, Imai K, Jiang Z. 2022. Safe policy learning through extrapolation: application to pre-trial risk assessment. arXiv:2109.11679v3 [stat.ML]
Berk RA. 2017. An impact assessment of machine learning risk forecasts on parole board decisions and recidivism. J. Exp. Criminol. 13:2633–55
[Google Scholar]
Berk RA. 2018. Machine Learning Forecasts of Risk in Criminal Justice Settings New York: Springer
Berk RA. 2020. Statistical Learning from a Regression Perspective New York: Springer. , 3rd ed..
Berk RA, DeLeeuw J. 1999. An evaluation of California's inmate classification system using a generalized regression discontinuity design. J. Am. Stat. Assoc. 94:4481045–52
[Google Scholar]
Berk RA, Elzarka A. 2020. Almost politically acceptable criminal justice risk assessment. Criminol. Public Policy 19:41231–57
[Google Scholar]
Berk RA, Freedman DA 2003. Statistical assumptions as empirical commitments. Punishment and Social Control TG Blomberg, S Cohen 234–58 Piscataway, NJ: Aldine de Gruyter. , 2nd ed..
[Google Scholar]
Berk RA, Heirdari H, Jabbari S, Kearns M, Roth A 2018. Fairness in criminal justice risk assessments: the state of the art. Sociol. Methods Res. 50:13–44
[Google Scholar]
Berk RA, Kuchibhotla AK, Tchetgen Tchetgen E 2021. Improving fairness in criminal justice algorithmic risk assessments using optimal transport and conformal prediction sets. arXiv:2111.09211 [STAT.AP]
Berk RA, Sorenson SB, Barnes G. 2016. Forecasting domestic violence: a machine learning approach to help inform arraignment decisions. J. Empir. Legal Stud. 13:194–115
[Google Scholar]
Bickel PJ, Hammel EA, O'Connell JW. 1975. Sex bias in graduate admission: data from Berkeley. Science 187:394–404
[Google Scholar]
Bielen S, Grajzi P. 2021. Prosecution or persecution? Extraneous events and prosecutorial decisions. J. Empir. Legal Stud. 18:4765–800
[Google Scholar]
Binns R. 2019. On the apparent conflict between individual and group fairness. arXiv:1912:06883v1 [cs.LG]
Bishop CM. 2006. Pattern Recognition and Machine Learning New York: Springer
Blinder A 2022. Lia Thomas wins an N.C.A.A. swimming title. New York Times March 17 https://www.nytimes.com/2022/03/17/sports/lia-thomas-swimmer-wins.html#:∼:text=Thomas%2C%20a%20fifth%2Dyear%20senior,woman%20to%20win%20an%20N.C.A.A.
[Google Scholar]
Bognar G, Hirose I. 2014. The Ethics of Health Care Rationing: An Introduction London: Routledge
Bolukbasi T, Chang K-W, Zou J, Saligrama V, Kalai A. 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. arXiv:1607.06520 [cs.CL]
Breiman L. 2001. Statistical modeling: two cultures. Stat. Sci. 16:3199–231
[Google Scholar]
Buja A, Brown L, Berk R, George E, Pitkin Eet al 2019a. Models as approximations I: consequences illustrated with linear regression. Stat. Sci. 34:4523–44
[Google Scholar]
Buja A, Brown L, Kuchibhotla AK, Berk R, George E, Zhao L 2019b. Models as approximations II: a model-free theory of parametric regression. Stat. Sci. 34:4545–65
[Google Scholar]
Calmon FP, Wei D, Vinzamuri B, Ramamurthy KN, Varshney KR 2017. Optimized pre-processing of discrimination prevention. Advances in Neural Information Processing Systems 30 (NIPS 2017) I Guyon, U Von Luxburg, S Bengio, H Wallach, R Fergus et al. N.p.: NeurIPS
[Google Scholar]
Campbell DT, Stanley JC. 1963. Experimental and Quasi-Experimental Designs for Research Chicago: Rand McNally & Co.
Chen Y, Bühlmann P. 2021. Domain adaptation under structural causal models. J. Mach. Learn. Res. 22:2611–80
[Google Scholar]
Chouldechova A. 2017. Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5:153–63
[Google Scholar]
Cochran WG. 2007. Sampling Techniques New York: Wiley
Coglianese C. 2021. Administrative law in the automated state Work. Pap. 2273, Legal Sch. Repos., Univ. Pa https://scholarship.law.upenn.edu/faculty_scholarship/2273/
Coglianese C, Lai A. 2022. Algorithm v. algorithm. Duke Law J 71:1281–342
[Google Scholar]
Coglianese C, Lehr D. 2018. Transparency and algorithmic governance. Adm. Law Rev. 71:1–56
[Google Scholar]
Cooper AF, Abrams E. 2021. Emergent unfairness in algorithmic fairness–accuracy trade-off research. AIES '21: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society46–54 New York: ACM
[Google Scholar]
Corbett-Davies S, Pierson E, Feller A, Goel S, Hug A 2017. Algorithmic decision making and the cost of fairness. arXix:1701.08230v4 [cs.CY]
Chzhen E, Denis M, Hebiri M, Oneto L, Pontil M. 2020. Fair regression with Wasserstein barycenters. arXiv:2006.07286 [stat.ML]
Diana E, Gill W, Kearns M, Kenthapadi K, Roth A 2021. Minimax group fairness: algorithms and experiments. arXiv:2011.03108v2 [cs.LG]
Dwork C, Hardt M, Pitassi T, Reingold O, Zemel 2011. Fairness through awareness. arXiv:1104.3913v2 [cs.CC]
Feldman M, Sorelle AF, Moeller J, Scheidegger C, Venkatasubramanian S. 2015. Certifying and removing disparate impact. Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining259–68 New York: ACM
[Google Scholar]
Fisher FM, Kadane JB 1983. Empirically based sentencing guidelines and ethical considerations. Research on Sentencing: The Search for Reform, Volume II A Blumstein, J Cohen, SE Martin, MH Tonry 184–93 Washington, DC: Natl. Acad. Press
[Google Scholar]
Freedman D. 2006. Statistical Models: Theory and Practice Cambridge, UK: Cambridge Univ. Press
Friedman JH. 2001. Greedy function approximation: a gradient boosting machine. Ann. Stat. 29:51189–232
[Google Scholar]
Gillen S, Jung C, Kearns M, Roth A 2018. Online learning with an unknown fairness metric. NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems S Bengio, HM Wallach, H Larochelle, K Grauman, N Cesa-Bianchi 2600–9 N.p.: NeurIPS
[Google Scholar]
Hardt M, Price E, Srebro N 2016. Equality of opportunity in supervised learning. arXiv:1601.02413v1 [cs.LG]
Hastie T, Tibshrani R, Friedman J. 2009. Elements of Statistical Learning New York: Springer. , 2nd ed..
Hosanna-Tabor Evangelical Lutheran Church and School v. Equal Employment Opportunity Commission, 597 F. 3d 769, reversed; 2010.)
Imai K, Jiang Z, Greiner DJ, Halen R, Shin S 2022. Experimental evaluation of algorithm-assisted human decision-making: an application of pretrial public safety assessment Presentation at Royal Statistical Society, Statistics and Law Section, Data Ethics and Governance Section, and Discussion Meetings Committee, virtual meeting Feb. 8
Imbens GW, Lemieux T. 2008. Regression discontinuity designs: a guide to practice. J. Econom. 142:2615–35
[Google Scholar]
Jung C, Kearns M, Neel S, Roth A, Stapleton L, Wu ZS. 2020. An algorithmic framework for fairness elicitation. arXiv:1905.10660v2 [cs.LG]
Kamiran F, Calders T. 2012. Data pre-processing techniques for classification without discrimination. Knowl. Inf. Syst. 33:1–33
[Google Scholar]
Kearns M, Roth A 2020. The Ethical Algorithm Oxford, UK: Oxford Univ. Press
Kim MP, Ghorbani A, Zou J. 2019. Multiaccuracy: black-box post-processing for fairness classification. AIES '19: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society247–54 New York: ACM
[Google Scholar]
Kleinberg J, Himabindu L, Leskovec J, Ludwig J, Sendhil M. 2018. Human decisions and machine predictions. Q. J. Econ. 133:1237–93
[Google Scholar]
Kleinberg J, Mullainathan SM, Raghavan M. 2017. Inherent tradeoffs in the fair determination of risk scores. Proceedings of the 8th Conference on Innovations in Theoretical Computer Science CH Papadimitriou, artic. 43 Saarbrücken, Ger: Schloss Dagstuhl
[Google Scholar]
Kuchibhotla AK, Berk RA. 2020. Nested conformal prediction sets for classification with applications to probation data. arXiv:2104.09358
Kuchibhotla AK, Kolassa JE, Kuffner TA. 2021. Post-selection inference. Annu. Rev. Stat. Appl. 9:505–27
[Google Scholar]
Kussner M, Loftus J, Russel C, Silva R 2018. Counterfactual fairness. arXiv:1703.06856v3 [stat.ML]
Le Gouic T, Loubes J-M, Rigollet P 2020. Projection to fairness in statistical learning. arXIV:2005.11720v4 [cs.LG]
Lei J, G'Sell M, Rinaldo R, Tibshirani RJ, Wasserman L 2018. Distribution-free predictive inference for regression. J. Am. Stat. Assoc. 113:523
[Google Scholar]
Madaan N, Mehta S, Agrawaal T, Malhotra V, Aggarwal A et al. 2018. Analyze, detect and remove gender stereotyping from Bollywood movies. PMLR 81:92–105
[Google Scholar]
Maity S, Xue S, Yurochkin M, Sun Y. 2021. Statistical inference for individual fairness. arXiv:2103.16714v1 [stat.ML]
Mitchell S, Potash E, Barocas S, D'Amour A, Lum K 2021. Algorithmic fairness, choices, assumptions and definitions. Annu. Rev. Stat. Appl. 8:141–63
[Google Scholar]
Miller JL, Rossi PH, Simpson JE. 1986. Race and gender differences in judgements of appropriate prison sentence. Law Soc. Rev. 20:3313–34
[Google Scholar]
Mohri M, Rostamizadeh A, Talwalkar A. 2018. Foundations of Machine Learning Cambridge, MA: MIT Press. , 2nd ed..
Mukherjee D, Yurochkin M, Banerjee M, Sun Y 2020. Two simple ways to learn individual fairness metrics from data. ICML'20: Proceedings of the 37th International Conference on Machine Learning7097–107 N.p.: JMLR
[Google Scholar]
Murphy KP. 2012. Machine Learning: A Probabilistic Perspective Cambridge, MA: MIT Press
Nabi R, Shpister I. 2018. Fair inference on outcomes. arXiv:1705.10378v4 [stat.ML]
Nakkiran P, Kaplun G, Bansal Y, Yang T, Barak B, Sutskever I. 2019. Deep double descent: where bigger models and more data hurt. aXiv.1912.02292 [cs.LG ].
Pan JP, Yang Q 2009. A survey of transfer learning. IEEE Trans. Knowl. Data Eng. 22:101345–59
[Google Scholar]
Pan L, Meng M, Ren Y, Zheng Y, Xu Z. 2021. Self-paced deep regression forests with consideration on ranking fairness. arXiv:2112.06455v1 [cs.CV]
Peyré G, Cuturi M. 2019. Computational Optimal Transport with Applications to Data Science Boston: Now Publ.
Oneto L, Chiappa S. 2020. Fairness in machine learning. arXiv:2012.15816v1 [cs.LG]
O'Reilly S. 2017. Just because you're paranoid… Phillip K Dick's troubled life. The Irish Times Oct. 7. https://www.irishtimes.com/culture/film/just-because-you-re-paranoid-philip-k-dick-s-troubled-life-1.3243976
[Google Scholar]
Rawls J. 2001. Justice as Fairness: A Restatement Cambridge, MA: Harvard Univ. Press
Robbins MW, Saunders J, Kilmer B. 2017. A framework for synthetic control methods with high-dimensional, micro-level data: evaluating a neighborhood-specific crime intervention. J. Am. Stat. Assoc. 112:517109–26
[Google Scholar]
Rodolfa KT, Lamba H, Ghani R. 2021. Empirical observation of negligible fairness–accuracy trade-offs in machine learning for public policy. Nat. Mach. Intell. 3:869–904
[Google Scholar]
Romano Y, Barber RF, Sabatti C, Candés E. 2019. With malice toward none: assessing uncertainty via equalized coverage. arXiv:1908:05428v1 [stat.ME]
Romano Y, Patterson E, Candés E. 2019. Conformalized quantile regression. arXiv:1905.03222 [stat.ME]
Rossi PH, Berk RA. 1985. Varieties of normative consensus. Am. Sociol. Rev. 50:3333–47
[Google Scholar]
Rossi PH, Berk RA. 1997. Just Punishments: Federal Guidelines and Public View Compared Piscataway, NJ: Aldine de Gruyter
Rossi PH, Waite E, Bose C, Berk RA. 1974. The seriousness of crimes: normative structure and individual differences. Am. Sociol. Rev. 39:224–37
[Google Scholar]
Rothenhäusler D, Meinshausen N, Bühlmann P, Peters J. 2021. Anchor regression: heterogeneous data meet causality. J. R. Stat. Soc. Ser. B 83:215–46
[Google Scholar]
Rudin C, Berk U. 2018. Optimized scoring systems: toward trust in machine learning for healthcare and criminal justice. J. Appl. Anal. 48:5449–66
[Google Scholar]
Schökopf B. 2019. Causality for machine learning. arXiv:1911.10500 [cs.LG]
Shannon CE. 1959. Coding theorems for a discrete source with a fidelity criterion. IRE Int. Conv. Rec. 7:42–163
[Google Scholar]
Shi C, Wang X, Luo S, Zhu H, Ye J, Song R 2021. Dynamic causal effects evaluation in A/B testing with a reinforcement learning framework. arXiv:2002.01711v5 [cs.LG]
Singer N, Metz C. 2019. Many facial recognition systems are biased, says U.S. study. New York Times Dec. 19. https://www.nytimes.com/2019/12/19/technology/facial-recognition-bias.html
[Google Scholar]
Smith AH, Parish JJ. 2010. When a Person with Mental Illness Goes to Prison New York: Urban Justice Cent.
Starr SB. 2014. Sentencing, by the numbers. New York Times Aug. 10. https://www.nytimes.com/2014/08/11/opinion/sentencing-by-the-numbers.html
[Google Scholar]
Stevenson MT, Doleac JL. 2018. The Roadblock to Reform Washington, DC: Am. Const. Soc.
Tibshirani RJ, Barber RF, Candés EJ, Ramdas A. 2020. Conformal prediction under a covariate shift. arXiv:1904.06019v3 [stat.ME]
Tseng G. 2018. Interpreting neural networks. Towards Data Science Blog Nov. 16. https://towardsdatascience.com/interpretable-neural-networks-45ac8aa91411
[Google Scholar]
Watson J, Holmes C. 2016. Approximate models and robust decisions. Stat. Sci. 11:4465–89
[Google Scholar]
Wu J, Ma Z, Ramen A, Laudanski K, Hung A 2021. APOL1 risk variants in individuals of African genetic ancestry drive endothelial cell defects that exacerbate sepsis. Immunity 54:112632–49
[Google Scholar]
Wu Y, Zhang L, Wu X, Tong H. 2019. PC-fairness: a unified framework for measuring causality-based fairness. arXiv:1910.12586v1 [cs.LG]
Zhuang F, Qi Z, Duan K, Xi D, Zhu Y et al. 2020. A comprehensive survey of transfer learning. arXiv:1911:02685v3 [cs.LG]

/content/journals/10.1146/annurev-statistics-033021-120649

Fair Risk Algorithms

Annual Review of Statistics and Its Application 10, 165 (2023); https://doi.org/10.1146/annurev-statistics-033021-120649

/content/journals/10.1146/annurev-statistics-033021-120649

Data & Media loading...

Article Type: Review Article

Most Cited Most Cited RSS feed

- Probabilistic Forecasting
  
  Tilmann Gneiting, and Matthias Katzfuss
  
  Vol. 1 (2014), pp. 125–151
- Functional Data Analysis
  
  Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Müller
  
  Vol. 3 (2016), pp. 257–295
- Bayesian Computing with INLA: A Review
  
  Håvard Rue, Andrea Riebler, Sigrunn H. Sørbye, Janine B. Illian, Daniel P. Simpson, and Finn K. Lindgren
  
  Vol. 4 (2017), pp. 395–421
- Functional Regression
  
  Jeffrey S. Morris
  
  Vol. 2 (2015), pp. 321–359
- Topological Data Analysis
  
  Larry Wasserman
  
  Vol. 5 (2018), pp. 501–532
- Algorithmic Fairness: Choices, Assumptions, and Definitions
  
  Shira Mitchell, Eric Potash, Solon Barocas, Alexander D'Amour, and Kristian Lum
  
  Vol. 8 (2021), pp. 141–163
- Microbiome, Metagenomics, and High-Dimensional Compositional Data Analysis
  
  Hongzhe Li
  
  Vol. 2 (2015), pp. 73–94
- Learning Deep Generative Models
  
  Ruslan Salakhutdinov
  
  Vol. 2 (2015), pp. 361–385
- On p-Values and Bayes Factors
  
  Leonhard Held, and Manuela Ott
  
  Vol. 5 (2018), pp. 393–419
- High-Dimensional Statistics with a View Toward Applications in Biology
  
  Peter Bühlmann, Markus Kalisch, and Lukas Meier
  
  Vol. 1 (2014), pp. 255–278
More Less

Annual Review of Statistics and Its Application

Volume 10, 2023

Review Article

Open Access

Fair Risk Algorithms

Abstract

Most Read This Month

Most Cited Most Cited RSS feed