We review statistical methods for estimating and interpreting league tables used to infer hospital quality with a primary focus on methods for partitioning variation into two types: () that associated with within-hospital variation for a homogeneous group of patients and () that produced by between-hospital variation. We discuss the types of covariates included in the model, hierarchical and nonhierarchical logistic regression models for conducting inferences in a low-information context and their associated trade-offs, and the role of hospital volume. We use all-cause mortality rates for US hospitals to illustrate concepts and methods.


Article metrics loading...

Loading full text...

Full text loading...


Literature Cited

  1. Alexandrescu R, Jen M-H, Bottle A, Jarman B, Aylin P. 2011. Logistic versus hierarchical modeling: an analysis of a statewide inpatient sample. J. Am. Coll. Surg. 213:392–401 [Google Scholar]
  2. Ash AS, Fienberg SE, Louis TA, Normand S-LT, Stukel TA, Utts J. 2012. Statistical issues in assessing hospital performance Rep., Comm. Pres. Stat. Soc. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/Downloads/Statistical-Issues-in-Asses-sing-Hospital-Performance.pdf [Google Scholar]
  3. Ash A, Shwartz M. 1999. R2: a useful measure of model performance when predicting a dichotomous outcome. Stat. Med. 18:375–84 [Google Scholar]
  4. Baggerly KA, Coombes KR. 2011. What information should be required to support clinical “omics” publications?. Clin. Chem. 57:688–90 [Google Scholar]
  5. Bayarri MJ, Castellanos ME. 2007. Bayesian checking of the second levels of hierarchical models. Stat. Sci. 22:322–43 [Google Scholar]
  6. Berk RA. 2008. Statistical Learning from a Statistical Perspective New York: Springer-Verlag [Google Scholar]
  7. Birkmeyer JD, Siewers AE, Finlayson EVA, Stukel TA, Lucas FL. et al. 2002. Hospital volume and surgical mortality in the United States. N. Engl. J. Med. 346:1128–37 [Google Scholar]
  8. Bishop YMM, Fienberg SE, Holland PW. 2007 (1975). Discrete Multivariate Analysis: Theory and Practice New York: Springer-Verlag [Google Scholar]
  9. Blumberg MS. 1987. Comments on HCFA hospital death rate statistical outliers. Health Serv. Res. 21:715–39 [Google Scholar]
  10. Breiman L. 2001. Random forests. Mach. Learn. 45:5–32 [Google Scholar]
  11. Bunker JP, Forrest WHJ, Mosteller F, Vandam LD. 1969. The National Halothane Study: a study of the possible association between halothane anesthesia and post-operative hepatic necrosis. Rep. Subcomm. Anesth., Div. Med. Sci. Natl. Acad. Sci.–Natl. Res. Counc. Washington, DC: [Google Scholar]
  12. Camilli G, Cizek GJ, Lugg CA. 2001. Psychometric theory and the validation of performance standards: history and future perspectives. Setting Performance Standards: Concepts, Methods and Perspectives GJ Cizek 445–75 Mahwah, NJ: Lawrence Erlbaum [Google Scholar]
  13. Carlin BP, Louis TA. 2009. Bayesian Methods for Data Analysis. Boca Raton, FL: Chapman & Hall/CRC, 3rd ed.. [Google Scholar]
  14. CDC (Cent. Dis. Control) 2010. Your guide to the standardized infection ratio (SIR). NHSN e-news Special Edition Dec. 10 [Google Scholar]
  15. Citro C, Kalton G. 2000. Small-Area Income and Poverty Estimates: Priorities for 2000 and Beyond Washington, DC: Natl. Acad. Press [Google Scholar]
  16. Cleveland WS, Devlin SJ. 1988. Locally-weighted regression: an approach to regression analysis by local fitting. J. Am. Stat. Assoc. 83:596–610 [Google Scholar]
  17. Crainiceanu CM, Caffo BS, Morris J. 2013. Multilevel functional data analysis. The SAGE Handbook of Multilevel Modeling MA Scott, JS Simonoff, BD Marx 223–48 Los Angeles: SAGE [Google Scholar]
  18. Crainiceanu CM, Ruppert D, Carroll RJ, Adarsh J, Goodner B. 2007. Spatially adaptive penalized splines with heteroscedastic errors. J. Comput. Graph. Stat. 16:265–88 [Google Scholar]
  19. Dempster AP. 1988. Employment discrimination and statistical science. Stat. Sci. 3:149–61 Discussion. 3:162–95 [Google Scholar]
  20. Diggle PJ, Thomson MC, Christensen OF, Rowlingson B, Obsomer V. et al. 2007. Spatial modelling and the prediction of Loa loa risk: decision making under uncertainty. Ann. Trop. Med. Parasitol. 101:499–509 [Google Scholar]
  21. Dudley RA, Johansen KL, Brand R, Rennie DJ, Milstein A. 2000. Elective referral to high-volume hospitals: estimating potentially avoidable deaths. JAMA 283:1159–66 [Google Scholar]
  22. Efron B. 1978. Regression and ANOVA with zero-one data: measures of residual variation. J. Am. Stat. Assoc. 73:113–21 [Google Scholar]
  23. Ericksen EP, Kadane JB. 1985. Estimating the population in a census year: 1980 and beyond. J. Am. Stat. Assoc. 80:98–109 Discussion. 80:110–31 [Google Scholar]
  24. Fienberg SE. 2011. Bayesian models and methods in public policy and government settings. Stat. Sci. 26:212–26 Discussion. 26:227–30 [Google Scholar]
  25. Fiscella K, Burstin HR, Nerenz DR. 2014. Quality measures and sociodemographic risk factors: to adjust or not to adjust. JAMA 321:242615–16 [Google Scholar]
  26. Freedman DA, Navidi WC. 1986. Regression models for adjusting the 1980 census. Stat. Sci. 1:3–11 Discussion. 1:12–39 [Google Scholar]
  27. Gatsonis CA. 1998. Profiling providers of medical care. Encyclopedia of Biostatistics 3 P Armitage, T Colton New York: Wiley, 2nd ed.. [Google Scholar]
  28. Gelman A, Carlin J, Stern H, Rubin D. 2004. Bayesian Data Analysis Boca Raton, FL: Chapman & Hall/CRC, 2nd ed.. [Google Scholar]
  29. Gelman A, van Mechelen I, Verbeke G, Heitjan DF, Meulders M. 2005. Multiple imputation for model checking: completed-data plots with missing and latent data. Biometrics 61:74–85 [Google Scholar]
  30. Goldman E, Chu P, Osmond D, Bindman A. 2011. The accuracy of present-on-admission reporting in administrative data. Health Serv. Res. 46:1946–62 [Google Scholar]
  31. Goldstein H, Spiegelhalter DJ. 1996. League tables and their limitations: statistical issues in comparisons of institutional performance. J. R. Stat. Soc. Ser. A 159:3385–443 [Google Scholar]
  32. Greiner DJ. 2008. Causal inference in civil rights litigation. Harvard Law Rev. 122:533–98 [Google Scholar]
  33. Hastie TJ, Tibshirani RJ, Friedman JH. 2009. The Elements of Statistical Learning New York: Springer-Verlag, 2nd ed.. [Google Scholar]
  34. Iezzoni LI. 1997. Assessing quality using administrative data. Ann. Intern. Med. 127:666–74 [Google Scholar]
  35. Iezzoni LI. 2003. Risk Adjustment for Measuring Health Care Outcomes Chicago, IL: Health Adm. Press., 3rd ed.. [Google Scholar]
  36. Jha AK, Zaslavsky AM. 2014. Quality reporting that addresses disparities in health care. JAMA 312:3225–26 [Google Scholar]
  37. Jones HE, Spiegelhalter DJ. 2011. The identification of “unusual” health-care providers from a hierarchical model. Am. Stat. 65:154–63 [Google Scholar]
  38. Kalbfleisch JD, Wolfe RA. 2013. On monitoring outcomes of medical providers. Stat. Biosci. 5:286–302 [Google Scholar]
  39. Kipnis P, Escobar GJ, Draper D. 2010. Effect of choice of estimation method on inter-hospital mortality rate comparisons. Med. Care 48:458–65 [Google Scholar]
  40. Krumholz HM, Brindis RG, Brush JE, Cohen DJ, Epstein AJ. et al. 2006. Standards for statistical models used for public reporting of health outcomes: an American Heart Association Scientific Statement from the Quality of Care and Outcomes Research Interdisciplinary Writing Group: cosponsored by the Council on Epidemiology and Prevention and the Stroke Council. Endorsed by the American College of Cardiology Foundation. Circulation 113:456–62 [Google Scholar]
  41. Landrum M, Bronskill S, Normand S-LT. 2000. Analytic methods for constructing cross-sectional profiles of health care providers. Health Serv. Outcomes Res. Methodol. 1:23–48 [Google Scholar]
  42. Landrum MB, Normand S-LT, Rosenheck RA. 2003. Selection of related multivariate means: monitoring psychiatric care in the Department of Veterans Affairs. J. Am. Stat. Assoc. 98:7–16 [Google Scholar]
  43. Lin R, Louis TA, Paddock SM, Ridgeway G. 2006. Loss function based ranking in two-stage, hierarchical models. Bayesian Anal. 1:915–46 [Google Scholar]
  44. Lin R, Louis TA, Paddock SM, Ridgeway G. 2009. Ranking USRDS provider specific SMRs from 1998–2001. Health Serv. Outcomes Res. Methodol. 9:22–38 [Google Scholar]
  45. Lin X. 2007. Estimation using penalized quasilikelihood and quasi-pseudo-likelihood in Poisson mixed models. Lifetime Data Anal. 13:533–44 [Google Scholar]
  46. Lockwood JR, Louis TA, McCaffrey DF. 2002. Uncertainty in rank estimation: implications for value-added modeling accountability systems. J. Educ. Behav. Stat. 27:255–70 [Google Scholar]
  47. Louis TA, Zeger SL. 2008. Effective communication of standard errors and confidence intervals. Biostatistics 10:1–2 [Google Scholar]
  48. Magder LS, Zeger S. 1996. A smooth nonparametric estimate of a mixing distribution using mixtures of Gaussians. J. Am. Stat. Assoc. 91:1141–51 [Google Scholar]
  49. McCaffrey DF, Ridgeway G, Morral AR. 2004. Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychol. Methods 9:403–25 [Google Scholar]
  50. Mesirov JP. 2010. Computer science. Accessible reproducible research. Science 327:415–16 [Google Scholar]
  51. Morris JS, Carroll RJ. 2006. Wavelet-based functional mixed models. J. R. Stat. Soc. Ser. B Stat. Methodol. 68:179–99 [Google Scholar]
  52. Mosteller F. 2010. The safety of anesthetics: the National Halothane Study. The Pleasures of Statistics: The Autobiography of Frederick Mosteller SE Fienberg, DC Hoaglin, JM Tanur 69–88 New York: Springer [Google Scholar]
  53. Ni X, Zhang D, Zhang HH. 2010. Variable selection for semiparametric mixed models in longitudinal studies. Biometrics 66:79–88 [Google Scholar]
  54. Normand S-LT, Shahian DM. 2007. Statistical and clinical aspects of hospital outcomes profiling. Stat. Sci. 22:206–26 [Google Scholar]
  55. Paddock S, Louis TA. 2011. Percentile-based empirical distribution function estimates for performance evaluation of healthcare providers. J. R. Stat. Soc. Ser. C Appl. Stat. 60:575–89 [Google Scholar]
  56. Paddock S, Ridgeway G, Lin R, Louis TA. 2006. Flexible distributions for triple-goal estimates in two-stage hierarchical models. Comput. Stat. Data Anal. 50:3243–62 [Google Scholar]
  57. Pepe MS. 2003. The Statistical Evaluation of Medical Tests for Classification and Prediction Oxford, UK: Oxford Univ. Press [Google Scholar]
  58. Pepe MS, Feng Z, Huang Y, Longton G, Prentice R. et al. 2008. Integrating the predictiveness of a marker with its performance as a classifier. Am. J. Epidemiol. 167:362–68 [Google Scholar]
  59. Ross JS, Normand S-LT, Wang Y, Ko DT, Chen J. et al. 2010. Hospital volume and 30-day mortality for three common medical conditions. N. Engl. J. Med. 362:1110–18 [Google Scholar]
  60. Shahian DM, Normand S-LT. 2003. The volume-outcome relationship: from Luft to leapfrog. Ann. Thorac. Surg. 75:1048–58 [Google Scholar]
  61. Shahian DM, Normand S-LT. 2008. Comparison of “risk-adjusted” hospital outcomes. Circulation 117:1955–63 [Google Scholar]
  62. Shen W, Louis TA. 1998. Triple-goal estimates in two-stage, hierarchical models. J. R. Stat. Soc. Ser. B Stat. Methodol. 60:455–71 [Google Scholar]
  63. Silber JH, Rosenbaum PR, Brachet TJ, Ross RN, Bressler LJ. et al. 2010. The Hospital Compare mortality model and the volume-outcome relationship. Health Serv. Res. 45:1148–67 [Google Scholar]
  64. Spencer G, Wang J, Donovan L, Tu JV. 2008. Report on coronary artery bypass surgery in Ontario, fiscal years 2005/06 and 2006/07 Tech. Rep., Inst. Clin. Eval. Sci. Toronto: [Google Scholar]
  65. Spiegelhalter D, Best N, Carlin B, Linde AVD. 2002. Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B Stat. Methodol. 64:583–639 [Google Scholar]
  66. Spiegelhalter D, Sherlaw-Johnson C, Bardsley M, Blunt I, Wood C, Grigg O. 2012. Statistical methods for healthcare regulation: rating, screening and surveillance. J. R. Stat. Soc. Ser. A Stat. Soc. 175:1–47 [Google Scholar]
  67. Tomberlin T. 1988. Predicting accident frequencies for drivers classified by two factors. J. Am. Stat. Assoc. 83:309–21 [Google Scholar]
  68. Wang Y. 2011. Smoothing Splines: Methods and Applications Boca Raton, FL: Chapman & Hall/CRC [Google Scholar]
  69. Whoriskey P. 2006. Florida to link teacher pay to students' test scores. Washington Post March 22 [Google Scholar]
  70. Wood SN. 2006. Generalized Additive Models: An Introduction with R Boca Raton, FL: Chapman & Hall/CRC [Google Scholar]

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error