1932

Abstract

Although both randomized and nonrandomized study data relevant to a question of treatment efficacy are often available and separately analyzed, these data are rarely formally combined in a single analysis. One possible reason for this is the apparent or feared disagreement of effect estimates across designs, which can be attributed both to differences in estimand definition and to analyses that may produce biased estimators. This article reviews specific models and general frameworks that aim to harmonize analyses from the two designs and combine them via a single analysis that ideally exploits the relative strengths of each design. The development of such methods is still in its infancy, and examples of applications with joint analyses are rare. This area would greatly benefit from more attention from researchers in statistical methods and applications.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-010814-020249
2015-04-10
2024-06-16
Loading full text...

Full text loading...

/deliver/fulltext/statistics/2/1/annurev-statistics-010814-020249.html?itemId=/content/journals/10.1146/annurev-statistics-010814-020249&mimeType=html&fmt=ahah

Literature Cited

  1. Ades AE, Sutton AJ. 2006. Multiparameter evidence synthesis in epidemiology and medical decision-making: current approaches. J. R. Stat. Soc. A 169:15–35 [Google Scholar]
  2. Amatya A, Bhaumik DK, Normand S-L, Greenhouse J, Kaizar E. et al. 2015. Likelihood-based random effect meta-analysis of binary events. J. Biopharm. Stat. In press. doi: 10.1080/10543406.2014.920348 [Google Scholar]
  3. Begg CB, Pilote L. 1991. A model for incorporating historical controls into a meta-analysis. Biometrics 47:899–906 [Google Scholar]
  4. Benson K, Hartz AJ. 2000. A comparison of observational studies and randomized controlled trials. N. Engl. J. Med. 342:1878–86 [Google Scholar]
  5. Bérare A, Bravo G. 1998. Combining studies using effect sizes and quality scores: application to bone loss in postmenopausal women. J. Clin. Epidemiol. 51:801–7 [Google Scholar]
  6. Bhaumik DK, Amatya A, Normand S-L, Greenhouse J, Kaizar E. et al. 2012. Meta-analysis of rare binary adverse event data. J. Am. Stat. Assoc. 107:498555–67 [Google Scholar]
  7. Black N. 1996. Why we need observational studies to evaluate the effectiveness of health care. BMJ 312:70401215–18 [Google Scholar]
  8. Bridge JA, Axelson DA. 2008. The contribution of pharmacoepidemiology to the antidepressant-suicidality debate in children and adolescents. Int. Rev. Psychiatry 20:2209–14 [Google Scholar]
  9. Chowdhury R, Kunutsor S, Vitezova A, Oliver-Williams C, Chowdhury S. et al. 2014. Vitamin D and risk of cause specific death: systematic review and meta-analysis of observational cohort and randomised intervention studies. BMJ 348:g1903 [Google Scholar]
  10. Cohen AM, Stavri PZ, Hersh WR. 2004. A categorization and analysis of the criticisms of evidence-based medicine. Int. J. Med. Inform. 73:35–43 [Google Scholar]
  11. Cole SR, Stuart EA. 2010. Generalizing evidence from randomized clinical trials to target populations: the ACTG-320 trial. Am. J. Epidemiol. 172:107–15 [Google Scholar]
  12. Concato J, Shah N, Horwitz RI. 2000. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N. Engl. J. Med. 342:251887–92 [Google Scholar]
  13. Cook TD, Shadish WR, Wong VC. 2008. Three conditions under which experiments and observational studies produce comparable causal estimates: new findings from within-study comparisons. J. Policy Anal. Manag. 27:4724–50 [Google Scholar]
  14. Cooper GF, Yoo C. 1999. Causal discovery from a mixture of experimental and observational data. Proc. 15th Conf. Uncertain. Artif. Intell. Stockholm, Swed. 116–25 San Francisco: Morgan Kaufmann [Google Scholar]
  15. Cooper WO, Callahan ST, Shintani A, Fuchs DC, Shelton RC. et al. 2014. Antidepressants and suicide attempts in children. Pediatrics 133:2204–10 [Google Scholar]
  16. Deeks JJ, Dinnes J, D'Amico R, Sowden AJ, Sakarovitch C. et al. 2003. Evaluating non-randomised intervention studies. Health Technol. Assess. 7:271–173 [Google Scholar]
  17. Dias S, Sutton AJ, Welton NJ, Ades AE. 2013. Evidence synthesis for decision making 3: heterogeneity—subgroups, meta-regression, bias, and bias-adjustment. Med. Decis. Making 33:618–40 [Google Scholar]
  18. Dias S, Welton NJ, Marinho VCC, Salanti G, Higgins JPT, Ades AE. 2010. Estimation and adjustment of bias in randomized evidence by using mixed treatment comparison meta-analysis. J. R. Stat. Soc. A 173:3613–29 [Google Scholar]
  19. Eddy DM. 1987. The use of confidence profiles to assess tissue-type plasminogen activator. Acute Coronary Care 1987 RM Califf, GS Wagner 89–110 Boston: Nijhoff [Google Scholar]
  20. Eddy DM. 1989. The confidence profile method: a Bayesian method for assessing health technologies. Oper. Res. 37:210–28 [Google Scholar]
  21. Eddy DM, Hasselblad V, Shachter RD. 1992. Meta-Analysis by the Confidence Profile Method: The Statistical Synthesis of Evidence Boston: Academic [Google Scholar]
  22. Epstein D, Mochón LG, Espín J, Soares MO. 2013. Use of multiparameter evidence synthesis to assess the appropriateness of data and structure in decision models. Med. Decis. Making 33:715–30 [Google Scholar]
  23. Fortin M, Dionne J, Phiho G, Gignac J, Almirall J, Lapointe L. 2006. Randomized controlled trials: Do they have external validity for patients with multiple comorbidities?. Ann. Fam. Med. 4:104–8 [Google Scholar]
  24. Frangakis CE, Rubin DB. 2002. Principal stratification in causal inference. Biometrics 58:21–29 [Google Scholar]
  25. GAO (US Gen. Account. Office) 1992. Cross design synthesis: a new strategy for medical effectiveness research GAO PEMD-92-18 Washington, DC: http://www.gao.gov/assets/160/151472.pdf [Google Scholar]
  26. GAO (US Gen. Account. Office) 1994. Breast conservation versus mastectomy: patient survival in day-to-day medical practice and in randomized studies. GAO PEMD-95-9 Washington, DC: http://www.gpo.gov/fdsys/pkg/GAOREPORTS-PEMD-95-9/pdf/GAOREPORTS-PEMD-95-9.pdf [Google Scholar]
  27. Gibbons RD, Brown CH, Hur K, Marcus SM, Bhaumik DK. et al. 2007. Early evidence on the effects of regulators' suicidality warnings on SSRI prescriptions and suicide in children and adolescents. Am. J. Psychiatry 164:1356–63 [Google Scholar]
  28. Gibbons RD, Hur K, Bhaumik DK, Mann JJ. 2006. The relationship between antidepressant prescription rates and rate of early adolescent suicide. Am. J. Psychiatry 163:1893–904 [Google Scholar]
  29. Green DP, John P. 2010. Field experiments in comparative politics and policy. Ann. Am. Acad. Polit. Soc. Sci. 628:6–10 [Google Scholar]
  30. Greenhouse JB, Kaizar EE, Kelleher K, Seltman H, Gardner W. 2008. Generalizing from clinical trial data: a case study. The risk of suicidality among pediatric antidepressant users. Stat. Med. 27:111801–13 [Google Scholar]
  31. Greenland S. 2005. Multiple-bias modelling for analysis of observational data. J. R. Stat. Soc. A 168:267–306 [Google Scholar]
  32. Greenland S. 2009. Relaxation penalties and priors for plausible modeling of nonidentified bias sources. Stat. Sci. 24:2195–210 [Google Scholar]
  33. Greenland S, O'Rourke K. 2001. On the bias produced by quality scores in meta-analysis, and a hierarchical view of proposed solutions. Biostatistics 2:463–71 [Google Scholar]
  34. Grines CL, Nelson TR, Safian RD, Hanzel G, Goldstein JA, Dixon S. 2008. A Bayesian meta-analysis comparing AngioJet® thrombectomy to percutaneous coronary intervention alone in acute myocardial infarction. J. Interv. Cardiol. 21:459–82 [Google Scholar]
  35. Grootendorst DC, Jager KJ, Zoccali C, Dekker FW. 2010. Observational studies are complementary to randomized controlled trials. Nephron Clin. Pract. 114:c173–77 [Google Scholar]
  36. Hamad TA, Laughren T, Racoosin J. 2006. Suicidality in pediatric patients treated with antidepressant drugs. Arch. Gen. Psychiatry 63:332–39 [Google Scholar]
  37. Heckman JJ, Smith JA. 1995. Assessing the case for social experiments. J. Econ. Perspect. 9:285–110 [Google Scholar]
  38. Hernán MA, Alonso A, Logan R, Grodstein F, Michels KB. et al. 2008. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology 19:766–79 discussion 780–93 [Google Scholar]
  39. Higgins JPT, Altman DG, Sterne JAC. 2011. Chapter 8: assessing risk of bias in included studies. Cochrane Handbook for Systematic Reviews of Interventions version 5.1.0 (updated March 2011), ed. JPT Higgins, S Green Cochrane Collab http://handbook.cochrane.org [Google Scholar]
  40. Higgins JPT, Ramsay C, Reeves BC, Deeks JJ, Shea B. et al. 2013. Issues relating to study design and risk of bias when including non-randomized studies in systematic reviews on the effects of interventions. Res. Synth. Methods 4:12–25 [Google Scholar]
  41. Higgins JPT, Thompson SG, Spiegelhalter DJ. 2009. A re-evaluation of random-effects meta-analysis. J. R. Stat. Soc. A 172:1137–59 [Google Scholar]
  42. Hlatky MA. 1991. Using databases to evaluate therapy. Stat. Med. 10:647–52 [Google Scholar]
  43. Hlatky MA, Califf RM, Harrell FE, Lee KL, Mark DB, Pryor DB. 1998. Comparison of predictions based on observational data with the results of randomized controlled clinical trials of coronary artery bypass surgery. J. Am. Coll. Cardiol. 11:2237–45 [Google Scholar]
  44. Howick J, Chalmers I, Glasziou P, Greenhalgh T, Heneghan C. et al. (Oxford Cent. Evid. Based Med. Levels Evid. Work. Group) 2011. The Oxford 2011 Levels of Evidence Oxford Cent. Evid. Based Med. http://www.cebm.net/wp-content/uploads/2014/06/CEBM-Levels-of-Evidence-2.1.pdf [Google Scholar]
  45. Humphreys K, Weisner C. 2000. Use of exclusion criteria in selecting research subjects and its effect on the generalizability of alcohol treatment outcome studies. Am. J. Psychiatry 157:588–94 [Google Scholar]
  46. Ibrahim JG, Chen M-H. 2000. Power prior distributions for regression models. Stat. Sci. 15:146–60 [Google Scholar]
  47. Imai K, King G, Stuart EA. 2008. Misunderstandings among experimentalists and observationalists about causal inference. J. R. Stat. Soc. A 171:2481–502 [Google Scholar]
  48. Imai K, van Dyk DA. 2004. Causal inference with general treatment regimes: generalizing the propensity score. J. Am. Stat. Assoc. 99:467854–66 [Google Scholar]
  49. Ioannidis JPA, Chang CQ, Lam TK, Schully SD, Khoury MJ. 2013. The geometric increase in meta-analyses from China in the genomic era. PLOS ONE 8:6e65602 [Google Scholar]
  50. Ioannidis JPA, Haidich A-B, Pappa M, Pantzakis N, Kokori SI. et al. 2001. Comparison of evidence of treatment effects in randomized and nonrandomized studies. JAMA 286:821–30 [Google Scholar]
  51. IOM (Institute of Medicine) 2013. Observational Studies in a Learning Health System: Workshop Summary Washington, DC: Nat. Acad. Press [Google Scholar]
  52. Kaizar EE, Greenhouse JB, Seltman H, Kelleher K. 2006. Do antidepressants cause suicidality in children? A Bayesian meta-analysis. Clin. Trials 3:273–98 [Google Scholar]
  53. Kaizar EE. 2011. Estimating treatment effect via simple cross design synthesis. Stat. Med. 30:252986–3009 [Google Scholar]
  54. King M, Nazareth I, Lampe F, Bower P, Chandler M. et al. 2005. Impact of participant and physician intervention preferences on randomized trials: a systematic review. JAMA 293:91089–99 [Google Scholar]
  55. Larose DT, Dey DK. 1997. Grouped random effects models for Bayesian meta-analysis. Stat. Med. 16:161817–29 [Google Scholar]
  56. Li Z, Begg CB. 1994. Random effects models for combining results from controlled and uncontrolled studies in a meta-analysis. J. Am. Stat. Assoc. 89:1523–27 [Google Scholar]
  57. Little RJA, Rubin DB. 2002. Statistical Analysis with Missing Data. Hoboken, NJ: Wiley., 2nd ed.. [Google Scholar]
  58. Lu CY, Zhang F, Lakoma MD, Madden JM, Rusinak D. et al. 2014. Changes in antidepressant use by young people and suicidal behavior after FDA warnings and media coverage: quasi-experimental study. BMJ 348:g3596 [Google Scholar]
  59. MacLehose RR, Reeves BC, Harvey IM, Sheldon TA, Russell IT, Black AM. 2000. A systematic review of comparisons of effect sizes derived from randomised and non-randomised studies. Health Technol. Assess. 4:341–154 [Google Scholar]
  60. Mak A, Cheung MWL, Ho RC-M, Cheak AA-C, Lau CS. 2009. Bisphosphonates and atrial fibrillation: Bayesian meta-analyses of randomized controlled trials and observational studies. BMC Musculoskelet. Disord. 10:113 [Google Scholar]
  61. Marcus SM. 1997. Assessing non-consent bias with parallel randomized and nonrandomized clinical trials. J. Clin. Epidemiol. 50:7823–28 [Google Scholar]
  62. Marcus SM, Stuart EA, Wang P, Shadish WR, Steiner PM. 2012. Estimating the causal effect of randomization versus treatment preference in a doubly randomized preference trial. Psychol. Methods 17:2244–54 [Google Scholar]
  63. McCarron CE, Pullenayegum EM, Thabane L, Goeree R, Tarride J-E. 2010. The importance of adjusting for potential confounders in Bayesian hierarchical models synthesising evidence from randomised and non-randomised studies: an application comparing treatments for abdominal aortic aneurysms. BMC Med. Res. Methodol. 10:64 [Google Scholar]
  64. Moher D, Jadad AR, Nichol G, Penman M, Tugwell P, Walsh S. 1995. Assessing the quality of randomized controlled trials: an annotated bibliography of scales and checklists. Control. Clin. Trials 16:62–73 [Google Scholar]
  65. Molitor N-T, Best N, Jackson C, Richardson S. 2009. Using Bayesian graphical models to model biases in observational studies and to combine multiple sources of data: application to low birth-weight and water disinfection by-products. J. R. Stat. Soc. A 172:3615–37 [Google Scholar]
  66. Peto R. 1987. Why do we need systematic overviews of randomized trials?. Stat. Med. 6:233–40 [Google Scholar]
  67. Pearl J. 2009. Causality: Models, Reasoning, and Inference New York: Cambridge Univ. Press, 2nd ed.. [Google Scholar]
  68. Peinemann F, Tushabe DA, Kleijnen J. 2013. Using multiple types of studies in systematic reviews of health care interventions—a systematic review. PLOS ONE 8:12e85035 [Google Scholar]
  69. Prevost TC, Abrams KR, Jones DR. 2000. Hierarchical models in generalized synthesis of evidence: an example based on studies of breast cancer screening. Stat. Med. 19:3359–76 [Google Scholar]
  70. Ratcliffe J, Ades AE, Gibb D, Sculpher MJ, Briggs AH. 1998. Prevention of mother-to-child transmission of HIV-1 infection: alternative strategies and their cost-effectiveness. AIDS 12:1381–88 [Google Scholar]
  71. Reeves BC, Deeks JJ, Higgins JPT, Wells GA. 2011. Chapter 13: including non-randomized studies. Cochrane Handbook for Systematic Reviews of Interventions version 5.1.0 (updated March 2011), ed. JPT Higgins, S Green Cochrane Collab http://handbook.cochrane.org [Google Scholar]
  72. Reeves BC, Higgins JPT, Ramsay C, Shea B, Tugwell P, Wells GA. 2013. An introduction to methodological issues when including non-randomised studies in systematic reviews on the effects of interventions. Res. Synth. Methods 4:1–11 [Google Scholar]
  73. Rothwell PM. 2005. External validity of randomised controlled trials: “To whom do the results of this trial apply?”. Lancet 365:945382–93 [Google Scholar]
  74. Rubin D. 1990. A new perspective on meta-analysis. The Future of Meta-Analysis KM Wachter, ML Straff 155–165 New York: Russell Sage Found. [Google Scholar]
  75. Ryan PB, Madigan D, Stang PE, Overhage JM, Racoosin JA, Hartzema AG. 2012. Empirical assessment of methods for risk identification in healthcare data: results from the experiments of the Observational Medical Outcomes Partnership. Stat. Med. 31:4401–15 [Google Scholar]
  76. Sampath S, Moran JL, Graham PL, Rockliff S, Bersten AD, Abrams KR. 2007. The efficacy of loop diuretics in acute renal failure: assessment using Bayesian evidence synthesis techniques. Crit. Care Med. 35:112516–24 [Google Scholar]
  77. Shadish WR, Clark MH, Steiner PM. 2008. Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. J. Am. Stat. Assoc. 103:1334–44 [Google Scholar]
  78. Shrier I, Boivin J-F, Steele RJ, Platt RW, Furlan A. et al. 2007. Should meta-analyses of interventions include observational studies in addition to randomized controlled trials? A critical examination of underlying principles. Am. J. Epidemiol. 166:101203–9 [Google Scholar]
  79. Siersma V, Als-Nielsen B, Chen W, Hilden J, Gluud LL, Gluud C. 2007. Multivariable modelling for meta-epidemiological assessment of the association between trial quality and treatment effects estimated in randomized clinical trials. Stat. Med. 26:2745–58 [Google Scholar]
  80. Spiegelhalter DJ, Best NG. 2003. Bayesian approaches to multiple sources of evidence and uncertainty in complex cost-effectiveness modelling. Stat. Med. 22:3687–709 [Google Scholar]
  81. Sterne JAC, Jüni P, Schultz KF, Altman DG, Bartlett C, Egger M. 2002. Statistical methods for assessing the influence of study characteristics on treatment effects in ‘meta-epidemiological’ research. Stat. Med. 21:1513–24 [Google Scholar]
  82. Stevens J, Kelleher K, Greenhouse J, Chen G, Xiang H. et al. 2007. Empirical evaluation of the generalizability of the sample from the multimodal treatment study for ADHD. Admin. Policy Ment. Health Ment. Health Serv. Res. 34:3221–32 [Google Scholar]
  83. Stuart EA, Cole SR, Bradshaw CP, Leaf PJ. 2011. The use of propensity scores to assess the generalizability of results from randomized trials. J. R. Stat. Soc. A 174:2369–86 [Google Scholar]
  84. Teicher MH, Glod C, Cole JO. 1990. Emergence of intense suicidal preoccupation during fluoxetine treatment. Am. J. Psychiatry 147:207–10 [Google Scholar]
  85. Thompson SG. 1994. Why sources of heterogeneity in meta-analysis should be investigated. BMJ 309:1351–55 [Google Scholar]
  86. Thompson SG, Higgins JP. 2002. How should meta-regression analyses be undertaken and interpreted?. Stat. Med. 21:111559–73 [Google Scholar]
  87. Turner RM, Davey J, Clarke MJ, Thompson SG, Higgins JPT. 2012. Predicting the extent of heterogeneity in meta-analysis, using empirical data from the Cochrane Database of Systematic Reviews. Int. J. Epidemiol. 41:3818–27 [Google Scholar]
  88. Turner RM, Lloyd-Jones M, Anumba DOC, Smith GCS, Spiegelhalter DJ. et al. 2012. Routine antenatal anti-D prophylaxis in women who are Rh(D) negative: meta-analyses adjusted for differences in study design and quality. PLOS ONE 7:2e30711 [Google Scholar]
  89. Turner RM, Spiegelhalter DJ, Smith GCS, Thompson SG. 2009. Bias modelling in evidence synthesis. J. R. Stat. Soc. A 172:21–47 [Google Scholar]
  90. Valuck RJ, Libby AM, Sills MR, Giese AA, Allen RR. 2004. Antidepressant treatment and risk of suicide attempt by adolescents with major depressive disorder: a propensity-adjusted retrospective cohort study. CNS Drugs 18:151119–32 [Google Scholar]
  91. Welton NJ, Ades AE, Carlin JB, Altman DG, Sterne JAC. 2009. Models for potentially biased evidence in meta-analysis using empirically based priors. J. R. Stat. Soc. A 172:1119–36 [Google Scholar]
  92. Wilks DC, Mander AP, Jebb SA, Thompson SG, Sharp SJ. et al. 2011a. Dietary energy density and adiposity: employing bias adjustments in a meta-analysis of prospective studies. BMC Public Health 11:48 [Google Scholar]
  93. Wilks DC, Sharp SJ, Ekelund U, Thompson SG, Mander AP. et al. 2011b. Objectively measured physical activity and fat mass in children: a bias-adjusted meta-analysis of prospective studies. PLOS ONE 6:2e17205 [Google Scholar]
  94. Williams DDR, Garner J. 2002. The case against ‘the evidence’: a different perspective on evidence-based medicine. Br. J. Psychiatry 180:8–12 [Google Scholar]
  95. Wolpert R, Mengersen K. 2004. Adjusted likelihoods for synthesizing empirical evidence from studies that differ in quality and design: effects of environmental tobacco smoke. Stat. Sci. 19:450–71 [Google Scholar]
  96. Wood L, Egger M, Gluud LL, Schulz KF, Jüni P. et al. 2008. Empirical evidence of bias in treatment effect estimates in controlled trials with different interventions and outcomes: meta-epidemiological study. BMJ 336:601 [Google Scholar]
  97. Yoo C. 2012. The Bayesian method for causal discovery of latent-variable models from a mixture of experimental and observational data. Comput. Stat. Data Anal. 56:2183–205 [Google Scholar]
  98. Zimmerman M, Chelminski I, Posternak MA. 2004. Exclusion criteria used in antidepressant efficacy trials: consistency across studies and representativeness of samples included. J. Nerv. Ment. Dis. 192:87–94 [Google Scholar]
/content/journals/10.1146/annurev-statistics-010814-020249
Loading
/content/journals/10.1146/annurev-statistics-010814-020249
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error