Calibrating the Scientific Ecosystem Through Meta-Research

Tom E. Hardwicke; Stylianos Serghiou; Perrine Janiaud; Valentin Danchev; Sophia Crüwell; Steven N. Goodman; John P.A. Ioannidis

doi:10.1146/annurev-statistics-031219-041104

Annual Review of Statistics and Its Application

Volume 7, 2020

Review Article

Free

Calibrating the Scientific Ecosystem Through Meta-Research

Tom E. Hardwicke^1,2, Stylianos Serghiou^2,3, Perrine Janiaud², Valentin Danchev², Sophia Crüwell^1,4, Steven N. Goodman^2,3,5, and John P.A. Ioannidis^1,2,3,4,5
View Affiliations Hide Affiliations

Affiliations: ¹Meta-Research Innovation Center Berlin (METRIC-B), QUEST Center for Transforming Biomedical Research, Berlin Institute of Health, Charité–Universitätsmedizin Berlin, 10178 Berlin, Germany; email: [email protected] ²Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California 94305, USA ³Department of Health Research and Policy, Stanford University, Stanford, California 94305, USA ⁴Department of Psychological Methods, University of Amsterdam, 1018 WS Amsterdam, Netherlands ⁵Department of Medicine, Stanford University, Stanford, California 94305, USA ⁶Departments of Biomedical Data Science, and of Statistics, Stanford University, California, USA
Vol. 7:11-37 (Volume publication date March 2020) https://doi.org/10.1146/annurev-statistics-031219-041104
First published as a Review in Advance on November 01, 2019
Copyright © 2020 by Annual Reviews. All rights reserved

Abstract

While some scientists study insects, molecules, brains, or clouds, other scientists study science itself. Meta-research, or research-on-research, is a burgeoning discipline that investigates efficiency, quality, and bias in the scientific ecosystem, topics that have become especially relevant amid widespread concerns about the credibility of the scientific literature. Meta-research may help calibrate the scientific ecosystem toward higher standards by providing empirical evidence that informs the iterative generation and refinement of reform initiatives. We introduce a translational framework that involves (a) identifying problems, (b) investigating problems, (c) developing solutions, and (d) evaluating solutions. In each of these areas, we review key meta-research endeavors and discuss several examples of prior and ongoing work. The scientific ecosystem is perpetually evolving; the discipline of meta-research presents an opportunity to use empirical evidence to guide its development and maximize its potential.

Keyword(s): bias, meta-research, meta-science, methodology, open science, reproducibility

Article metrics loading...

/content/journals/10.1146/annurev-statistics-031219-041104

2020-03-07

2024-04-16

Full text loading...

/deliver/fulltext/statistics/7/1/annurev-statistics-031219-041104.html?itemId=/content/journals/10.1146/annurev-statistics-031219-041104&mimeType=html&fmt=ahah

Literature Cited

Alsheikh-Ali AA, Qureshi W, Al-Mallah MH, Ioannidis JPA 2011. Public availability of published research data in high-impact journals. PLOS ONE 6:9e24357
[Google Scholar]
Altman DG. 1994. The scandal of poor medical research. BMJ 308:6924283–84
[Google Scholar]
Altman DG, Simera I. 2016. A history of the evolution of guidelines for reporting medical research: the long road to the EQUATOR Network. J. R. Soc. Med. 109:267–77
[Google Scholar]
Anderson MS, Ronning EA, Devries R, Martinson BC 2010. Extending the Mertonian norms: scientists’ subscription to norms of research. J. High. Educ. 81:3366–93
[Google Scholar]
Armitage P, McPherson CK, Rowe BC 1969. Repeated significance tests on accumulating data. J. R. Stat. Soc. Ser. A. 132:2235–44
[Google Scholar]
Avey MT, Moher D, Sullivan KJ, Fergusson D, Griffin G et al. 2016. The devil is in the details: incomplete reporting in preclinical animal research. PLOS ONE 11:11e0166733
[Google Scholar]
Baker M. 2016. 1,500 scientists lift the lid on reproducibility. Nature 533:7604452–54
[Google Scholar]
Bakker M, van Dijk A, Wicherts JM 2012. The rules of the game called psychological science. Perspect. Psychol. Sci. 7:6543–54
[Google Scholar]
Bakker M, Wicherts JM. 2011. The (mis)reporting of statistical results in psychology journals. Behav. Res. Methods 43:3666–78
[Google Scholar]
Bakker M, Wicherts JM. 2014. Outlier removal, sum scores, and the inflation of the type I error rate in independent samples t tests: the power of alternatives and recommendations. Psychol. Methods 19:3409–27
[Google Scholar]
Barba LA. 2018. Terminologies for reproducible research. arXiv:1802.03311 [cs.DL]
Begley CG, Ellis LM. 2012. Raise standards for preclinical cancer research. Nature 483:7391531–33
[Google Scholar]
Benjamin DJ, Berger JO, Johannesson M, Nosek BA, Wagenmakers E-J et al. 2017. Redefine statistical significance. Nat. Hum. Behav. 2:16–10
[Google Scholar]
Bennett CM, Miller MB, Wolford GL 2009. Neural correlates of interspecies perspective taking in the post-mortem Atlantic salmon: an argument for multiple comparisons correction. NeuroImage 47:S1S39–41
[Google Scholar]
Börner K, Klavans R, Patek M, Zoss AM, Biberstine JR et al. 2012. Design and update of a classification system: the UCSD map of science. PLOS ONE 7:7e39464
[Google Scholar]
Boutron I, Dutton S, Ravaud P, Altman DG 2010. Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA 303:202058–64
[Google Scholar]
Brown NJL, Heathers JAJ. 2017. The GRIM Test: a simple technique detects numerous anomalies in the reporting of results in psychology. Soc. Psychol. Person. Sci. 8:4363–69
[Google Scholar]
Bruce R, Chauvin A, Trinquart L, Ravuad P, Boutron I 2016. Impact of interventions to improve the quality of peer review of biomedical journals: a systematic review and meta-analysis. BMC Med 14:85
[Google Scholar]
Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J et al. 2013. Power failure: why small sample size undermines the reliability of neuroscience. Nat. Rev. Neurosci. 14:5365–76
[Google Scholar]
Camerer CF, Dreber A, Forsell E, Ho T-H, Huber J et al. 2016. Evaluating replicability of laboratory experiments in economics. Science 351:62801433–36
[Google Scholar]
Camerer CF, Dreber A, Holzmeister F, Ho T-H, Huber J et al. 2018. Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nat. Hum. Behav. 2:9637–44
[Google Scholar]
Carlisle JB. 2012. The analysis of 168 randomised controlled trials to test data integrity. Anaesthesia 67:5521–37
[Google Scholar]
Carp J. 2012. The secret lives of experiments: methods reporting in the fMRI literature. NeuroImage 63:1289–300
[Google Scholar]
Chalmers I, Glasziou P. 2009. Avoidable waste in the production and reporting of research evidence. Lancet 374:968386–89
[Google Scholar]
Chambers CD. 2013. Registered Reports: a new publishing initiative at Cortex. Cortex 493609–10
[Google Scholar]
Chambers CD, Mellor DT. 2018. Protocol transparency is vital for registered reports. Nat. Hum. Behav. 2:791–92
[Google Scholar]
Chan A-W, Hróbjartsson A, Haahr MT, Gøtzsche PC, Altman DG 2004. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA 291:202457–65
[Google Scholar]
Chavalarias D, Ioannidis JPA. 2010. Science mapping analysis characterizes 235 biases in biomedical research. J. Clin. Epidemiol. 63:111205–15
[Google Scholar]
Chavalarias D, Wallach JD, Li AHT, Ioannidis JPA 2016. Evolution of reporting P values in the biomedical literature, 1990–2015. JAMA 315:111141–48
[Google Scholar]
Ciani O, Buyse M, Garside R, Pavey T, Stein K et al. 2013. Comparison of treatment effect sizes associated with surrogate and final patient relevant outcomes in randomised controlled trials: meta-epidemiological study. BMJ 346:f457
[Google Scholar]
Cohen J. 1962. The statistical power of abnormal-social psychological research: a review. J. Abnorm. Soc. Psychol. 65:145–53
[Google Scholar]
Counsell A, Harlow LL. 2017. Reporting practices and use of quantitative methods in Canadian journal articles in psychology. Can. Psychol./Psychol. Can. 58:2140–47
[Google Scholar]
Cramer AOJ, van Ravenzwaaij D, Matzke D, Steingroever H, Wetzels R et al. 2016. Hidden multiplicity in exploratory multiway ANOVA: prevalence and remedies. Psychon. Bull. Rev. 23:2640–47
[Google Scholar]
Cumming G, Fidler F, Leonard M, Kalinowski P, Christiansen A et al. 2007. Statistical reform in psychology: Is anything changing?. Psychol. Sci. 18:3230–32
[Google Scholar]
Dal-Ré R, Ioannidis JPA, Bracken MB, Buffler PA, Chan A-W et al. 2014. Making prospective registration of observational research a reality. Sci. Transl. Med. 6:224224cm1
[Google Scholar]
Dechartres A, Boutron I, Trinquart L, Charles P, Ravaud P 2011. Single-center trials show larger treatment effects than multicenter trials: evidence from a meta-epidemiologic study. Ann. Intern. Med. 155:139–51
[Google Scholar]
Dechartres A, Trinquart L, Atal I, Moher D, Dickersin K et al. 2017. Evolution of poor reporting and inadequate methods over time in 20920 randomised controlled trials included in Cochrane reviews: research on research study. BMJ 357:j2490
[Google Scholar]
Dickersin K, Min YI, Meinert CL 1992. Factors influencing publication of research results. Follow-up of applications submitted to two institutional review boards. JAMA 267:3374–78
[Google Scholar]
Dickersin K, Rennie D. 2012. The evolution of trial registries and their use to assess the clinical trial enterprise. JAMA 307:171861–64
[Google Scholar]
Dreber A, Pfeiffer T, Almenberg J, Isaksson S, Wilson B et al. 2015. Using prediction markets to estimate the reproducibility of scientific research. PNAS 112:5015343–47
[Google Scholar]
Dwan K, Altman DG, Clarke M, Gamble C, Higgins JPT et al. 2014. Evidence for the selective reporting of analyses and discrepancies in clinical trials: a systematic review of cohort studies of clinical trials. PLOS Med 11:6e1001666
[Google Scholar]
Dwan K, Altman DG, Cresswell L, Blundell M, Gamble CL, Williamson PR 2011. Comparison of protocols and registry entries to published reports for randomised controlled trials. Cochrane Database Syst. Rev. https://doi.org//10.1002/14651858.MR000031.pub2
[Crossref] [Google Scholar]
Dwan K, Gamble C, Williamson PR, Kirkham JJ, Report. Bias Group. 2013. Systematic review of the empirical evidence of study publication bias and outcome reporting bias—an updated review. PLOS ONE 8:7e66844
[Google Scholar]
Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR 1991. Publication bias in clinical research. Lancet 337:8746867–72
[Google Scholar]
Elms AC. 1975. The crisis of confidence in social psychology. Am. Psychol. 30:10967–76
[Google Scholar]
Ernst AF, Albers CJ. 2017. Regression assumptions in clinical psychology research practice—a systematic review of common misconceptions. PeerJ 5:e3323
[Google Scholar]
Etz A, Vandekerckhove J. 2016. A Bayesian perspective on the Reproducibility Project: Psychology. PLOS ONE 11:2e0149794
[Google Scholar]
Fanelli D. 2009. How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PLOS ONE 4:5e5738
[Google Scholar]
Fanelli D. 2010. Do pressures to publish increase scientists' bias? An empirical support from US states data. PLOS ONE 4:e10271
[Google Scholar]
Fanelli D. 2011. Negative results are disappearing from most disciplines and countries. Scientometrics 90:3891–904
[Google Scholar]
Fanelli D, Costas R, Ioannidis JPA 2017. Meta-assessment of bias in science. PNAS 114:143714–19
[Google Scholar]
Faust D, Meehl PE. 2002. Using meta‐scientific studies to clarify or resolve questions in the philosophy and history of science. Philos. Sci. 69:S3S185–96
[Google Scholar]
Fidler F, Burgman MA, Cumming G, Buttrose R, Thomason N 2006. Impact of criticism of null-hypothesis significance testing on statistical reporting practices in conservation biology. Conserv. Biol. 20:51539–44
[Google Scholar]
Fiedler K. 2011. Voodoo correlations are everywhere—not only in neuroscience. Perspect. Psychol. Sci. 6:2163–71
[Google Scholar]
Forsell E, Viganola D, Pfeiffer T, Almenberg J, Wilson B et al. 2019. Predicting replication outcomes in the Many Labs 2 study. J. Econ. Psych. 75(A):102117
[Google Scholar]
Franco A, Malhotra N, Simonovits G 2014. Publication bias in the social sciences: unlocking the file drawer. Science 345:62031502–5
[Google Scholar]
Franco A, Malhotra N, Simonovits G 2016. Underreporting in psychology experiments. Soc. Psychol. Personal. Sci. 7:18–12
[Google Scholar]
Franzoni C, Scellato G, Stephan P 2011. Changing incentives to publish. Science 333:6043702–3
[Google Scholar]
Fricker RD, Burke K, Han X, Woodall WH 2019. Assessing the statistical analyses used in Basic and Applied Social Psychology after their p-value ban. Am. Stat. 73:1374–84
[Google Scholar]
Gelman A, Loken E. 2014. The statistical crisis in science: data-dependent analysis, a “garden of forking paths,” explains why many statistically significant comparisons don't hold up. Am. Sci. 102:460–65
[Google Scholar]
Gelman A, Stern H. 2006. The difference between “significant” and “not significant” is not itself statistically significant. Am. Stat. 60:328–31
[Google Scholar]
Gigerenzer G. 2004. Mindless statistics. J. Socio-Econ. 33:587–606
[Google Scholar]
Giner-Sorolla R. 2012. Science or art? How aesthetic standards grease the way through the publication bottleneck but undermine science. Perspect. Psychol. Sci. 7:6562–71
[Google Scholar]
Glasziou P, Altman DG, Bossuyt P, Boutron I, Clarke M et al. 2014. Reducing waste from incomplete or unusable reports of biomedical research. Lancet 383:9913267–76
[Google Scholar]
Goldacre B, Drysdale H, Dale A, Milosevic I, Slade E et al. 2019. COMPare: a prospective cohort study correcting and monitoring 58 misreported trials in real time. Trials 20:118
[Google Scholar]
Goodman SN. 1993. p Values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate. Am. J. Epidemiol. 137:5485–96
[Google Scholar]
Goodman SN. 2018. How sure are you of your result? Put a number on it. Nature 564:7
[Google Scholar]
Goodman SN. 2019. Why is getting rid of p-values so hard? Musings on science and statistics. Am. Stat. 73:S126–30
[Google Scholar]
Goodman SN, Fanelli D, Ioannidis JPA 2016. What does research reproducibility mean?. Sci. Transl. Med. 8:3411–6
[Google Scholar]
Gopal AD, Wallach JD, Aminawung JA, Gonsalves G, Dal-Ré R et al. 2018. Adherence to the International Committee of Medical Journal Editors’ (ICMJE) prospective registration policy and implications for outcome integrity: a cross-sectional analysis of trials published in high-impact specialty society journals. Trials 19:1448
[Google Scholar]
Grimes DR, Bauch CT, Ioannidis JPA 2018. Modelling science trustworthiness under publish or perish pressure. R. Soc. Open Sci. 5:1171511
[Google Scholar]
Hardwicke TE, Frank MC, Vazire S, Goodman SN 2019a. Should psychology journals adopt specialized statistical review?. Adv. Methods Pract. Psychol. Sci. https://doi.org/10.1177/2515245919858428
[Crossref] [Google Scholar]
Hardwicke TE, Ioannidis JPA. 2018a. Mapping the universe of registered reports. Nat. Hum. Behav. 2:793–96
[Google Scholar]
Hardwicke TE, Ioannidis JPA. 2018b. Populating the Data Ark: an attempt to retrieve, preserve, and liberate data from the most highly-cited psychology and psychiatry articles. PLOS ONE 13:8e0201856
[Google Scholar]
Hardwicke TE, Mathur MB, MacDonald K, Nilsonne G, Banks GC et al. 2018. Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition. R. Soc. Open Sci 5:8180448
[Google Scholar]
Hardwicke TE, Wallach JD, Kidwell MC, Ioannidis JPA 2019b. An empirical assessment of transparency and reproducibility-related research practices in the social sciences (2014–2017). MetaArXiv, Apr. 28. https://doi.org/10.31222/osf.io/6uhg5
[Crossref]
Hemkens LG, Contopoulos-Ioannidis DG, Ioannidis JPA 2016. Agreement of treatment effects for mortality from routinely collected data and subsequent randomized trials: meta-epidemiological survey. BMJ 352:i493
[Google Scholar]
Higginson AD, Munafò MR. 2016. Current incentives for scientists lead to underpowered studies with erroneous conclusions. PLOS Biol 14:11e2000995
[Google Scholar]
Hoekstra R, Finch S, Kiers HAL, Johnson A 2006. Probability as certainty: dichotomous thinking and the misuse of p values. Psychon. Bull. Rev. 13:61033–37
[Google Scholar]
IntHout J, Ioannidis JPA, Borm GF, Goeman JJ 2015. Small studies are more heterogeneous than large ones: a meta-meta-analysis. J. Clin. Epidemiol. 68:8860–69
[Google Scholar]
Ioannidis JPA. 2005. Why most published research findings are false. PLOS Med 2:8e124
[Google Scholar]
Ioannidis JPA. 2008. Why most discovered true associations are inflated. Epidemiology 19:5640–48
[Google Scholar]
Ioannidis JPA. 2012. Why science is not necessarily self-correcting. Perspect. Psychol. Sci. 7:6645–54
[Google Scholar]
Ioannidis JPA. 2014. How to make more published research true. PLOS Med 11:10e1001747
[Google Scholar]
Ioannidis JPA. 2015. Handling the fragile vase of scientific practices. Addiction 110:19–10
[Google Scholar]
Ioannidis JPA. 2016. The mass production of redundant, misleading, and conflicted systematic reviews and meta-analyses. Milbank Q 94:3485–514
[Google Scholar]
Ioannidis JPA. 2017. The reproducibility wars: successful, unsuccessful, uninterpretable, exact, conceptual, triangulated, contested replication. Clin. Chem. 63:5943–45
[Google Scholar]
Ioannidis JPA. 2018a. Meta-research: why research on research matters. PLOS Biol 16:3e2005468
[Google Scholar]
Ioannidis JPA. 2018b. The proposal to lower P value thresholds to.005. JAMA 319:141429–30
[Google Scholar]
Ioannidis JPA. 2019. Retiring statistical significance would give bias a free pass. Nature 567:461
[Google Scholar]
Ioannidis JPA, Caplan AL, Dal-Ré R 2017a. Outcome reporting bias in clinical trials: why monitoring matters. BMJ 356:j408
[Google Scholar]
Ioannidis JPA, Fanelli D, Dunne DD, Goodman SN 2015. Meta-research: evaluation and improvement of research methods and practices. PLOS Biol 13:10e1002264
[Google Scholar]
Ioannidis JPA, Greenland S, Hlatky MA, Khoury MJ, Macleod MR et al. 2014. Increasing value and reducing waste in research design, conduct, and analysis. Lancet 383:9912P166–75
[Google Scholar]
Ioannidis JPA, Stanley TD, Doucouliagos H 2017b. The power of bias in economics research. Econ. J. 127:605F236–65
[Google Scholar]
Ioannidis JPA, Trikalinos TA. 2007. An exploratory test for an excess of significant findings. Clin. Trials. 4:3245–53
[Google Scholar]
Iqbal SA, Wallach JD, Khoury MJ, Schully SD, Ioannidis JPA 2016. Reproducible research practices and transparency across the biomedical literature. PLOS Biol 14:1e1002333
[Google Scholar]
John LK, Loewenstein G, Prelec D 2012. Measuring the prevalence of questionable research practices with incentives for truth telling. Psychol. Sci. 23:5524–32
[Google Scholar]
Jones CW, Keil LG, Holland WC, Caughey MC, Platts-Mills TF 2015. Comparison of registered and published outcomes in randomized controlled trials: a systematic review. BMC Med 13:282
[Google Scholar]
Justice AC, Cho MK, Winker MA, Berlin JA, Rennie D 1998. Does masking author identity improve peer review quality? A randomized controlled trial. JAMA 280:3240–42
[Google Scholar]
Kaplan RM, Irvin VL. 2015. Likelihood of null effects of large NHLBI clinical trials has increased over time. PLOS ONE 10:8e0132382
[Google Scholar]
Kerr NL. 1998. HARKing: hypothesizing after the results are known. Personal. Soc. Psychol. Rev. 2:3196–217
[Google Scholar]
Kimmelman J, Mogil JS, Dirnagl U 2014. Distinguishing between exploratory and confirmatory preclinical research will improve translation. PLOS Biol 12:5e1001863
[Google Scholar]
Klein O, Hardwicke TE, Aust F, Breuer J, Danielsson H et al. 2018. A practical guide for transparency in psychological science. Collab. Psychol. 4:120
[Google Scholar]
Klein RA, Vianello M, Hasselman F, Adams BG, Adams RB et al. 2018. Many Labs 2: investigating variation in replicability across samples and settings. Adv. Methods Pract. Psychol. Sci. 1:4443–90
[Google Scholar]
Lane DM, Sándor A. 2009. Designing better graphs by including distributional information and integrating words, numbers, and images. Psychol. Methods 14:3239–57
[Google Scholar]
Lash TL, Vandenbroucke JP. 2012. Should preregistration of epidemiologic study protocols become compulsory?. Epidemiology 23:2184–88
[Google Scholar]
Lazic SE, Clarke-Williams CJ, Munafò MR 2018. What exactly is ‘N’ in cell culture and animal experiments?. PLOS Biol 16:4e2005282
[Google Scholar]
Leamer EE. 1983. Let's take the con out of econometrics. Am. Econ. Rev. 73:131–43
[Google Scholar]
Liddell T, Kruschke JK. 2018. Analyzing ordinal data with metric models: What could possibly go wrong?. J. Exp. Soc. Psychol. 79:328–48
[Google Scholar]
Loder E, Loder S, Cook S 2018. Characteristics and publication fate of unregistered and retrospectively registered clinical trials submitted to the BMJ over 4 years. BMJ Open 8:2e020037
[Google Scholar]
Madigan D, Stang PE, Berlin JA, Schuemie M, Overhage JM et al. 2014. A systematic statistical approach to evaluating evidence from observational studies. Annu. Rev. Stat. Appl. 1:11–39
[Google Scholar]
Makel MC, Plucker JA, Hegarty B 2012. Replications in psychology research. Perspect. Psychol. Sci. 7:537–42
[Google Scholar]
Marwick B, Boettiger C, Mullen L 2017. Packaging data analytical work reproducibly using R (and friends). Am. Stat. 72:180–88
[Google Scholar]
Mathieu S, Boutron I, Moher D, Altman DG, Ravaud P 2009. Comparison of registered and published primary outcomes in randomized controlled trials. JAMA 302:9977–84
[Google Scholar]
Matzke D, Nieuwenhuis S, van Rijn H, Slagter HA, van der Molen MW, Wagenmakers EJ 2015. The effect of horizontal eye movements on free recall: a preregistered adversarial collaboration. J. Exp. Psychol. Gen. 144:1e1–15
[Google Scholar]
Mayo DG. 2018. Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars Cambridge, UK: Cambridge Univ. Press
McGillivray B, De Ranieri E 2018. Uptake and outcome of manuscripts in Nature journals by review model and author characteristics. Res. Integr. Peer Rev. 3:5
[Google Scholar]
McShane BB, Gal D, Gelman A, Robert C, Tackett JL 2019. Abandon statistical significance. Am. Stat. 73:1235–45
[Google Scholar]
Merton RK. 1973. The Sociology of Science: Theoretical and Empirical Investigations Chicago: Univ. Chicago Press
Meyer MN. 2018. Practical tips for ethical data sharing. Adv. Methods Pract. Psychol. Sci. 1:1131–44
[Google Scholar]
Miguel E, Camerer C, Casey K, Cohen J, Esterling KM et al. 2014. Promoting transparency in social science research. Science 343:616630–31
[Google Scholar]
Moher D, Dulberg CS, Wells GA 1994. Statistical power, sample size, and their reporting in randomized controlled trials. JAMA 272:2122–24
[Google Scholar]
Moher D, Jones A, Lepage L, CONSORT (Consol. Stand. Rep. Trials) Group. 2001. Use of the CONSORT statement and quality of reports of randomized trials: a comparative before-and-after evaluation. JAMA 285:151992–95
[Google Scholar]
Moher D, Naudet F, Cristea IA, Miedema F, Ioannidis JPA, Goodman SN 2018. Assessing scientists for hiring, promotion, and tenure. PLOS Biol 16:3e2004089
[Google Scholar]
Morey RD, Chambers CD, Etchells PJ, Harris CR, Hoekstra R et al. 2016. The Peer Reviewers’ Openness Initiative: incentivizing open research practices through peer review. R. Soc. Open Sci. 3:1150547
[Google Scholar]
Moshontz H, Campbell L, Ebersole CR, IJzerman H, Urry HL et al. 2018. The Psychological Science Accelerator: advancing psychology through a distributed collaborative network. Adv. Methods Pract. Psychol. Sci. 1:4501–15
[Google Scholar]
Mulkay MJ. 1976. Norms and ideology in science. Soc. Sci. Inf. 15:4–5637–56
[Google Scholar]
Munafò MR, Nosek BA, Bishop BVM, Button KS, Chambers CD et al. 2017. A manifesto for reproducible science. Nat. Hum. Behav. 1:0021
[Google Scholar]
Murad MH, Wang Z. 2017. Guidelines for reporting meta-epidemiological methodology research. Evid. Based Med. 22:4139–42
[Google Scholar]
Naudet F, Sakarovitch C, Janiaud P, Cristea IA, Fanelli D et al. 2018. Data sharing and reanalysis of randomized controlled trials in leading biomedical journals with a full data sharing policy: survey of studies published in the BMJ and PLOS Medicine. . BMJ 360:k400
[Google Scholar]
Nelson LD, Simmons J, Simonsohn U 2018. Psychology's renaissance. Annu. Rev. Psychol. 69:1511–34
[Google Scholar]
Nieuwenhuis S, Forstmann BU, Wagenmakers E-J 2011. Erroneous analyses of interactions in neuroscience: a problem of significance. Nat. Neurosci. 14:91105–7
[Google Scholar]
Nosek BA, Alter G, Banks GC, Borsboom D, Bowman SD et al. 2015. Promoting an open research culture. Science 348:62421422–25
[Google Scholar]
Nosek BA, Ebersole CR, DeHaven AC, Mellor DT 2018. The preregistration revolution. PNAS 115:112600–6
[Google Scholar]
Nosek BA, Errington TM. 2017. Making sense of replications. eLife 6:e23383
[Google Scholar]
Nosek BA, Spies JR, Motyl M 2012. Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspect. Psychol. Sci. 7:6615–31
[Google Scholar]
Nuijten MB, Borghuis J, Veldkamp CLS, Dominguez-Alvarez L, van Assen MALM, Wicherts JM 2017. Journal data sharing policies and statistical reporting inconsistencies in psychology. Collab. Psychol. 3:131
[Google Scholar]
Nuijten MB, Hartgerink CHJ, van Assen MALM, Epskamp S, Wicherts JM 2016. The prevalence of statistical reporting errors in psychology (1985–2013). Behav. Res. Methods 48:41205–26
[Google Scholar]
O'Boyle EH, Banks GC, Gonzalez-Mulé E 2013. The Chrysalis Effect: how ugly data metamorphosize into beautiful articles. Acad. Man. Proc. 43:2376–99
[Google Scholar]
Open Sci. Collab 2015. Estimating the reproducibility of psychological science. Science 349:6251aac4716
[Google Scholar]
Orben A, Przybylski AK. 2019. The association between adolescent well-being and digital technology use. Nat. Hum. Behav. 3:173–82
[Google Scholar]
Page MJ, Higgins JPT, Clayton G, Sterne JAC, Hróbjartsson A, Savović J 2016. Empirical evidence of study design biases in randomized trials: systematic review of meta-epidemiological studies. PLOS ONE 11:7e0159267
[Google Scholar]
Pashler H, Wagenmakers EJ. 2012. Editors’ introduction to the special section on replicability in psychological science: a crisis of confidence?. Perspect. Psychol. Sci. 7:6528–30
[Google Scholar]
Patel CJ, Burford B, Ioannidis JPA 2015. Assessment of vibration of effects due to model specification can demonstrate the instability of observational associations. J. Clin. Epidemiol. 68:91046–58
[Google Scholar]
Pereira TV, Horwitz RI, Ioannidis JPA 2012. Empirical evaluation of very large treatment effects of medical interventions. JAMA 308:161676–84
[Google Scholar]
Phillips CV. 2004. Publication bias in situ. BMC Med. Res. Methods 4:20
[Google Scholar]
Poldrack RA, Baker CI, Durnez J, Gorgolewski KJ, Matthews PM et al. 2017. Scanning the horizon: towards transparent and reproducible neuroimaging research. Nat. Rev. Neurosci. 18:2115–26
[Google Scholar]
Prinz F, Schlange T, Asadullah K 2011. Believe it or not: How much can we rely on published data on potential drug targets?. Nat. Rev. Drug Discov. 10:9712
[Google Scholar]
Robinson KA, Goodman SN. 2011. A systematic examination of the citation of prior research in reports of randomized, controlled trials. Ann. Intern. Med. 154:150–55
[Google Scholar]
Rosenthal R. 1966. Experimenter Effects in Behavioral Research East Norwalk, CT: Appleton-Century-Crofts
Rosenthal R. 1979. The file drawer problem and tolerance for null results. Psychol. Bull. 86:3638–41
[Google Scholar]
Ross JS, Tse T, Zarin DA, Xu H, Zhou L, Krumholz HM 2012. Publication of NIH funded trials registered in ClinicalTrials.gov: cross sectional analysis. BMJ 344:d7292
[Google Scholar]
Ross-Hellauer T, Görögh E. 2019. Guidelines for open peer review implementation. Res. Integr. Peer Rev. 4:4
[Google Scholar]
Rowhani-Farid A, Barnett AG. 2016. Has open data arrived at the British Medical Journal (BMJ)? An observational study. BMJ Open 6:10e011784
[Google Scholar]
Sargent RM. 1999. Francis Bacon: Selected Philosophical Works Indianapolis, IN: Hackett
Schatz P, Jay KA, McComb J, McLaughlin JR 2005. Misuse of statistical tests in archives of clinical neuropsychology publications. Arch. Clin. Neuropsych. 20:81053–59
[Google Scholar]
Schuemie MJ, Ryan PB, DuMouchel W, Suchard MA, Madigan D 2014. Interpreting observational studies: why empirical calibration is needed to correct p‐values. Stat. Med. 33:2209–18
[Google Scholar]
Scott A, Rucklidge JJ, Mulder RT 2015. Is mandatory prospective trial registration working to prevent publication of unregistered trials and selective outcome reporting? An observational study of five psychiatry journals that mandate prospective clinical trial registration. PLOS ONE 10:8e0133718
[Google Scholar]
Sedlmeier P, Gigerenzer G. 1989. Do studies of statistical power have an effect on the power of studies. Psychol. Bull. 105:2309–16
[Google Scholar]
Seminara D, Khoury MJ, O'Brien TR, Manolio T, Gwinn ML et al. 2007. The emergence of networks in human genome epidemiology: challenges and opportunities. Epidemiology 18:11–8
[Google Scholar]
Shamseer L, Hopewell S, Altman DG, Moher D, Schulz KF 2016. Update on the endorsement of CONSORT by high impact factor journals: a survey of journal “Instructions to Authors” in 2014. Trials 17:1301
[Google Scholar]
Silberzahn R, Uhlmann EL, Martin DP, Anselmi P, Aust F et al. 2018. Many analysts, one data set: making transparent how variations in analytic choices affect results. Adv. Methods Pract. Psychol. Sci. 1:3337–56
[Google Scholar]
Simmons JP, Nelson LD, Simonsohn U 2011. False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22:111359–66
[Google Scholar]
Simonsohn U. 2013. Just post it: the lesson from two cases of fabricated data detected by statistics alone. Psychol. Sci. 24:101875–88
[Google Scholar]
Smaldino PE, McElreath R. 2016. The natural selection of bad science. R. Soc. Open Sci. 3:9160384
[Google Scholar]
Spellman BA. 2015. A short (personal) future history of Revolution 2.0. Personal. Psych. Sci. 10:6886–99
[Google Scholar]
Stanley DJ, Spence JR. 2014. Expectations for replications: Are yours realistic. Perspect. Psychol. Sci. 9:3305–18
[Google Scholar]
Steegen S, Tuerlinckx F, Gelman A, Vanpaemel W 2016. Increasing transparency through a multiverse analysis. Perspect. Psychol. Sci. 11:5702–12
[Google Scholar]
Sterling TD. 1959. Publication decisions and their possible effects on inferences drawn from tests of significance—or vice versa. J. Am. Stat. Assoc. 54:28530–34
[Google Scholar]
Stodden V, Seiler J, Ma Z 2018. An empirical analysis of journal policy effectiveness for computational reproducibility. PNAS 115:112584–89
[Google Scholar]
Strasak AM, Zaman Q, Marinell G, Pfeiffer KP, Ulmer H 2007. The use of statistics in medical research. Am. Stat. 61:147–55
[Google Scholar]
Szucs D, Ioannidis JPA. 2017a. Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLOS Biol 15:3e2000797
[Google Scholar]
Szucs D, Ioannidis JPA. 2017b. When null hypothesis significance testing is unsuitable for research: a reassessment. Front. Hum. Neurosci. 11:390
[Google Scholar]
Tierney JF, Vale C, Riley R, Smith CT, Stewart L et al. 2015. Individual participant data (IPD) meta-analyses of randomised controlled trials: guidance on their use. PLOS Med 12:7e1001855
[Google Scholar]
Trinquart L, Dunn AG, Bourgeois FT 2018. Registration of published randomized trials: a systematic review and meta-analysis. BMC Med 16:1173
[Google Scholar]
Tsakiris M, Martin R, Wagemans J 2018. Re-thinking Cognition’s open data policy: responding to Hardwicke and colleagues' evaluation of its impact. Cognition https://doi.org/10.1016/j.cognition.2018.10.008
[Crossref] [Google Scholar]
Turner L, Shamseer L, Altman DG, Schulz KF, Moher D 2012. Does use of the CONSORT Statement impact the completeness of reporting of randomised controlled trials published in medical journals? A Cochrane review. Syst. Rev. https://doi.org/10.1186/2046-4053-1-60
[Crossref] [Google Scholar]
van Dalen HP, Henkens K 2012. Intended and unintended consequences of a publish‐or‐perish culture: a worldwide survey. J. Am. Soc. Inf. Sci. Tech. 63:71282–93
[Google Scholar]
van Dongen NNN, van Doorn J, Gronau QF, van Ravenzwaaij D, Hoekstra R et al. 2019. Multiple perspectives on inference for two simple statistical scenarios. Am. Stat. 73:S1328–39
[Google Scholar]
Vanpaemel W, Vermorgen M, Deriemaecker L, Storms G 2015. Are we wasting a good crisis? The availability of psychological research data after the storm. Collabra 1:11–5
[Google Scholar]
Vasilevsky NA, Brush MH, Paddock H, Ponting L, Tripathy SJ et al. 2013. On the reproducibility of science: unique identification of research resources in the biomedical literature. PeerJ 1:e148
[Google Scholar]
Vazire S. 2017. Quality uncertainty erodes trust in science. Collab. Psychol. 13:4411–17
[Google Scholar]
Vines TH, Albert AYK, Andrew RL, Débarre F, Bock DG et al. 2014. The availability of research data declines rapidly with article age. Curr. Biol. 24:194–97
[Google Scholar]
Voytek B. 2016. The virtuous cycle of a data ecosystem. PLOS Comput. Biol. 12:8e1005037
[Google Scholar]
Vul E, Harris C, Winkielman P, Pashler H 2009. Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. Perspect. Psychol. Sci. 4:3274–90
[Google Scholar]
Wagenmakers E-J. 2007. A practical solution to the pervasive problems of p values. Psychon. Bull. Rev. 14:5779–804
[Google Scholar]
Wagenmakers E-J, Wetzels R, Borsboom D, van der Maas JLJ, Kievit RA 2012. An agenda for purely confirmatory research. Perspect. Psychol. Sci. 7:6632–38
[Google Scholar]
Walker RL, Sykes L, Hemmelgarn BR, Quan H 2010. Authors’ opinions on publication in relation to annual performance assessment. BMC Med. Educ. 10:21
[Google Scholar]
Wallach JD, Boyack KW, Ioannidis JPA 2018. Reproducible research practices, transparency, and open access data in the biomedical literature, 2015–2017. PLOS Biol 16:11e2006930
[Google Scholar]
Wallach JD, Sullivan PG, Trepanowski JF, Sainani KL, Steyerberg EW, Ioannidis JPA 2017. Evaluation of evidence of statistical support and corroboration of subgroup claims in randomized clinical trials. JAMA Intern. Med. 177:4554–60
[Google Scholar]
Wasserstein RL, Schirm AL, Lazar NA 2019. Moving to a world beyond “p < 0.05. .” Am. Stat 73:S11–19
[Google Scholar]
Weissgerber TL, Garcia-Valencia O, Garovic VD, Milic NM, Winham SJ 2018. Why we need to report more than “data were analyzed by t-tests or ANOVA. .” eLife 7:e36163
[Google Scholar]
Weissgerber TL, Milic NM, Winham SJ, Garovic VD 2015. Beyond bar and line graphs: time for a new data presentation paradigm. PLOS Biol 13:4e1002128
[Google Scholar]
Wicherts JM, Borsboom D, Kats J, Molenaar D 2006. The poor availability of psychological research data for reanalysis. Am. Psychol. 61:7726–28
[Google Scholar]
Wiseman R, Watt C, Kornbrot D 2019. Registered reports: an early example and analysis. PeerJ 7:e6232
[Google Scholar]
World Med. Assoc 2013. World Medical Association Declaration of Helsinki: ethical principles for medical research involving human subjects. JAMA 310:202191–94
[Google Scholar]
Wu L, Wang D, Evans JA 2019. Large teams develop and small teams disrupt science and technology. Nature 566:7744378–82
[Google Scholar]
Yong E. 2012. Replication studies: bad copy. Nature 485:7398298–300
[Google Scholar]
Young NS, Ioannidis JPA, Al-Ubaydli O 2008. Why current publication practices may distort science. PLOS Med 5:10e201
[Google Scholar]
Zarin DA, Tse T, Ide C 2005. Trial registration at ClinicalTrials.gov between May and October 2005. New Engl. J. Med. 353:262779–87
[Google Scholar]

/content/journals/10.1146/annurev-statistics-031219-041104

Calibrating the Scientific Ecosystem Through Meta-Research

Annual Review of Statistics and Its Application 7, 11 (2020); https://doi.org/10.1146/annurev-statistics-031219-041104

/content/journals/10.1146/annurev-statistics-031219-041104

Data & Media loading...

Article Type: Review Article

Most Cited Most Cited RSS feed

- Probabilistic Forecasting
  
  Tilmann Gneiting, and Matthias Katzfuss
  
  Vol. 1 (2014), pp. 125–151
- Functional Data Analysis
  
  Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Müller
  
  Vol. 3 (2016), pp. 257–295
- Bayesian Computing with INLA: A Review
  
  Håvard Rue, Andrea Riebler, Sigrunn H. Sørbye, Janine B. Illian, Daniel P. Simpson, and Finn K. Lindgren
  
  Vol. 4 (2017), pp. 395–421
- Functional Regression
  
  Jeffrey S. Morris
  
  Vol. 2 (2015), pp. 321–359
- Topological Data Analysis
  
  Larry Wasserman
  
  Vol. 5 (2018), pp. 501–532
- Algorithmic Fairness: Choices, Assumptions, and Definitions
  
  Shira Mitchell, Eric Potash, Solon Barocas, Alexander D'Amour, and Kristian Lum
  
  Vol. 8 (2021), pp. 141–163
- Microbiome, Metagenomics, and High-Dimensional Compositional Data Analysis
  
  Hongzhe Li
  
  Vol. 2 (2015), pp. 73–94
- Learning Deep Generative Models
  
  Ruslan Salakhutdinov
  
  Vol. 2 (2015), pp. 361–385
- On p-Values and Bayes Factors
  
  Leonhard Held, and Manuela Ott
  
  Vol. 5 (2018), pp. 393–419
- High-Dimensional Statistics with a View Toward Applications in Biology
  
  Peter Bühlmann, Markus Kalisch, and Lukas Meier
  
  Vol. 1 (2014), pp. 255–278
More Less

Annual Review of Statistics and Its Application

Volume 7, 2020

Review Article

Free

Calibrating the Scientific Ecosystem Through Meta-Research

Abstract

Most Read This Month

Most Cited Most Cited RSS feed