Skip to main content

Estimating the Reproducibility of Psychological Science

🧐 Skeptical/Critical β†—
Open Science Collaboration β€’ 2015 Modern Era β€’ methodology

πŸ“Œ Appears in:

Plain English Summary

In a landmark wake-up call for science, 270 researchers teamed up to redo 100 psychology studies from top journals. The results were striking: while 97% of originals claimed significant findings, only 36% held up on retry. Effect sizes -- how strong a finding is -- were cut in half. Social psychology fared worst at just 25% replicating, versus 50% for cognitive psychology. The likely culprits? Publication bias (journals preferring exciting positive results) and flexible analysis that makes noise look like signal. This matters hugely for parapsychology debates, because critics single out psi research for failing to replicate -- yet mainstream psychology clearly has the same problem.

Actual Paper Abstract

Reproducibility is a defining feature of science, but the extent to which it characterizes current research is unknown. We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline. Ninety-seven percent of original studies had statistically significant results. Thirty-six percent of replications had statistically significant results; 47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects. Correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.

Research Notes

Landmark empirical foundation for the replication crisis in psychology. Directly relevant to psi debates: Rabeyron (2020) and Kennedy (2016, 2013) cite it to contextualize whether parapsychology's replication difficulties are unique or reflect discipline-wide problems. Establishes a 36% base rate for replication that many psi effect claims must be evaluated against.

A collaborative effort by 270 researchers replicated 100 experimental and correlational studies from three leading psychology journals (2008 issues of Psychological Science, JPSP, and JEP:LMC) using pre-registered, high-powered designs with original materials. While 97% of original studies reported significant results, only 36% of replications achieved significance. Mean replication effect sizes (r = 0.197) were half the original magnitudes (r = 0.403). Cognitive psychology findings replicated better (50%) than social psychology (25%). Replication success correlated with original evidence strength rather than replication team characteristics, implicating publication bias and analytic flexibility as likely contributors to inflated original effects.

Links

Related Papers

More in Methodology

πŸ“‹ Cite this paper
APA
Open Science Collaboration (2015). Estimating the Reproducibility of Psychological Science. Science. https://doi.org/10.1126/science.aac4716
BibTeX
@article{open_science_2015_reproducibility,
  title = {Estimating the Reproducibility of Psychological Science},
  author = {Open Science Collaboration},
  year = {2015},
  journal = {Science},
  doi = {10.1126/science.aac4716},
}