Estimating the Reproducibility of Psychological Science

📌 Appears in:

⚡ Facilitated Communication ⚡ Meta Debate 📖 Weak Spots 📖 Newcomer Survey

Plain English Summary

In a landmark wake-up call for science, 270 researchers teamed up to redo 100 psychology studies from top journals. The results were striking: while 97% of originals claimed significant findings, only 36% held up on retry. Effect sizes -- how strong a finding is -- were cut in half. Social psychology fared worst at just 25% replicating, versus 50% for cognitive psychology. The likely culprits? Publication bias (journals preferring exciting positive results) and flexible analysis that makes noise look like signal. This matters hugely for parapsychology debates, because critics single out psi research for failing to replicate -- yet mainstream psychology clearly has the same problem.

Abstract

Reproducibility is a defining feature of science, but the extent to which it characterizes current research is unknown. We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline. Ninety-seven percent of original studies had statistically significant results. Thirty-six percent of replications had statistically significant results; 47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects. Correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.