Skip to main content

The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No "Fishing Expedition" or "P-Hacking" and the Research Hypothesis Was Posited Ahead of Time

⚑ Contested
Gelman, Andrew, Loken, Eric β€’ 2013 Modern Era β€’ skeptical

πŸ“Œ Appears in:

Plain English Summary

Imagine you're walking through a garden where the path keeps branching, and at each fork you make a choice that feels obvious β€” but a different person might have chosen differently at every turn. That's the powerful metaphor this paper introduced to explain how researchers can unknowingly inflate their results. Even without deliberately cheating or fishing for significance, the sheer number of small analytical decisions (how to split groups, which outliers to drop, which measure to emphasize) means the one analysis a scientist reports isn't really just "one" test β€” it's the survivor of many invisible alternatives. Using Bem's precognition studies as a prime example, the authors argue that pre-registration β€” publicly committing to your analysis plan before seeing data β€” is the best antidote to this hidden flexibility.

Actual Paper Abstract

Researcher degrees of freedom can lead to a multiple comparisons problem, even in settings where researchers perform only a single analysis on their data. The problem is there can be a large number of potential comparisons when the details of data analysis are highly contingent on data, without the researcher having to perform any conscious procedure of fishing or examining multiple p-values. We discuss in the context of several examples of published papers where data-analysis decisions were theoretically-motivated based on previous literature, but where the details of data selection and analysis were not pre-specified and, as a result, were contingent on data.

Research Notes

Foundational replication-crisis paper that uses Bem (2011) as a central example. Introduced the influential 'garden of forking paths' metaphor, now a standard reference in debates about analytic flexibility in psi research and the Feeling the Future controversy.

Researcher degrees of freedom can produce a multiple comparisons problem even when scientists perform only a single analysis on their data. Using case studies from published psychology β€” including Bem's (2011) precognition experiments, menstrual-cycle effects on voting, and upper-body strength and political attitudes β€” a four-level typology of testing procedures is proposed, distinguishing deliberate fishing from the more common pattern where a single analysis path is chosen that appears predetermined but is actually contingent on the observed data. Pre-registration and pre-publication replication are recommended as solutions.

Related Papers

More in Skeptical

πŸ“‹ Cite this paper
APA
Gelman, Andrew, Loken, Eric (2013). The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No "Fishing Expedition" or "P-Hacking" and the Research Hypothesis Was Posited Ahead of Time. Columbia University Department of Statistics Working Paper.
BibTeX
@article{gelman_2013_forking_paths,
  title = {The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No "Fishing Expedition" or "P-Hacking" and the Research Hypothesis Was Posited Ahead of Time},
  author = {Gelman, Andrew and Loken, Eric},
  year = {2013},
  journal = {Columbia University Department of Statistics Working Paper},
}