“Theoretical False Positive Psychology”, 2022-05-02 (; backlinks):
A fundamental goal of scientific research is to generate true positives (ie. authentic discoveries). Statistically, a true positive is a statistically-significant finding for which the underlying effect size (δ) is greater than 0, whereas a false positive is a statistically-significant finding for which δ equals 0. However, the null hypothesis of no difference (δ = 0) may never be strictly true because innumerable nuisance factors can introduce small effects for theoretically uninteresting reasons. If δ never equals zero, then with sufficient power, every experiment would yield a statistically-significant result. Yet running studies with higher power by increasing sample size (N) is one of the most widely agreed upon reforms to increase replicability. Moreover, and perhaps not surprisingly, the idea that psychology should attach greater value to small effect sizes is gaining currency.
Increasing N without limit makes sense for purely measurement-focused research, where the magnitude of δ itself is of interest, but it makes less sense for theory-focused research, where the truth status of the theory under investigation is of interest.
Increasing power to enhance replicability will increase true positives at the level of the effect size (statistical true positives) while increasing false positives at the level of theory (theoretical false positives). With too much power, the cumulative foundation of psychological science would consist largely of nuisance effects masquerading as theoretically important discoveries.
Positive predictive value at the level of theory is maximized by using an optimal N, one that is neither too small nor too large…PPV at the level of theory is the probability that a p < 0.05 result confirming a theory-based prediction reflects the effect of the theoretical mechanism, not a nuisance factor.
[Keywords: null hypothesis statistical-significance-testing, false positives, positive predictive value, replication crisis]
See Also:
Heterogeneity in direct replications in psychology and its association with effect size
Small Effects: The Indispensable Foundation for a Cumulative Psychological Science
Small Telescopes: Detectability and the Evaluation of Replication Results
Psychological measurement and the replication crisis: Four sacred cows
Most Published Research Findings Are False—But a Little Replication Goes a Long Way
Evaluating the replicability of social science experiments in Nature and Science 2010–52015
View PDF: