“Star Wars: The Empirics Strike Back”, 2013-03 (; backlinks; similar):
Journals favor rejection of the null hypothesis. This selection upon tests may distort the behavior of researchers.
Using 50,000 tests published 2005–6201113ya in the AER, JPE, and QJE, we identify a residual in the distribution of tests that cannot be explained by selection. The distribution of p-values exhibits a camel shape with abundant p-values above 0.25, a valley between 0.25 and 0.10 and a bump slightly below 0.05.
The missing tests (with p-values between 0.25 and 0.10) can be retrieved just after the 0.05 threshold and represent 10% to 20% of marginally rejected tests.
Our interpretation is that researchers might be tempted to inflate the value of those almost-rejected tests by choosing a “significant” specification. We propose a method to measure inflation and decompose it along articles’ and authors’ characteristics.
View PDF: