“The Power of Twins: The Scottish Milk Experiment”, Gwern2016-01-12 (, , ; backlinks; similar)⁠:

In discussing a large Scottish public health experiment, Student noted that it would’ve been vastly more efficient using a twin experiment design; I fill in the details with a power analysis.

Randomized experiments require more subjects the more variable each datapoint is to overcome the noise which obscures any effects of the intervention. Reducing noise enables better inferences with the same data, or less data to be collected, which can be done by balancing observed characteristics between control and experimental datapoints.

A particularly dramatic example of this approach is running experiments on identical twins rather than regular people, because twins vary far less from each other than random people due to shared genetics & family environment. In 193193ya, the great statistician Student (William Sealy Gosset) noted problems with an extremely large (n = 20,000) Scottish experiment in feeding children milk (to see if they grew more in height or weight), and claimed that the experiment could have been done far more cost-effectively with an extraordinary reduction of >95% fewer children if it had been conducted using twins, and claimed that 100 identical twins would have been more accurate than 20,000 children. He, however, did not provide any calculations or data demonstrating this.

I revisit the issue and run a power calculation on height indicating that Student’s claims were correct and that the experiment would have required ~97% fewer children if run with twins.

This reduction is not unique to the Scottish milk experiment on height/weight, and in general, one can expect a reduction of 89% in experiment sample sizes using twins rather than regular people, demonstrating the benefits of using behavioral genetics in experiment design/power analysis.