“The Optimum Emphasis on Dams’ Records When Proving Dairy Sires”, Jay L. Lush1944-11-01 (; similar)⁠:

Nearly all sire indexes which have been proposed can be described by the general equation

I = a + c (XWY)

in which a, b, and c are constants, X is the average production of the daughters, Y is the average production of their dams and I is the index.

The size of a affects only the general level (the mean) of the indexes. The size of c affects the variability of I but not its accuracy for comparing the breeding values (G) of 2 or more indexed sires. The size of b affects the accuracy of the index as well as its variability.

The main contribution of this paper is in showing that maximum accuracy of the index is attained when

β = (σx / σy) · (rgxrxyrgy)⧸(rgxrgyrxy)

If rGY = zero this optimum value of b becomes simply the regression of X on Y. If rGY has a small positive value (as is possible if breeders whose cows have high records generally try harder than other breeders to get good bulls—and if the extra efforts are partially successful) the optimum value of b is a little less than the regression of X on Y. The regression of X on Y is about 0.5 to 0.6 both for milk and for test in most sets of data actually used for proving dairy sires. The optimum value for b in dairy data will, therefore, be not far from 0.5.

If rGY is zero, selection of sires on the optimum index, as thus defined, will make 1⧸√1−r2 times as much progress as choosing the sires on the average of their daughters alone. The size of this factor, when rGY is very small and rXY has such values as are usually encountered in proving dairy sires, is about 1.12 to 1.20.

The size of rXY or of the regression of X on Y is affected more by the correlation (v) between a daughter’s record and the record of a mate of her sire, other than her own dam, than it is by the correlation (r) between a daughter and her own dam, especially when n is large. The regression of X on Y approaches vu and rXY approaches v⁄√uw as a limit when n becomes extremely large, u being the phenotypic correlation between the mates of the same sire and w being the phenotypic correlation between daughters of a sire.

A sire index can be made as variable as desired by adjusting c. The value 2.0, used for c in the intermediate or equal-parent indexes makes σI generally just a little larger than σD or σO This index can be used rather fairly for comparing proven sires directly with individual cows, as is necessary in evaluating pedigrees. It is, however, more variable than real breeding values. Consequently, if it is to be used directly as the sire’s most probable breeding value, the index needs first to be regressed far toward the breed average (just as cows’ records do) to allow for the average amount of non-genetic variation in such indexes. Approximately this amount of regression would already be accomplished in an index which used for c twice the heritability of differences between the records of individual cows. Rice1944’s proposed “NEW” index, which uses 1.0 for c, is the equal-parent index regressed half way toward the breed average. It is, therefore, half as variable but has exactly the same accuracy.