“Quantifying Bias from Measurable & Unmeasurable Confounders Across 3 Domains of Individual Determinants of Political Preferences”, Rafael Ahlskog, Sven Oskarsson2022-02-22 (, , , )⁠:

A core part of political research is to identify how political preferences are shaped. The nature of these questions is such that robust causal identification is often difficult to achieve, and we are not seldom stuck with observational methods that we know have limited causal validity.

The purpose of this paper is to measure the magnitude of bias stemming from both measurable and unmeasurable confounders across 3 broad domains of individual determinants of political preferences: socio-economic factors [education, income, wealth], moral values [social trust, altruism & antisocial attitudes, utilitarian judgement], and psychological constructs [risk preferences, Extraversion, locus of control, IQ]. We leverage an unique combination of rich Swedish population registry data for a large sample of identical twins, with a comprehensive battery of 34 political preference measures, and build a meta-analytical model comparing our most conservative observational (naive) estimates with discordant twin estimates. This allows us to infer the amount of bias from unobserved genetic and shared environmental factors that remains in the naive models for our predictors, while avoiding precision issues common in family-based designs.

The results are sobering: in most cases, substantial bias remains in naive models. A rough heuristic is that about half of the effect size even in conservative observational estimates is composed of confounding.

[Keywords: policy preferences, causal inference, twin, family fixed effects, genetic confounding]

…The results are sobering: for a large set of important determinants, a substantial bias seems to remain even in conservative naive models. In a majority of cases, half or more of the naive effect size appears to be composed of confounding, and in 0 cases are the naive effect sizes underestimated. The implications of this are important. First, it provides a reasonable bound on effect estimates stemming from observational methods without similar adjustments for unobserved confounders. While the degree of bias will vary depending on both predictors and outcomes, a rough but useful heuristic derived from the results of this paper is that effect sizes are often about half as big as they appear. Second, future research will have to consider more carefully the confounding effects of genetic factors and elements of the rearing environment that are not easily captured and controlled for.

Method: The method employed follows 3 steps for each predictor separately. First, 3 regression models (empty, naive, and within, as outlined below) are run for each political preference outcome in the sample of complete twin pairs. Second, a meta-analytical average for all outcomes, per model, is calculated. Third, this average effect size is compared across models to see how it changes with specification…The precision problem is at least partially solved by the aggregation of many [34] outcomes: while we should expect standard errors to be higher in the discordant models, the coefficients should not change in any systematic direction if the naive effect sizes are unbiased. Systematic changes in the average effect size across the different preference items is therefore a consequence of model choice (and, we argue, a reduction in bias) rather than variance artefacts.

Models: Naive: The second model (the “naive” model, n), and hence the first model comparison, adds a comprehensive set of controls available in the register data. The ambition is to produce as robust a model as possible with conventional statistical controls. The controls include possible contextual (municipal fixed effects), familial (parental birth years, income, and education) and individual (occupational codes, income, and education) confounders. In total, this should produce a model that is fairly conservative…Within: Finally, the third model (the “within” model, w) adds twin-pair fixed effects, producing a discordant twin design. This controls for all unobserved variables shared within an identical twin pair, that is, genetic factors, upbringing and home environment, as well as possible neighborhood and network effects

Figure 2: Main results, all outcomes, ‘naive’ versus ‘within’. Average beta coefficients across all outcomes, per model and predictor. 90% confidence intervals shown.