I investigate the relationship between fertility and intelligence in a data set from Wisconsin

A new study finds that among men born in Sweden from 1951-1967, less intelligent men tend to have fewer kids. I look into whether this pattern is also true in people in a data set of people born in Wisconsin in ~1939.

The data is from the Wisconsin Longitudinal Study. Data preparation is performed here.

```
library(pacman)
p_load(tidyverse, magrittr, broom, rsample, janitor, feather, plotrix)
source('../../src/extra.R', echo = F, encoding="utf-8")
df <- read_feather("data/data.f") %>%
select(iq100, nkids, gender, rtype, income, eduyears, nkids75, nkids92) %>%
drop_na(iq100, nkids, gender)
levels = c("< 74", "74-81", "81-89", "89-96", "96-104",
"104-111", "111-119", "119-126", "> 126")
df %<>% drop_na(iq100, nkids, gender)
df %<>%
mutate(
iq_group = case_when(
iq100 < 74 ~ "< 74",
between(iq100, 74, 81) ~ "74-81",
between(iq100, 81, 89) ~ "81-89",
between(iq100, 89, 96) ~ "89-96",
between(iq100, 96, 104) ~ "96-104",
between(iq100, 104, 111) ~ "104-111",
between(iq100, 111, 119) ~ "111-119",
between(iq100, 119, 126) ~ "119-126",
iq100 > 126 ~ "> 126"
)
) %>%
mutate(iq_group = factor(iq_group, levels = levels))
```

```
df %>% tabyl(gender) %>% select(-percent)
```

gender | n |
---|---|

female | 5540 |

male | 5001 |

I divide the data set into the same IQ subgroups as in the Swedish study. We do not see the same relationship as in the Swedish study between low IQ and low fertility in men. In women we see a strong negative correlation between IQ and fertility.

```
df %>% group_by(gender, iq_group) %>%
summarise(
mean = mean(nkids),
se = plotrix::std.error(nkids)
) %>%
ggplot(aes(x = iq_group, y = mean, group=gender, color=gender)) +
geom_errorbar(aes(ymin = mean - se, ymax = mean + se), width=.1) +
geom_line(size=1.1) +
geom_point(size=3) +
labs(x = "IQ", y = "n_children")
```

Note that there are some issues with the reporting of number of children in the data set. The initial question was given in 1975, when the participants were ~36 years old. Thus there is still time to have more children, especially for men. This is likely the reason that men’s overall fertility is lower than women’s in the figure. The question was repeated again in 1992, and the above figure includes this update, but there was quite low response rate. It is also possible that especially more intelligent men tend to have kids later in life, which would make us question the dip in the graph for men at high IQ.

The gender difference in the influence of IQ on the number of offspring is significant in an interaction test:

```
df <- bind_rows(
df %>% filter(gender == "male") %>% mutate(nkids_z = stdize(nkids)),
df %>% filter(gender == "female") %>% mutate(nkids_z = stdize(nkids))
)
lm(nkids_z ~ iq100 * gender, df)
```

Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|

(Intercept) |
0.637 | 0.0945 | 6.74 | 1.67e-11 |

iq100 |
-0.00637 | 0.000936 | -6.81 | 1.04e-11 |

gendermale |
-0.336 | 0.133 | -2.53 | 0.0115 |

iq100:gendermale |
0.00337 | 0.00131 | 2.57 | 0.0103 |

A subset of the data set also has polygenic scores available. Thus we can calculate the expected decline in IQ / education polygenic scores for the population caused by selection. This is the same method as used by Kong et al. We use the polygenic score for educational attainment as a proxy for the polygenic score for IQ, since they have very high correlation.

```
df <- read_csv("data/data_nosibs.csv", guess_max=1500) %>%
mutate_at(vars(starts_with("z_cmkdb")), funs(. - 1900 - birthyear)) %>%
mutate(mean_birth_age = rowMeans(select(., starts_with("z_cmkdb")), na.rm = T)) %>%
drop_na(pgs_edu)
```

```
df %>% tabyl(gender) %>% select(-percent)
```

gender | n |
---|---|

female | 2783 |

male | 2496 |

\[\begin{equation} X = \text{Education attainment polygenic score / IQ with mean 0 and std 1} \end{equation}\]

\[\begin{align} Regression 1:\nonumber \\ \text{Number of children} &\sim X \\ \text{Number of children} &= a + bX \\ \nonumber \\ Regression 2:\nonumber \\ \text{Average age at child birth} &\sim X \\ \text{Average age at child birth} &= c + dX \end{align}\]

\[\begin{equation} \beta = \frac{b}{a*c} - \frac{d * log(\frac{a}{2})}{c^2} \end{equation}\]

\[\begin{align} \text{If X is polygenic score:} \nonumber \\ \text{Mean change in genetic edutainment per year} &= \beta \\ \nonumber \\ \text{If X is IQ:} \nonumber \\ \text{Mean change in genetic intelligence per year} &= h^2 * \beta \end{align}\]

```
set.seed(0)
n_samples <- 1000
calc_bootstrap <- function(split, measure) {
r1 <- lm(as.formula(glue("nkids ~ {measure}")), analysis(split)) %>%
tidy %$% estimate
r2 <- lm(as.formula(glue("mean_birth_age ~ {measure}")), analysis(split)) %>%
tidy %$% estimate
a <- r1[1]; b <- r1[2]; c <- r2[1]; d <- r2[2]
return(b / (a * c) - d *log(a/2) / (c*c))
}
get_beta <- function(g) {
boots <- rsample::bootstraps(df %>% filter(gender==g), times = n_samples)
res <- boots %>% mutate(beta = map_dbl(splits, calc_bootstrap, "pgs_edu"))
alpha <- .05
res[3] %>%
summarise(gender = g,
mean_beta = mean(beta),
high = quantile(beta, 1 - alpha / 2),
low = quantile(beta, alpha / 2)
)
}
betas <- bind_rows(get_beta("male"), get_beta("female"))
betas <- bind_rows(
betas,
betas %>% summarise(
mean_beta = mean(mean_beta),
high = mean(high),
low = mean(low),
gender = "combined"
)
) %>%
rename("CI_95%_lower_bound" = high, "CI_95%_upper_bound" = low)
beta <- betas %>% filter(gender == "combined") %>% pull(mean_beta)
betas
```

gender | mean_beta | CI_95%_lower_bound | CI_95%_upper_bound |
---|---|---|---|

male | -9e-04 | -9.14e-05 | -0.0017 |

female | -0.00137 | -0.00065 | -0.00211 |

combined | -0.00114 | -0.000371 | -0.0019 |

We use the combined beta in this estimate.

```
change_pgs = beta * 10
change_iq_per_pgs = coef(lm(iq100 ~ pgs_edu, df))[["pgs_edu"]]
change_iq_from_pgs = change_pgs * change_iq_per_pgs
h2_iq_high = 0.8
h2_iq_low = 0.4
explained_var_pgs = summary(lm(iq100 ~ pgs_edu, df))$adj.r.squared
change_iq_high = change_iq_from_pgs * h2_iq_high / explained_var_pgs
change_iq_low = change_iq_from_pgs * h2_iq_low / explained_var_pgs
tribble(
~Description, ~Value, ~Variable,
"Change in EDU PGS per decade", change_pgs, "change_pgs",
"Change in IQ per std change in EDU PGS", change_iq_per_pgs, "change_iq_per_pgs",
"Change in IQ per decade estimated from the change in EDU PGS",
change_iq_from_pgs, "change_iq_from_pgs",
"Variance in IQ in the data that is explained by the EDU PGS",
explained_var_pgs, "explained_var_pgs")
tribble(~"IQ Heritability estimate", ~"Predicted change in IQ per decade",
"0.8 (classical behavior genetics)", change_iq_high,
"0.4 (GCTA GREML)", change_iq_low)
```

Description | Value | Variable |
---|---|---|

Change in EDU PGS per decade | -0.0114 | change_pgs |

Change in IQ per std change in EDU PGS | 4.65 | change_iq_per_pgs |

Change in IQ per decade estimated from the change in EDU PGS | -0.0528 | change_iq_from_pgs |

Variance in IQ in the data that is explained by the EDU PGS | 0.0981 | explained_var_pgs |

IQ Heritability estimate | Predicted change in IQ per decade |
---|---|

0.8 (classical behavior genetics) | -0.43 |

0.4 (GCTA GREML) | -0.215 |