Number of siblings vs. total fertility rate

This is commentary on:

For over a century, social scientists have predicted declines in religious beliefs and their replacement with more scientific/naturalistic outlooks, a prediction known as the secularization hypothesis. However, skepticism surrounding this hypothesis has been expressed by some researchers in recent decades. After reviewing the pertinent evidence and arguments, we examined some aspects of the secularization hypothesis from what is termed a biologically informed perspective. Based on large samples of college students in Malaysia and the USA, religiosity, religious affiliation, and parental fertility were measured using self-reports. Three religiosity indicators were factor analyzed, resulting in an index for religiosity. Results reveal that average parental fertility varied considerably according to religious groups, with Muslims being the most religious and the most fertile and Jews and Buddhists being the least. Within most religious groupings, religiosity was positively associated with parental fertility. While cross-sectional in nature, when our results are combined with evidence that both religiosity and fertility are substantially heritable traits, findings are consistent with view that earlier trends toward secularization (due to science education surrounding advancements in science) are currently being counter-balanced by genetic and reproductive forces. We also propose that the inverse association between intelligence and religiosity, and the inverse correlation between intelligence and fertility lead to predictions of a decline in secularism in the foreseeable future. A contra-secularization hypothesis is proposed and defended in the discussion. It states that secularism is likely to undergo a decline throughout the remainder of the twenty-first century, including Europe and other industrial societies.

To investigate fertility differentials, they rely on surveys that include a question about the number of siblings. This is a potentially problematic measure of fertility.

Total fertility rate (TFR) = Number of children birth per woman over her entire reproductive life.

Suppose we have a population where all women have exactly 2 children (TFR=2). If we sample children at random from this population, the mean number of siblings will be 1.00 (sd=0). If we add 1 to take into account the respondent himself, we get 2.00. So, this seems to be a way to estimate the TFR using another type of data.

However, suppose we have a population with 5 kinds of women: those who have 0, 1, 2, 3 and 4 children. These are equally common, so each is 20%. Pretend that we have 100 women. The distribution looks like this:

  • 20 women with 0 children
  • 20 women with 1 child
  • 20 women with 2 children
  • 20 women with 3 children
  • 20 women with 4 children

The total population of children is 20 * 0 + 20 * 1 + 20 * 2 + 20 * 3 + 20 * 4 = 200. So, TFR=2.00. Now consider the case where we sample the children instead, and you can see the problem:

  • 80 of the children have 3 siblings
  • 60 of the children have 2 siblings
  • 40 of the children have 1 sibling
  • 20 of the children have 0 siblings

The mean number of siblings if we count everybody is (80 * 3 + 60 * 2 + 40 * 1)/200 = 2. If we then add the respondent, we get 3. But the total fertility rate is only 2, not 3. This is because when you sample at the child level, you are more likely to sample from the larger sibships and indeed it’s impossible to sample from the children among women who had no children, so these are necessarily under-counted. Thus, the number of siblings metric is not usually comparable to TFR.

Two populations that have different sibship distributions but equal TFRs may have different mean number of siblings as well, so the metric is maybe problematic to use for comparison purposes. To investigate how problematic in practice, one must either have some nice data that allow for the calculation of both metrics, so it has to be a familial design. Or if one has a realistic idea about the data generation process, one could just simulate some data. My intuitive hunch is that for comparison purposes, the number of siblings will be a good metric because the shape of the distribution will not vary markedly between populations.

Here’s some simple empirical support at the country level from EU Stat.

Basically, it follows something close to a Poisson distribution with a lambda that varies by group. The Poisson distribution is tricky to understand. It takes only a single parameter. This parameter is the average number of events, for instance, per time. In our case, it will children per woman. This way, we are treating the number of children a woman will have as being independent of the number of children she already has and also being a random event with some chance of happening. As such, there is no maximum number of children a woman can have. This is unrealistic for two reasons: 1) it takes some time to have a child because pregnancy and the reproductive span is limited. If we take history as our realism, the max achieved seems to have been 27 births with 69 offspring. 2) There are genetic defects that cause untreatable infertility, and which thus always results in 0 children. 3) One can have multiple children in one birth (twins, triplets etc.). Still, it’s a good approximation.

So, I did some simple simulations and…

So turns out that the mean number of siblings exactly balances out and matches the TFR when simulating data using a Poisson distribution. Yes, I double checked. There is some deeper math here I don’t understand, but empirical math is good enough to me.

PS. The authors are aware that this metric can be problematic.

Obviously, parental fertility cannot be equated with the fertility of couples, of women, or of populations (the most common bases for operationalizing fertility). The main distinction between these more common fertility measures and ours is that our measure over-estimates fertility by excluding all childless couples from being counted.

It seems that it can. But note that this does not imply that it works the same as a comparative measure at the individual level. This I did not test.

Nathaniel Bechhofer worked out the proof.