Measuring antisocial behavior well

I have previously written about issues with measuring antisocial/criminal behavior, but I found a study that did it very well, so we will be talking about it.

Genetic and environmental influences on childhood antisocial and aggressive behavior (ASB) during childhood were examined in 9- to 10-year-old twins, using a multi-informant approach. The sample (605 families of twins or triplets) was socioeconomically and ethnically diverse, representative of the culturally diverse urban population in Southern California. Measures of ASB included symptom counts for conduct disorder, ratings of aggression, delinquency, and psychopathic traits obtained through child self-reports, teacher, and caregiver ratings. Multivariate analysis revealed a common ASB factor across informants that was strongly heritable (heritability was .96), highlighting the importance of a broad, general measure obtained from multiple sources as a plausible construct for future investigations of specific genetic mechanisms in ASB. The best fitting multivariate model required informant-specific genetic, environmental, and rater effects for variation in observed ASB measures. The results suggest that parent, children, and teachers have only a partly “shared view” and that the additional factors that influence the “rater-specific” view of the child’s antisocial behavior vary for different informants. This is the first study to demonstrate strong heritable effects on ASB in ethnically and economically diverse samples.


The USC Twin Study of Risk Factors for Antisocial Behavior is a longitudinal study of the interplay of genetic, environmental, social, and biological factors on the development of antisocial behavior across adolescence. The first wave of assessment occurred during 2001 to 2004, when the twins were 9 to 10 years old, with a 2-year follow-up assessment in the laboratory when twins were ages 11 to 12. Two additional follow-up assessments will be conducted when the twins are ages 14 to 15 (third wave) and 16 to 17 years old (fourth wave). The present analyses are based on data from the first wave. Comprehensive assessment of each child was made, including cognitive, behavioral, psychosocial, and psycho-physiological measures based on individual testing and interviews of the child and primary caregiver during the laboratory visit, with additional teacher surveys completed and returned by mail. A detailed description of the study, including a summary of the measures, can be found in .

So, it is kinda small for a BG study, but reasonable to get a general pattern of results.

Measurement agreement across observers is not that high:

Another important aspect to consider when comparing results across studies is the source of the information about ASB. It is well-known that different informants produce different reports of a child’s behavior. Correlations between raters of the same child are typically about .60 between mother and father ratings, .28 between parent and teacher ratings, and .22 between the parent and child ratings (). Largely, each rater provides a unique perspective on the child’s behavior. Children would seem to be the most knowledgeable source to report on their own behavior (particularly covert actions) as well as their motivations, although their cognitive development, truthfulness, and social desirability factors may limit the accuracy of their reports. Parents may be more able to objectively report on a child’s externalizing behaviors, although they may be unaware of covert actions or unwilling to report them to researchers. Although teachers’ reports may also have the advantage of greater objectivity, teachers may have limited knowledge of the child’s antisocial behavior, particularly as it may occur outside of classroom or other school settings. Although researchers sometimes combine ratings across reporters in an attempt to increase scale reliability, different etiologies may exist for scales derived from different informants (, ; ). Thus, the best way to model information from multiple informants is to use a multivariate, factor-based approach that allows for both differences and correlations across informants simultaneously ().

Of course, this means that ‘unique environment’ (Everything Else, as we call it here) will be far inflated because of random measurement and systematic error.

To avoid issues with measurement error, this study used an impressive collection of measures:

The present study used a total of 18 different measures of antisocial behavior taken from five different instruments from a total of three unique informants (caregivers, teachers, and children). Instruments varied in terms of their mode of assessment, with some being administered through semistructured interviews (i.e., the Diagnostic Interview Schedule for Children—Version IV [DISC–IV]) and others through questionnaires administered either in an interview format (i.e., the Childhood Aggression Questionnaire [CAQ] and the Child Psychopathy Scale [CPS]) or in paper-and-pencil format (i.e., the Child Behavior Checklist [CBCL]). Each instrument was given to at least two of the three possible informants. The following sections provide detailed information about each of the five instruments, including information about the instrument itself, mode of assessment, informant type, and use of any relevant subscales.

The main lacking thing is some kind of official record of crime from the criminal justice system. The first author notes by email to me that it is of course not possible here because the twins were 9-10 years old at the time. Perhaps school delinquency records? She further notes that since they are now in their early 20s, this information is being pursued, so perhaps in a few years, we will get an update with a longitudinal design.

What is particularly nice here is that they used a test-retest approach to estimate random measurement error for a subsample (n=60).

We are mostly interested in the general factor, first principle component, which has quite high test-retest at .94 for total sample. Seemingly, girls’ antisocial behavior is harder to measure since all their reliabilities are lower. If not accounted for, this will result in a fake sex difference in estimates. The subsample here is too small to be sure in this sex difference in reliability however.

They present the complete correlation matrix for all their measures, and note that:

Additional comparisons of caregiver, teacher, and child reports of ASB were made by computing correlations between informants for the various scales (see Table 4). Informant agreement (indicated in boldface type in Table 4 for each measure common to two or more raters) was lowest between the child and either the caregiver or teacher (r = .17 to .29 for boys; r = .02 to .21 for girls). Agreement between caregiver and teacher ratings was somewhat higher (r = .26 to .43 for boys; r = .10 to .21 for girls) across the board. Although not shown in Table 4, correlations across raters for the composite measure of antisocial behavior (described in the next section) were also significant: r = .30 for caregiver–child agreement, r = .23 for child–teacher agreement, and r = .44 for caregiver–teacher agreement (sexes combined).

Principal-components analysis

Although all of the within-rater correlations were significant and were of moderate to high magnitude, they were not unity, which at first blush might indicate that heterogeneity of ASB may exist in these preadolescent children. The positive manifold of correlations within each rater, however, is suggestive of a single, general factor of antisocial behavior underlying the various measures. Principal-components analyses of the ASB scales within each rater confirmed that a single factor could account for much of the variance among these measures. Loadings on the first principal component within each rater are provided in Table 5, along with the percentage of variance explained among the scales in each case. All factor loadings were .70 or higher, and the general ASB factor accounted for 57.4% of the variance among the child report measures of ASB, 58.7% of variance among caregiver reports, and 77.4% of variance among teacher reports. Within each rater, scree plots clearly indicated a strong preference for a single principal component, such that only the first eigenvalue exceeded 1.0 (i.e., 3.44 for child report measures, 4.11 for caregiver ratings, and 3.87 for teacher ratings) with the second eigenvalue being clearly less than 1.0 in all three analyses (0.70, 0.72, and 0.41 for child, caregiver, and teacher ratings, respectively). It would thus appear that there is considerable overlap between the individual ASB scales, consistent with the notion of a general externalizing factor (). We therefore computed composite measures of ASB for each rater (using factor-weighted scores), and used these in the multivariate genetic models. It is noteworthy that the 6-month test–retest correlations were strong for the composite scores (r = .81 for child reports and .94 for caregiver reports) and that interrater agreement for the three composites (r = .30 between caregiver and child, r = .23 between child and teacher, and r = .44 between caregiver and teacher) was comparable to—and, in many instances, higher than—the values for each individual scale reported in Table 4.

Generally speaking, looks like teacher reports are more to be trusted than self-report or caregiver reports (in that, they have stronger general factors). My interpretation of this is that 1) teachers aren’t intimately linked to the person in question, so they are more impartial, 2) they have a lot of experience with different kids, so are aware of real differences, 3) using teachers’ ratings avoids the common method variance of relying on parents about their multiple children (twins in this case).

Their main result:

Standardized parameter estimates from the full common pathways model with equal effects across gender are provided in Figure 2. Estimates shown to be statistically significant at p < .05 are indicated with an asterisk (based on results of post hoc analyses; these analyses are available upon request). As shown, the common ASB factor underlying all three raters was primarily explained by genetic influences, with a heritability of .96 and no effect of shared twin environment. (In order to calculate estimates for proportions of variation, each standardized parameter estimate shown in Figure 2 is squared; i.e., h2 of shared view = .982.) Only a small proportion of variation in the underlying latent factor (.04) was explained by nonshared environmental influences (.192). Moreover, post hoc analyses indicated that these nonshared environmental influences were not statistically significant and that all variation in the latent ASB factor representing the shared view could be accounted for entirely by genetic influence (i.e., the h2 of the latent factor = 1.0). Figure 2 also demonstrates that the latent factor representing the shared viewpoint accounted for only 17.6% of the overall variation in child reports (.422) but explained approximately one third (.552 = .303) and nearly half (.672 = .449) of the variation in teacher and caregiver reports, respectively.

A different way of looking at it, is looking at the breakdown for each measure composite:

As the authors conclude:

Our analyses revealed that although mean levels of ASB differed for boys and girls, the sources of individual differences in ASB were similar across gender. One of the most important findings from this study is that a shared view of antisocial behavior is strongly genetically influenced, with little or no effect of shared sibling environment. Although our analyses revealed a moderate genetic basis to individual views of antisocial behavior, with heritabilities ranging from .40 to .50 for individual composites from child, teacher, and caregiver, the estimated heritability of the underlying shared view of antisocial behavior from the common pathways model was nearly 1.0. This latent factor may reflect constellations of stable personality traits (e.g., disinhibition, lack of constraint) that may influence antisocial behavior across many contexts (). This highly heritable common factor representing the shared view across multiple informants could therefore prove especially useful in future investigations of specific genetic associations, or quantitative trait loci, in human aggression and antisocial behavior.

As a bonus, they looked at the classroom effect of the teacher ratings, and found the same thing the other studies did:

In contrast, for teacher reports, we were able to differentiate rater effects from true shared environmental effects, because although virtually all twins attended the same school, less than half of twin pairs were in the same classroom at school. This allowed us to disentangle shared environmental influences, which would affect the similarity of all twin pairs, regardless of classroom, from rater effects, which would only increase similarity among twins who were rated by the same teacher. In this study, rater effects accounted for more than one fourth (28.1%) of the overall variation in teacher reports. This indicates that the twins in the same classroom are rated more similarly than twins in different classrooms. Although we speculate that this is due to rater bias on the part of the teacher, it is theoretically possible that twins in the same classroom may in fact have a greater shared environment than those in separate classrooms (i.e., a direct classroom effect on behavior). To investigate this possibility, we examined post hoc whether caregiver or child ratings were also more similar if twins were in the same classroom at school, using the same dummy code for shared classroom that we used to evaluate the teacher rater effects (as described earlier). The results of these post hoc analyses indicate that being in the same classroom at school had virtually no effect on twin similarity of antisocial behavior as rated by either caregivers or the twins themselves. Being in the same classroom at school, therefore, does not lead to increased twin similarity in ASB based on either the caregiver’s or the child’s own view. Thus, our findings suggest that reports from teachers may be more heavily influenced by rater bias effects than are ratings from other reporters, leading to a spurious effect of shared environment when teacher reports are examined alone. However, in the absence of direct observational data, we cannot rule out the possibility that twins in the same classroom behave more similarly while at school. Nevertheless, if this is the case, it is important to note that these “classroom effects” are situational specific and do not affect similarity of behavior in other contexts.

All in all, this is a great study. I hope there are replications somewhere with larger samples and official measures of antisocial behavior (court records). The finding of perfect heritability is questionable, so I doubt a replication will find that value. However, as I said in the last review post: “heritabilities of harmful criminal behavior are probably seriously underestimated”. Going with the results from intelligence and personality measured well, I shall estimate the true heritability of antisocial behavior will be about the same at 80%. Perhaps all the behavioral traits have about the same heritability of 80% and the differences we see are artifacts of differential measurement issues.

Leave a Reply