Stereotypes and accuracy

So i decided to look up som studies after reviewing Pinker’s comments in The Blank Slate. I found som studies mentioned on Wikipedia.


A book chapter was mentioned and i found the book it is in. the book i downloaded via the excellent ebook site bookos . it is chapter 10.


i also found a .doc of the chapter alone, if somone wants that: jussim et al, unbearable stereotpes, handbook 10-12-06


There is a second type of discrepancy reported in the literature that is still relevant as “inac-

curacy,” but has considerably less theoretical or practical importance with respect to stereotypes.

Independent of perceiving how two (or more) groups mutually differ on a given attribute (e.g.,

height), sometimes people have a general tendency to overestimate or underestimate the level of

some attribute for all groups. For example, let’s say men and women in the United States average 72

and 66 in. in height, respectively. Fred, however, believes that men and women average 74 and 68

in., respectively. He consistently overestimates height by 2 in. (this is a fairly meaningless “eleva-

tion” effect; see, e.g., Judd & Park, 1993; Jussim, 2005), but he does not exaggerate sex differences

in height.


in absolut terms, no, since he estimates that it is 6 in. however in relativ terms he does, since in the first case he thinks men are 9.09% taller, but in the second case it is 8.82%. that is, (gender dif*/female height)*100.



This standard has been supported by two recent studies that have examined the typical effect

sizes found in clinical and social psychological research. One recent review of more than 300 meta-

analyses—which included more than 25,000 studies and over 8 million human participants—found

that mean and median effect sizes in social psychological research were both about .2 (Richard et

al., 2003). Only 24% of social psychological effects exceeded .3. A similar pattern has been found

for the phenomena studied by clinical psychologists (Hemphill, 2003). Psychological research rarely

obtains effect sizes exceeding correlations of. 3. Effect sizes of .4 and higher, therefore, constitute

a strong standard for accuracy. Last, according to Rosenthal’s (1991) binomial effect size display,

a correlation of at least .4 roughly translates into people being right at least 70% of the time. This

means they are right more than twice as often as they are wrong. That seems like an appropriate

cutoff for considering a stereotype reasonably accurate.


srsly? so low correlations? correlations reported in the intelligence science journals are often much higher than that.



Definitive individuating information

The first situation involves having vividly clear and relevant individuating information about a par-

ticular target. We refer to such individuating as “definitive” because it provides a clear, valid, suf-

ficient answer to whatever question one has about a target. For example, when judging academic

accomplishments, we might have standardized test scores and class rank and grade point average

for a college applicant; when judging sales success, we might have 10 years of sales records for a

salesperson; and when judging personality, we might have multiple expert judges’ observations of,

and well-validated personality test scores for, a particular individual. When we have this quality and

quantity of information, how much should we rely on stereotypes?

If one discovers from a credible source (say, the Weather Channel) that it is 80 degrees today

in much of Alaska, but only 60 in New York, what should one conclude? The fact that it is usually

colder in Alaska is not relevant. Today, it is warmer in Alaska.

Professional basketball players tend to be tall—very tall. It is very rare to find one shorter than

6’4.” It is, therefore, reasonable to expect all basketball players to be very tall.

Once in a while, though, a short player makes it into the National Basketball Association (NBA).

Spud Webb was a starting player in the 1990s, and he was about 5’7.” Once one knows his height,

should one allow one’s stereotype to influence one’s judgment of his height? Of course not. His

height is his height, and his membership in a generally very tall group—NBA players—is com-

pletely irrelevant.

In situations where one has abundant, vividly clear, relevant individuating information about a

member of a group, the stereotype—its content, accuracy, and so on—becomes completely irrel-

evant. One should rely entirely on the individuating information.


the authors are wrong about this, altho they arent too far of the truth (in their terms, it is a near miss!).

as i have written befor somwher els (i forgot wher), the reason is that they dont take a baynesian approach to the data. they commit the base rate fallacy.


lets take their example of the temperature in the states New York (NY) and Alaska (AK). surely, most of the time, it is warmer in Alaska. the average temperatures ar: -3.0°C in AK and 7.4°C in NY. this is a pretty large differnence in averages. without knowing the standard deviations, i cant calculate the effect size (Cohen’s d). however, let’s suppose that the base rate P(AK>NY)=0.02. That is, only two times out of a hundred AK is warmer than NY. Assuming they cant be equally warm, this means that P(NY>AK)=0.98. Or, alternativly P(~AK>NY)=0.98, which is the probability that it is fals that AK is warmer than NY is 0.98.


Now comes the evidence part. Suppose we have good evidence from The Weather Channel (WC) that today AK really is warmer than NY. To calculate the probability that P(AK>NY|WC), that is, the probability that AK is warmer than NY today, we need the error rates of the WC. Suppose that P(WC|AK>NY)=0.99, that is, the probability that WC will report AK as warmer than NY happens 99% of the time whenever AK is warmer than NY. They miss the 1% (false negative rate). Suppose also that P(WC|~AK>NY)=0.01, that is, the probability that WC will report that AK is warmer than NY given that it its fals that AK is warmer than NY is 1%. In other words, the WC report is wrong 1% of the time when they claim that AK is warmer than NY (false positive rate). These data indicate that the WC is a very reliable source of info. But given that they report that AK>NY one a given day, what is the chance of that?


We can plug in the data on calculator, or use the equations ourselves. The probability is about 70%, even tho the WC is very reliable. This is becus the base rate is so low. Error rates are increased to even 5%, then the probability is not even over 50%.


In more general terms, when something is very unlikely to begin with, we need stronger evidence to believe it than if it is not quite as unlikely to begin with. That there is the same evidence (in the sens abov) in favor of two propositions A and B, does not imply that the probability of them ar the same. the base rate must be taken into account.



Accuracy in Perception of Small Group Differences

Madon et al. (1998) examined the accuracy of seventh-grade teachers’ perceptions of their students’

performance, talent, and effort at math about 1 month into the school year. Madon et al. assessed

accuracy in the following manner. First they identified the teachers’ perceptions of group differ-

ences by correlating teachers’ perceptions of individual students with the students’ race, sex, and

social class. This correlation indicated the extent to which teachers systematically evaluated indi-

viduals from one group more favorably than individuals from another group. Next, Madon et al.

assessed actual group differences in performance, talent, and effort by correlating individual stu-

dents’ final grades the prior year (before teachers knew the students), standardized test scores, and

self-reported motivation and effort with students’ race, sex, and social class. The teachers’ accuracy

was assessed by correlating the teachers’ perceived differences between groups with the groups’

actual differences.

Madon et al. (1998) found that teachers were mostly accurate. The correlation between teachers’

perceived group differences and actual group differences was r = .71. The teachers’ perceptions of

sex differences in effort, however, were highly inaccurate—they believed girls exerted more effort

than boys, but there was no sex difference in self-reported motivation and effort. When this outlier

was removed, the correlation between perceived and actual group differences increased to r = .96.


perhaps the members of the gender groups reported their effort levels relativ to members of their own gender, not the composit group of both genders. this wud mask the gender difference in the data.


if the result is genuin, i will be surprised, as it is widely believed that girls work harder in school (like the teachers believed), and it is known that school effort is correlated with the conscientiousness factor, on which women load higher than men.



C. E. Cohen (1981) examined whether people more easily remember behaviors and attributes that

are consistent with a stereotype than those that are inconsistent with that stereotype. Perceivers

in her study viewed a videotape of a dinner conversation between a husband and wife (they were

actually husband and wife, but they were also experimental confederates trained by Cohen). Half of

the time, this conversation led perceivers to believe the woman was a waitress; half of the time, the

conversation led perceivers to believe the woman was a librarian. The remainder of the conversation

conveyed an equal mix of librarian-like and waitress-like attributes and behaviors.


what remainder? lol



Sex Stereotypes: Jussim et al. (1996) and Madon et al. (1998)

Both Jussim et al. (1996) and Madon et al. (1998) examined the accuracy of teacher expectations.

(Madon et al., 1998, was described previously; Jussim et al., 1996, was similar, except that it was

conducted in sixth grade rather than seventh grade, and it did not examine the accuracy of perceived

differences between students from different demographic groups.) Both found that, when control-

ling for individuating information (motivation, achievement, etc.), student social class and race or

ethnicity had little or no effect on teacher expectations. Thus, teachers essentially jettisoned their

social class and ethnic stereotypes when judging differences between children from different social

class and ethnic backgrounds. Although this finding is in many ways laudable, teachers relying

entirely on individuating information does not help address the question of whether relying on a

stereotype increases or reduces accuracy.


Both studies, however, found that sex stereotypes biased teachers’ perceptions of boys’ and girls’

performance (standardized regression coefficients of .09 and .10 for performance, and .16 and.19 for

effort, for Madon et al. and Jussim et al., respectively). In both studies, teachers perceived girls as

performing higher and exerting more effort than boys. Because these effects occurred in the context

of models controlling for individuating information, they are best interpreted as stereotypes influ-

encing teacher perceptions—bias effects, in traditional social psychological parlance.

Did these sex stereotyping bias effects increase or reduce the accuracy of teachers’ perceptions?

They did both. In the case of performance, the sex stereotype effect increased teacher accuracy. The

real performance difference, as indicated by final grades the prior year, was r = .08 and r = .10 (for

the 1996 and 1998 studies, respectively, girls received slightly higher grades). The regression model

producing the “biasing” effect of stereotypes yielded a “bias” that was virtually identical to the real

difference. In other words:


The small independent effect of student sex on teacher perceptions (of performance) accounted for

most of the small correlation between sex and teacher perceptions (of performance). This means that

teachers apparently stereotyped girls as performing slightly higher than boys, independent of the actual

slight difference in performance. However, the extent to which teachers did so corresponded reasonably

well with the small sex difference in performance. In other words, teachers’ perceptions of differences

between boys and girls were accurate because teachers relied on an accurate stereotype. (Jussim et al.,

1996, p. 348)


The same conclusion, of course, also characterizes the results for the 1998 study.


On the other hand, the results regarding effort provided evidence of bias that reduced accuracy.

There was no evidence that girls exerted more effort than boys. Therefore, the influence of student

sex on teacher perceptions of effort (i.e., teachers’ reliance on a sex stereotype to arrive at judgments

of effort) led teachers to perceive a difference where none existed. This is an empirical demonstra-

tion of something that, logically, has to be true. Relying on an inaccurate stereotype when judging

individuals can only harm one’s accuracy.



Table 10.4 compares the frequency with which social psychological research produces effects

exceeding correlations of r = .30 and r = .50, with the frequency with which the correlations reflect-

ing the extent to which people’s stereotypes correspond to criteria exceed r = .30 and r = .50.

Only 24% of social psychological effects exceed correlations of r = .30 and only 5% exceed r =

.50. In contrast, all 18 of the aggregate and consensual stereotype accuracy correlations shown in

Table 10.1 and Table 10.2 exceed r = .30, and all but two exceed r = .50. Furthermore, 9 of 11 per-

sonal stereotype accuracy correlations exceeded r = .30, and 4 of 11 exceeded r = .50.


This is doubly important. First, it is yet another way to convey the impressive level of accuracy

in laypeople’s stereotypes. Second, it is surprising that so many scholars in psychology and the

social sciences are either unaware of this state of affairs, unjustifiably dismissive of the evidence,

or choose to ignore it (see reviews by Funder, 1987, 1995; Jussim, 1991, 2005; Ryan, 2002). When

introductory texts teach about social psychology, they typically teach about phenomena such as

the mere exposure effect (people like novel stimuli more after repeated exposure to it, r = .26), the

weapons effect (they become more aggressive after exposure to a weapon, r = .16), more credible

speakers are more persuasive (r = .10), and self-serving attributions (people take more responsibil-

ity for successes than failures, r = .19; correlations all obtained from Richard et al., 2003). How

much time and space is typically spent in such texts reviewing and documenting the much stronger

evidence of the accuracy of people’s stereotypes? Typically, none at all. For a field that aspires to be

scientific, this is a troubling state of affairs. Some might even say unbearable.



Generating a coherent understanding of Both past and future Research

The decades of research on the role of stereotypes in expectancy effects, self-fulfilling prophecies, per-

son perception, subtyping, and memory, are jeopardized if all stereotypes are regarded as wholly inac-

curate. This past research will be haunted by a definitional tautology; that is, that people who believe

in stereotypes are in error because stereotypes are erroneous beliefs. On the other hand, accepting that

stereotypes range in accuracy makes this past research coherent, and allows for more edifying inter-

pretations of past and future research, such as “people in X condition, or of Y disposition, are more

likely to believe in, subscribe to, and maintain false stereotypes, whereas people in A condition, or of

B disposition are more likely to believe in, subscribe to, and maintain accurate stereotypes.”

In sum, accepting that stereotypes can sometimes be accurate provides the means to distinguish

innocent errors from motivated bigotry, assess the efficacy of efforts to correct inaccurate stereo-

types, and reach a more coherent scientific understanding of stereotypes. We believe that this propo-

sition can advance the depth, scope, and validity of scientific research on stereotypes, and thereby

help improve intergroup relations.


Leave a Reply