Assessment of Psychometric Intelligence for Racial and Ethnic Minorities: Some Unanswered Questions

Eleanor Armour-Thomas. Handbook of Racial & Ethnic Minority Psychology. Editor: Guillermo Bernai. Sage Publications. 2003.

The validity of conventional tests of intelligence for racial and ethnic minorities has remained one of the most enduring and contentious issues in psychoeducational assessment. At the heart of the seemingly unending debate is whether intelligence as measured by standardized tests is a universal phenomenon and therefore could be subjected to legitimate comparisons between and among cultural groups. Despite the furor, conventional tests of intelligence continue to be used for educational purposes. For example, test scores predict school grades, standardized academic achievement, and some aspects of job performance (Greenfield, 1997; Neisser et al., 1996), and they are used as a major factor in placement in gifted and talented and special education programs (Suzuki & Valencia, 1997). But in a democratic and multicultural society such as the United States, it is very difficult to avoid a misunderstanding of test results for populations whose cultural frame of reference is different from the population on whom intelligence tests were initially normed. Indeed, since its inception, critics have voiced concerns about the use and interpretation of results of standardized measures of intelligence for some racial and ethnic minorities, particularly those socialized beyond the pale of the mainstream or dominant culture of the United States (e.g., African Americans, Latinos, and Native Americans).

Some problems in assessment for these populations concern conceptual issues regarding the construct of intelligence itself, whereas others are more methodological in nature and relate to the standardization criteria used in intelligence test construction, administration, and validation. It is crucial that these challenges are not only understood but that steps are also taken to ensure greater cultural sensitivity by test developers, researchers, and practitioners who work with children from culturally diverse backgrounds.

The thesis of this chapter is that any standardized test of intelligence, oftentimes called an IQ test, only has validity for the cultural group(s) for whom it was developed. This chapter opens with a brief revisitation of a perspective of intelligence that I refer to as psychometric intelligence. Although other conceptions of intelligence have emerged in recent years—for example, Gardner’s (1983) multiple intelligences, Sternberg’s (1985) triar chic theory of intelligence, and Goleman’s (1995) emotional intelligence—the psychometric perspective is singled out for particular attention because the majority of conventional tests of intelligence are based on psychometric methodologies. Moreover, it is the validity of such tests that has been suspect for some racial and ethnic minority groups in the United States. Next, the interdependency of culture and human cognition is explored, the discussion of which forms the backdrop for challenging assumptions and raising unanswered questions about psychometric intelligence for some racial and ethnic minority children in the United States. This chapter ends with recommendations for researchers and professional practitioners for promoting greater cultural sensitivity in the assessment of cognitive abilities, oftentimes used interchangeably with the construct intelligence.

Psychometric Intelligence … a Work in Progress

Psychometric intelligence is used here to refer to the mental operations underlying performance on a set of cognitive tasks identified through factor analysis. Factor analysis is a mathematical procedure that analyzes the intercorrelations among different kinds of cognitive tasks. The results of the analysis reveal sources of observable individual differences in performance that psychometricians call factors, each of which is presumed to represent a mental or cognitive ability. Factors may differ in terms of number (e.g., Horn’s [1991] nine abilities; Spearman’s [1927/1981] two-factor theory; Thurstone’s [1938] seven primary abilities) or structure (e.g., Carroll’s [1993] three-stratum theory; Guilford’s structure of the intellect model [1982]; Vernon’s [1971] hierarchical model). Examples of psychometric abilities include memory; quantitative, inductive, and deductive reasoning; comprehension; knowledge; visual spatialization; visual and auditory processing; and speed of cognitive processing. There is a strong empirical evidence for the psychometric ability model (see, e.g., Carroll, 1993), which serves as a reference for the analysis and interpretation of most conventional tests of intelligence, such as the third edition of the Wechsler Intelligence Scale for Children (Wechsler, 1991), the fourth edition of the Stanford-Binet Intelligence Scale (Thorndike, Hagen, & Sattler, 1986), the Woodcock-Johnson Psycho-Educational Battery—Revised (Woodcock & Johnson, 1989), and the Kaufman Adolescent and Adult Intelligence Test (Kaufman & Kaufman, 1993).

A major assumption of the psychometric view of intelligence that has influenced the development of these measures is that individual differences in intellectual functioning can be understood in terms of these “factors of the mind” that are unaffected by culture. Recently, Salovey and Mayer (1994) articulated this assumption that seems to undergird any measure of psychometric intelligence: “Intelligence, as defined by Western psychology, is the property of the individual, and that individual, idiocentric or allocentric, can have his or her intelligence gauged by abilities at manipulating symbols” (p. 310).

An even more compelling justification for the use of psychometric intelligence came from a group of well-known researchers considered as experts on intelligence and intelligence testing. In a position paper titled “Mainstream Science on Intelligence,” developed in 1994 and reprinted in the journal Intelligence (Gottfredson, 1997), they signed a statement that

intelligence is a very general mental capability that… can be measured, and individual tests measure it well…. Intelligence tests are not culturally biased against American blacks, or other native-born, English-speaking peoples in the U.S. Rather, IQ scores predict equally accurately for all such Americans, regardless of race and social class, (p. 17)

This claim, though, has not gone unchallenged by other well-known researchers who question whether any standardized measure of intelligence, psychometric or otherwise, can be used with populations whose cultural frame of reference is substantively different from the cultural group(s) for whom the test was initially developed and validated. For example, as early as 1970, the Association of Black psychologists (Williams, 1970) called for a moratorium on intelligence testing for racial minorities, with charges that test data

label black children as uneducable; place black children in special classes; potentiate inferior education; assign black children to lower education tracks than whites; deny black children higher educational opportunities; and destroy positive intellectual growth and development of black children. (p. 5)

Today, criticisms of IQ tests are as strong as the condemnation of it more than 30 years ago. For example, Dent (1996) argued that the cultural content of standardized intelligence tests unfairly penalizes some ethnic minorities:

Asking an African American child who has lived in the inner-city, a Hispanic youngster, brought up in a barrio, or a refugee child who recently arrived from another country questions that reflect White American middle-class values and experiences will reveal little about that child’s cognitive ability or intellectual functioning, (p. 110)

Finally, recent reviews (e.g., Gresham & Witt, 1997; Reschley, 1997) have found little justification for the use of intelligence tests in schools, with claims that (a) IQ measures are no better at identifying low-achieving students or students with low cognitive abilities and learning disabilities than what could be obtained from teacher judgments, and (b) no empirical evidence shows that recommendations from IQ tests do in fact lead to better educational intervention for children.

Despite the voluminous research and widespread practice of psychometric tests of intelligence, I consider them “works in progress” because neither test developers, researchers, nor professional practitioners have given sufficient attention to aspects of culture that matter when assessing children from diverse backgrounds. If differences due to culture are real but unexamined or if no meaningful benefits can be shown for all children who obtain low IQ scores, might not the psychometric principles on which tests are based be reasonably questioned? In a later section, I try to show how continuing cultural insensitivity in intellectual assessment for some racial and ethnic minorities will guarantee that test development, research, and practice based on psychometric methodologies will remain “a work in progress.”

Culture and Cognition

I take the position that intelligence, however defined, is a culture-dependent construct because it develops and finds expression in the shared ways of life of a social group. The position is neither new nor original and is consistent with a fundamental assumption in Vygotskian theory that human development is inseparable from culturally and socially organized activities (Vygotsky, 1978). This premise has undergirded much of the cross-cultural research in cognitive development and behavior in the past 20 years and should guide test development and administration as well. The term cognition is used to describe those mental or cognitive processes that any culture considers necessary for solving problems, representing information, understanding, making decisions, reasoning, and remembering. Because these processes are similar to “factors of the mind” as defined in psychometric tests of intelligence, they are used interchangeably with terms such as mental abilities or intellectual abilities. Admittedly, this is a rather narrow conception of intelligence but one that has informed the majority of widely used conventional tests of intelligence.

In previous research, my colleague and I (Armour-Thomas & Gopaul-McNicol, 1998) suggested a biocultural perspective for assessing intelligence, and the current conceptualization about culture and cognition draws heavily from and builds on that work. Although cognitions underlying behavior deemed “intelligent” may be identified through psychometric procedures, they do not function as abstract thought. Rather, they are inextricably wedded to a cultural group’s (a) values and beliefs about what and how these processes are to be applied to cognitive tasks important and relevant to the cultural group’s way of life, (b) the language style used by its members for communicating about matters pertaining to the development and manifestation of cognitive skills, and (c) the symbol system that the cultural group uses to embody cognitive tasks of interest.

Cultural Attributes

Although definitions of culture abound in the psychological literature, I examined those that include measurable characteristics or attributes with relevance for understanding human behavior. This decision finds support among a growing number of researchers who study the role of culture in psychology (e.g., Betancourt & Lopez, 1993; Laboratory of Comparative Human Cognition, 1986; Gauvain, 1995; Greenfield, 1997; Helms, 1992; Phinney, 1996; Rohner, 1984; Triandis et al., 1980; van de Vijver & Leung, 1997). In keeping with this criterion, the following four attributes of culture are singled out for further discussion: values and beliefs, language style, and symbol system.

Values and Beliefs

Values refer to the social norms or conventions that define the standards and expectations for behaviors that, according to Berry (1976), members of a social group regard as proper, right, and natural. Values also include unspoken but shared understandings of a social group about the goal of an activity, the “right” strategies for pursuing it, and judgments of what constitutes “appropriate” performance of goal attainment. In keeping with this definition, the cognitive processes or “factors of the mind” underlying intelligent behavior cannot operate in isolation but rather are tied in large measure to the values of a social group.

Beliefs are emotionally charged mind-sets that reflect a tacit consensus of assumptions that members of a social group form about themselves and others as a consequence of their collective experiences and understandings. Beliefs are closely associated with values in that they are deeply embedded in the standards against which behavior is perceived and judged. Like values, beliefs are also included in a definition of worldview, which Mbti (1970) defined as an attitude of mind or perceptions that influence the way people think, act, and speak in various situations of life. Along these lines, how intelligence is defined and who has how much or little of it are strongly influenced by the beliefs of a social group.

Language Styles

The term language styles describes the courtesies and conventions governing the different ways a social group communicates ideas, feelings, and thoughts among its members in various situations. The context-specific nature of language styles means that the norms of communication may differ from one setting to the next; consequently, what is considered as an appropriate style of social interaction in one context may be culturally incorrect in another. These idiosyncratic ways of using language do have relevance for human cognition to the extent that individuals who engage in cognitive tasks understand and can apply the rules of social engagement. In adult-child communication, what types of questions are commonly asked by adults and what are the expectations of children in responding to them? Is the interactional format for communication mutually understood by adults and children? Are some of the questions relevant for the communication medium in which cognitive tasks are negotiated?

Symbol System

A symbol system describes the technologies or modes of representation that embody cognitive tasks of any given cultural group. Some cognitive tasks are represented in linguistic, pictorial, figurai, and numerical domains, whereas others are represented in symbolic media such as maps, charts, manipulatives, and tools. Vygotsky (1978) was among the first cultural psychologists to call attention to the psychological functions of ancient tools such as tying knots that were used as mnemonic devices to help retrieve information from memory or counting on fingers as a support in higher intellectual functioning involved in basic arithmetic operations. The efficacy with which children are able to engage in cognitive tasks that produce intelligent behavior depends to some extent on the familiarity or the amount of practice they have had with the symbol system in which such tasks are represented.

Learning Experiences in Cultural Niches

What is the mechanism by which these attributes of culture, as described in the previous section, become linked to human cognition in ways that result in intelligent behavior? To answer this question, I elaborate on two concepts my colleague and I (Armour-Thomas & Gopaul-McNicol, 1998) discussed in our biocultural perspective of intelligence: “learning experiences” and “cultural niche.”

learning experience with relevance for cognition describes an encounter with at least four critical ingredients: (a) an adult, capable peer, or anyone who is a significant other in a person’s life who provides structure, guidance, and direction for the developing person; (b) the cognitive task or activity of interest to be engaged; (c) a process of social interaction for engaging the cognitive task or activity; and (d) the desired goal to be attained from task or activity engagement. The cultural attributes as described in the previous section are embodied in learning experiences, thus making it impossible to separate cognition from culture. For example, to minimize misunderstanding, the significant other more than likely will use a familiar language style when communicating with the child about the cognitive task or activity of interest. Engagement in the cognitive task or activity is likely to be more efficient if it is represented in a symbol system familiar to the child. Moreover, what constitutes the “right” cognitive strategies during task engagement or “good” performance after task completion are value-laden judgments communicated by the significant other to the child.

The second concept, a cultural niche, refers to a highly specialized area in the environment that contains critical ingredients for children’s healthy growth and development. The use of the term here is similar to terms used by other researchers who have studied the role of culture in cognitive development. For example, Super and Harkness (1986) and Gauvain (1995) used the term developmental niche to discuss cultural influences of children’s cognitive development. Earlier, Bronfenbrenner (1979) coined the term ecological niches to draw attention to the importance of properties and conditions of some physical and social contexts in fostering cognitive growth. What accounts for a cultural niche’s psychological significance, in my judgment, is not merely its physical or social address but the nature and quality of the learning experiences embedded with it and to which the developing person is exposed in a consistent and systematic manner over time. It is the routinization of learning experiences within cultural niches that accounts for the development of cognitive potentials along particular trajectories toward particular end states. Contexts such as the home, community, the school, and peer groups may be conceived as cultural niches to the extent that they provide the kinds of learning experiences conducive to cognitive growth and development.

The Challenge to Develop and Administer Culturally Sensitive Assessment

The United States is a nation of different racial and ethnic groups, each with its own distinctive culture. Thus, one can speak of the culture of Black or White Americans to describe cultural differences between two racial groups. Or one can speak of the culture of African Americans, Anglo-Americans, Native Americans, Latinos, and Asian and Pacific Islander Americans to describe cultural differences between or among ethnic groups. Although there are many cultural attributes of various racial or ethnic groups, I have focused only on those measurable aspects of culture that have relevance for intellectual functioning: shared values, beliefs, language style, and a symbol system.

Attention to these cultural attributes suggests a number of questions for designers of intelligence tests: Does intelligence have the same valued meaning among the various racial and ethnic groups assessed? Is the stimulus embodying the intelligence task familiar to all racial and ethnic groups assessed? Finally, is the language style used in test item development and administration similarly appropriate for all racial and ethnic groups assessed? Essentially, these questions are about cultural equivalence—whether the various racial and ethnic groups in our society share similar values and beliefs, language styles, and symbol system(s) associated with intelligence. In the section that follows, I explore the difficulties inherent in developing and administering a standardized intelligence test to meet the criterion of cultural equivalence.

Does Intelligence have the same Meaning Across Racial/Ethnic Groups?

In developing a standardized test of intelligence, test developers assume that there is agreement on the meaning of the construct intelligence that is being measured within and across racial and ethnic groups. It is also assumed that during administration of an intelligence test, both examiner and examinee would have a common understanding as to what constitutes an intelligent question and what constitutes an intelligent answer. However, these assumptions about construct equivalences are not necessarily applicable for some racial and ethnic groups in the United States. For example, Okagaki and Sternberg (1991) interviewed native-born Anglo-American and Mexican American parents as well as immigrant parents from Cambodia, Mexico, the Philippines, and Vietnam about their beliefs about child rearing and intelligence. Findings indicated that parents had different views about what characterizes an intelligent first-grade child. For the Asian parents, noncognitive factors (e.g., motivation, social skills, and self-management skills) were more important than cognitive skills such as problem-solving skills, creative skills, and verbal skills. The Mexican American and Mexican immigrant parents equally valued noncognitive factors and cognitive skills in their conception of intelligence. In contrast, the Anglo-American parents thought that cognitive skills were more important to their conception of intelligence than factors such as motivation and hard work.

In another study, Gopaul-McNicol (1993) found that some immigrant children in the United States with a Caribbean background experienced difficulty in completing tasks on psychometric intelligence tests because slow but careful execution of a task was valued in their culture. Even when requested by the examiner for a quick response, such children ignored the request and continued to work methodically and cautiously. Their approach to cognitive tasks is not unlike Ugandan villages that use words such as careful, slow, and active to define intelligence (Wober, 1972).

More than 30 ago, Messick and Anderson (1970) claimed that the same test may measure different cognitive processes among minority children from low-income backgrounds than it measures among White middle-class children. More recent investigations have found profile differences in intellectual abilities among racial and ethnic groups. Some studies reported that Asian Americans performed better on visual and quantitative reasoning than on verbal subtests on an intelligence measure. In other studies, it was found that Native Americans also tended to show relatively better performance on the visual reasoning subtest than on the verbal subtests. Although not ruling out other explanations, it is possible that the observed ethnic differences in intellectual performance may be attributable to different cultural values about what it means to be intelligent. If this is the case, an intelligence measure may be assessing different notions of valued intellectual abilities in different racial and ethnic groups and, in so doing, invalidating its results for these groups.

Are There Stereotypical Beliefs About Intelligence?

Much has been written over the years about the damaging effects of prejudice and discrimination to which some racial and ethnic groups have been subjected in the United States. One negative impact of discrimination with relevance to performance on intelligence tests has to do with the concept of stereotype threat, which Steele (1997) defined as

the event of a negative stereotype about a group to which one belongs becoming self-relevant, usually as a plausible interpretation for something one is doing, for an experience one is having, or for a situation one is in, that has relevance to one’s self-definition, (p. 617)

According to Steele (1997), some African American students have internalized the stereotypical belief or myth of intellectual inferiority that has been so pervasive in much of the heated debates about racial differences in IQ scores within and outside the academy. When told that an IQ test was diagnostic of their abilities, these students tend to perform less well than their Anglo-American peers for whom such information holds no threat.

Ogbu (1992) provided a different perspective on how beliefs associated with Black and White culture can negatively affect the performance of some African Americans and other ethnic minorities of color who share similar beliefs. According to Ogbu, opposition and ambivalence are distinguishing attributes of Black culture in its relation to White or mainstream culture that have emerged in response to their subordination and exploitation by the dominant group in U.S. society. These aspects of Black culture are reflected in the belief among some African Americans that the cultural frame of reference is substantively different from the White cultural frame of reference. Moreover, these elements are embodied in an oppositional cultural system with mechanisms for protecting and maintaining the identity of its members. One of the mechanisms with relevance for academic (or intellectual functioning) is cultural inversion—the tendency of members of one social group (e.g., Black Americans) to consider certain forms of behavior, symbols, and events as inappropriate for them because these elements are not valued by members of another group (e.g., White Americans).

Many questions may be raised about stereotypical beliefs that are relevant in the administration of an intelligence test: How many children from racial groups recommended for psychoeducational evaluation, of which the IQ test is a central component, are vulnerable to the stereotype threat? How many of them hold beliefs that an IQ test is a product of White or mainstream culture and should therefore not be taken seriously? How many examiners who administer the IQ test to children from racial and ethnic minority groups hold caste thinking about them? To the extent that examiners and/or examinees hold stereotypical beliefs during the administration of an intelligence test, it will clearly violate the assumption of equivalence of testing conditions that will be necessary for making comparative judgments of intellectual abilities between Black and White ethnic groups.

Are Cultural Attributes Comparable between Racial/Ethnic Groups?

One of the difficulties in developing a measure of intelligence that is culturally sensitive is figuring out how to ensure comparability of those aspects of culture that have implications for intellectual functioning represented among the racial and ethnic groups included in the standardization sample. The United States is a multicultural society and, as such, reflects attributes of cultures of diverse racial and ethnic groups. Some individuals within each group, though, may choose to retain some aspects of their ancestral culture while identifying with aspects of the dominant culture to which they have acculturated. A customary practice in instrument development is to ask participants to identify themselves by choosing from among social categories such as race and ethnicity. Given the heterogeneity of beliefs and values that must inevitably be embedded within any racial or ethnic group, how do test designers ensure that all members within each group that make up the representative sample have had equivalent culturally relevant experiences? The problem is further complicated by the comingling of cultural attributes with other dimensions of human diversity, such as socioeconomic status and regionality.

For example, some years ago, Williams (1975) developed the BITCH 100 (Black Intelligence Test of Cultural Homogeneity), a vocabulary test from which words were selected from the dictionary of Afro-American slang. African American high school students scored significantly higher than their Anglo-American peers, a finding that the author attributed to the possibility that the Anglo group had less opportunity to learn the words than the African American group. Because socioeconomic status was uncontrolled in this study, it may well be that the items favored a particular social class within the African American group or a region in which they live rather than African Americans as a cultural group.

Are Conventions of Discourse Comparable Between Racial/Ethnic Groups?

An assumption in any standardized measure of intelligence developed in the United States is that the format of test questions is similarly understood by all U.S.-born racial and ethnic groups to whom the test is administered. In addition, it is also assumed that all respondents know the cultural convention about when and how to respond to an examiner’s question. In describing the convention underlying every cognitive test, Greenfield (1997) stated that “the test question assumes that a questioner who already has a given piece of information can sensibly ask a listener for the same information” (p. 1119).

However, this particular convention of discourse may be a function of formal schooling and child-rearing practices among some racial and ethnic groups from middle-class backgrounds. For example, in a study conducted in the southern United States, Heath (1989) examined the linguistic conventions of African American and Anglo-American adults and their children. It was found that Anglo-American parents questioned their children a great deal in a manner similar to what is found in a formal testing context. In contrast, the African American parents infrequently questioned their children and hardly ever used testlike questions. In another study, Miller-Jones (1989) described the interactions between a 5-year-old lower-middle-class African American and an examiner in a standardized IQ testing context in which it was obvious that both the child and the examiner were responding to different expectations regarding communication conventions.

There is tremendous language style variations within and across racial and ethnic groups in the United States (African Americans, Latinos, Asian Americans, and Anglo-Americans), many of whom have distinctive conventions of discourse (for further discussion on this issue, see Butcher & Pancheri, 1976; Hamayan & Damico, 1991; van de Vijver & Leung, 1997). Validity of test scores will be compromised if members of racial and ethnic groups use communicative conventions different from what is assumed by any standardized IQ test. In my judgment, it is virtually impossible to design and administer a test that meets the criterion of communication convention equivalence in a linguistically diverse society such as the United States.

Do Racial/Ethnic Groups have Comparable Familiarity with Symbol System?

To make valid comparisons between two groups on any standardized intelligence measure, test developers must ensure familiarity of symbols or stimuli used to represent test items. This is very difficult to do because children have had differential exposure to cultural practices in which certain symbols are used to embody cognitive tasks. For example, an early cross-cultural study (Gay & Cole, 1967) used bowls of rice and geometric blocks to assess classification skills between schooled and unschooled Liberians and U.S.-schooled children. It was found that the unschooled Liberians experienced greater difficulty sorting geometric shapes than did the U.S.-schooled children. However, when the materials to be classified were changed to rice, the results were reversed. The U.S. children showed greater difficulty in sorting rice than the Liberians. In another study, Serpell (1979) compared the performance of English and Zambian children on a pattern reproduction task embodied in three symbolic media: clay, paper and pencil, and strips of wire. The English children performed better with the paper-and-pencil medium, whereas the Zambian children performed better in the wire medium. When a medium familiar to both groups (clay) was used, both groups performed equally well on the same task. Also, Lantz (1979) compared the classification skills of Indian children using grains, seeds, and colors. It was found that children showed better skills when grains and seeds were used than when the same task used an array of colors. Finally, numerous researchers (Hatano, 1982; Stigler, 1984; Stigler, Barclay, & Aiello, 1982) have examined Japanese abacus experts who used the abacus as a tool for arithmetic operations. The researchers found that the experts were able to do mental arithmetic calculations with the abacus as accurately as without it. It appeared that the experts created a representation of the problems using a “mental abacus” that enhanced skills of remembering digits forward or backward.

These studies underscored the importance of the medium in which cognitive tasks are embedded and demonstrate how differential familiarity with the medium can account for differences in cognitive functioning. At least two questions can be raised from these studies, with implications for the standardized assessment of intelligence of children from different racial and ethnic backgrounds in the United States: Is the symbol system(s) used in the test sufficiently familiar to all children to whom the IQ test is administered? If not, how much variation in intellectual performance between racial and ethnic groups can be explained in terms of differential familiarity with the symbol system used in the test?

Future Directions for the Assessment of Intelligence

When considered from the perspective of cultural psychology, any standardized test of intelligence is a reflection of the values, beliefs, language styles, and symbol system of a particular cultural group. But as I have tried to show in the previous section, it is extremely difficult to design and administer a valid standardized test of intelligence for culturally diverse groups in a democratic society. Moreover, inattention to cultural issues in test construction and administration leaves unknown the contribution of culture in interpretations for observed differences in intellectual performance between Anglo-American and racial and ethnic minority groups. Despite these concerns, psychometric tests of intelligence continue to have widespread use in psychoeducational evaluations of racial and ethnic minority groups in the United States, and claims about its inappropriateness for these populations have been met with spirited resistance (see Frisby, 1999, for the latest defense of psychometric intelligence). I submit that until efforts are made to adequately address the culture question in psychometric assessments of intelligence, the controversy and debate surrounding its questionable validity for racial and ethnic minorities in the United States will continue. In the section that follows, I propose some suggestions for making intellectual assessments more culturally sensitive in the areas of research and practice.

Psychometric Assessment Research

The primary research challenge in studies seeking to compare intelligence within and across racial and ethnic groups in the United States has to do with validity of results: whether the interpretation for observed differences in performance on any psychometric test of intelligence is legitimate for racial and ethnic minorities. Threats to validity derive from incomplete answers to the following questions: Does the construct intelligence have the same meaning for the racial and ethnic groups investigated as it does for the researcher? Are the subjects selected for comparative study truly independent? Are the items that make up the test equivalent for the groups studied? Are alternative explanations ruled out for the differences observed when comparing the results of racial and ethnic groups? These questions are consistent with rudimentary principles of science, and I argue that researchers in the psychometric tradition have unevenly adhered to them in their investigations of intelligence of racial and ethnic minorities in the United States. There are at least four areas in research that, if addressed, could enhance the validity of results of psychometric intelligence tests.

Conceptual Definition of Construct

The absence of consensus with respect to the meaning of intelligence among some racial and ethnic minorities in the United States makes it difficult to explain their high or low performance on an intelligence test. To be sure, defenders of psychometric intelligence would argue otherwise by citing studies that used factor-analytic procedures to demonstrate the equivalence of the construct of intelligence between groups on the same psychometric test (see Jensen, 1980; Keith etal, 1995; Mishra, 1981; Valencia, 1995, for construct validity studies between White Mexican American and African American groups). But a posteriori statistical procedures, no matter what their scientific credibility is, yield insufficient proof of construct equivalence. These methods must be complemented by a priori procedures to establish construct familiarity between or within the racial or ethnic group to be studied.

Before administering any psychometric intelligence test, the researcher needs to ascertain rather than assume that the groups to be studied share the same values and beliefs about the construct of intelligence as their counterparts on whom the test was initially normed. Absent such evidence, the researcher has three options:

  • Develop a test consistent with a conception of intelligence as agreed on by the groups under investigation. As Greenfield (1997) pointed out, such a decision is not made at an individual level but rather at a cultural level in terms of social norms of the groups.
  • Modify an existing psychometric test in an effort to accommodate legitimate cultural differences among members of the group.
  • Measure the cultural attributes that are likely to influence the performance on the psychometric test.

The last two options are elaborated on in subsequent paragraphs.

Measurement of Cultural Attributes

A comparative study of intelligence using a psychometric test may still proceed without conceptual consensus of intelligence. However, in the interest of scientific fairness and to avoid misattribution for observed differences between racial or ethnic groups studied, the researcher must take steps to rule out a cultural explanation. This can be done by developing additional measures of cultural attributes to be used in conjunction with the psychometric test. For example, if there is a possibility that stereotype threat may be a factor in the observed differences in psychometric test scores of African Americans and Anglo-Americans, interviews or questionnaires could be administered to determine whether that is the case. Similarly, if there is a possibility that differences in performance between Asian Americans and Mexican Americans may be due, in part, to differential levels of acculturation to U.S. mainstream culture, an acculturation scale can be developed or may be administered to these groups. Moreover, specific behaviors indicative of values associated with the ancestral culture of these social groups may be clinically observed during the administration of the psychometric test.

A common practice in validity studies is to minimize as much as possible alternative explanations for observed differences in performance on a measure used with culturally diverse populations.

Similarly, the hypothesis that differences in performance between Asian Americans and Mexican Americans are related to acculturation to U.S. mainstream culture could be validated by measuring the cultural values, language usage, and symbolic modes of representation associated with cognitive functioning in both the ancestral culture of the group and the U.S. culture.

Modification of Existing Psychometric Test

It was argued earlier that differential language style during administration of a psychometric test and/or differential familiarity with the stimuli used in test items could lead to misunderstanding and invalidate the results of the test for some racial and ethnic groups. Without a doubt, proponents of psychometric tests of intelligence with culturally diverse populations would be quick to point to empirical research that shows the absence of situational bias in test administration (Jensen, 1980; Mishra, 1980, 1983; Oplesch & Genshaft, 1981) and content validity of test items (Jensen, 1980; Koh, Abbatiello, & McLoughlin, 1984; Mishra, 1981; Pugh & Boer, 1989; Sandoval, Zimmerman, & Woo-Sam, 1983) in refuting these claims. However, I raise the concern again about the insufficiency of a posteriori statistical procedures in “proving” the validity of a psychometric measure for all English-speaking racial and ethnic minorities in the United States. Efforts must also be made to ensure cultural equivalence before and during administration of a psychometric test of intelligence developed and administered to groups for whom the conventions of language usage and the symbols used in test items may be unfamiliar. I elaborate on this issue in a subsequent section, but the interested psychometric researcher may refer to a number of excellent references on the topic of equivalence in assessments for culturally diverse populations (e.g., Butcher, 1982; Cole, Gay, Glick, & Sharp, 1971; Helms, 1992; Laboratory of Comparative Human Cognition, 1986; Lonner, 1981; van de Vijver & Leung, 1997).

Categorization of Subjects

In typical studies using a psychometric test of intelligence to examine similarities or differences in cognitive ability, researchers use terms such as White and Black to categorize subjects into racial groups. They may also use a continent, or a country of origin, paired with “American” to classify members of ethnic groups (e.g., African American, Asian American, Mexican American). But these socially derived categories mask psychologically relevant cultural attributes that may be similar or different across and within racial and ethnic groups. After all, most social groups in the United States, irrespective of the social category to which they are assigned or to which they self-select, share some aspects of the mainstream culture of the United States while simultaneously retaining others from their ancestral culture. What these cultural attributes are and their significance for intellectual behavior depend, in part, on the racial or ethnic groups selected for study and the kinds of socialization experiences members would have had in cultural niches such as the home, the school, and the peer group. By not assessing or controlling for these cultural attributes, researchers run the risk of attributing differences or similarities to race or ethnicity that may in actuality be due to culture differences or similarities in values, beliefs, language style, or symbol usage. In the interest of scientific objectivity and to avoid misunderstanding psychometric test results of two racial or ethnic groups, psychometric researchers must at least make the effort to unbundle and measure the cultural attributes within and between the groups under investigation. (See Armour-Thomas & Gopaul-McNicol, 1998; Betancourt & Lopez, 1993; Helms, 1992; Phinney, 1996; Rohner, 1984, for further discussion of the problem of comingling of culture with other dimensions of human diversity such as race, ethnicity, and socioeconomic status.)

Psychometric Assessment Practices

The use of a psychometric test of intelligence and the interpretation of its results for certain racial and ethnic minorities are sources of unending controversy both within and outside the academic community. Critics may be hard-pressed to argue against the use of these measures for understanding individuals’ intellectual strengths and weaknesses to make intervention more responsive to their needs. However, the overrepresentation of some racial and ethnic minorities in special education classes and underrepresentation of these groups in gifted and talented programs have fueled suspicion that the test provides an inaccurate diagnosis of their cognitive strengths and weaknesses. Moreover, the tacit interpretations of their performance in terms of genetic inferiority or cultural deprivation are suspect due to the questionable methodologies in psychometric test construction and administration used with these groups as discussed earlier. In light of these concerns, what options are open to professional practitioners such as school psychologists, who are required to use a psychometric test of intelligence as part of their psychoeducational assessment regimen? The following recommendations are offered as strategies for enabling greater cultural sensitivity before, during, and after the administration of a psychometric test of intelligence. Central to these recommendations is a hope for an assessment system rather than a single test for assessing intelligence.

Before Assessment

To protect against premature labeling and probable misattribution for cognitive performance to be analyzed later, the professional practitioner should gather background data of potential cultural significance prior to the administration of a psychometric test. Consider some of the procedures that may be used for this purpose.

An observational protocol could be developed to collect qualitative data on the examinee in contexts or cultural niches such as the classroom, cafeteria, and playground. As indicated earlier, cultural experiences within these environments may differ for some racial and ethnic minority children, and these experiences could have either a positive or negative impact on their cognitive functioning. The practitioner would need to pay particular attention to the nature and quality of the interactions between examinee and teacher, coach, or adult in the school setting as well as peers. Observation in multiple contexts is also important to ascertain whether there is behavior inconsistency with respect to cognitive functioning from one context to the next.

The professional practitioner may conduct a case history interview with significant others in the examinee’s life (e.g., parent or guardian, minister, coach, teacher, community social worker) to gather information of cultural significance for cognitive functioning. Examples include specific beliefs about intelligence, language styles, conception of time, and developmental and educational history. A written questionnaire about these cultural factors may also be given to significant adults in the examinee’s life.

During Assessment

There is good reason to believe that the administration of a psychometric test of intelligence under standardized conditions provides an incomplete picture of the cognitive strengths and weaknesses of some children from racial and ethnic minority backgrounds. Differential familiarity of test stimuli, different patterns of language usage, and different values about time in answering questions are some of the cultural variables that could affect behavior during assessment. Ignoring or trivializing their importance could lead to the misunderstanding and, in turn, misattribution of test results for some children for whom culture matters. There are many clinical strategies a professional practitioner can use to ensure cultural sensitivity during the administration of a psychometric test. Techniques that deviate from standardized procedures are commonly defined as “testing to the limits” in the psychoeducational literature.

One technique may involve suspending the time allotted on the psychometric test items to see whether the examinee may eventually obtain the right answer. Another might be to teach items to the examinee to establish familiarity with the test format or stimuli and then readminister the psychometric items to see if the examinee’s performance improves. A third technique might include giving the examinee a paper and pencil to ascertain whether he or she could solve psychometric items requiring mental computation. Yet another procedure may involve contextualizing vocabulary items on the psychometric test to determine whether the examinee understands the meaning of words. Finally, the professional practitioner may try to make items on the psychometric test equivalent to the examinee’s cultural experience by matching selected items to the examinee’s culture. A more comprehensive discussion of these strategies is found in Armour-Thomas and Gopaul-McNicol (1998).

After Assessment

Prior to writing up the report, the professional practitioner analyzes the results of the psychometric test using the clinical data obtained before and during assessment as well as the psychometric data. One primary question of interest is the following: To what extent is the psychometric score obtained due to cultural factors? It is extremely important that the professional practitioner rule out these factors before attributing observed low or high scores to factors intrinsic to the individual.

To the extent that cultural factors did affect the test results, another question of interest would need to be answered: What culture-relevant recommendations can be made to the classroom teacher, parent, and mental health worker that are likely to improve the cognitive functioning of the examinee? (For examples for writing a culturally sensitive psychological report, see Armour-Thomas & Gopaul-McNicol, 1998; Gopaul-McNicol & Armour-Thomas, 2002.)


The psychometric model has been dominant in psychology for almost 100 years since Alfred Binet designed the first intelligence test. It has contributed insightful notions of the structure of the human intellect, about which numerous comparative studies have been conducted and measures of intelligence developed and used with diverse groups. In this chapter, I have argued that incomplete attention or neglect to matters of culture as it pertains to some racial and ethnic minorities in the United States has raised doubts about the validity of psychometric intelligence research and test practices for this population. I believe that a rigid adherence to the continuing use of psychometric measures of intelligence in research and practice, irrespective of cultural considerations, creates an unhealthy tension in a multicultural society committed in principle to the twin ideals of equity and social justice. After all, genuine respect for cultural diversity in a democracy allows those who identify with a particular culture the right to develop and express intelligence in accordance with the values, beliefs, symbols, and language systems of their particular social group. It also allows them the choice of adopting the cultural attributes of another social group or switching cultural frames of reference, depending on the salience of experiences they would have had in one cultural niche or another. What this means is that the “one size fits all” approach of standardized assessment of intelligence must give way to an assessment system in which the psychometric IQ test is important but is only one measure on the assessment menu. The recommendations proposed in this chapter are intended to encourage greater cultural sensitivity in intelligence research and test practices for those racial and ethnic minorities in the United States who have been disadvantaged by psychometric intelligence tests.