Personality and Individual Differences

Robert Hogan, Allan Harkness, David Lubinski. The International Handbook of Psychology. Editor: Kurt Pawlik & Mark R Rosenzweig. Sage Publications. 2000.

Basic Concepts

This chapter is about personality and individual differences, topics that are related but not identical. Personality concerns the nature of human nature; individual differences—sometimes called differential psychology (cf. Stern, 1900)—concerns analyzing the ways in which people’ performance differs. In addition to social behavior, this includes individual differences in intellectual, psychomotor, perceptual, and cognitive performance. This chapter primarily focuses on personality, temperament, and intelligence, and their implications for individual differences in human performance.

Personality psychology concerns the distinctive and important characteristics of people compared with other biological species. Personality psychology is designed to answer three kinds of questions: (a) how and in what ways are people all alike, e.g., are people naturally aggressive? (b) how and in what ways are people different, e.g., what are the major dimensions of individual differences in human social behavior?; and (c) how to explain the puzzling behavior of single individuals, e.g., why did Adolph Hitler hate Jews? Most of the interesting questions we have about people are the subject of personality psychology.

How is personality defined? The word personality comes from the ancient Greek word ‘persona,’ which referred to the mask worn by actors in Greek drama; the mask denoted an actor’ part or role in a play, but left open the question of who or what was behind the mask. This suggests there are two sides to a person—that which we see in public and that which is behind the public mask. MacKinnon (1948) also noted that the word personality is defined in two very different ways—each pointing to a different but important aspects of personality. The first is personality from the outside, the manner in which a person is perceived and described by others. The second is personality from the inside, and concerns the factors inside people that explain why others perceive them as they do. This suggests that any definition of personality must take account of these two perspectives. In everyday language personality in the first sense—the observer’ view—is referred to as reputation, the unique way in which each person is described by others. The terms used to describe personality in the second sense—the factors inside people that explain their characteristic behavior—differ depending on the writer, and the great debates in personality psychology concern the nature of these inner factors.

Measures of personality from the observer’ and from the actor’ perspective are only moderately correlated, suggesting that the two forms of personality are not identical. In addition, personality from the observer’ perspective is relatively easy to study; it is studied using observer ratings, for which high levels of agreement are routinely found. In contrast, personality from the actor’ perspective is hard to study because the hypothesized processes inside people cannot be directly observed; their nature must be inferred, based on sources of indirect data, whether they come from experiments or self-reports.


There are two separate traditions running through the history of personality psychology; they can be called the applied and the academic traditions. In the applied tradition, researchers think about personality in the context of trying to help people solve their problems. In the academic tradition, researchers study personality simply because they think it is interesting and important.

History of Applied Tradition

This tradition begins when Hippocrates, the father of Greek medicine (c. 460 B.C.E.-c. 370 B.C.E.) decided, on the basis of the physical theory of Empedocles (the world is composed of four elements: earth, air, fire, and water), that people are composed of four humors or bodily fluids: black bile, yellow bile, phlegm, and blood. Galen (130-200 C.E.) using Hippocrates’ physical theory, suggested that peoples’ behavior reflects the influence of these basic humors, and problems occur when they get out of balance. Depending on which humor is predominant, this yields four types of people: those with an excess of black bile are melancholic or depressed; those with an excess of yellow bile—from the spleen—are splenetic or hostile; those with an excess of phlegm are phlegmatic or lacking in energy; and those with an excess of blood are sanguine, or cheerful and optimistic. Galen’ types have been very influential; they were adopted by Immanuel Kant in his best-selling eighteenth century textbook Anthropologie (1798), by Wilhelm Wundt in his model of physiological psychology (1903), and by Eysenck (1970) in his taxonomy of personality types.

The ancient Greeks invented the word ‘hysteria’ to label medical symptoms that have no obvious physical basis. The Greeks believed that only women have hysterical symptoms; as a result, the word hysterical is related to the Greek word for uterus. The larger point is that hysterical symptoms have been recognized for at least 2,000 years and they continue to be important today, although now they are called somatization disorders. For example, the Centers for Disease Control in the US estimates that perhaps 65% of the people who seek medical attention each day are suffering from somatization disorders.

Galen and other non-Christian, non-European thinkers explained hysterical symptoms in terms of a lack of balance among the various parts of the psyche. During the Christian middle ages in Europe, however, hysterical disorders were thought to be caused by a devil who had entered a person’ body, and the disorder could be treated by exorcism. Johan Joseph Gassner (1727-1779), an Austrian priest, became famous for curing poor people, many of whose symptoms were undoubtedly hysterical. A commission was formed to investigate Gassner, and it included a physician named Franz Anton Mesmer (1734-1815). Mesmer soon claimed to be able to reproduce Gassner’ cures using magnets and argued that the secret to Gassner’ cures was his unusual ‘animal magnetism.’ Mesmer developed a method for curing hysterical disorders based on magnets and his own personal magnetism. Although Mesmer was a fraud, he is credited with inventing hypnotism—originally called mesmerism.

Auguste Lieabeault (1823-1904), a French country doctor, read about hypnotism in school; he gave his patients the choice between being treated conventionally for a fee or being hypnotized for free; his practice rapidly expanded. Hippolyte Bernheim (1840-1919) studied with Lieabeault, and later proposed that hypnosis is a function of the patient’ suggestibility rather than the hypnotist’ personality, and that suggestibility is normally distributed in the population.

Jean-Martin Charcot (1825-1893), the best-known psychiatrist in Europe in his prime, was the director of Paris’ largest psychiatric hospital, where he studied hysterics using hypnosis in the 1870s. Charcot concluded that both men and women suffer from hysterical disorders, that hysterics suffer too much to be faking, that hysteria and hypnotic susceptibility are related, and that both are caused by dissociation—an inability to integrate one’ thoughts and memory due to a degeneration of the nervous system.

Charcot established hypnosis as a legitimate scientific topic; he also quarrelled with Bernheim, arguing that only neurotics could be hypnotized. Over time, Bernheim’ view—that hypnotizability is a normal characteristic—prevailed. More important for this history, however, is the fact that a young doctor from Vienna named Sigmund Freud went to Paris in the winter of 1885-1886 to study with Charcot; when Freud returned to Vienna, he began using hypnosis to treat hysteria, and he began developing psychoanalysis, the first systematic theory of personality.

Freud is not well regarded by academic psychologists today, but thoughtful non-psychologists believe that he is one of the three or four most influential thinkers of the twentieth century. Freud explained psychological problems (including hysteria) in terms of powerful sexual and aggressive desires that are repressed and then reappear as physical symptoms. Three Freudian revisionists are particularly important: Alfred Adler and Karen Horney explained psychological problems in terms of disturbed social relationships (Horney especially emphasized problems that are specific to women); Carl Jung explained psychological problems in terms of frustrated religious longings.

After Freud’ death, personality researchers extended Adler and Horney’ views by arguing that peoples’ problems are largely caused by disturbed relationships with their primary caretakers (parents) during childhood; this viewpoint, called attachment theory (Bowlby, 1983), became an important area of research at the turn of the century (Simpson & Rholes, 1998).

George Kelly’ (1955) book, The psychology of personal constructs, stimulated the cognitive revolution in applied personality psychology. In an abrupt departure from Freud, Kelly argued that people are compelled to make conceptual sense out of their social worlds; they develop theories of what they believe others expect of them during social interaction, and then use these theories to guide their behavior. People with problems often have developed incorrect theories about what others expect, and then they have problems dealing with others. Kelly’ ideas have been very influential, as seen in the writings of Walter Mischel (1990) and his students, and in the works of Julian Rotter (1966) and Albert Bandura (1982). Rotter and Bandura also developed psychometric measures of individual differences in peoples’ expectations about the world—Locus of Control and Self-Efficacy—that also have been very popular and influential.

History of Academic Tradition

The applied tradition of personality research has a specific agenda—initially to explain the origins of hysterical disorders, then more generally to explain individual differences in psychological adjustment—and most of the classical theories of personality concern this problem. The academic tradition has an entirely different agenda—it is to capture the unique qualities of individuals, to describe how people differ from one another (cf. Stern, 1900). To do this, units of analysis become important—because they provide the means for comparing people. In personality psychology, two kinds of units have dominated these discussions—types and traits.

Types are coherent composites of traits; types are the discernable clusters of people found in populations, e.g., the bureaucrat, the absent-minded professor, the bohemian or hippie. Traits, on the other hand, are recurring themes in the behavior of individuals, e.g., aggressiveness, charm, conscientiousness. Types are composed of, and can always be broken down into, traits. However, it is usually difficult to examine a group of traits and decide what sort of type would be formed if they were combined. So the relationship between types and traits is intransitive—types can be decomposed into traits with some precision but the reverse is not necessarily true. Rephrasing this point, people have traits, but they fit types.

Regardless of whether we are talking about types or traits, our ability to study them scientifically depends on having available a taxonomy, a method for classifying them reliably. Consequently, the history of type theory and the history of trait theory concern the search for an agreed-upon taxonomy, a way to classify the phenomena.

Type Theory

The first theory of types that we know about was proposed by Theophrastus, a botanist and student of Aristotle; Theophrastus classified the people who were prominent in public life in Athens during the time of Alexander (356-323 B.C.E.). These types are clearly and vividly drawn, and many of them, e.g., the Flatterer, are easily recognized today. Theophrastus’ types are an interesting starting point, but they are not based on systematic observations and, as a taxonomy, his types are not very comprehensive.

We mentioned earlier Hippocrates who defined four types of people based on their balance of bodily humors. Rostan (1824) developed the most influential typology after Hippocrates, based on physique or body build. He defined four types: cerebral—long and thin; muscular—square and athletic; digestive—rotund; and respiratory—a combination. Viola (1909) using more sophisticated measurement methods, reduced this typology to Rostan’ first three, which Viola called microsplanchic, nor-mosplanchic, and macrosplanchic. Following Viola, Ernst Kretschmer (1926), developed a type theory based on physique that was popular prior to World War II. He defined three types: aesthenic—an emaciated body build; athletic—a muscular build; and pyknic—a plump build. He also found relationships between the aesthenic build and schizophrenia, and between the pyknic build and manic-depression. Building on the tradition of Rostan, Viola, and Kretschmer, W. H. Sheldon (Sheldon, Dupertuis, & McDer-mott, 1954) also developed a typology based on body build, which he called somatotypes. He believed the somatotypes reflected underlying genetic factors, and were associated with characteristic personality styles. Long, thin people—ectomorphs—were shy and retiring and potentially disposed to schizophrenia. Square, athletically built people—mesomorphs—were aggressive and assertive and potentially disposed to paranoia. Round people—endomorphs—were jolly and fun and potentially disposed to manic-depression.

Carl Jung also developed an influential type theory in his 1923 book, Psychological Types. There are two features of Jung’ theory that should be noted. First, it is an information-processing model; the Jungian types reflect: (a) where people deploy their attention—inward for Introverts, outward for Extraverts; (b) how they take in information—intuitively or empirically; and (c) how they evaluate information—logically or according to its personal meaning. The second thing to note about the Jungian types is that they form the basis for a very popular personality inventory—the Myers—Briggs Type Indicator, millions of copies of which are sold each year to business and educational organizations around the world.

Spranger (Types of Men, 1928) suggested there are six ideal types of personality, and each type is defined by a particular value orientation. The Theoretical type values truth and wants to impose rational order on the world. The Economic type values things that are useful and wants to make money. The Aesthetic type values form and harmony and wants to make life attractive. The Social type loves people and wants to help them. The Political type values power and wants to gain personal authority and influence. Finally, the Religious type values unity or oneness with the universe and wants to understand his/her relationship to the cosmos. The American personality psychologist Gordon Allport was a great enthusiast of German psychology, and he developed a personality inventory, The Study of Values, to measure Spranger’ types. Allport’ inventory enjoyed considerable popularity in the 1950s and 1960s.

Drawing on Jung, Spranger, and Allport, John Holland (1985) proposed a type theory based on peoples’ interests, competencies, and values. The resemblance between Holland’ types and the earlier theories are obvious: Realistic types (e.g., engineers) build, operate, and maintain things and equipment; Investigative types (e.g., scientists) use data to solve problems; Artistic types (e.g., artists) design and decorate things, and entertain people; Social types (e.g., clinical psychologists) help people; Enterprising types (e.g., politicians) persuade and manipulate people; and Conventional types (e.g., accountants) regulate and codify things. Holland’ theory can classify every job in the Dictionary of Occupational Titles—it is an exhaustive taxonomy of occupations that has been extensively replicated and is generally regarded as the model theory of personality types. Holland’ theory is the end point of the search for personality types that began with Theophrastus.

Trait Theory

Trait theory begins with the research of Franz Joseph Gall (1758-1828), the foremost brain anatomist of his day. Gall is best known today as the father of phrenology, a fraudulent discipline based on the notion that well-developed mental faculties can be identified by bumps or protrusions on the skull. Gall identified groups of people who were characterized by a specific trait—generosity, truthfulness, lasciviousness—and then tried to find a common denominator in the shape of their skulls. Over time he identified 27 human capacities and designed charts to show the bumps that corresponded to these capacities. The notion that bumps on the skull correspond to overdeveloped areas of the brain is, of course, nonsense. Nonetheless, Gall proposed a clear link between the brain and behavior and, while his contemporaries searched for general laws of the mind, Gall tried to identify the dimension along which people differ and to find the causes of these differences. In formulating his observations of how people differ from one another, Gall created the first taxonomy of traits.

Francis Galton (1822-1911), Charles Darwin’ cousin and an enthusiastic practitioner of applied measurement, invented the foundations of modern psychometrics, including the concepts of the regression line, regression to the mean, and the correlation coefficient; and his work led directly to the development of factor analysis and modern behavior genetics. Although he studied ability and not personality, he developed methods needed to study personality traits.

The formal study of personality and individual differences as an academic discipline begins with the work of William Stern (1900, 1906). In his two founding books Stern developed a systematic framework for the study of individual differences and of psychological individuality as such. In this framework he distinguishes already between a variable-oriented and a type-oriented research approach, paralleling the later distinction between so-called R- and Q-technique of factor analysis. As to variables measuring individual differences, Stern (1900) unfolded a broad spectrum of personality indicators, including also so-called historical methods. In Part II of the book the reader is introduced to a range of statistical methods for analyzing variations and co-variations (correlations) of individual difference measures. The final part, Part III, is devoted to the study of individuality on the basis of biographical and psychological profile (‘psychogram’) data.

The Dutch psychologists Heymans and Wiersma published the first personality inventory in 1906; it was a checklist designed to measure individual differences in three dimensions of adjustment, and it was the prototype for personality trait measurement for the next 50 years. From 1906 until the end of World War II, personality measurement primarily focused on dimensions of psychopathology; Woodworth’ (1908) Personal Data Sheet, used to screen Army recruits in World War I, was the next in this series of tests (Woodworth, 1920). Somewhat later, Thurstone developed a personality inventory (Thurstone & Thurstone, 1929) based on factor analysis—the analytical methodology invented by Spearman (1904) to study intelligence; Thurstone’ inventory produced a gross score indicating the presence or absence of neurotic tendencies. Using this measure, Thurstone showed that neurotic tendencies are relatively independent of mental abilities, but are related to success in college. The most famous test in this tradition is the Minnesota Multiphasic Personality Inventory (Hathaway & McKinley, 1943), still widely used to evaluate personality, psychopathology, and the emotional stability of applicants for jobs where public safety is an issue.

William McDougall, a Scottish child prodigy and polymath, published his Social Psychology in 1908; despite the title, the book concerns personality. McDougall is best known today for his instinct-based theory of motivation, but he was a prolific researcher who also developed an interesting theory of personality, parts of which were adopted by later writers without attribution. Kurt Lewin’ (1935) book, A Dynamic Theory of Personality, extends Stern’ theory of ‘personalism’ or person theory of conscious experience, i.e., what people do depends on what is on their minds. Gordon Allport’ (1937) book, Personality: a Psychological Interpretation, is an elaborate defense of traits. Strongly influenced by both Stern and McDougall, Allport regarded traits as conscious intentions that have a motivational force. Drawing on the German philosopher Windelband’ distinction between nomothetic and idiographic analyses, Allport further distinguished between traits that describe people in general and traits that are specific to individuals, a distinction that is important when the goal of research is to characterize the distinctive features of individuals. Henry Murray’ (1938) Explorations in Personality contains an elaborate taxonomy of needs, which are almost identical to Allport’ traits, a taxonomy that is still used today.

Immediately after World War II, several researchers began studying normal personality using factor analysis. The issue here, once again, concerned the nature and number of fundamental traits underlying normal personality, i.e., the search for an adequate taxonomy of traits. This led to the development of several trait-based personality inventories, the best known of which included the 16PF (Cattell, Eber, & Tat-suoka, 1970), the Eysenck Personality Inventory (Eysenck & Eysenck, 1975), and the Guilford—Zimmerman Temperament Survey (Guilford & Zimmerman, 1949).

In the mid-1960s personality trait research went through a major crisis caused by three unrelated events. The first was the ‘response set controversy’ (cf. Edwards, 1959); advocates of the response set model argued that rather than measuring traits, all existing personality inventories measured individual differences in a person’ desire to be well-regarded, i.e., social desirability. This argument severely challenged the methodological base of personality research. The second event was a series of behaviorist critiques of personality research, which argued that a careful review of the data provided no support for the existence of any general traits that govern, control, or explain people’ behavior across situations (more about this important debate in Section 16.4 below). Third, by the mid-1960s, there were so many different personality tests in the literature, each measuring a different set of traits, that the possibility of discovering an adequate taxonomy of traits seemed hopeless.

The response set controversy was finally resolved when Block (1964), in a series of ingenious analyses, demonstrated that the basic claims of response set theory are false. Next, the reality of traits was established by research in behavior genetics showing that scales on major personality inventories had substantial heritability (cf. Loehlin, Willerman, & Horn, 1989), i.e., there is an important genetic component to personality trait measures. And finally, an obscure Air Force technical report (Tupes & Christal, 1961) provided a solution to the taxonomic issue, arguing that personality could be classified in terms of five broad traits (Adjustment, Ascendance, Agreeableness, Prudence, and Intellect/Openness). This argument, which became known as the Big Five Theory (cf. Wiggins, 1996), suggests that all existing measures of personality concern parts or combinations of the same five dimensions; this viewpoint is widely accepted by many, but not all, modern personality psychologists. Thus, many people believe that the search for a taxonomy of personality traits that began with Gall’ faculties has ended with a parsimonious list of five broad dimensions.


Individual differences research from Stern to the present shows that people reliably differ from one another along every dimension of performance that has been studied. The major theories of personality concern why differences in social behavior occur. For purposes of exposition, these theories can be sorted into six broad categories—with apologies for the inevitable oversimplifications that this entails. These categories are: depth psychology, behaviorism, cognitive theory, trait theory, interpersonal theory, and evolutionary theory.

Depth Psychology

Depth psychology (e.g., Freudian psychoanalysis) is primarily a continental European tradition that contains many subtheories. Despite their differences, these subtheories share some important assumptions. First, they assume that the major structures of personality are primarily unconscious, that we are normally unaware of the ‘true’ reasons for our actions, and that a major goal in life is to become aware of these reasons. Second, in these theories, intrapsychic and interpersonal conflict is inevitable, and caused by unconscious sources. Third, these theories assume that development is important; events that happen early in life are more significant and determinative than events that happen later. Fourth, they assume that memories of these early developmental experiences persist in the unconscious and cause problems in adulthood, so that most adults have problems. Finally, these theories regard personality as very stable, over long periods of time. For traditional depth psychology, individual differences in social behavior arise from unconscious sources that are often hard to discover. Because of the emphasis on unconscious processes, depth psychology does not easily lend itself to standardized psychometric measurement and is hard to evaluate empirically.

Behaviorist Theories

Behaviorist theories are primarily a North American tradition, and behaviorism is largely defined by a point-for-point rejection of the key assumptions of depth psychology. For example, behaviorists believe that what people do depends on the circumstances they are in, and how they have learned to behave in those circumstances, rather than on underlying personality characteristics. Nor do they believe that conflict is inevitable—nothing about people is inevitable because their tendencies are learned. Behaviorist theories also conceptualize development very differently from depth psychology—early experience is no more important than later experience, what matters is how often certain forms of behavior have been successful for a person, not when they happened. Finally, because what people do depends on what they have learned, and because they are always learning, personality is not at all stable; one of the primary criticisms behaviorists make of depth psychology is that it vastly overestimates the degree to which personality is consistent over time.

People learn behaviors that are successful and repeat them until they are no longer successful. People also learn expectancies about what they, and others, will do in various circumstances; thus, individual differences in social behavior are caused by differences in learned behaviors and expectancies. Historically, a major criticism of behaviorism is the degree to which it ignores individual differences; nonetheless, among the behaviorists, Bandura and Rotter developed useful individual differences measures that have been very influential and widely used in research. The behaviorist emphasis on change is consistent with the widespread belief in American culture that people can always change and improve themselves, a set of beliefs that are not well supported by empirical data.

Cognitive Theories

Beginning with Kurt Lewin and extending through George Kelly to Walter Mischel and his students, cognitive theory tries to redefine personality psychology from the ground up. It begins with a critique of the concept of motivation, based on three points. First, motivational terms are useless when we are trying to help someone, e.g., what do you do after you have decided someone is lazy? Second, when motives are attributed to someone, the attribution will turn into a self-fulfilling prophecy, e.g., if we think a person is lazy, we will then treat the person as if he/she were lazy, thereby confirming what was initially only our hypothesis. And third, cognitive theory argues that motivational terms are unnecessary if we simply assume that people are active to begin with. This view of motivation resembles Isaac Newton’ notion that gravity is a constant and physicists need not worry about it; although Newton was widely criticized at the time, his theoretical decision proved ultimately to be wise.

The essential insight of cognitive theory is that people develop theories of the world, and of other people, and then use these theories to organize their lives. Sometimes the theories are accurate, but when they are not, they cause problems. In any case, people do what is on their minds and personality research concerns discovering the laws of thought, which can be described in purely cognitive terms. Cognitive theory describes personality development very much the way behaviorists do—what is important is what is on your mind and it does not matter where it came from in a developmental sense.

Like behaviorist theory, cognitive theory believes that traditional depth psychology vastly overestimates the degree to which personality is stable. For cognitive theory, if a person changes his/her theory of the world, his/her personality will change. Cognitive theory is quite popular in the United States and the United Kingdom. It is essentially an offshoot of experimental psychology; despite the attention it ostensibly gives to individual differences, cognitive theory has had almost no influence on personality assessment, largely because the key concepts are almost impossible to operationalize with traditional psychometric methods. Because both behaviorism and cognitive theory are only peripherally interested in individual differences, this raises the possibility that they are not, in fact, theories of personality, as is widely assumed.

Trait Theory

Modern trait theory argues that social behavior is controlled in important ways by real ‘neuro-psychic structures’ that exist inside the body. Sometimes called ‘constructive realism,’ modern trait theory distinguishes between: (a) real traits that exist in people; (b) our theories of those traits, which are called constructs; and (c) the measures that we use to observe the traits. Trait theory argues that a person can have a high level of a trait like neuroticism, and have underlying physiological systems that make the person alert to threat and danger, and yet the person, and those who observe the person, can be unaware of this. The reports of oneself and others regarding a person’ neuroticism are useful data, but they need to be supplemented with data from other sources: performance on experimental tasks, biological assays, physiological recordings, etc.

Trait theory and behavior genetics are closely linked in the study of personality. Loehlin et al. (1989), using data from many types of behavior genetic studies, estimated that slightly over 40% of the variability in the most prominent five factors of personality—personality traits—is due to genetic variation. The environmental variation that is shared within families contributed less than 10% of the variability in the five factors. Such findings support the reality of biological systems that produce consistencies in overt social behavior—and individual differences among people—and such evidence strongly challenges the behaviorist critique of traits (cf. Mischel, 1968). Some people argue that trait theories are not very ‘psychological’ and do not provide a link between biological traits and social behavior, but Tellegen and his colleagues propose trait theories that include motivational, perceptual, and information-processing components. Some features of modern trait theory are illustrated in the section of this chapter on temperament (Section 16.7).

Interpersonal Theory

The fundamental assumption of interpersonal theory is that personality arises out of, and is primarily expressed during, social interaction. Depth psychology, behaviorism, and cognitive theory concern intrapsychic processes; in these models other people are objects in the external world, differing from lamp posts and trees only in that they are more dangerous or more fun. In contrast, the interpersonal theorists argue that we need other people, that we live for social interaction, and the person that we become depends on feedback from others.

Social interaction is fueled by two motives: a need for approval, and a need to dominate or outperform others. Development is also important—interaction with parents and caretakers in childhood establishes core beliefs about one’ competencies and feelings of self-worth. We then carry these beliefs forward into the way we deal with others in adulthood. Individual differences in personality reflect different strategies and methods for dealing with others, some of which are more productive than others. Finally, the interpersonal theorists have been very active in developing methods for measuring and classifying the differences in peoples’ interpersonal behavior.

Evolutionary Theory

Models of personality based on evolutionary theory attempt to synthesize the preceding five traditions. They assume that human nature was shaped by the conditions to which our ancestors in the Pleistocene era had to adapt—people evolved as group-living animals and our culture (e.g., language and technology) and flexible intelligence gave us a substantial advantage over our animal competitors. People always live in groups, and every group has a status hierarchy; the major problem in life is to gain acceptance and approval from the other members of our social groups while at the same time gaining status and the control of resources. Acceptance and status confer reproductive benefits and are pursued during social interaction. Thus, social interaction is a major preoccupation, during which we try to build coalitions, attract support, and negotiate the status hierarchy.

Individual differences in personality reflect individual differences in temperament—which are inherited—and individual differences in strategies and behaviors designed to enhance acceptance and status—which are learned. In contrast with behaviorism, models of personality based on evolutionary theory regard personality as quite stable over long periods of time, a belief that is firmly supported by data. Personality is stable in part because it is rooted in biology; the heritability coefficients for the major dimensions of personality average about .50, suggesting that half the variance in personality scale scores is controlled by genetics. In addition, for evolutionary theory, development matters. Specifically, because people are fundamentally oriented toward social interaction, infants are born prewired to need attention and care. Attachment theory argues that, under normal circumstances, children become attached to their primary caretakers and develop unconscious cognitive prototypes of themselves as worthwhile and other people as trustworthy. These unconscious mental representations guide social interaction in adulthood and provide additional stability to personality.

Finally, writers in this tradition have been leading advocates of the Five-Factor Model (Wiggins, 1996), the view that the major dimensions of personality can be summarized in terms of five broad themes of Adjustment, Ascendance, Agreeableness, Prudence, and Intellectance/Openness.

Traits and Situations

Starting in the late 1950s and continuing for about 15 years thereafter, personality psychology was severely criticized on many grounds, but the most vigorous line of criticism came from behaviorism. It began with a Yale University research project in the late 1920s, called the Character Education Inquiry. Thousands of school children were tested in classrooms, on playgrounds, at parties, and in experimental laboratories. The tests concerned honesty/ dishonesty and each test provided children with an opportunity to cheat in some way. The major finding was that a child’ performance on one task could not be predicted by its performance on another task, i.e., what children did seemed to depend on the situations they were in. If honesty is considered a trait, then this research shows that the expression of traits depends on situations. Therefore, people’ behavior must be a function of situations and not personality, defined in terms of traits. Gordon Allport understood the importance of these findings and spent some time discussing the Character Education Inquiry in his 1937 book.

Personality psychologists largely ignored these findings, but Walter Mischel returned to them in his influential (1968) book; this book summarized the standard criticisms of personality psychology and set off the ‘person/ situation’ debate which was never satisfactorily resolved—people just grew tired of it. Mischel’ argument can be summarized in terms of two claims. First, if personality exists, then people’ behavior should be consistent across situations; a review of the empirical literature from the Character Education Inquiry to the present reveals no such consistency. Second, if personality exists, then personality measures should be able to predict people’ behavior. A review of the empirical literature shows that validity coefficients for personality measures rarely exceed a correlation of .30 and the vast majority of them are substantially smaller. Taken together, these two points strongly suggest that: (a) personality does not exist; or (b) it is essentially irrelevant as a cause of behavior. It is hard to overstate the negative impact this argument had on personality psychology in the 1970s and 1980s. Suffice it to say that, in the US, the field almost disappeared in the 1970s.

A closer examination of Mischel’ argument shows that it is flawed. Consider the claim that if personality exists, then behavior should be consistent across situations. There are three problems with this proposition. First, why should behavior be consistent? Why not intentions, or values, or personal goals? Second, the question of how to define consistency in a world of continuous flux is one of the oldest problems in philosophy and seems insoluble. Third, what is a situation? The empirical literature reveals no consensus regarding how to define the term, nor is there any consensus regarding a taxonomy of situations—which means that there is no agreement even on multiple definitions. Thus, Mischel’ first claim turns out to be logically odd and impossible in principle to rebut in an empirical fashion.

The second claim—that validity coefficients for personality measures rarely exceed .30—is more interesting. In the 1960s there were thousands of personality measures available in the published literature, many of dubious technical quality, and if one lumped together the results based on all these measures, one would certainly conclude that personality measurement does not work very well. A more careful review would have led to a different conclusion, however. Specifically, the empirical literature surrounding the few technically competent inventories of normal personality available at the time contained many validity coefficients substantially in excess of .30. Today, the results are still more promising; the development of the Five-Factor Model has provided an invaluable taxonomy for organizing the empirical literature—so that measures of adjustment can be compared with one another and not, for example, with measures of extraversion. When this taxonomy is combined with the recent rise of meta-analysis as a statistical method, validity coefficients in excess of .50 summing across tests and criterion measures, have become commonplace.

Concerning Mischel’ original argument, modern research reveals substantial evidence for the existence of personality and the utility of personality assessment. But for some persons outside the field of personality psychology there is some remaining confusion as reflected in the question, ‘Which is the more important determinant of personality: traits or situations?’

The Fallacy of Situationism

There is a problem with the argument that people’ behavior is a function of traits and situations. The problem concerns the fact that there is no agreed-upon definition or taxonomy of situations. Thus, behavior is claimed to be a function of something that has yet to be defined—even by those people who most believe in situations as explanatory concepts. As a result, the search for ‘person-by-situations interactions’ that so preoccupied researchers in the late 1970s seems to have been an empty exercise—because the concept of ‘situation’ has never been defined or given operational specification. It is very difficult systematically to link persons with situations when situations are undefined. Lewin (1935) and Murray (1938) suggested that situations should be defined in terms of how they are perceived by individuals. If so, then situations become a function of individual personality, and the person—situation debate becomes moot.

Personality Measurement and Structure

There are two primary questions in personality measurement: (a) what to measure; and (b) how to measure it? Both questions have been the subject of considerable debate. The first is a question about theoretical taste, the second is an empirical question which in principle can be answered with data.

What to Measure?

If we assume that personality assessment has a job to do, i.e., if we adopt an applied perspective, then what we measure depends on what we are trying to accomplish. For depth psychology, people are primarily motivated by unconscious wishes and desires, and this is what we should measure. Depth psychologists also believe that, when conscious controls relax, unconscious motives will appear in fantasy material: dreams, artistic creations. This belief led to the development of projective tests which are used to assess unconscious themes, desires, and aspirations.

American psychology is heavily grounded in behaviorism and American psychiatrists and clinical psychologists want to identify characteristics such as dysfunctional behaviors, conscious anxiety, and emotional distress. They generally believe people can talk about their problems and reveal them in interviews or through self-report inventories and behavioral check lists. This led to the development of standardized psychiatric inventories such as the Minnesota Multiphasic Personality Inventory, which are used as aids to diagnosis and treatment planning.

Career counselors and vocational/occupational psychologists want to measure qualities or factors that predict occupational satisfaction and success. They, therefore, measure the values, interests, and skills that characterize various well-defined occupational groups, e.g., scientists like to work alone, solve puzzles and problems, and tend not to value money, so that people who are introverted, like problem solving, and are indifferent to the profit motive are compatible with careers in science. Occupational psychologists also want to measure the characteristics that are associated with success in the various occupations, which may include such factors as intelligence, drive, creativity, and social skill. They then use this information to guide people into occupations.

Industrial psychologists overlap with the foregoing groups; they use personality assessment to solve three general categories of problems. The first is to identify people who will be undesirable employees; the goal here is to measure factors associated with absenteeism, bogus medical complaints, and tendencies toward theft, violence, and other forms of antisocial behavior. The second problem is to identify people who will perform well in specific occupations; here the goal is to develop the psychological profile of high performers in particular jobs, then use the profile to identify others who would perform well in that occupation. The third way industrial psychologists use assessment is to give individuals feedback on how to enhance and improve their overall career performance. This third case requires an extensive and in-depth assessment of all the factors that might influence career development, including cognitive ability, normal personality, abnormal personality, and values and interests.

Personality psychologists in the academic tradition want to measure traits, which they see as the building blocks of personality. Measuring traits relies heavily on factor analysis as a tool, and the goal is to identify factors that recur in different samples and languages; these factors are seen as traits. This academic orientation led to the development of the Five-Factor Model mentioned earlier, widely regarded as a substantial scientific achievement. The goal of measuring traits is very different from the goal of applied personality psychology, where the key question in measurement is validity—the degree to which scores on a test predict measures of a desired outcome such as job success. For academic researchers, validity does not matter, the goal is to measure traits and that is sufficient unto itself. Once again, the answer to the question ‘What should we measure?’ depends on what we are trying to do.

How to Measure Personality?

At the beginning of this chapter we distinguished between the actor’ view and the observer’ view of personality, i.e., between identity and reputation.

Measuring the Observer’ View of Personality

It is a relatively straightforward task to measure personality from the observer’ perspective; this involves having observers rate the person in question. There are a number of well-standardized rating instruments, including the Q sorts used by clinical psychologists and the 360 degree appraisal forms used by modern industrial psychologists. Moreover, the Five-Factor Model (Wiggins, 1996) brings a useful taxonomic discipline to the rating process. Some writers suggest that the Five-Factor Model is, in fact, the natural structure of observer ratings, the innate categories in terms of which we think about and evaluate others. And this suggests that the various standardized rating instruments in use today can, at least in principle, be reconfigured in terms of five broad dimensions.

The reliability of ratings of personality depends on the number of raters and the observability of the characteristics to be rated. More raters and greater observability (e.g., talkativeness, which is observable, versus brittle ego structures, which are not) always enhance the reliability of ratings. In addition, when done correctly, these ratings tend to be stable over long time periods. And finally, ratings of personality can predict useful outcomes, i.e., they are valid as well as reliable. In fact, generally speaking, the validity of personality ratings equals or exceeds the validities of personality inventory scales.

Measuring the Actor’ View of Personality

There are two models for measuring personality from the actor’ perspective; these might be called the trait theory model and the empirical model. The trait theory model, which is by far the more popular, is based on two major assumptions. First, it assumes, à la Allport, that individual personality is configured in terms of traits—indwelling neuropsychic structures—and the goal of assessment is to measure these traits. Second, this model assumes that people can report on the degree to which various traits are salient in their lives. With these two assumptions in mind, the measurement process is relatively straightforward: one writes an initial set of items designed to reflect the trait in question; one next tests a group of people with the items; one then calculates correlations among the items and retains the items that are most highly correlated; finally, one begins again for the next trait of interest. After creating a number of trait measures in this way, one can calculate correlations among the scales and retain the scales that are most independent or uncorrelated.

This process will result in a set of homogenous or internally consistent measures of hypothesized trait dimensions that are also relatively independent—a highly desirable state of affairs for most measurement researchers. The problem with this model is that, although it maximizes scale reliabilities and independence, it tends to ignore validity—which is the bottom line in applied assessment.

The alternative and minority perspective is that the goal of assessment is not to measure traits but to predict significant outcomes: status, popularity, income, occupational performance, creativity, delinquency, leadership. Here one identifies items that discriminate between people who have high and low standing on the desired outcome, and then retains those items as scales. This model of measurement maximizes validity at the expense of scale homogeneity and scale independence. Some of the best-known tests in the history of psychological measurement have been composed in this way, including Binet’ original measure of intelligence, E. K. Strong’ Vocational Interest Blanks, Hathaway and McKinley’ Minnesota Multiphasic Personality Inventory (Hathaway & McKinley, 1943), and Gough’ California Psychological Inventory (Gough, 1987). It is, of course, possible to combine these two approaches, but it is rarely done.

The modern assessment center, which provides a comprehensive analysis of an individual personality, typically includes a variety of measurement procedures: simulations, ratings, and inventories. The modern assessment center was invented by the German Army after World War I and widely imitated in the United Kingdom and the United States. There is a rich research tradition associated with assessment centers, which when properly designed and conducted generally yield valid results, i.e., significant and useful correlations between scores for performance in the assessment center and outcome measures such as organizational status and rated creativity.

Special Topics

Occupational Performance

Personality psychology has been historically a part of clinical psychology and psychiatry and the history of personality measurement reflects this theme. From the beginning of the discipline in the early 1900s, personality measurement focused on assessing aspects of psychopathology. In 1943 the United States’ Office of Strategic Services (today the Central Intelligence Agency) established an assessment center to screen applicants for membership in the organization. The book, Assessment of Men (MacKinnon, 1948) evaluated the effectiveness of the assessment center and concluded that psycho-pathology was not a good predictor of performance, that some highly effective individuals had experienced unusually traumatic upbringings and some undistinguished performers seemed very well-adjusted.

This lesson has been largely overlooked by researchers studying the links between personality and occupational performance, and this is partly responsible for the negative reviews in the 1960s regarding the validity of personality measures. Measures of psychopathology—anxiety, depression, and self-esteem—are largely uncorrelated with occupational performance.

For about 30 years—1960 to 1990—it was widely believed that personality measures were uncorrelated with significant real-life behavior. Two developments changed this perception. On the one hand, the Five-Factor Model provided the necessary taxonomy for organizing literature reviews—measures of extraversion could be compared with one another rather than with measures of conscientiousness and adjustment. On the other hand, researchers realized that certain dimensions of personality are more relevant to some occupational outcomes than to others—this is called aligning predictor variables (personality measures) with criterion variables (outcomes). Beginning in the early 1990s, researchers in the United States and Europe, using meta-analysis, organizing personality variables in terms of the Five-Factor Model, and appropriately aligning predictors with criteria, reported finding that personality reliably predicts occupational performance above and beyond the prediction afforded by cognitive variables.


Organizations and cultures exist in environments that are constantly changing. All organizations and cultures are in competition with other organizations and cultures; in order to remain viable, they must constantly change. Groups are notoriously resistant to change; from where, then, does the stimulus for change come? It comes from the innovative people in the group or organization. In this sense, creativity is a resource for cultural and organizational survival. It then becomes a matter of some importance to be able to identify creative talent—and then to encourage it.

The Institute for Personality Assessment and Research at U.C. Berkeley was established in 1948 to study high-level effectiveness. In the 1950s and 1960s the staff of the Institute conducted a series of studies of creativity with highly significant and well-replicated results (cf. Barron, 1969). One of the best of these was a study of creativity in architects. The researchers identified, using various nomination methods, three groups of architects: the first was highly creative—as judged by a substantial number of experts; the second group had worked with the first group but was not regarded as creative; the third group was journeymen architects who had no contact with the first two groups. All of these architects went through a 2½ day assessment center. An analysis of the assessment center results revealed highly significant, cross-validated differences between the three groups with the creative people being no brighter, but substantially more troubled, ambitious, and harder working than the other groups. Moreover, these differences in early adulthood persisted through the life span—the creative group remained substantially more productive, professionally active, and well-regarded than the other two groups well into their seventies and eighties.


Temperament involves stable individual differences that appear early in life. One of the most influential discussions of temperament was a round-table discussion in 1987 (Goldsmith et al., 1987). Goldsmith summarized the discussion in terms of four points:

  • Temperament involves individual differences.
  • Temperament consists of dispositions, that is, tendencies to engage in classes of behavior or to experience certain classes of emotion; temperament is not defined by specific behaviors.
  • Temperament is expressed most directly during infancy; later it is expressed less directly because temperament becomes subject to newly developed control processes.
  • Temperament theorists emphasize the ‘biological underpinnings’ and continuity of individual differences.

Goldsmith also concluded that most of the discussants regarded temperament as modifiable. But he noted that there are significant disagreements over boundaries of the temperament construct, over its boundaries with personality, and over the distinctive or defining features of temperament.

A number of the issues raised at the 1987 discussion remain quite current. On the distinction between temperament and personality, Strelau (1987) argues that temperament is distinguishable from personality in that (1) temperament is determined by biology, whereas personality is shaped by social forces; (2) temperament is expressed in the early years and personality appears later; (3) temperament can be observed in nonhuman animals and personality cannot; (4) temperament is expressed in the style of the execution of behaviors rather than in the specific content of behaviors (although he acknowledged anxiety is an example of a temperament saturated with content); and (5) personality, but not temperament, is concerned with how behavior is directed by higher cognitive processes.

Rothbart argued that personality is built upon the foundation of temperament:

… ‘personality’ is a far more inclusive term than ‘temperament.’ Personality includes important cognitive structures such as self-concept and specific expectations and attitudes that may, if they are sufficiently negative, result in frequent displays of distress even if an individual is not temperamentally predisposed toward it. Personality also includes perceptual and response strategies that mediate between the individual’ biological endowment and cognitive structures and the requirements, demands, and possibilities of the environment … Thus, temperament and personality are seen as broadly overlapping domains of study, with temperament providing the primarily biological basis for the developing personality. (Rothbart in Goldsmith et al., 1987, p. 510)

On the other hand, Hofstee (1991) argued that if personality is defined in terms of fundamental dispositions, then the distinction between personality and temperament vanishes. The world’ literature thus contains a range of opinion anchored on one end by Hofstee’ contention that personality and temperament are the same, and on the other end by views that temperament is a primal organizer, a foundation for a later developing personality that is much more than temperament. Based on the view that personality and temperament are different, Angleitner and Riemann (1991) described the implications of this distinction for the measurement of temperament.

Another common assertion is that temperament should be particularly observable in infancy. That is, basic dispositions such as fearfulness should be easily observed in infancy because the child has not yet developed higherlevel control processes, the elaboration of the self-concept, and so on. However, one could also argue that because infancy is the time of greatest environmental potency, when children cannot defend themselves from scratchy clothes, intrusive relatives, or aversive foods, dispositional differences are less likely to be observed. Certainly heritable individual differences are less detectable when potent environmental factors produce great phenotype variation. The field of behavior genetics also raises the possibility that genetic differences are powerfully enhanced by people actively selecting and modifying environments, a process that becomes prominent only later in development (Scarr & McCartney, 1983). Thus although armchair psychologists can argue one side or the other, the degree of stability and prominence of early-appearing individual differences remains an open research question, not to be resolved by definition.

Hinde, also a discussant of the 1987 round-table discussion, pointed out the problems in defining temperament by reference to its supposed biological features. A number of problems have been caused by the careless use of three terms: biological, genetic, and heritable. Unless one believes that some psychological processes are mediated by physical processes while others are mediated by metaphysical processes, it is not helpful to assert that temperament has biological underpinnings; all psychological processes have biological underpinnings or they would not exist. The general term genetics refers to molecular recipes for both individual differences and features that are shared across all members of a species, whereas heritability refers only to individual differences. Having lungs to breathe oxygen is encoded in the genome, and is thus genetically transmitted, but it is not classically heritable in humans, whereas individual differences in endurance probably has a degree of heritability. Heritability thus more specifically refers the extent to which phenotypic variation across individuals is due genetic differences between them.

Some people define temperament as individual differences that are heritable. This approach to defining temperament is not very discriminating: most well-measured behavioral traits have some degree of heritability. But the heritability of virtually all traits is well below ‘perfect’ heritability in which all observed differences between individuals can be accounted for by genetic differences between them. As Hinde noted, there is no natural cut-off point for deciding when a trait is heritable. Rowe (1997) reviewed research on early-appearing individual differences and concluded that, ‘one-third to one-half of individual differences in temperamental traits can be attributed to genetic variation among children’ (p. 378). Temperamental traits thus do not seem to have higher heritability than other human dispositions. Thus the heritability of early-appearing individual differences should be treated as a research problem to be studied, and the problems entailed in defining temperament as ‘that which is heritable’ should be recognized.

Others have attempted to define temperament as ‘that which is stable.’ However, the degree to which temperament is stable over time is another question that should be answered by research rather than by definition. Differential continuity refers to the extent to which a child retains his or her relative position in individual differences across developmental periods. This does not mean that behavior stays the same. For example, fearfulness in a one-year old may be evidenced in very different ways than fearfulness in a 13-year-old. Nevertheless, the type of continuity evaluated in this section is differential continuity, that is, maintaining one’ position, high, low or medium, on a disposition across developmental periods.

Wachs (1994) summarized stability research by noting that age-to-age correlations in temperament are modest, suggesting only 4 to 9% of the variance in temperament measures is predictable over six months. Initial research on the sources of stability and change in adult positive and negative emotionality suggests that indeed, much of the stability of traits arises from genetic factors, and much of the change arises from environmental factors that are not shared by members of families (McGue, Bacon, & Lykken, 1993). Even if a dimension of individual difference shows high heritability at one developmental point, this does not imply that there will be differential continuity. Consider the stages of the butterfly’ life; the transformations from larva to butterfly show that change can be the result of genetic control. The misconception that heritability implies stability seems particularly prominent in the temperament domain. But nothing known in genetics would rule out the possibility that some children might have a special temperament that is turned on for a period of time, e.g., ‘a squeaky wheel gets the grease’ personality, only to be shut off later. The point is that continuity of temperament must be established by research, not decided by definition.

What are the Major Dimensions of Temperament?

If one defines temperament as early-appearing psychological individual differences excluding intelligence, the problem remains of providing a list of the major dimensions that comprise temperament. At present there is no consensus. Thus we will examine the lists provided by prominent researchers and theorists who have been concerned with delineating the features of temperament. Buss and Plomin (1975) initially listed dimensions of emotionality (ranging from easily distressed to relatively phlegmatic), activity, sociability and impulsivity; impulsivity was subsequently dropped. Thomas and Chess (1977) listed nine features, most of them dimensions: rhythmicity, activity level, approach/ withdrawal for novel stimuli, adaptability, sensory thresholds, predominant mood, mood intensity, distractibility, and persistence/ attention. In addition to dimensions, some theorists include temperament types, such as a ‘difficult’ child. Thomas and Chess (1977) included three temperament types: easy, difficult, and slow to warm up.

Strelau (1983) has emphasized stylistic rather than content features of temperament; and he developed the theory and measurement instruments to apply Pavlovian concepts of the nervous system function to the study of temperament: strength of excitation, strength of inhibition, and mobility. This contrasts with a content emphasis, as found in the work of Goldsmith and Campos, who defined temperament as dispositions to experience fundamental emotions that link coherent stimulus classes with organized output classes. Goldsmith and Campos developed a list of dispositions to experience arousal of primary emotions, including: sadness, pleasure, anger, disgust, fear, interest, and activity.

Rusalov (1989) developed a measure of temperament that includes ergonicity, an ambition or achievement-like construct; social ergonicity which resembles extraversion; plasticity, resembling openness; tempo; social tempo; emotionality; and social emotionality. Gray (1991) coordinated temperament hypotheses with his three theoretical neuropsychological systems. First, there is a Behavioral Inhibition System, which generates individual differences in sensitivity to novelty and signals of punishment or nonreward. The second system is the Fight/ Flight system, which responds to unconditioned aversive stimuli. Finally, the Behavioral Approach System generates individual differences in responsiveness to signals of reward or nonpunishment.

A number of researchers have used factor analysis to examine the relationships between measures of temperament to see if some consensus might be reached regarding a set of descriptive dimensions. Ruch, Angleitner, and Strelau (1991) suggested five factors of temperament: emotional stability, rhythmicity, activity and tempo, sociability, and impulsiveness. Zuckerman, Kulhman, and Camac (1988) also suggested five factors: sociability, activity, aggression-sensation-seeking, neuroticism-anxiety, and impulsive-unsocialized-sensation seeking.

Angleitner and Ostendorf (1994) examined the relationships among a wide range of temperament measures and concluded that the Five-Factor Model of personality represented a single comprehensive framework within which measures of temperament and personality could be located. As noted earlier, the Five-Factor Model of personality is defined by the dimensions of Neuroticism, Extraversion, Agreeableness, Conscientiousness, and a fifth factor called Culture, Intellectance, or Openness.

Rothbart (1981) developed measures of smiling and laughter, fear, frustration, soothability, activity level, and duration of orienting. Ahadi and Rothbart (1994) analyzed the structure of temperament dimensions in a manner consistent with Gray’ approach, and concluded that temperament dimensions can be classified as approach related (mapping onto the Five-Factor Model dimension of extroversion); anxiety-inhibition related temperament (mapping onto Neuroticism); and a third aspect, effortful control, exerting modulating effects on the other dimensions. These summary dimensions seem to correspond with Tellegen’ superfactors of Positive Emotionality, Negative Emotionality, and Constraint. Rothbart, Derryberry, and Posner (1994) presented a developmental model of temperament that outlines the appearance of a generalized distress and negative affect system in the newborn period. Over the first year, more differentiated fear and frustration/anger develop from generalized distress states. The development of approach systems beginning in the second month adds the possibility of frustration when a child is unable to gain a reward. Finally, effortful control appears later because its development is tied to the maturation of attentional control systems after the tenth month.

A number of structural analyses of temperament converge on solutions that resemble four of the five factors found in the Five-Factor Model of personality (e.g., Digman & Shmelyov,’ 1996, analysis of the temperament of Russian schoolchildren). These temperament findings converge with the integrative model of personality proposed by Watson, Clark, and Harkness (1994); this model is composed of Neuroticism or Negative Affectivity, Extraversion or Positive Emotionality (linked with approach systems), Conscientiousness or Constraint (linked with effortful control), and Agreeableness. Thus, one promising development is an increasing convergence between lists of the major features of temperament and lists of the major features of personality.

Recent Developments in Studying the Behavior Genetics of Temperament

Studies of the behavior genetics of early-appearing individual differences have moved beyond estimating the heritabilities of temperament traits. Recent studies show, for example, that some of the situational specificity of behavior (e.g., classroom activity level versus activity level in a laboratory) is genetic in origin (Schmitz, Saudino, Plomin, Fulker, & DeFries, 1996).

A comprehensive review of findings is beyond the scope of this chapter, however as an example of recent developments, we will describe the issue of apparent ‘contrast effects’ in twin studies of temperament. A large inconsistency in the literature on the behavior genetics of temperament concerns the discrepancy between twin studies and other methods. The degree of similarity in parental temperament ratings of dizygotic twins (twins resulting from two separate conceptions, as contrasted with monozygotic twins who result from a single conception that divides to become two persons) were inconsistent with similarity estimates from other methods. Comparing across multiple methods of study, the twin method produced unexpected results: dizygotic twins seemed to be too different. Buss and Plomin (1984) conjectured that parents amplify any existing real differences, resulting in a ‘contrast effect.’ Suppose one child is moderately high in fearfulness and the other is moderately low; if parents amplify the real difference by rating one child extremely high and the other extremely low in fearfulness, this would be a contrast effect. Because real differences tend to be larger between dizygotic twins than between monozygotic twins, parental ratings amplifying real differences would make dizygotic twin ratings more different than expected. Subsequent research has supported this interpretation. Saudino and Eaton (1991), using automated data collection procedures rather than ratings (automated procedures not being subject to contrast effects), showed dizygotic twin similarities to be in line with expectations. Further, the development of measurement procedures not affected by contrast effects provides a clearer understanding of the role of environment in shaping temperament (Goldsmith, Buss, & Lemery, 1997). This maturation of research methods, in this case the ability to check the consistency of heritability estimates across methods, allowed for improvement in the measurement methods themselves.

To summarize the major developments in temperament research, structural analyses examining the list of dimensions defining temperament suggests convergence between models of temperament and the Five-Factor Model of personality. Developmental models, such as that proposed by Rothbart et al. (1994), have embedded these lists of features in a picture of dynamic change across the first years of life. The measurement of temperament has become more sophisticated through the identification and amelioration of rating contrast effects, allowing for further gains in precision.

Intelligence and Cognitive Ability

Although this chapter begins with personality, the study of individual differences is most often associated with the study of intelligence, and this section concerns the theory, measurement, and consequences of the intelligence construct. Most people believe that the modern study of intelligence begins with Alfred Binet (1857-1911), because he developed, in 1905, the first test of general mental ability, revised the test in 1908, and refined it in 1911. Binet’ instruments spawned the applied mental measurement movement, but there were some important antecedents to Binet’ contribution. Earlier, for example, Esquirol distinguished between mental deficiency and mental disorders, two conditions that had often been conflated, and stressed that mental deficiencies come in degrees. In the late nineteenth century, Fechner, Weber, and Wundt tried to define the relationship between intervals of objective stimulus intensity and subjective appraisals of just noticeable differences (JNDs) in the personal experience of those stimuli. Their efforts to scale simultaneously physical and psychological phenomena (psychophysics) led to the development of psychometrics, which is based on the idea that responses to environmental cues reflect individual differences in ability, interests, and personality. The limen of psychophysics became the standard error of measurement in psychometrics (the measurement of individual differences in abilities).

Next, Frances Galton and James McKeen Cattell, among others, tried to measure intelligence by scaling individual differences in the strength of various sensory systems (Thorndike & Lohman, 1990); they used Aristotle’ view that the mind is informed to the extent that one’ sensory systems provide clear and reliable information. This approach did not pan out. Binet, in a creative departure, examined complex behavior, e.g., comprehension, judgment, reasoning, directly. His methods were less reliable than psychophysical assessments, but they more than made up for this in validity. Binet’ insight was to use an external criterion to validate his measuring tool. Thus, Binet pioneered the empirically-keyed or external validation approach to scale construction. His external criterion was chronological age, and test items were grouped such that the typical member of each age group was able to achieve 50% correct answers on questions (items) of differential complexity. With Binet’ procedure, individual differences in scale scores, or mental age (MA), were quite variable within students of similar chronological age (CA). William Stern used these components to create a ratio of mental development: MA/CA. This was later multiplied by 100 to form what we now know as the intelligence quotient (IQ), namely IQ = MA/ CA X 100.

Binet’ approach was impressive; unlike Galton’ and Cattell’ psychophysical assessments of sensory systems, Binet’ test predicted teacher ratings and school performance, and the progressive educational movement in America eagerly adopted it. Two of G. Stanley Hall’ students, H. H. Goddard and Lewis M. Terman, promoted applied psychological testing. Although they specialized in opposite ends of the IQ spectrum (Chapman, 1988; Zenderland, 1998), they shared similar views about the need to tailor curriculum complexity and speed to individual differences in mental age. Terman later conducted one of the most famous longitudinal studies in all of psychology, devoted to the intellectually gifted (Chapman, 1988), while Goddard concentrated on the ‘feeble minded’ and directed the Vineland institution for training practitioners to work with this special population (Zenderland, 1998). They later joined Robert M. Yerkes to develop a cognitive ability measure for personnel selection during World War I. The Armed Forces needed to screen recruits, many of whom were illiterate; one of Terman’ students, Arthur S. Otis, devised a nonverbal test of general intelligence that was used for this purpose. The group developed the Army Alpha (for literates) and Beta (for illiterates). The role mental measurements played in World War I and, subsequently, in World War II legitimized the use of cognitive ability measures to screen people in the general public for training purposes; as a consequence, virtually every modern adult has taken some sort of cognitive ability test at some point in his or her life.

Following World War I, Terman recognized the link between the use of intellectual assessment in the military and problems in the public schools. Terman, a former teacher, understood that there is a range of ability in students who are grouped based on chronological age; he then became an advocate of homogeneous grouping based on mental age. He felt strongly that beyond two standard deviations either side of IQ’ normative mean, the likelihood of encountering special students increases exponentially, and the more extreme the IQ, the more intense the need. Optimal rates of curriculum presentation, as well as its complexity, vary throughout the range of individual differences in general intelligence. With IQ centered on 100 and a standard deviation of 16, IQs extending from the bottom to the top 1% in ability cover an IQ range of approximately 63 to 137. But since IQs go beyond 200, this 74-point span only covers one-third of the possible range. Hollingworth’ (1942) Children over 180 IQ, dramatized the unique educational needs of this special population, a need that has been empirically supported in every decade since (Benbow & Stanley, 1996).

The Nature of Intelligence

American psychologists were interested primarily in the application of IQ measures to education; the early research on the nature of general intelligence came from Europe. In a groundbreaking publication, Charles Spearman (1904) showed that a dominant dimension (‘g’) runs through various collections of intellectual tasks (test items). Ostensibly, items used to form such groupings should be a ‘hotchpotch.’ Yet, all such items are positively correlated and when they are summed, the construct-relevant aspects of each coalesce through aggregation, whereas their construct-irrelevant (uncorrelated or unique) aspects vanish within the composite. Spearman and Brown formalized this property of aggregation in 1910. The Spearman-Brown Prophecy formula estimates the proportion of common or reliable variance running through a composite: rtt = krxx ÷ 1 + (k – 1)r xx (where: r tt =common or reliable variance, rxx = average item intercorrelation, and k = number of items). This formula reveals how a collection of items with uniformly light positive intercorrelations (say, averaging r = .15) will form a composite dominated by common variance. Although each item on a typical intelligence test is dominated by unwanted noise, aggregation serves to amass the construct-relevant aspect (signal) from each item, while simultaneously attenuating their construct-irrelevant aspect (noise) within the composite. Aggregation amplifies signal, and attenuates noise.

At the phenotypic level, all modern intelligence tests measure essentially the same construct (‘g’) described at the turn of the century in Spearman’ (1904) paper, ‘“General intelligence,” Objectively determined and measured’—although more efficiently and precisely. Not everyone is happy with this; some complain that this finding indicates a lack of progress, while others complain that ‘g’ is simple and complex human behavior is multiply determined. Yet, given the validity data that have accrued for g over the years, how much additional forecasting power can be reasonably anticipated from the ability domain? Everyone agrees that, for predicting complex human behavior, many things matter, and that we must remain open to new ways to forecast performance. In the meantime, however, writers as different as Snow (1989), an educational psychologist, and Campbell (1990), an I/O psychologist, underscore the real-world significance of general intelligence:

Given new evidence and reconsideration of old evidence, [g] can indeed be interpreted as ‘ability to learn’ as long as it is clear that these terms refer to complex processes and skills and that a somewhat different mix of these constituents may be required in different learning tasks and settings. The old view that mental tests and learning tasks measure distinctly different abilities should be discarded. (Snow, 1989, p. 22)

General mental ability is a substantively significant determinant of individual differences in job performance for any job that includes information processing tasks. If the measure of performance reflects the information processing components of the job and any of several well-developed standardized measures used to assess general mental ability, then the relationship will be found unless the sample restricts the variances in performance or mental ability to near zero. The exact size of the relationship will be a function of the range of talent in the sample and the degree to which the job requires information processing and verbal cognitive skills. (Campbell, 1990, p. 56)

Modern research on the nature of intelligence has focused on predicting with greater precision educational outcomes, occupational training, and work performance. Other researchers have extended the network of general intelligence’ external relationships to topics such as aggression, delinquency and crime, income and poverty. Some representative findings include correlations in the .70—.80 range with academic-achievement measures, .40—.70 with military training assignments, .20—.60 with work performance (higher values reflect job complexity), .30—.40 with income, and around .20 with law abidingness (see Brody, 1992; Gottfredson, 1997; Jensen, 1998, and references therein). Brand (1987, Table 2) documents a variety of light correlations between general intelligence and altruism, sense of humor, practical knowledge, response to psychotherapy, social skills, supermarket shopping ability (all positive correlates), and impulsivity, accident proneness, delinquency, smoking, racial prejudice, obesity (all negative correlates), among others. This diverse set of correlates reveals how individual differences in general intelligence ‘pull’ cascades of primary (direct) and secondary (indirect) effects. Murray’ (1998) 15-year analysis of income differences between biologically related siblings (reared together) who differed on average by 12 IQ points corroborates a handful of studies using a similar control for family environment (Bouchard, 1997)—while not confounding socioeconomic status with biological relatedness.

IQ and Social Policy

The foregoing data are widely accepted among experts in the individual differences field (Carroll, 1993; Gottfredson, 1997; Jensen, 1998; Thorndike & Lohman, 1990). Yet research regarding general intelligence routinely stimulates contentious debate (Cronbach, 1975, American Psychologist), and this is likely to be with us always. Because measures of cognitive ability are used to allocate educational and vocational opportunities, they affect social policies. But psychometric data do not (and can not) dictate policies for test use. Moreover, because the test scores and criterion performance of different demographic groups differ, concern about using these tests emerged shortly after Spearman’ (1904) initial article appeared (cf. Chapman, 1988; Jenkins & Paterson, 1961). Nevertheless, Robert Thorndike summarized his research findings (through the 1980s) on cognitive abilities as follows:

[T]he great preponderance of the prediction that is possible from any set of cognitive tests is attributable to the general ability that they share. What I have called ‘empirical g’ is not merely an interesting psychometric phenomenon, but lies at the heart of the prediction of real-life performances …

Remarks such as these highlight the importance of understanding general intelligence, its measurement, nature, and how best to nurture its development—because powerful scientific tools can be used wisely or unwisely, and their use is almost always accompanied by unintended (indirect) effects. So a number of wide-ranging studies have appeared over the last two decades, designed to reveal what this dimension forecasts relative to other psychological attributes. For example, John B. Carroll (1993) published his massive Human Cognitive Abilities, which included 467 data sets of factor-analytic work dating back to the 1920s. Psychological Science (1992) published a special section on ‘Ability testing,’ as did Current Directions in Psychological Science (1993). The National Academy of Sciences published two book-length special reports on fairness and validity of ability testing (Hartigan & Widgor, 1989; Widgor & Garner, 1982), while the Journal of Vocational Behavior launched two special issues ‘The g factor in employment’ and ‘Fairness in employment testing’ (Gottfredson, 1986a, b; Gottfredson & Sharf, 1988, respectively). Finally, Sternberg’ (1994) two-volume Encyclopedia of Intelligence is an excellent source for examining, systematically, the landscape of psychological concepts, findings, history, and research about intelligence. Sternberg’ Encyclopedia, like his Advances series (Erlbaum), goes well beyond the psychometric assessment of cognitive abilities.

In the mid-1990s, Herrnstein and Murray (1994) published The Bell Curve: Intelligence and class structure, a controversial book that heightened the intensity of attention devoted to general intelligence and its assessment. Among other things, Herrnstein and Murray (1994) examined the relative predictive power of general intelligence and SES for forecasting a variety of social outcomes. Because of the controversy this volume stimulated, the American Psychological Association formed a special task force; the task force report, ‘Intelligence: Knowns and unknowns’ (Neisser et al., 1996), concluded that measures of general intelligence assess individual differences in ‘abstract thinking or reasoning,’ ‘the capacity to acquire knowledge,’ and ‘problem solving ability’ (Brody, 1992; Carroll, 1993; Gottfredson, 1997; Snyderman & Rothman, 1987), and that individual differences in these attributes affect facets of life outside of academic and vocational arenas, because abstract reasoning, problem solving, and rate of learning impact many aspects of life in general, especially in a computer-driven, information-dense society.

Recent Issues in Cognitive Ability Research

Flynn Effect

Raw score increases on measures of general intelligence definitely occur over time—average scores go steadily up as time goes by. Observed scores on intelligence tests have been steadily rising cross-culturally, during most of this century, and this observation is called the ‘Flynn effect,’ after the man who documented it. Whether these increases reflect genuine gains in general intelligence within the general population is less clear. Increases can occur due to increases on a measure’ construct-relevant or construct-irrelevant variance, or both. The problem is complex; and it has generated a considerable amount of discussion (Neisser, 1998). However, the following suggests that the changes are at least in part due to construct-irrelevant aspects of measuring tools.

The magnitude of the Flynn effect is positively correlated with the amount of nonerror uniqueness of various measures of g. For example, population gains over time on the Raven Matrices are greater than gains on the Verbal Reasoning Composites of heterogeneous verbal tests, which, in turn, are greater than gains on broadly sampled tests of General Intelligence (aggregates of heterogeneous collections of numerical, spatial, and verbal problems). The Raven Matrices consists of approximately 50% g variance, whereas aggregated collections of cognitive tests approach 85%. The issue becomes more complex when we consider that test scores have also probably increased due to advances in medical care, dietary factors, and educational opportunities. Moreover, scores at the upper end of this dimension may not have increased much due to the gifted being deprived of appropriate developmental opportunities (Benbow & Stanley, 1996). Nonetheless, the Flynn effect definitely deserves intense study.

Whatever the reason for these raw score gains, the gains neither detract from nor enhance the construct validity of measures of general intelligence. Populations at different levels of ability, for example, typically show the same covariance structure with respect to the trait indicators under analysis (Lubinski, in press). It does not follow that mean changes on an individual difference dimension somehow attenuate the construct validity of measures purporting to assess it.

Vertical Inquiry

Jensen (1998) argues that basic research on general intelligence needs to identify more fundamental (biological) vertical paths, and develop more ultimate (evolutionary) explanations, for genuine advances to occur. Like other psychological constructs, general intelligence can be studied at different levels of analysis. By pooling studies of monozygotic and dizygotic twins reared together and apart, and a variety of adoption designs, the heritability of general intelligence in industrialized nations has been estimated to be between 60-80% (Bouchard, 1997). Using magnetic resonance imaging (MRI) technology, brain size controlled for body weight correlates in the high .30s with general intelligence, after removing the variance associated with body size (Jensen, 1998, pp. 146-149). Glucose metabolism is related to problem-solving behavior, and gifted people appear to engage in more efficient and less energy-expensive problem-solving behavior. Also, gifted people have enhanced right hemispheric functioning. The complexity of electro-encephalograph (EEG) waves is positively correlated with g, as are the amplitude and latency of average evoked potential (AEP). Some investigators suggest that dendritic arborization (amount of branching) is correlated with g. In addition, a multidisciplinary team claims to have uncovered a DNA marker associated with g.

Proximal and Ultimate Investigations of Cognitive Ability

Bouchard and his colleagues have introduced a revision of experience producing drives (EPD) theory that concerns the development of human intelligence (Bouchard, Lykken, Tellegen, & McGue, 1996). EPD theory is a modification of the views of Hayes (1962)—a comparative psychologist who studied the language and socialization capabilities of nonhuman primates. Like all organisms, evolution designed humans to do something; inherited EPDs facilitate skill acquisition by motivating individuals toward particular kinds of experiences and developmental opportunities. Moreover, these selective sensitivities operate in a wide range of environments (because the environments children evolved in were highly variable). Bouchard et al.’s formulation is consistent with developmental theories concerning the active role individuals take in structuring their environments (see Lubinski, in press).

Other investigators propose synthesizing evolutionary psychology with chronometrical procedures for measuring inspection time (Jensen, 1998), i.e., perceptual discrimination of stimulus configurations that typically take less than one second for average adults to perform with essentially zero errors. Theoretically, elementary cognitive tasks can be used to index the time required for information processing in the nervous system. Despite the measurement issues in this area of research, it appears that the time to perform elementary cognitive tasks covaries negatively with g (faster processing is associated with higher glevels). Inspection-time measures have also been used to assess individual differences in cognitive sophistication among nonhuman primates.

This procedure may be a vehicle for the comparative study of the biological underpinnings of general cognitive sophistication, comparable to using sign language to study language learning in nonhuman primates. Primatologists have always recognized the range of individual differences in cognitive ability in primate groups. Premack (1983, p. 125) noted, pertaining to individual differences in language versus non-language trained groups of chimpanzees:

Although chimpanzees vary in intelligence, we have unfortunately never had any control over this factor, having to accept all animals that are sent to us. We have, therefore, had both gifted and non-gifted animals in each group. Sarah is a bright animal by any standards, but so is Jessie, one of the non-language-trained animals. The groups are also comparable at the other end of the continuum, Peony’ negative gifts being well matched by those of Luvy.

Some researchers propose that individual differences in processing stimulus equivalency (verbal/symbolic) relationships is a marker of general intelligence. If these such individual differences are linked to individual differences in CNS microstructure within and between the primate order and these, in turn, are linked to observations such as Premack,’ ‘all of the ingredients are in place to advance a comparative psychology of mental ability.’ If individual differences in cognitive skills are linked to more fundamental biological mechanisms, we would have an especially powerful lens through which to view common phylogenetic processes involved in cognitive development. Research developments on this front will be interesting to follow, and may even assuage E. O. Wilson’ (1998, p. 184) concern: ‘[S]ocial scientists as a whole have paid little attention to the foundations of human nature, and they have had almost no interest in its deep origins.’