Psychologists and Their Theories for Students. Editor: Kristine Krapp. Volume 1, Gale, 2005.
Alfred Binet is best remembered as the developer of the first useful test for measuring intelligence. Along with Théodore Simon, Binet developed the Binet-Simon Scale, the forerunner of modern IQ tests. Binet’s original goal for the scale was relatively modest and very practical. In the early years of the 1900s, the French government had just enacted laws requiring that all children be given a public education. For the first time, mentally “subnormal” children—those who today might be called mentally retarded or developmentally disabled—were to be provided with special classes, rather than simply ignored by the schools. However, this raised the issue of how to identify which children would benefit from special programs. Binet and Simon set out to solve this problem. In the process, they developed a revolutionary approach to testing mental abilities.
Yet intelligence testing was only one small part of Binet’s highly productive career. Although his work was cut short when he died at age 54, he still managed to author almost 300 published books, articles, and reviews. His wide-ranging interests included sensitivity to touch, mental associations, hypnosis, child development, personality, memory, eyewitness testimony, and creativity, to name just a few. The breadth of his interests led him to study a wide spectrum of the population, including schoolchildren, experts at chess and mental arithmetic, authors, mentally retarded individuals, and his own two daughters.
Nevertheless, Binet is mainly remembered for his groundbreaking intelligence test. It was so useful for predicting school performance that a variation, the Stanford-Binet Intelligence Scales, is still in use today. In a 1930 essay, Lewis Terman, the American psychologist who developed the Stanford-Binet, described his great predecessor this way: “My favorite of all psychologists is Binet; not because of his intelligence test, which was only a by-product of his life work, but because of his originality, insight, and openmindedness, and because of the rare charm of personality that shines through all his writings.”
- The Psychology of Reasoning. Paris: Alcan, 1886. Translated by A. G. Whyte. Chicago: Open Court, 1886.
- With Charles Féré. Animal Magnetism. Paris: Alcan, 1887. New York: Appleton, 1892.
- The Experimental Study of Intelligence. Paris: Schleicher Frères, 1903.
- With Théodore Simon. “New Methods for the Diagnosis of the Intellectual Level of Subnormals.” L’Année psychologique 12 (1905): 191-244.
- With Théodore Simon. “A method of measuring the development of the intelligence of young children.” Bulletin de la Société Libre pour l’Etude Psychologique de l’Enfant 70-1 (1911): 187-248. Translated by C. H. Town. Chicago: Medical Book Co., 1913.
- Translated by E. S. Kite. The Development of Intelligence in Children. Vineland, NJ: Publications of the Training School at Vineland, 1916.
- With Théodore Simon. “The Development of Intelligence in Children.” L’Année psychologique 14 (1908): 1-94. Translated by E. S. Kite. Baltimore: Williams & Wilkins, 1916.
- Modern Ideas About Children. Paris: Flammarion, 1909. Translated by Suzanne Heisler. Menlo Park, CA: 1984.
Binet’s life is notable for both its successes and its failures. On one hand, Binet’s intelligence test became one of the most influential tests in the history of psychology. On the other hand, his innovative ideas about child development and memory had a much more limited impact. Both of these results can be traced, at least in part, to the independence that marked Binet’s career. Self-taught in psychology, he never held a position as a university professor. This kept him from building alliances with other professors and from training many students to follow in his footsteps. Yet it also gave him free rein to nurture his own tremendous curiosity and creativity.
The Early Years
Binet was born on July 8, 1857, in Nice, France. He was the only child of a father who was a physician and a mother who dabbled in art. His wealthy parents separated when he was young, leaving his mother, Moïna Binet, with most of the responsibility for raising him. Until age 15, Binet attended school in Nice. He also spent some summers at a boardinghouse in England, where he undoubtedly improved his fluency in English. This paid off later, when he was able to read the English and American psychological literature.
Once Binet turned 15, his mother took him to Paris so that he could attend a renowned school, the Lycée Louis-le-Grand. Binet studied there for three years. Upon graduating, he had trouble deciding what career path he wanted to pursue. He first earned a law license in 1878; however, he seems to have almost immediately concluded that practicing law was not for him. Next came a brief stint studying medicine. There was a strong medical tradition in his family; his father and both of his grandfathers had been physicians. This choice, too, proved short-lived. Binet suffered an emotional breakdown and dropped out of medical school.
False Starts and Lessons Learned
Discouraged and directionless, Binet began spending time in the Bibliothèque Nationale, a great library in Paris. There, he started browsing through books on psychology. He was fascinated by what he found. In particular, his interest was drawn to experiments on the two-point threshold, the smallest distance at which touching the skin at two different points at once is felt as two sensations rather than just one. Previous research had shown that this distance varied from one part of the body to another. For example, the distance was about 30 times greater on the small of the back than on the tip of the index finger. Several theories had been proposed explaining the differences. After trying a few simple experiments on himself and his friends, Binet concluded that these theories contained some errors. In 1880, he published his ideas in a paper titled “On the Fusion of Similar Sensations.” He soon learned a lesson about the hazards of rushing into print. Joseph Delboeuf, a Belgian physiologist who had already done much more complex research on the subject, published an article outlining the flaws in Binet’s work. Fortunately, Binet’s interest in psychology was strong enough to withstand the blow.
Early on, Binet became an avid reader of British philosopher John Stuart Mill. In his theory of associationism, Mill had proposed that the flow of thoughts and ideas through a person’s consciousness was controlled by the associations among these ideas. Mill had also outlined the basic laws that he believed determined which ideas would arise from a particular thought. In 1886, Binet published his first book, a fervent defense of associationism. In the book, titled The Psychology of Reasoning, Binet argued that the laws of associationism could explain everything that happened in the mind. Yet cracks in this theory had already become apparent. For example, associationism was unable to explain how one starting idea might lead to totally different trains of thought under different circumstances. Binet realized that he was on shaky ground once again. He soon gave up the position that associationism alone could explain all mental phenomena. However, he never stopped believing in the great, although incomplete, power of mental associations. Years later, he would argue that intelligence could not be studied without considering an individual’s personal associations, circumstances, and experiences.
Not all of Binet’s early ideas about psychology came from books. In 1883, Binet began working as an unpaid researcher for Jean Martin Charcot, director of the Salpêtrière, a famous hospital in Paris. Charcot was one of the most esteemed neurologists in the world. At the time, he was studying hypnosis, a temporary state of altered attention. Charcot noted that, under hypnosis, good subjects often became unable to move, insensitive to pain, or unable to remember what had happened. These were very much like the symptoms seen in patients with hysteria, a mental disorder in which people had physical ailments when no physical cause could be found. In fact, the similarities were so striking that Charcot jumped to some wrong conclusions. He believed that the ability to be hypnotized was actually a sign of hysteria. He also believed that the unusual behavior seen under hypnosis was caused by some underlying feature of the nervous system. In fact, it turned out to be caused by nothing more than the subject’s response to suggestions given by the hypnotist.
When Binet first arrived at the Salpêtrière, however, he accepted the older man’s theories without question. Binet and a young doctor named Charles Féré spent the next seven years doing research under Charcot’s guidance. The two researchers were assigned to study a woman named Blanche Wittmann, called Wit in their writings. Recalling the days when hypnotism was known as “animal magnetism,” Binet and Féré found that they could reverse Wit’s physical symptoms or emotional state under hypnosis simply by reversing a magnet. One minute, Wit would be laughing. The next minute, with a turn of the magnet, she was sobbing. Not surprisingly, when Binet and Féré published their findings, other scientists reacted with skepticism. One skeptic was Delboeuf, the same physiologist who had debunked Binet’s earlier work on the two-point threshold. Delboeuf finally traveled to Paris to observe Wit in person. He immediately saw the obvious: The hypnotist was reversing the large magnet right in front of Wit. It seemed clear that Wit was responding to the hypnotist, rather than the magnet. At first, Binet defended his findings. Slowly, however, the truth dawned. He was forced to admit that he had been blinded by Charcot’s reputation.
Binet’s career was off to a rocky start. After public missteps in work on the two-point threshold, associationism, and hypnosis, Binet appeared destined for anything but greatness. Yet these setbacks just seemed to strengthen his resolve to move ahead and make his mark on psychology.
The Psychologist at Home
The years at the Salpêtrière were a time of growth and change in Binet’s home life as well. In 1884, Binet married Laure Balbiani, daughter of biologist E. G. Balbiani. Two daughters soon followed: Madeleine, born in 1885, and Alice, born in 1887. Ever the scientist, Binet began coming up with tests and puzzles for his young daughters to solve. He proved to be a keen observer of their developing minds and personalities. In papers about his observations, Binet called the girls Marguerite and Armande.
Many of the first tests Binet tried were based on the ones used by two earlier pioneers in intelligence research, Francis Galton and James McKeen Cattell. Both men had tried to measure mental ability using physiological tests. For example, some tests measured reaction time, the split-second needed for mental processing between the time when an event occurs and the time when the muscles start responding to it. Such tests were thought to measure how efficiently the nervous system worked. Other tests, such as the two-point threshold, measured the sharpness of the senses. The idea was that intelligence requires information, and this information comes from sensations.
When Binet tried reaction-time tests with his daughters and their young friends, he found that their average reaction times were indeed longer than those of adults. However, the children’s individual reaction times varied widely. Sometimes, the children reacted just as quickly as adults, but other times, they were much slower. Binet concluded that the real difference between children and adults was not in the speed with which they could react, but in their ability to pay attention to the task. When the children’s attention wandered, as it often did, their reaction times suffered. These observations led Binet to doubt that simple physiological tests could ever be useful for sorting out the differences between immature and mature minds. Instead, it seemed that more complex tests, such as those requiring sustained attention, would be needed. This realization probably played a role in shaping the kinds of tasks Binet chose for his intelligence test years later.
In hindsight, many of the ideas that Binet formed about child development seem ahead of their time. Several of them appear to foreshadow the later work of Jean Piaget, the famous Swiss psychologist who described four stages in children’s mental development. Like Piaget, Binet believed that the purpose of mental development was to adapt effectively to the demands of the outside world. He also thought that new information was incorporated into existing ways of thinking. In addition, he believed that intelligence played a role in all human activities, from the simple to the complex.
Binet did not believe in distinct stages of development. Yet some of his descriptions of mental differences between children and adults come close to Piaget’s descriptions of various stages. For example, Binet noted that a young child might be struck by a detail on an object that an adult would overlook. Yet that same child might be unable to see the object as a whole the way an adult could.
Might the similarities between the ideas of Binet and those of Piaget be more than just coincidence? This question is still unclear. Piaget never acknowledged any such influence. After Binet’s death, however, Piaget spent time working in Paris with Simon, coauthor of the Binet-Simon Scale. In this setting, it seems likely that some of Binet’s ideas might have rubbed off on Piaget.
Along with watching his daughters’ developing mental abilities, Binet also observed their personality differences. Madeleine tended to be thoughtful and cautious in her actions, while Alice tended to be impulsive and easily distracted. This observation convinced Binet that problem-solving was a matter not only of ability level, but also of personal style. It was another theme that would reappear in his later work on intelligence.
A Second Chance at Success
After the split from Charcot, Binet found himself at loose ends. Although his family wealth meant he did not need to work for money, he was still eager to get on with his research. In 1891, Binet happened to meet Henri Beaunis in a railway station at Rouen, France. Beaunis, a physiologist, was director of the new Laboratory of Physiological Psychology at the Sorbonne, a world-famous college in Paris. During the hypnosis controversy, Beaunis had publicly criticized Binet. It must have taken courage and perhaps desperation on Binet’s part to ask Beaunis for a job in his lab. Yet that is exactly what Binet did, offering to work without pay. Beaunis, for his part, was struggling to staff the lab with limited funds. He agreed to give Binet a position. It turned out to be an excellent bargain. In 1895, when Beaunis retired, Binet took over as director. This job, which Binet held until his death, lent him legitimacy and gave him freedom to pursue his own research ideas.
Binet flourished at the Sorbonne laboratory. The events during just two years, 1894-95, show how amazingly productive he could be, given the right environment. During this period, Binet published two books. One was an introduction to experimental psychology, and the other described his research on experts at chess and mental calculations. He and Beaunis also founded and edited the first French psychological journal, L’Année psychologique, for which Binet himself wrote 85 reviews and four original articles. In addition, Binet was appointed to the board of a new American journal, Psychological Review. At the same time, he studied optical illusions and developed a method for making a graphic record of piano playing. With Jacques Passy, he studied dramatic authors. With Victor Henri, he studied memory in schoolchildren.
Somehow, Binet also found time to finish his doctoral degree in 1894. Six years earlier, he had begun studying biology in his father-in-law’s laboratory. Over time, he grew fascinated by the behavior, anatomy, and physiology of insects. His thesis, titled “A Contribution to the Study of the Subintestinal Nervous System of Insects,” was filled with detailed drawings, most of which he made himself. This detour into natural science just added to Binet’s credentials as a well-rounded scientist and skilled observer.
Binet continued to be very interested in child development as well. With the authority of his new job behind him, he was no longer limited to just studying his own daughters. Now, he could gain access to the schools to observe subjects of all ages. During this period, Binet and Henri conducted studies of children’s memory that are still surprisingly up-to-date. In experiments on prose memory, the researchers presented schoolchildren with paragraphs, and then asked the children to write down what they remembered. The researchers found that the children tended to remember general ideas better than specific words. The longer the delay between testing and recall, the more pronounced this difference became. Also, the more important an idea was within the overall paragraph, the more likely it was to be recalled. Binet and Henri concluded that memory processes for connected ideas and memory processes for isolated words were totally different. Once again, Binet was ahead of the curve. These findings were eventually borne out by studies on prose memory in the 1970s.
Binet’s research also foretold later findings on eyewitness testimony. In one study, Binet presented schoolchildren with a poster depicting several objects and a scene. The children were allowed to look at the poster for just a matter of seconds. Afterward, they were asked about what they remembered. The answers tended to vary depending on how the questions were worded, a result that has been confirmed many times in recent years.
Although Binet had clearly learned the value of testing his ideas in larger groups of subjects, he also continued to conduct in-depth case studies of individuals. By studying a handful of individuals with extraordinary skill at playing chess or doing mental arithmetic, he explored the nature and limits of these mental abilities. By studying the working habits of leading French authors, he explored creativity. Of course, Binet’s longest-running case studies were of his own daughters. As they grew older, he continued to test them on everything from number judgment and memory to inkblot interpretation and storytelling. He described the results from 20 of these tests in a 1903 book called The Experimental Study of Intelligence. Despite its title, however, the book was less about intelligence than about general mental development and personality.
The Stage is Set for Greatness
In 1896, Binet and his assistant, Henri, published a paper describing what they called “individual psychology.” As they explained it, general psychology dealt with broad psychological properties that are common to everyone. Individual psychology, in contrast, dealt with properties that vary from one person to another. Their aim was to study this variation both within and across individuals. In order to do that, however, Binet soon realized that he needed practical tests of psychological functioning. He set an ambitious goal for himself: to devise a series of such tests that could be given in less than two hours and would assess 10 major psychological processes. The processes were memory, imagery, imagination, attention, comprehension, suggestibility, aesthetic sentiment, moral sentiment, muscular strength and willpower, and motor ability and eye-hand coordination.
Unfortunately, the tests Binet and Henri devised were a flop. In one influential study, Stella Sharp, a graduate student at Cornell University, gave the tests to seven of her fellow psychology students. She found little evidence of a meaningful pattern in the scores. There was also a troubling lack of relationship among the scores for subtests that were supposed to measure the same ability. Binet himself found similarly disappointing results. In 1904, after eight years of effort, Binet admitted defeat. Today, the goal of developing a quick yet complete test of psychological functioning remains elusive. Yet Binet’s time had not been wasted. It had prepared him well for his next challenge: devising an intelligence test.
Several other events also helped to set the stage for Binet’s achievement. In 1899, Simon began to perform doctoral research under Binet’s supervision. At the time, Simon was a young doctor working at a large institution for the mentally retarded, and Binet was eager to try out his tests on this new group of subjects. Their collaboration was the most fruitful of Binet’s career, and the two researchers became close friends.
The next year, Binet played a key role in organizing the new Free Society for the Psychological Study of the Child. This was a group of psychologists and educators who banded together to seek solutions to problems facing the schools. Binet became a leader of the group and founded its Bulletin for publishing members’ research. One of the most pressing problems was how to carry out new laws requiring that all French children be provided a public education. This included mentally retarded children, who in earlier years would never have gone to school or would have dropped out early. In 1904, the French government appointed Binet to a commission that was charged with improving the education of this previously overlooked group of children.
Binet soon zeroed in on a critical problem: identifying which children should be considered mentally retarded and placed in special educational programs. Binet and Simon set out to solve this problem by developing a test. Traditionally, mentally retarded individuals had been divided into three categories: profoundly retarded (called idiots), moderately retarded (called imbeciles), and mildly retarded. Binet called the mildly retarded group débiles, or “weak ones.” His English translators later substituted the term moron, from a Greek word meaning “dull.” The test was intended to sort out children who belonged in one of these categories from the children whose intelligence could be considered normal. The first Binet-Simon Scale was introduced in 1905. That same year, Binet opened a research center in the school at Belleville, a working-class neighborhood of Paris. The next several years were spent improving his test. Revisions followed in 1908 and 1911.
Triumphs and Disappointments
Binet was busy revising the scale when he died in Paris on October 18, 1911. He was at the height of a remarkable career. Binet’s final years, however, were marked by disappointments as well as triumphs. Perhaps the greatest disappointment was his failure to secure a position as a university professor. In 1895, Binet visited the University of Bucharest in Romania as a guest lecturer. His lectures were a hit with the students, and he was invited to stay on as a professor. He turned down the offer, partly because he hoped to get a similar post in France. As was the custom of the time, he proposed himself for two such positions: one at the College of France, and one at the Sorbonne. He was not chosen for either post, however.
Binet’s family life had once been a source of comfort. He and his wife lived in Paris when they were first married, but they later moved to a suburb called Meudon. The Binets stayed there until 1908, when they returned to Paris. Life in Meudon seems to have been quite pleasant for several years. The family shared interests in art and drama. They also enjoyed a lovely home and garden, pets, bicycling, long walks, and summer vacations.
After about 1900, however, Binet’s family life took a turn for the worse. His wife became depressed and ill, and the couple rarely went out socially. His daughters had been isolated, too, since they were schooled at home. As the girls grew into young women, Binet worried about their ability to form healthy friendships. He also fretted about Alice’s health and Madeleine’s marriage, of which he did not approve. The gloomy atmosphere at home may have been reflected in Binet’s hobby. In the last years of his life, he wrote plays with dramatist André de Lorde, nicknamed “The Prince of Terror.” The plays all dealt with ghoulish themes, such as a released mental patient who committed murder and a scientist who tried to bring his dead daughter back to life.
In the ultimate irony, even Binet’s intelligence test was largely ignored and even ridiculed in France during his lifetime. It was already being hailed abroad, however. After Binet’s death, his test and those that followed had a profound impact on psychology, education, and society at large. Binet’s name became forever linked with intelligence tests.
Although Binet intended his intelligence test to be a practical tool, it became impossible to separate this tool from the theoretical questions it raised: What was intelligence? How can it be tested? And how should researchers use the test results? These questions remain at the heart of a lively debate over intelligence testing.
Binet’s ideas about intelligence were rooted in his earlier theory of individual psychology. He continued to stress variation, both within and across individuals. Based on his previous work, Binet was also convinced that such individual differences could best be detected by studying complex mental processes, such as memory, attention, imagination, and comprehension.
What is intelligence? Binet was always more concerned with measuring intelligence than with defining it. Nevertheless, the test he developed embodied his ideas about the nature of intelligence. Binet believed that intelligence was not a single entity. Instead, he viewed it as a collection of specific processes. Therefore, any general test of intelligence needed to sample the whole range of mental processes, rather than just one or two isolated abilities.
Binet also believed that people’s mental abilities differed in quality as well as quantity. His observations of his daughters apparently convinced him of this point. From a very young age, Madeleine seemed to think things through more carefully, while Alice seemed to act more impulsively. When the girls were learning to walk, for example, Binet noticed that Madeleine would go only to objects a short distance away. Alice, on the other hand, would head straight for an empty part of the room, apparently unconcerned about whether or not it contained an object she could grab for support.
Based on such observations, Binet was well aware that two children might arrive at the same overall result on his test by two very different paths. He wrote about the importance of noting the specific errors made by a child on the test, in order to get a more complete picture of how that child’s mind worked. Unlike many psychologists who followed, Binet was unwilling to reduce a person’s whole intelligence to a single number. In fact, the concept of an IQ score was not introduced until after Binet’s death.
Binet also believed that intelligence was changeable within limits, rather than fixed; consequently, an individual’s intelligence level could be raised through proper education. Binet acknowledged, however, that each person probably had an upper limit, but he thought that very few people came close to reaching it. Therefore, there was usually room for improvement. This was especially true of the mentally retarded children that Binet’s test was designed to identify. In a 1909 book, titled Modern Ideas About Children, Binet decried the “brutal pessimism” of psychologists and educators who believed intelligence to be fixed at a set level.
Binet never set forth a rigorous definition of intelligence. In a 1905 paper, however, he and Simon argued that judgment played a central role:
It seems to us that in intelligence there is a fundamental faculty, the alteration or lack of which, is of the utmost importance for practical life. This faculty is judgment, otherwise called good sense, practical sense, initiative, the faculty of adapting one’s self to circumstances. To judge well, to comprehend well, to reason well, these are the essential activities of intelligence.
To Binet, the very essence of intelligence was rooted in practical experience.
How can intelligence be tested? To develop his test, Binet started with groups of children who had been identified by teachers or doctors as mentally retarded or of normal intelligence. Binet then had both groups perform a wide variety of tasks. He hoped to find tasks that would clearly differentiate the groups. He quickly ran into a snag, however. It proved nearly impossible to find tasks that were almost always done successfully by the normal intelligence group, but almost never by the retarded group. There was always some overlap in the results.
Then, Binet had one of the most important insights of his career. He realized that age made a critical difference. Both the retarded children and those with normal intelligence might eventually master the same skill. However, the normal intelligence children did so at a younger age. This idea has become so widely accepted that it seems like common sense today. Before Binet, however, other researchers had missed the crucial connection.
With this insight as a starting point, Binet and Simon came up with 30 tasks of gradually increasing difficulty. The simplest tasks were at the very basic level of intelligence seen in normal infants or in the most profoundly retarded children of any age. The hardest tasks could be passed easily by normal 11- or 12-year-olds, but were beyond the grasp of even the oldest and most capable retarded children. These items, and the others in between, made up the first Binet-Simon Scale of 1905.
A child’s score on the total scale revealed his mental level. For example, a seven-year-old child who passed all the tasks normally passed by children of his age would have a mental level of seven. However, if that same child could only pass the tasks normally passed by five-year-olds, he would have a mental level of five. Binet noted that it was common for children to have a mental level that lagged behind their chronological age by a year. Most of these children did fine in a regular classroom. If a child’s mental level trailed his chronological age by at least two years, however, and if the child came from an ordinary French background and was healthy and alert when he took the test, then a diagnosis of mental retardation could be considered.
Binet wanted his test to be psychological rather than educational. Therefore, he avoided tasks that relied heavily on reading, writing, and other school-related skills. Yet he also believed that the test should assess judgment in lifelike situations. Therefore, he included many tasks that required a basic knowledge of French culture and life. Binet knew this meant that his test would only be valid for children who had grown up in the mainstream French culture, but he reasoned that it would be able to accurately assess most of the French schoolchildren for whom the test was designed.
Although the Binet-Simon Scale of 1905 was a groundbreaking achievement, it had some flaws. For one thing, the mental levels were based on research that had studied only 50 normal-intelligence children and 45 mentally retarded children. Therefore, the levels provided only rough guidelines. In addition, more than half of the tasks were geared to very young or severely retarded children. Yet, in real life, most of the tough decisions that needed to be made involved older children around the borderline between mental retardation and normal intelligence. Binet and Simon attempted to correct these flaws in the 1908 and 1911 revisions of the scale.
To do this, the researchers set out to expand and refine the tasks that made up the test. Starting in 1905, they tested numerous tasks in a larger number of children between the ages of three and 13. For a task to be assigned a mental level of seven, for example, it had to be passed by only a few of the six-year-olds, most of the seven-year-olds, and even more of the eight-year-olds with normal intelligence. Of course, not all tasks broke down neatly this way. By 1908, however, Binet and Simon had found 58 tasks that met their criteria. These refined tasks made up the 1908 revision of the Binet-Simon Scale.
That same year, Simon left Paris to become director of a mental hospital in Rouen. He and Binet continued to work together afterward, but not as closely as before. Meanwhile, Binet expanded the intelligence scale up to a mental level of 15. He also adjusted the test so that there were exactly five items for each age level. In an effort to better standardize and quantify the test, Binet came up with a formula. It calculated the mental level of a child by counting one-fifth of a year for each subtest passed. Binet worried that dividing year levels into fifths implied a misleading degree of precision, however. He warned that the fractions “do not merit absolute confidence.” Even for the same person, they could vary noticeably from one test-taking to another. The higher age level and the new formula were included in the 1911 revision of the test.
How should intelligence test results be used? For Binet, there were at least two reasons why intelligence test results should not be considered exact measurements of mental ability. One, the test itself was imperfect, containing sources of error and unreliability. Two, he believed intelligence could change over time. The latter view set Binet apart from some of the psychologists who expanded upon his test in the decades after his death. It also led Binet to recommend frequent retesting.
Before the Binet-Simon Scale, children had been placed in special educational programs based on nothing more than subjective opinions. Binet knew that such opinions were often biased. For example, teachers in regular schools might label troublemakers as mentally retarded to get them out of their classes. Conversely, teachers in special schools might exaggerate their students’ achievements to makes themselves look good. Likewise, parents might understate their children’s mental ability to escape responsibility for them. Or, they might overstate to avoid embarrassment. Even professional evaluators tended to be quite inconsistent. For example, one principal claimed not to have a single mentally retarded child at his school, while another claimed to have 50 of them. Clearly, a more objective means of assessment was needed.
Binet argued that his test should be adopted for two reasons. First, it avoided the bias and inconsistency that occurred when placement decisions were based strictly on subjective opinions. Instead, the test was rooted in objective data. Second, the test tried to assess mental capability rather than school-based learning. Therefore, a child’s performance on the test was thought to be relatively independent of his or her past school experiences.
Binet thought his test could identify which children would be able to succeed in regular classrooms and which would need special educational programs. He also believed, however, that the categories of normal and retarded were not carved in stone. Steps could be taken to raise the intelligence of mentally retarded children, at least to a degree. To this end, he helped design a series of exercises called “mental orthopedics.” Binet had noted that retarded children, much like young children of normal intelligence, had trouble paying attention to anything for very long. Therefore, many of the exercises were geared to helping children increase their attention span. For example, one exercise was the game Statue. The teacher would give a signal to freeze, and the children would try to hold their position until they were told to relax.
Binet had begun his career by studying mental ability using simple physiological measures, such as the two-point threshold and reaction time. Eventually, however, he concluded that measures of complex mental processes—such as memory, attention, imagination, and comprehension—were needed to sort out individual differences in intelligence. Therefore, his intelligence test included tasks that were intended to assess these complex processes.
Binet-Simon Scale of 1905 The first Binet-Simon Scale included 30 items. They are listed below in order from easiest to most difficult.
- Le regard. This item tested a child’s ability to follow a lighted match with his or her eyes. The goal was to assess a very basic capacity for attention.
- Prehension provoked by a tactile stimulus. This item tested a child’s ability to grasp a small object placed in his or her hand, hold it without letting it fall, and carry it to the mouth.
- Prehension provoked by a visual perception. This item was similar to the previous one; however, it tested a child’s ability to reach for and grab an object placed within his or her view.
- Recognition of food. In this task, a piece of chocolate was placed next to a little cube of wood. The aim was to see whether the child could tell by sight alone which of the objects was food.
- Quest of food complicated by a slight mechanical difficulty. In this task, a piece of candy was shown to the child and then wrapped in paper. The aim was to see whether the child would unwrap the candy.
- Execution of simple commands and imitation of simple gestures. This item tested whether the child knew how to shake hands with the examiner and comply with simple spoken or gestured commands. The goal was to assess very basic social and language skills. Children with normal intelligence could pass the first six items on the test by age two. Some of the items, however, were too difficult for the most profoundly retarded children. Therefore, profound retardation came to be defined as a mental level no higher than that of a two-year-old with normal intelligence, including the inability to interact socially and use language.
- Verbal knowledge of objects. In this task, the examiner asked the child to point to various parts of the body. The child was then asked to give the examiner various common objects, such as a cup and a key.
- Verbal knowledge of pictures. In this task, the child was asked to point to familiar objects in a picture, such as a window and a broom.
- Naming of designated objects. This item was the opposite of the previous one. Using another picture, the examiner pointed to familiar objects and asked the child to name them.
- Immediate comparison of two lines of unequal lengths. In this task, the child was shown pieces of paper with pairs of lines on them. One line was always 4 cm long; the other, 3 cm. The child was asked to indicate which line was longer.
- Repetition of three figures. This item tested a child’s ability to repeat back a string of three numbers.
- Comparison of two weights. In this task, the child was shown two boxes that looked identical, but were of different weights. The child was asked to decide which box was heavier.
- Suggestibility. In some of the previous tasks, the examiner would make false suggestions to see how the child would respond. For example, after asking the child to point to various common objects, the examiner would ask the child about an object that was not there.
- Verbal definition of known objects. This item tested a child’s ability to give simple definitions for familiar things, such as a house and a fork.
- Repetition of sentences of 15 words. This item tested a child’s ability to repeat back sentences averaging 15 words long. These last nine items on the test could be passed by children with normal intelligence by age five. The items assessed simple vocabulary and language skills as well as basic judgment and memory. This particular item was considered the cut-off point for moderate retardation. That is, moderately retarded children were thought to operate at the level of a two- to five-year-old with normal intelligence.
- Comparison of known objects from memory. In this task, the child was asked to state the differences between pairs of common objects, such as a piece of wood and a piece of glass.
- Exercise of memory on pictures. In this task, the child was shown several pictures of familiar objects for a brief time. The child was then asked to name the objects from memory.
- Drawing a design from memory. In this task, the child was briefly shown two geometric designs, then asked to draw them from memory.
- Immediate repetition of figures. This item was identical to the earlier one in which the examiner asked the child to repeat back a string of three numbers. Now, however, the examiner gave greater weight to the nature of any errors.
- Resemblances of several known objects given from memory. In this task, the child was asked to state the similarities between sets of objects, such as a fly, an ant, a butterfly, and a flea.
- Comparison of lengths. In this task, the child was shown pieces of paper with pairs of lines on them. The child was asked to indicate which line was longer. While this was similar to an earlier task, the differences in line lengths were smaller this time.
- Five weights to be placed in order. This item required the child to arrange five identical-looking boxes in order of heaviness. The boxes varied in weight from 3 grams to 15 grams.
- Gap in weights. After the previous task, one of the middle boxes was removed while the child closed his or her eyes. The child was then asked to figure out which box was missing by hand-weighing.
- Exercise upon rhymes. This item tested the child’s ability to name words that rhymed with the French word obéissance.
- Verbal gaps to be filled. This item tested the child’s ability to fill in the blanks in simple spoken sentences. For example, one sentence was: “The weather is clear, the sky is (blue).”
- Synthesis of three words in one sentence. In this task, the child was given three words: “Paris,” “river,” and “fortune.” The child was then asked to make up a sentences using all the words.
- Reply to an abstract question. This item tested the child’s ability to answer 25 questions dealing with practical problem-solving and social judgment. The questions ranged from very easy to fairly difficult. For example, one medium-difficulty question asked: “When anyone has offended you and asks you to excuse him, what ought you to do?”
- Reversal of the hands of a clock. This item tested the child’s ability to figure out in his or her head what time it would be if the large and small hands on a clock were reversed for various times.
- Paper cutting. In front of the child, the examiner folded a paper into quarters, and then cut out a triangle at the edge with a single fold. Without actually unfolding the paper, the child was then asked to draw the design he would see if the paper were opened.
- Definitions of abstract terms. In this task, the child was asked to state the differences between two abstract terms, such as weariness and sadness.
These last 15 items on the test contained the boundary line between mild retardation and normal intelligence. In general, these items could be passed by children of normal intelligence between the ages of 5 and 11. However, some of the most difficult tasks near the end were not always passed by even 11-year-olds with normal intelligence.
Binet-Simon Scale of 1911 The final version of the Binet-Simon Scale included similar items. Some examples are given below. The ages refer to the age at which typical children of normal intelligence were able to perform certain tasks.
- Age three: Pointing as told to the eyes, nose, and mouth; naming common objects in a picture; repeating back a string of two numbers; repeating a six-syllable sentence; knowing their last names.
- Age six: Telling the difference between morning and evening; telling an “attractive” face from an “ugly” one in a picture; copying a diamond-shaped design from memory; counting 13 pennies; giving simple definitions for familiar things, such as a fork and a table.
- Age 10: Copying line drawings from memory; composing a sentence with the words “Paris,” “fortune,” and “river”; placing five identical-looking boxes in order by weight; answering questions involving social judgment; finding and explaining absurdities in statements. Some of the latter statements showed Binet’s fascination with ghoulish themes, similar to the subject matter of the plays he was writing at the time. For example, one item asked children to explain what was wrong with this statement: “The body of an unfortunate girl was found, cut into 18 pieces. It is thought that she killed herself.”
- Age 15: Repeating back a string of seven numbers; naming three rhymes for the French word obéissance; repeating a 26-syllable sentence; giving appropriate explanations for pictured scenes of people; solving problems such as this one: “My neighbor has just been receiving strange visitors. He has received in turn a doctor, a lawyer, and then a priest. What is taking place?”
Test-giving procedures Binet and Simon provided general instructions on how to give their test. Many of these echo the procedures still used in individual testing today. For example, the test was to be given in a quiet room with no distractions. When the child met the examiner for the first time, a familiar person, such as a relative or the school principal, was to be present. The examiner was to greet the child with “friendly familiarity,” to help put the child at ease. Binet realized that the child’s emotional state and motivation could affect the results, so he stressed that these factors should not be ignored.
Binet had not forgotten his early mistake made when studying hypnosis; specifically, that the subject’s behavior had unintentionally been changed by suggestions from the hypnotist. Binet’s research on memory in schoolchildren had also underscored the power of suggestion to affect behavior. Therefore, Binet was well aware that unwitting suggestions by an examiner might affect children’s performance on the intelligence test. In a 1905 paper, he and Simon warned: “It is a difficult art to be able to encourage a subject, to hold his attention, to make him do his best without giving aid in any form by an unskillful suggestion.”
In his 1909 book, Modern Ideas About Children, Binet noted four mental processes that he thought played a key role in intelligence. He also described how these processes might look in young children of normal intelligence. Of course, these descriptions also fit older children and adults with moderate retardation.
- Comprehension. This term referred to the ability to notice and understand things. Binet wrote that young children experienced the world largely through their senses. They also tended to see parts of things rather than the whole, and they had trouble differentiating unimportant details from important ones. When it came to language, the children used few adjectives and conjunctions. They also tended to use concrete words rather than abstract ones. In short, they had “a comprehension that remains always on the surface.”
- Inventiveness. This concept referred to the ability to describe and interpret things. Binet wrote that young children still used words in a very limited and rather dull way. When shown a picture, the children described it in vague terms that could describe any number of pictures.
- Direction. This term referred to the ability to pay attention and stay on task. Binet noted that young children frequently forgot what they were doing. They tended to get carried away by fantasy, losing track of their real-world aims. When speaking, the children jumped from subject to subject, based on chance associations rather than logical connections.
- Criticism. This referred to the ability to make critical judgments. Binet noted that this ability, too, was quite limited in young children. The children naively accepted the most absurd explanations. They also told lies because of their weak ability to tell the difference between reality and fantasy. In addition, young children were highly suggestible.
Like everyone else, Binet was shaped by the times in which he lived. In part, his intelligence test was a reaction to earlier efforts by two of his colleagues, British Sir Francis Galton and American James McKeen Cattell, who each had tried to assess mental ability with physiological measures.
Galton and Hereditary Intelligence
The first person to try to develop a scientific intelligence test was Francis Galton. This British scientist, a half-cousin of English naturalist Charles Darwin, was a polymath, a person who is knowledgeable in many scientific areas. His interests included studying weather, fingerprints, and the peoples of Africa. Galton argued that plants and animals varied in systematic ways, and he devised new statistical methods for studying heredity. When it came to people, Galton proposed a controversial idea: the planned selection of superior parents as a means of improving the human race. To this end, he coined the term “eugenics” for the theoretical science of human breeding.
Before a practical program of eugenics could gain wide support, however, Galton had to show that his ideas were sound. Galton had been greatly influenced by his famous half-cousin’s theory of evolution. A basic premise of that theory is that the variation among members of any species is inherited. The differences among parents in one generation are passed down to their offspring in the next generation. In an 1869 book titled Hereditary Genius, Galton set out to show that high mental ability was passed down this way. It is likely that Galton’s own family tree inspired this line of thinking, since both he and Darwin were grandsons of Erasmus Darwin, a noted physician and naturalist in his own right.
For the book, Galton picked a sample of people who had achieved great enough success in their careers to be listed in biographical reference works. Galton then researched their family backgrounds and found that about 10% had at least one close relative who was successful enough to be listed, too. Although this was a small percentage, it was still a much higher rate than would have been expected based on chance alone. This finding was consistent with Galton’s theory of hereditary ability. It did not settle the issue, however, since most individuals in the same family share not only genes, but also similar lifestyles and experiences. Thus began the great nature-nurture debate, which asks: How much of people’s intelligence is due to nature (the genes they inherited from their parents), and how much is due to nurture (the way they were raised and the experiences they have had)? This question continues to be hotly debated today.
In 1865, Galton suggested that a test might be devised to measure inherited differences in mental ability. When it came time to actually develop such a test, however, he was stumped. All he had was a vague notion that the inherited differences must arise from measurable differences within the brain and nervous system. Eventually, Galton developed a series of physiological tests for measuring reaction time, the sharpness of the senses, and physical energy. He hoped these tests would show the efficiency of a person’s nervous system and, thus, the basis for his or her hereditary intelligence.
In 1884, Galton set up a laboratory at the South Kensington Museum in London to measure individual differences in mental ability. For a small fee, people could be tested there. Today, Galton’s choice of tests seems amusingly misguided. For one test, he used a special whistle to measure the highest pitch people could hear. For another, he tested people’s sensitivity to the smell of roses. Perhaps it is not surprising, then, that the tests did not work out as well as Galton had hoped. People with sharp senses and fast reaction times did not, as a group, turn out to especially gifted in other areas. Still, about 9,000 people paid for Galton’s services, and scientists took note. If nothing else, Galton’s laboratory was very successful at introducing the idea of intelligence testing to scientists and to the public.
Cattell and Mental Tests
James McKeen Cattell, an American psychologist, soon built upon Galton’s physiological method of measuring intelligence. In 1890, he published a set of “mental tests,” a catchy term he coined. Cattell suggested 10 mental tests for use with the general public.
- Dynamometer pressure. This test measured the strength of a person’s hand grip. Cattell explained that he included this test because “it is impossible to separate bodily from mental energy.”
- Rate of hand movement. This test measured how quickly a person could move his or her hand across 50 centimeters.
- Two-point threshold. A researcher touched a pair of rubber-tipped compass points to the back of a person’s hand. When the tips were very close together, the subject felt them as a single point. The researcher attempted to find the smallest distance at which the tips were felt by the subject as two separate points.
- Pressure-causing pain. An instrument was pressed against a person’s forehead with increasing force. The aim was to find the amount of pressure needed to cause signs of pain.
- Weight differentiation. This test required a person to put a set of identical-looking boxes in order by weight. The boxes, which differed in weight by 1 gram, ranged from 100 to 110 grams.
- Reaction time for sound. This test measured the very brief period that elapsed between the time when a sound was made and the time when a person’s muscles started reacting to it.
- Time for naming colors. A set of red, yellow, green, and blue patches, arranged in random order, was shown to a person. The aim was to measure how long it took the person to name the colors.
- Bisection of line. A 50-centimeter strip of wood with a sliding line attached was used. The person was asked to place the line as close as possible to the exact middle of the strip.
- Judgment of time. In this test, the examiner first tapped out a 10-second interval. The examiner then tapped on the table and asked the person to signal when another 10 seconds had passed.
- Number of repeated letters. This test measured how well a person could repeat back lists of random consonants.
A flurry of this kind of mental testing followed in the 1890s. When powerful new statistical methods came into use, however, it soon became clear that the tests were sorely lacking. Earlier, Galton had developed the concept of a correlation, the degree and direction of association between two things. Karl Pearson perfected the method of computing a correlation coefficient, an index of the strength of the relationship between two things when certain conditions are met. This statistic became known as the Pearson r. Now, researchers had a more sophisticated way to analyze test results.
In 1901, Clark Wissler, one of Cattell’s own graduate students, dealt a death blow to this type of mental testing. Using the new statistical methods, he studied the scores of college students who had taken Cattell’s tests. Wissler found virtually no correlation among the tests. In other words, a student who did well on one of the tests was not especially likely to do well on any of the other tests. Even worse, scores on the tests also did not correlate with college grades. This meant Cattell’s tests and college grades were measuring different things. Since college grades were thought to reflect intelligence, it seemed Cattell’s tests must be measuring something else.
Binet Compared to Galton and Cattell
The failure of Galton’s and Cattell’s intelligence tests opened the door for Binet to develop a more practical alternative. He succeeded where they had failed at devising a test that was related to intelligent behavior in real life. Today, most useful intelligence tests for people of all ages are still based on Binet’s model. Such tests require people to use several mental abilities to perform a broad range of complex tasks.
One factor that may have helped Binet succeed was his choice of study population. Galton and his followers had been mainly interested in studying intelligence in adults at the high end of the ability range. Binet, in contrast, was interested in testing the intelligence of children at the low end. Because he worked with children, Binet was able to see the way intelligence developed over time. And because he looked at less advanced mental processes, basic patterns may have been easier to notice.
Binet’s own studies of very creative adults, such as dramatists, had found that there was great individuality and complexity in higher-order abilities. When the Binet-Simon Scale was introduced, Binet noted that some children had a mental level that was a year or more ahead of their age in years. Were these children destined to grow up into very bright and talented adults? At first, Binet believed that it might be possible to answer that question by extending his scale upward. By 1908, however, he had developed doubts. The mixture of mental abilities measured by his test had only been shown to be something that prevented people from being retarded. They had not been shown to be the source of high ability, talent, or genius. Therefore, the very nature of the “intelligence” measured by Binet’s test seemed to be rather different from the “intelligence” Galton had had in mind.
Another major difference between Binet and Galton was their position on the nature-nurture debate. Galton mainly focused on the nature side of the equation. He viewed the upper limits of a person’s ability as fixed by genetics rather than culture. Binet, in contrast, was more interested in the role of nurture. He believed that cultural factors played a large role in shaping an individual’s mental abilities. He also stressed that intelligence was changeable within limits through proper education.
Because Binet saw culture and intelligence as closely related, he had no qualms about including culturally based items on his test. Of course, this meant that the test was only valid for people who came from a certain background. Galton’s and Cattell’s physiological tests, on the other hand, would have been more applicable to people from many different backgrounds—if only they had worked.
By a twist of fate, both Galton and Binet died in 1911. After the two men’s deaths, a strange thing occurred: Binet’s scale was immediately taken up by scientists whose views and goals were otherwise much closer to Galton’s. Clearly, Binet’s test survived because it had practical value. The theory behind the test, however, was not as quickly embraced. In part, this may have been because Binet himself was always more interested in measuring intelligence than in explaining it within a theoretical framework.
It may also have been due, in part, however, to the way the two scientists led their lives. At the time of their deaths, Galton was an old man, long past his active research days, while Binet was still in the prime of his career. Yet Galton held greater sway in scientific circles. During the last years of his life, Galton drummed up considerable support for his eugenics program and the hereditary theory of intelligence. Binet, on the other hand, had gained far fewer followers. As a result, the next generation of intelligence testers tended to use Binet’s techniques to advance Galton’s ideas.
Spearman and General Intelligence
Around the same time that Binet introduced his intelligence test, English psychologist Charles Spearman published his own theory of intelligence. It, too, was at odds with Binet’s concepts. Yet in later years, Spearman’s ideas, like those of Galton and Cattell, were often promoted using Binet’s test.
Spearman’s early work was actually inspired by Galton and Cattell. In one experiment, he studied two dozen schoolchildren in three ways. First, he had their teacher rank them on “cleverness in school.” Second, he had the two oldest children rank their classmates on “sharpness and common sense out of school.” Third, Spearman himself ranked the children’s performance on tests designed to measure the sharpness of their senses. Then Spearman calculated the correlations among these measures. He found a modest association between the teacher’s and classmates’ rankings, on one hand, and the sensory rankings, on the other. These findings differed from Clark Wissler’s results, who had found no correlation. Spearman explained the difference, however, by pointing out flaws in Wissler’s work. In truth, Spearman’s own method was far from perfect. Later researchers have tended to confirm Wissler rather than Spearman, finding very little association between sensory abilities and mental abilities.
Nevertheless, Spearman was encouraged. He went on to study the grades that children had earned in various school subjects. He found that children who did well in one subject tended to do well in the others, too. Likewise, children who did poorly tended to do so across the board. Taken together, Spearman’s findings seemed to point to a common thread tying together all these measures of mental ability. Spearman referred to this single, broad capability as general intelligence, or g. He first published his theory of general intelligence in an influential 1904 paper.
Spearman viewed general intelligence as a single, broad entity. Binet, in contrast, viewed intelligence as a group of mental processes that were arranged in different patterns within different people. Unlike Spearmen, Binet did not focus on finding a unifying factor for these processes.
Although Spearman disagreed with Binet’s theory, he was quite impressed by the Binet-Simon Scale. Even Spearman realized that his own method of measuring intelligence with teacher rankings, classmate rankings, and grades was not ideal. For one thing, it was too closely tied to school performance. Binet’s test offered a useful alternative that was not as greatly affected by past classroom experiences.
Of course, Spearman saw Binet’s test from his own point of view. When Spearman calculated the correlations among individual items on the test, he found a familiar trend: Children who did well on one item tended to also do well on the others. Spearman took this as evidence that the items were actually measuring general intelligence to a large extent. He argued that Binet’s test worked precisely because the overall result provided a useful estimate of a person’s level of general intelligence. In addition, he believed that a person’s general intelligence level owed more to heredity than to lifestyle and experiences. This became a popular view, even though Binet himself did not share many of Spearman’s ideas.
When Binet died, he considered his test to be a work in progress. He was still constantly striving to improve it. Yet this imperfect test was itself widely adopted, and it became the model for other tests that have had an enormous impact on society. Since Binet’s death, the field of intelligence testing has attracted both ardent supporters and vocal critics. Few other areas of psychology have proven to be such lightning rods for controversy.
Stern and the Intelligence Quotient
Today, the terms “intelligence test” and “IQ test” are often used interchangeably. Therefore, many people assume incorrectly that Binet came up with the idea of an intelligence quotient (IQ), a single number for expressing the overall result on an intelligence test. This distinction actually goes to German psychologist William Stern. In fact, Binet resisted the idea of reducing a person’s intelligence to a single number. When Stern introduced the concept of IQ in 1912, Binet was no longer alive to complain. But his coauthor, Simon, later called the IQ concept a betrayal of their original ideas.
Nevertheless, Stern’s concept caught on quickly; it involved some seemingly small but critical changes in the way Binet’s test results were used. Binet had talked about the mental level of children who took his test. Stern recast this as mental age, which implied a more precise measurement scale. Then, Stern proposed that mental age could be divided by chronological age to yield a handy numerical score. In 1916, the American psychologist Lewis Terman suggested multiplying this score by 100 to get rid of fractions. For example, consider a seven-year-old child with a mental age of six. To calculate this child’s IQ, an examiner would divide six by seven, then multiply the answer by 100. The child’s IQ would be 86.
Most people would agree that a five-year-old with a mental age of three has a more serious delay than a 15-year-old with a mental age of 13. Using Binet’s method, both children would simply be regarded as being two years behind. Using the IQ method, however, the differences in severity would be more obvious. The 15-year-old would have an IQ of 87 (in the normal intelligence range), while the five-year-old would have an IQ of only 60 (in the mentally retarded range). The 15-year-old would need to have a mental age of nine to get an IQ score that low. As this example shows, the use of IQs helped to equalize the scores for children who had roughly the same degree of mental retardation or normal intelligence, but who were of different ages.
Expressing intelligence as a single number also had other effects however. For one thing, it encouraged people to look at intelligence as a single entity, along the lines of Spearman’s General Intelligence. For another thing, it gave researchers a number they could use in correlational studies. A flood of studies followed in which researchers looked at the association between “intelligence” (as measured by IQ tests) and an endless list of other variables. Yet many people had—and still have—grave doubts about whether something as complex as intelligence could really be boiled down into something as simple as a numerical score.
Goddard and Negative Eugenics
Stern may have come up with the IQ formula, but American psychologist Henry Goddard did the most to popularize the Binet-Simon Scale in the early days. As director of research at the Training School for the Feebleminded in Vineland, New Jersey, Goddard was eager to learn all about the latest advances in the mental retardation field. In 1908, he traveled to Europe to see what was being doing there. Although he visited Paris, he never met Binet. At the time, Binet had yet to earn prestige within his own country, and Goddard got the impression that Binet was making little progress. In fact, Goddard did not even realize that Binet had just published a revision of his scale. As Goddard wrote in his travel diary: “Visited Sorbonne. Binet’s lab is largely a myth. Not much being done…”
When Goddard reached Belgium, however, he found out just how wrong he had been, learning there about the latest revision of the Binet-Simon Scale. Back in New Jersey, Goddard translated the test. Although skeptical at first, he gave the test to the mentally retarded children at his school. He became an instant convert when he saw how well the test classified the children’s degree of retardation. By 1915, Goddard had distributed more than 22,000 copies of the translated test and 88,000 answer blanks around the United States.
Goddard was a fan of Binet’s test, but not of his ideas. Instead, Goddard believed firmly in hereditary intelligence and eugenics. In fact, he took these views to an extreme. Galton, the founder of eugenics, had mainly wanted to foster breeding among people at the upper end of the intelligence range. Binet, the opposing voice, had wanted to promote the education and improve the lives of people at the lower end. Goddard, in contrast to both, was determined to prevent breeding among people with low intelligence—a policy called negative eugenics. Austrian biologist Gregor Mendel’s basic laws of heredity, which had gone unnoticed when he proposed them in the 1860s, had been rediscovered in the early 1900s. Influenced by the excitement over Mendel’s laws, Goddard incorrectly believed that mental retardation was caused by a single gene, and he thought it could be wiped out by preventing people with defective intelligence genes from having children.
In 1912, Goddard published a popular book titled The Kallikak Family: A Study in the Heredity of Feeble-Mindedness. This book was a sensationalized account of two branches of a family. One branch supposedly had a gene for feeblemindedness, which showed up in all manner of unsavory and immoral behavior among the relatives. The other branch, which supposedly lacked the gene, was filled with upstanding citizens. Goddard’s methods of gathering and presenting data for this book were later shown to be quite biased. Yet, even taking his arguments at face value, they failed to prove his hereditary theory. As with Galton’s earlier study of genius, it was impossible to separate the effects of nature and nurture.
While Goddard’s book may not have been great science, it was certainly effective propaganda. As a result of his book and others like it, several states passed laws requiring the involuntary sterilization of people with mental retardation. Intelligence tests were used to help identify which individuals would be candidates for sterilization.
Tragically, events in Nazi Germany during the 1930s and 1940s would highlight all too clearly the dark side of eugenics. In the early years of Nazism, more than 200,000 “degenerates” of all types, including people with mental retardation, were sterilized in Germany. Later, Germans with mental retardation and physical disabilities were among the millions of people killed alongside the Jews during the Holocaust. Once the extent of these atrocities became known, public revulsion helped turn opinion against negative eugenics, including the practice of involuntary sterilization. Today, some states still have involuntary sterilization laws on the books, but the policy is rarely enforced.
It is sad that it took such a brutal turn of events to make a crucial point: The improvement of the human race depends not only on heredity, but also on providing a better environment and improved education. Modern social policy often focuses on environmental and educational programs. Thus, society has come full circle to embrace the views of Binet and his “mental orthopedics.” Yet it is ironic that Binet’s test was used by others to justify policies that were so at odds with his personal philosophy.
Terman and the Stanford-Binet Intelligence Scales
While Goddard introduced Binet’s test to the United States, it was Lewis Terman who ensured its lasting popularity. At the same time that Binet and Simon were developing the first version of their scale in France, Terman was working on his doctoral thesis at Clark University in Massachusetts. A former teacher, Terman had noted that some students seemed to sail through all of their classes, while other students always struggled. He wanted to find mental tests that would distinguish one group of students from the other. To do this, he gave a series of tests to 14 schoolboys—seven of whom had been singled out by their teachers as exceptionally bright, and seven of whom had been singled out as exceptionally dull. Although Terman was still unaware of Binet’s work, the tests he chose were more similar to those of Binet than to those of Galton or Cattell. The tests involved creative imagination, logic, mathematical ability, language mastery, interpretation of fables, the game of chess, memory, and motor skill.
As Terman had expected, the bright boys did better, on average, than the dull boys on all the tests except those for motor skill. There was some overlap, however. On most of the tests, the best of the “dull” boys outdid the worst of the “bright” boys. As a result, Terman was disappointed by his findings. Yet the results only seemed like a failure because Terman had downplayed a key factor: The dull boys were almost a full year older, on average, than the bright ones. Had the two groups been the same age, the differences in their performance would have been greater. At the time, however, Binet had not yet pointed out the critical need for age standards in intelligence testing. Terman had failed to appreciate just how important age was.
In 1910, Terman accepted a teaching position at Stanford University. Around this time, he also learned about the Binet-Simon Scale. He immediately saw the advantage of using age standards. When age was taken into account, both his test items and those on the Binet-Simon Scale did a relatively good job of predicting school success. However, Terman also saw that the Binet-Simon Scale needed to be adapted for a U.S. audience. Terman showed that, in its original form, the Binet test seriously overestimated intelligence in young American children, but underestimated it in older children. Clearly, some of the test items and scoring needed to be adjusted.
Terman set out to assess Binet’s test items on a large number of American children. Several new items, some of which were based on Terman’s doctoral research, were assessed as well. Since Terman used better methods for choosing children on whom to try out the test, his results were more accurate than those of Binet. In 1916, Terman published his Stanford Revision and Extension of the Binet-Simon Scale, an unwieldy name that was quickly shortened to Stanford-Binet. The new test was more than a mere translation of the Binet-Simon Scale, however—it was a big leap forward. Forty new test items had been added, and some of the less reliable original items had been dropped. In addition, Terman had borrowed Stern’s idea of expressing results on the test as an IQ score.
The Stanford-Binet was an advance in other ways as well. For example, it was the first published intelligence test to include very specific, detailed instructions on test giving and scoring. It also offered alternate items to be used under certain circumstances; for example, if the examiner made a mistake when giving the regular item.
The Stanford-Binet quickly became the best intelligence test in the world and the gold standard by which future tests would be judged. It included six tasks at each age level. Following are two examples.
- Age four: Saying which of two horizontal lines is longer; matching shapes; counting four pennies; copying a square; repeating a string of four numbers; answering a question such as: “What must you do when you are sleepy?”
- Age nine: Knowing the current day of the week and year; arranging five weights from heaviest to lightest; doing mental arithmetic; repeating a string of four numbers backward; producing a sentence using three specified words; finding rhymes.
In 1926, Terman began working on a revision of the test with his colleague Maude Merrill. The project took them 11 years to complete. The 1937 revision offered two equivalent forms of the test. It also added new types of tasks for preschool and adult test takers.
Another revision of the test was already well under way at the time of Terman’s death in 1956. Published in 1960, this third edition of the Stanford-Binet offered only one form of the test, composed of the best items from the two earlier forms. No new items were added. There was one big change, however: the introduction of a new way of calculating IQ. No longer was it simply a matter of dividing mental age by chronological age, then multiplying by 100. Instead, a deviation IQ was used. The deviation IQ was based on a comparison of the performance of an individual with the performance of a group of same-aged people during the test’s development phase. Test performance was converted to a score where the average was always 100, and the standard deviation, a measure of variance in the scores, was 16. In the current version of the Stanford-Binet, the standard deviation is 15, but the average is still 100.
To understand how this works, it helps to picture the range of scores fitting neatly into a bell-shaped curve. About two-thirds of all scores fall between the average (at the top of the bell) and one standard deviation on either side. In other words, about two-thirds of all people have IQ scores between 85 and 115. Ninety-five percent of all scores fall between the average and two standard deviations on either side. In other words, only 5% of all people have IQ scores lower than 70 or higher than 130. This type of test, with an average of 100 and a standard deviation of 15, has become the industry standard in intelligence testing.
Binet Compared to Terman
Binet’s method of intelligence testing was an excellent match for Terman’s own interests and background. With Binet’s work as a starting point, Terman made great strides in refining the intelligence test. One way he did this was by focusing on standardization, the process of test development in which a test is given to a representative sample of individuals under clearly specified conditions, and the results are scored and interpreted according to set criteria. The goal is to spell out a standardized method of giving, scoring, and interpreting the test in the future. This approach helps to ensure that as much as possible of the variance in scores is caused by true differences in individual ability, and not by differences in the testing situation.
A key part of this process is the selection of the standardization sample: the group of people on whom the test is tried out during the development phase. The underlying assumption is that this group is representative of the whole population of people who will eventually take the test. Terman’s sample was much larger than Binet’s, and he went to what were then unprecedented lengths to select his standardization sample. By modern standards, however, Terman’s sample still fell short. It was not representative of the full spectrum of people living in the United States.
It is not just the standardization sample that needs to reflect the whole test population, however. The test materials need to do so as well. Otherwise, the test may be biased against those who find the materials less familiar or relevant. Binet recognized that his own test was valid only for children with a a knowledge of mainstream French culture. Terman’s Stanford-Binet also tended to focus heavily on the majority culture in the United States. For example, the pictures in the test kit depicted mainly white people and middle-class situations. Critics argued that the test was biased against members of certain racial, ethnic, and social groups.
In addition, the test seemed to reward conformity. For example, Terman added this item to the test: “An Indian who had come to town for the first time in his life saw a white man riding along the street. As the white man rode by, the Indian said—’The white man is lazy; he walks sitting down.’ What was the white man riding on that caused the Indian to say, ‘He walks sitting down.'” The only answer accepted as correct was bicycle. Cars and other vehicles were considered incorrect, because legs don’t go up and down on them. A horse was considered incorrect, because it was assumed that the Indian would know a horse if he saw one. Creative responses, such as a person riding on someone else’s back, were also marked wrong. Critics noted that the test seemed to measure conventional, rather than creative, thinking.
Both Terman and Binet were interested in educational uses for their tests. Binet was concerned primarily with identifying mentally retarded children who might need special educational programs. Terman, on the other hand, was fascinated by gifted children. In fact, he is remembered as much today for his research on the gifted as for his intelligence test. In the early 1900s, many people believed the popular catchphrase “early ripe, early rot.” In other words, they thought that child prodigies often burned out at an early age. Terman suspected that the reverse was actually true, but he needed evidence. Binet’s testing method seemed tailor-made for such research.
One of the first hurdles Terman faced was showing that high IQ scores in childhood really were good predictors of high achievement in adulthood. One way he tried to address this was by having a graduate student, Catherine Cox, study the childhood biographies of some 300 people rated to be among history’s greatest geniuses. The goal was to estimate their childhood IQs based on reports of the ages at which they had reached various landmarks in mental development. Obviously, this method left a lot to be desired. In many cases, little was known about the childhoods of these geniuses, and the information that was available was clearly not objective. Nevertheless, Terman and Cox believed that the study confirmed their basic point: These geniuses had “ripened” early, but they had certainly not gone to rot as adults.
Encouraged, Terman undertook a more ambitious project. In the early 1920s, his assistants tested more than 250,000 California schoolchildren. From this sample, Terman identified nearly 1,500 children with high IQs of 135 or above. Extensive background information was gathered on these children, nicknamed Terman’s Termites. In what turned out to be the longest-running study ever done, the children have been followed ever since. This study showed that most of the children were normal, happy, and healthy. As the Termites grew up, they continued to thrive as a group. Most of them were also relatively healthy, successful, and content with their lives as adults. Although the study had its flaws, it went a long way toward disproving the “early rot” myth.
When it came to people with less exalted IQs, however, Terman’s views could be less benign. Binet had wanted to identify children with below-normal intelligence so that they could be helped to learn and improve their lot in life. Terman, on the other hand, often seemed more concerned with putting a ceiling on what people could hope to achieve. He believed that, in an ideal world, everyone would be tested and then channeled into a job deemed appropriate for his or her intelligence. In general, he thought, jobs offering much in the way of status or money should be reserved for people with IQs over 100.
On the surface, this might seem logical enough. Yet there were at least two dangerous flaws in Terman’s logic. First, Terman’s IQ test was not a perfect predictor of true ability. Therefore, a low score might have kept someone from getting a good job that he or she would have been quite capable of doing. Second, the test had been criticized as being unfair to members of certain racial, ethnic, and social groups. If the test were indeed biased against individuals from these groups, then they would be likely to get lower-than-average scores for reasons unrelated to their actual intelligence. Yet those same scores could then be used to limit opportunities for advancement. Thus, it had become all too easy to turn IQ scores into a means of perpetuating social inequality. Binet himself would surely have been dismayed by this misuse of his creation.
Yerkes, Brigham, and Group Intelligence Tests
Terman and Goddard had introduced intelligence testing to America. Soon, world events would turn it into a national priority. In 1917, the year after Terman first published the Stanford-Binet, the United States entered World War I. Like many other Americans, psychologist Robert Yerkes was eager to serve his country. As president of the American Psychological Association, he also wanted show the value of the young science he represented. Yerkes set up committees to explore the military uses of psychology. He made himself chairman of a committee that was charged with developing an intelligence test for matching military recruits to the right jobs. Terman and Goddard were included among the other psychologists named to the committee.
The task Yerkes had taken on was extremely difficult, however. First, given the sheer number of recruits, the individual testing method developed by Binet and refined by Terman would not have been practical. A whole new kind of group intelligence test, which could be given to several people at once, would need to be developed. Second, the test would have to not only screen out those with low ability, but also identify those with high ability who might be officer material. Third, the test would have to be designed specifically for adults, rather than for children. Fourth, the test development would have to be accomplished very quickly, since results were needed right away.
Yerkes’ committee promptly put together two prototype tests: one for recruits who could read English, and another for those who could not. A trial on 80,000 men impressed the Army enough that it authorized the testing of all new recruits by the beginning of 1918. The tests were revised and renamed Army Alpha (for literate recruits) and Beta (for illiterate recruits). Soon, the tests were being given to some 200,000 men per month. By the time war ended in November 1918, about 1,750,000 men had taken one of the tests. This prodigious feat brought intelligence testing to the attention of the public. It introduced the idea of nearly universal testing, and it opened up a huge market for group tests after the war. In addition, the massive amount of data collected on the Army tests became the subject of intense study and led to much public debate about the state of intelligence in American society.
In 1921, Yerkes published Psychological Examining in the United States Army, an 800-page book analyzing the Army test data. Two years later, one of his junior colleagues named Carl Brigham published A Study of American Intelligence, which explored the same topic. The books made several questionable claims. For one thing, they claimed that the average mental age for all Army recruits was about 13 years. At the time, the mental age for an average adult was thought to be 16, and a mental age of 12 in an adult was considered the upper borderline for mild retardation. Therefore, the supposed mental age of the recruits was shockingly low. It might have been logical to conclude that the hastily thrown-together tests had been less than accurate. Yerkes, however, concluded that the results indicated a distressingly low level of intelligence in society at large.
Some of Yerkes’ and Brigham’s other conclusions were even more controversial. For example, the psychologists noted that, compared to native-born whites, immigrants and blacks tended to score lower on the tests. Once again, it might have been sensible to conclude that the tests had been biased toward members of the majority American culture. On the Alpha test, for example, individuals were expected to know that Overland cars were made in Toledo and that Crisco was a food product. On the Beta test, individuals were expected to be familiar with pictures of middle-class objects, such as a tennis court or a phonograph. Yet Yerkes and Brigham instead took the position that the lower scores obtained by immigrants and blacks indicated lower levels of natural mental ability in those groups.
At the time, racial segregation and discrimination were the norms in much of American society. Public sentiment had also turned sharply against immigration. In fact, in 1924, Congress passed a bill that set strict immigration quotas for each national group. This social climate helped to support Yerkes’ and Brigham’s conclusions. Yet, even at the time, there were opposing voices. One belonged to Franz Boas, a German immigrant himself and a leading American anthropologist of the early 1900s. Boas argued that many racial and ethnic characteristics were passed from generation to generation not by heredity, but by culture, through such mechanisms as shared values, language, and child-rearing customs.
American Otto Klineberg, a graduate student in psychology, was one of the first researchers to apply Boas’ ideas to group differences in intelligence test scores. While studying Yakima Indian children in the state of Washington, he noticed that they seemed indifferent to time limits. They took their time, no matter how much they were urged to hurry, but they also made relatively few mistakes. Klineberg noted that, in Yakima culture, speed was not considered a sign of intelligence. On the contrary, it was thought to reflect carelessness. This was clearly a cultural rather than a genetic difference. Yet it put the Yakima children at a disadvantage on timed intelligence tests. Similar observations in other cultures soon added up to a convincing case. By the 1930s, all but the most diehard eugenicists had conceded that culture played an important role in causing group differences in IQ scores.
In 1926, Brigham made his mark on group intelligence testing in another way. He introduced a brandnew kind of standardized test of mental ability. IQ tests looked at general thinking ability. This new type of test, however, looked more specifically at the kinds of word and number skills that were used in school. Brigham’s test became the forerunner of the SAT, a test that is still very familiar to high-school students.
Thurstone and the Structure of Intelligence
Meanwhile, research on the structure of intelligence was moving ahead as well. One of the most important figures in this field was American psychologist Louis Thurstone. He challenged Spearman’s ideas and, in the process, changed the way many psychologists viewed intelligence.
Spearman had proposed the existence of a unifying factor called general intelligence. He believed that all of the variation in intelligence test scores could be explained by the pervasive influence of general intelligence, combined with specific effects that were unique to the particular test activity at hand. Thurstone developed new statistical methods, and when he applied them to intelligence test scores, he noted that mental abilities tended to cluster into several groups rather than just one. In 1938, Thurstone published a book titled Primary Mental Abilities, in which he proposed that there were actually seven clusters of mental abilities. He called the clusters verbal comprehension, word fluency, number facility, spatial visualization, associative memory, perceptual speed, and reasoning.
When originally introduced, Spearman’s and Thurstone’s findings seemed to be directly opposed to each other. One currently popular view of intelligence, however, combines the two theories. Intelligence is often seen as having a three-level, hierarchical structure. General intelligence is on the top. Clusters of mental abilities make up the second level. Although separate from each other, in combination they all form general intelligence. A host of specific mental abilities make up the various clusters on the third level. Even psychologists who accept this structure, however, have different opinions about which level to emphasize. Some still see general intelligence as the most crucial consideration. Others, however, think it is more worthwhile to focus on each person’s distinctive pattern of strengths and weaknesses at the second level. The latter viewpoint echoes the view of intelligence put forth by Binet many decades before.
The Stanford-Binet after Terman
Terman died in 1956, but his legacy lives on. The fourth edition of the Stanford-Binet was introduced in 1972, 16 years after Terman’s death. This version contained major changes. Previous versions of the Stanford-Binet had included age scales, in which test items were grouped together by the age at which most individuals could pass them. The fourth edition, in contrast, introduced a point scale, in which all the test items of a particular type were grouped together. The test was then evaluated in terms of how many items of each type were answered correctly, rather than in terms of an age level. By the 1970s, this was a very common test structure. It was also the type of structure used for the Wechsler Intelligence Scales, which had by then eclipsed the Stanford-Binet as the most widely used intelligence tests.
Previous editions of the Stanford-Binet had yielded an overall IQ score, considered to be a measure of general intelligence. The fourth edition, however, went beyond just providing a general IQ. It contained 15 subtests that also yielded scores on four clusters of mental abilities: verbal reasoning, abstract/visual reasoning, quantitative reasoning, and short-term memory.
- Vocabulary. In this subtest, individuals are asked to identify pictured objects and define words.
- Comprehension. These items range from identifying parts of the body to answering more complex questions using social judgment; for example, “Why should people be quiet in a hospital?”
- Absurdities. Individuals are asked to identify what is wrong or silly about a picture.
- Verbal relations. Individuals are given four words; for example, “newspaper,” “magazine,” “book,” “television.” They are asked to state what is similar about the first three things, but different about the fourth.
- Pattern analysis. These tasks, which must be completed within a set time limit, range from putting cutout forms into a form-board to copying complex designs with blocks.
- Copying. Individuals are asked to copy designs with blocks or by drawing.
- Matrices. Individuals are shown an incomplete matrix—a systematic arrangement of geometric symbols, letters, or common objects. They are then asked to pick the object that is needed to complete the matrix.
- Paper folding and cutting. In these multiple-choice items, individuals are asked to decide how a folded and cut piece of paper will look when unfolded.
- Quantitative subtest. These items range from simple counting to knowledge of arithmetic concepts and operations.
- Number series. Individuals are asked to complete a sequence of numbers with the number that would come next.
- Equation building. Individuals are asked to rearrange a scrambled arithmetic equation so that it makes sense.
- Bead memory. In this subtest, individuals study a picture of a bead sequence for five seconds. They are then asked to reproduce the sequence using actual beads of varying color and shape.
- Memory for sentences. Individuals are asked to repeat sentences ranging in length from two to 22 words.
- Memory for digits. Individuals are asked to repeat strings of numbers.
- Memory for objects. In this subtest, familiar objects are presented at one-second intervals. Individuals are then asked to recall the objects in the correct order.
The fifth and latest edition of the Stanford-Binet was published in 2003. It is the most recent attempt to wed the rich tradition of this test with the newest research on mental abilities and intelligence testing. In the fifth edition, the traditional age scale has been brought back. Like the fourth edition, however, the test now yields scores on several clusters of mental abilities as well as on general intelligence.
In recent versions of the Stanford-Binet, developers have tried to weed out any systematic bias against members of particular racial, ethnic, or social groups. For example, the test kit now contains pictures showing children of different races and with disabilities, and the word “brunette” has been cut from the vocabulary test because it was not as meaningful to black children as to whites. A broader standardization sample has been chosen to better reflect the entire population of the nation. And psychologists from various racial and ethnic backgrounds have reviewed the materials for potential problems. Nevertheless, the test remains a product of its culture, and it may well be impossible to eliminate all bias.
A common criticism of previous versions of the Stanford-Binet test was that they relied too heavily on verbal abilities. The test was often unfavorably compared in this regard to the popular Wechsler Intelligence Scales, which David Wechsler first introduced in 1939. The Wechsler tests have two parts: Verbal and Performance. While the Verbal tasks are heavy on word skills, the Performance tasks rely less on language abilities. Instead, they involve nonverbal activities, such as completing pictures, making block designs, solving mazes, and using abstract symbols. This kind of test may be better suited to people for whom language is a barrier, as well as to those who have higher nonverbal abilities. The fifth edition of the Stanford-Binet, for the first time, tries to offer a better balance of verbal and nonverbal items.
One advantage of the Stanford-Binet is that it is an adaptive test, which means it is tailored to each test taker’s individual needs. The examiner uses information about a person to decide where to begin testing. This approach reduces the frustration that the person might feel if he or she was asked to complete tasks that were much too hard or too easy. It also cuts down on wasted time. Nevertheless, the Stanford-Binet is still an individual test, which means it is given by a trained examiner to only one person at a time, rather than to a group. The test usually takes about 45 to 60 minutes to give. As a result, it would usually not be feasible to give it to every student in a school, for example. Like the Wechsler scales and other individual tests, the Stanford-Binet typically is given only to individuals who have already been singled out as needing extra testing.
Theories in Action
Today, psychologists can choose from among many different individual and group intelligence tests. These tests are used for a wide variety of purposes. Indeed, intelligence testing has become one of the most widespread uses of psychology in everyday life. Research using intelligence tests has also helped fuel the ongoing debate over the very nature of intelligence.
Research on the Nature of Intelligence
What is intelligence? Binet struggled with this question in his day, and modern scientists are still grappling with it. In the 1920s, one of the more infamous answers was offered by American psychologist Edwin Boring, who pronounced that “intelligence is what the tests test.” This kind of circular reasoning may be amusing, but it is not very instructive for scientists seeking serious answers.
Whatever it is that intelligence tests measure, though, the tests seem to work best for predicting academic success. In study after study, intelligence test scores have been found to have a correlation of 0.4 to 0.6 (on a 0 to 1 scale) with school grades. Statistically speaking, this is considered a moderate to large correlation. But even a test that predicts school grades with a correlation of 0.5, however, still accounts for only 25% of the variation in school performance among individual students. This means that 75% of the variation is due to other factors. Clearly, the kind of intelligence that is measured on IQ tests is not the only predictor of academic performance. Other factors, such as good schools and high individual motivation, also seem to count for a lot.
Once researchers moved beyond the classroom and into the workplace, the predictive power of intelligence tests grew even weaker. In general, studies have found correlations between IQ scores and work performance of about 0.3. This means that the tests accounted for just 10% of the variation in performance among individual workers; therefore, 90% of the variation must be explained by other factors. In 1990, American psychologists Peter Salovey and John Mayer coined the term “emotional intelligence” to describe the emotional abilities and interpersonal skills that may play a critical role in workplace success.
This idea raises a related question: Is intelligence really just one thing or is it many? Some modern theorists have suggested that there may actually be several types of intelligence, some of which are not assessed by standard intelligence tests at all. One of these theorists is Howard Gardner, a professor of education at Harvard University. In 1983, Gardner published a book called Frames of Mind, in which he introduced his theory of multiple intelligences. Gardner thinks there are several different “intelligences” that are separate but equal in the mind. Some people learn more easily by using one kind of intelligence; others, by using another. So far, Gardner has described eight intelligences: linguistic, logical-mathematical, spatial, musical, bodily-kinesthetic, naturalist, interpersonal, and intrapersonal. Of these, linguistic and logical-mathematical are most similar to the kinds of word and number skills used in school and assessed on IQ tests.
Fresh ideas such as Salovey and Mayer’s emotional intelligence and Gardner’s multiple intelligences are intriguing. Yet research on these alternate intelligences has been hampered by the lack of well-validated tests to measure them. Only time will tell whether these new concepts will hold up to rigorous testing as well as General Intelligence has.
Research on the Nature-Nurture Debate
Another research question remains as relevant today as it was in Galton’s time: Is intelligence mainly the result of nature or nurture? Modern research methods are shedding some new light on this old puzzle. Some of the most interesting findings have come from the Minnesota Study of Twins Reared Apart. For this study, the researchers brought together from all over the world sets of twins who had been separated during childhood and, in most cases, have lived apart ever since. The twins were then put through a week of intense psychological and medical testing, including intelligence tests.
Since Galton’s time, scientists have realized that twin studies presented a unique opportunity for exploring the genetic basis of intelligence. Identical twins share exactly the same genetic makeup. Therefore, their inherited intelligence should theoretically be the same. When identical twins are reared apart, the resulting differences in their intelligence should be largely due to differences in their separate environments. Of course, this is not perfectly true. For one thing, twins do share at least one crucial part of their lives: the prenatal part in the womb. For another thing, even when twins have been reared apart, they may have been placed in similar homes. Nevertheless, twin studies are one of the best tools psychologists have for separating the effects of nature and nurture.
The Minnesota study showed that identical twins who had been raised apart grew up to be almost as similar in intelligence as identical twins who had been raised together. The degree of similarity was impressive. For example, one test the twins took was the Wechsler Adult Intelligence Scale, currently the most widely used IQ test for adults. The scores of identical twins reared apart correlated at 0.69, a high correlation that was not much different from the 0.88 correlation in the scores of identical twins reared together. On some other tests of mental ability, the correlations were even closer. For example, on a test called Raven’s Progressive Matrices, the correlation for the reared apart identical twins was 0.78. For the reared-together identical twins, it was 0.76.
Overall, the Minnesota study and others like it have found that about half of the differences in intelligence within a group of people may be due to differences in genes. Of course, this also means that half of the differences are due to other things. In addition, what is true for a group of people is not always true for a particular individual. In the Minnesota study, for example, one pair of twins scored almost 30 points apart in IQ.
Studies of extreme cases are another way of exploring the nature and limits of intelligence. The nearly 1,500 high-IQ children who took part in Lewis Terman’s long-running study of giftedness have become one of the best-researched groups in history. Over the years, some participants have chosen to reveal their identities, and reports have been published documenting their personal triumphs and tribulations. The life stories of Terman’s Termites, as the study’s participants came to be called, reveal a lot about the benefits and limitations of having a high IQ.
As a group, the Termites have fared relatively well. Although no world-class geniuses emerged from the group, some members achieved success and even a measure of fame as adults. For example, Jess Oppenheimer became the creator, producer, and head writer of I Love Lucy, one of the best-loved television shows of all time. Ancel Keys discovered the link between cholesterol and heart disease. Others in the group included Norris Bradbury, former director of the Los Alamos National Laboratory, and Shelley Smith Mydans, a one-time journalist for Life magazine. Yet none of the Termites ever won a Nobel or Pulitzer Prize. It is an interesting footnote that two of children who were tested for the study but whose IQs failed to make the cut did go on to win the Nobel Prize in Physics: William Shockley in 1956 and Luis Alvarez in 1968.
Another Termite who eventually made a name for himself was Edward Dmytryk. At age 14, Edward was picked up by the authorities as a runaway. He had reportedly left home to escape an abusive father, who was said to have torn up his schoolbooks and clubbed him with a board. The father wanted Edward returned, although a caseworker suspected that this might have been only because Edward brought home income. Terman wrote a letter to the authorities on Edward’s behalf, and the boy wound up being placed in a good foster home instead. This kind of meddling by a researcher in his subject’s lives was typical of Terman. It may have affected the results of his study, thereby undermining the data from a scientific perspective, but it also demonstrated the deep interest that Terman took in the children. As an adult, Dmytryk went on to direct 23 movies, including The Caine Mutiny, a classic 1954 film starring Humphrey Bogart.
Of course, not all of Terman’s Termites achieved happiness and success as adults. For example, the study included two half-sisters raised by the same mother, both of whom went to college at Stanford University. One became well-known as a freelance writer. The other died of alcoholism. Terman’s study showed that high IQ was helpful in adulthood, but, by itself, it was clearly no guarantee of the good life. Among the personal traits that seemed to be associated with adult success were the ability to set goals and the perseverance to achieve them. In addition, a stable marriage and a satisfying job also were related to happiness as an adult. If nothing else, then, the study underscored the fact that people with high IQs have basically the same needs and desires as everyone else. At best, they may just have a running start at fulfilling those needs.
Relevance to Modern Readers
Until recently, two-thirds of school districts in the United States used group intelligence tests on a routine basis to screen 90% of their students. The remaining 10% were given individual tests. Over the last few decades, though, concerns over the potential for error and bias have curtailed the routine use of group tests. Many states have passed laws banning the use of group test scores alone for placing children in different educational tracks. Nevertheless, group intelligence test scores are still sometimes used by school districts for educational planning. The scores can also identify children who might need more detailed assessment with individual tests.
Such children include not only those with developmental and learning disabilities but also those with special gifts. Thanks to Terman, the idea of IQ as an index of giftedness is firmly rooted in American society. Yet Binet himself had doubts about the ability of intelligence tests to identify gifted or talented individuals, and some modern psychologists share his concerns. It seems that highly creative thinking might, by its very nature, defy conventional testing. In addition, there are many kinds of valuable abilities—including musical, artistic, athletic, and leadership skills—only a few of which are tapped by standard IQ tests. Today, various states and school districts use a wide range of methods for identifying gifted students. Many use standardized tests. Along with intelligence tests, however, assessment for gifted programs may involve tests for creative thinking, artistic ability, leadership, or motivation. In addition, screening might involve non-test measures, such as teacher checklists, teacher or parent recommendations, or the student’s work in a portfolio.
Another common use for group tests is to help predict which high school students are likely to do well in college. The SAT is one familiar example of this kind of scholastic aptitude test. The scores from such tests tend to be highly correlated with IQ, and many psychologists regard scholastic aptitude tests as just another kind of intelligence test. Along with high school grades, SAT scores often play a big role in deciding who gets into a particular college and who does not. In 2003, 1.4 million high-school students took the SAT. Once those students are in college, similar tests are often used to help determine which of them will be admitted to graduate, medical, law, or business school.
The use of the SAT and similar tests to make educational decisions has long been the subject of controversy. The main advantage of the tests is that they make it easier to compare people coming from different schools and backgrounds. Grades are less comparable, since they reflect not only a student’s ability, but also the difficulty of the courses the student has taken and the standards of the school. On the other hand, SAT scores show nothing about factors such as a student’s motivation and work habits. Most psychologists now agree that, even when SAT scores are used, grades and other evidence of past performance also need to be considered.
Individual tests, such as the Stanford-Binet and Wechsler Intelligence Scales, are still widely used as well. They are often used for diagnosing specific educational or developmental problems. Public Law 94-142, the Education for All Handicapped Children Act of 1975, helped solidify this role in the United States for intelligence tests. This law, and others that followed it, required the development of individualized educational plans for children with learning, mental, and physical disabilities. A key step in developing such plans is evaluating each disabled child’s level of mental functioning.
Interestingly, the latter use for intelligence testing is very close to the one originally envisioned by Binet. In his critically acclaimed 1981 book The Mismeasure of Man, American paleontologist and author Stephen Jay Gould commented:
Ironically, many American school boards have come full cycle, and now use IQ tests only as Binet originally recommended: as instruments for assessing children with specific learning problems. Speaking personally, I feel that tests of the IQ type were helpful in the proper diagnosis of my own learningdisabled son. His average score, the IQ itself, meant nothing, for it was only an amalgam of some very high and very low scores; but the pattern of low values indicated his areas of deficit.
Indeed, many of Binet’s ideas still seem timely today, a century after he first stated them. Binet stressed that intelligence test results should never be used to label people as innately incapable. Instead, the results should be used to help people make the most of their inborn mental abilities. Some of Binet’s earliest followers failed to heed this part of his message. Over the years, however, society has learned from its mistakes. Most modern psychologists have come to appreciate the wisdom of Binet’s views.