Linking Theory to Evidence in International Relations

Richard Herrmann. Handbook of International Relations. Editor: Walter Carlsnaes, Thomas Risse, Beth A Simmons. Sage Publication. 2002.

As is clear in this handbook, the study of international relations includes diverse theories purporting to explain substantive patterns in world politics. The field is also characterized by different perspectives on how to defend these claims. One strategy, of course, is to connect the concepts that constitute a theory to observable indicators, spell out what expectations follow from the theory, and then demonstrate whether these expectations materialize or not. Although this positivist strategy sounds straightforward, its implementation is anything but simple. Demonstrating that the chosen indicators validly connect to the abstract concept is difficult, as is determining the specific expectations that should follow. The subsidiary theories, data generating methods and analytic techniques associated with these tasks also provoke controversy. Of course, some scholars conclude that trying to link theory and evidence in a positivist fashion is misguided. It confuses constructed concepts and categories with natural categories, treats created data as if they were facts, and employs the label of science in an effort to empower particular political preferences.

Despite the disagreements over how theory and evidence should be related there are benefits to attempting the endeavor. The process compels the production of specific definitions and concrete expectations. By taking the empirical step, it is easier to identify at what stage of the enterprise disagreement is most clear and what the substance of this disagreement is. One criticism of both rational choice and constructivist theorizing is that too little of it has made the connection to evidence. The confusion that can be generated when theories are not clearly linked to evidence is easy to spot. Fearon and Wendt, for instance, make a convincing case that the main differences between rationalist and constructivist theories are often misunderstood or misidentified as related to assumptions about the causal role of ideas or substantive assumptions about the motives of actors.

Connecting theory to evidence not only sharpens the understanding of theory, it also creates a common ground across the boundaries established by disciplines, sub-disciplines and intellectual communities. Most substantive questions such as why wars happen and what causes collective identities to form provoke research in many fields. Although scholars have reasons to claim their work is different and to align with people using similar languages and methods, focusing on the substantive question and the empirical evidence connected to theoretical answers can promote common ground. One purpose of this chapter is to explore this common-ground and to review the progress and problems in connecting theory to evidence. This will entail crossing sub-community boundaries. For instance, it will include bridging rationalist interest in subjective expectation with social psychological research on perception which, heretofore, have typically been seen as antithetical rational and irrational explanations. It will also include connecting the constructivist focus on collective identity formation with the literature on the emergence of nationalism.

Although rationalist and constructivist labels are relatively recent and often associated with a rather thin empirical record, the substantive research linking theories and evidence in international relations is large. Far too large to summarize in a single chapter. That is the task of the entire handbook. Instead, this chapter concentrates on the major efforts and problems encountered as rationalists and constructivists have attempted to link theory and evidence. I start with the rationalist tradition in its objectivist version. This has generated a huge body of work. I concentrate primarily on that portion of this research that focuses on war and security. Most of it takes as a starting assumption that there are rational actors sensitive to distributions of power. Following this, I turn to theories that abide by the most general form of rationalist models offered by Fearon and Wendt (that is Desire + Belief = Action). These are theories of purposeful action. Most of them relax both the assumptions of substantive and procedural rationality. They also emphasize the agent’s subjective understanding of the environment as distinct from the scholar’s objective description of the environment.

Following the discussion of how ideas have been studied at the level of agents, I turn to the study of ideas as constituting an element in the structure of the international system. I use the study of ideas and phenomenological perspectives as a bridge to cross from rationalist to constructivist endeavors. I pay special attention throughout to the tension within the constructivist umbrella between objectivist and phenomenological perspectives. I do this in some detail in the context of research on norms. Finally, I take up the question of where collective identities come from. This involves linking the constructivist approach to this question to the long-standing efforts to explain identity in comparative politics, sociology and history.

what was possible. Of course, determining an actor’s beliefs is also difficult. This information is central to strategic bargaining and actors have many incentives to disguise their real beliefs and to manipulate what other actors think their beliefs are (Jervis, 1972). Two approaches to the problem developed: I will label them objectivist and phenomenological.

The objectivist strategy assumes the external environment can be described by the scholar in terms that are objectively accurate. It then assumes that actors correctly see objective power distributions and incentives in the environment. This strategy leads to the study of the environment, especially the distribution of power. It also leads to the prediction of outcomes more than actions. Realists accept that states may misread the situation and make mistakes. The objective distribution of power, however, is assumed to determine the outcome of these actions. In addition, some realists add a social Darwinian notion, suggesting that actors that misread the situation are over time eliminated from the system, leaving actors that can be assumed mostly to understand objective reality.

The phenomenological strategy doubts that the scholar’s view of the situation and the actor’s view of the situation are likely to be the same. It also assumes that an actor’s action will follow from the actor’s perceptions not the scholar’s perception, no matter how objective scholars claim their view to be. The phenomenological strategy, therefore, puts primary emphasis upon the empirical identification of the perceptions and world-views held by actors. It seeks to explain action by referring to the cognitive understandings and ideas actors have, rather than searching for primary explanatory leverage in the objective structure of the environment. The phenomenological strategy focuses on action but also doubts the objectivist claim that environment determines outcomes. The main constraints in the system are typically seen to be the actions of the great powers. If these actions are determined by the perceptions held by the great powers, then predicting outcomes of interaction also requires knowing the perceptions of the key actors.

In both phenomenological and objectivist strategies, it is necessary to introduce auxiliary assumptions about what percentage of the opportunity (or perceived opportunity) that is available actors will seize. Often a maximizing assumption serves this purpose.

Although objectivist and phenomenological strategies are ideal-types and practical research often combines elements of both strategies, the differentiation captures an important distinction in efforts to link theory and evidence. I will look first at research in the more objectivist tradition and then turn to the phenomenological efforts. This sequencing should not be interpreted as suggesting one perspective supersedes another. Contemporary

Rationalist Theories

Many theories of international relations assume that actors have a set of desires or motives and pursue these according to beliefs about the environment. Various forms of realism, for instance, accept the formula Desire + Beliefs = Action. Of course, determining an actor’s desires or motives is a difficult task. Hans Morgenthau (1973) argued that it was an impossible task. He explained that single motives, like national security or the desire for wealth, did not associate with single behaviors but could lead to many different behaviors. Specific behaviors, like defense spending or military intervention, also did not associate with only a single motive. The same behavior could be attributed to diverse motives. With no empirical way to infer an actor’s motives, Morgenthau suggested that motives be held constant and variation in action be explained by variation in the other variable, that is beliefs, especially beliefs about power. Assuming that whatever an actor’s desires were they would need power to achieve them, Morgenthau defined interests in terms of power.

The central realist simplification, treating means as a common aim, led to a focus upon beliefs about empirical research in both veins has proceeded simultaneously for many years.

Objectivist Strategies

Hans Morgenthau doubted that international relations could be dealt with appropriately by adopting the logic of science employed in the physical sciences. Despite the insight in his work, his theoretical formulations were inconsistent with positivist methods. Concepts were defined loosely, often without empirical referents, and causal claims were sometimes contradictory. For instance, power was defined in multiple dimensions with no strategy for aggregating a net power score, and while Morgenthau (1973: 4-16) argued that states defined their interests in terms of power and behaved accordingly, he also argued that balance of power systems were impossible in an era of mass politics (1973: 241-56, 327-37). Mass politics, Morgenthau argued, led states not to pursue their power interests but instead to pursue normative crusades, what he called nationalistic universalism.

Power determinism 

Scholars like William Riker (1962) recognized scientific shortcomings in traditional realist formulations and proposed more precise renditions that clarified the concepts and teased out individual causal claims. This led, at first, to stark power determinist models with explicit maximizing assumptions. The ideal-typical formulation paralleled B.F. Skinner’s (1960) model of personal behavior. That is, it assumed that actors had similar motives, mostly to survive and satisfy material needs, and that actions responded to objective incentives in the environment. The environment was characterized as anarchic, leading to a concentration on the distribution of power among actors assumed to follow self-help strategies. The empirical task then was to operationally define the variation in power and see if this corresponded to predicted variation in behavior.

Measuring power was not easy but indicators were devised (Cline, 1975, 1994; Knorr, 1975). The best known of these are associated with the Correlates of War (COW) project headed by J. David Singer. COW identified a set of indicators and treated them as objective measures of power (Singer and Diehl, 1991). Data sets were also created that measured the conflict and cooperation between states, which was taken to be the dependent variable in these models. The Correlates of War project focused on states involved in war. Subsequently, in the 1980s, this data was expanded to include states involved in militarized disputes (Gochman and Maoz, 1984). A number of events data projects sought to study a wider set of countries and a wider range of behaviors. They typically arrayed behaviors on a scale ranging from very cooperative behaviors (like unifying two states into a single state) to very conflictual behavior (like military attack). Events data sets like COPDAB (Azar, 1980), WEIS (McClelland, 1976, 1983) and KEDs (Gerner et al., 1994) drew information from news reporting services, papers and the wire, and expanded their coverage of sources as funds and technology permitted.

The expansion of sources reduced concerns that the editorial selection of news biased badly the events and behaviors reported. It did not, however, solve the problem of translating categories of behaviors into a scale that associated numbers with each category. To translate this categorical information into a scale across which interval distances would be substantively meaningful (that is a conflictual behavior coded as 4 points from the neutral point would be substantively twice as conflictual as a conflictual behavior coded as 2 points from the neutral point), scholars endeavored to create a weighted conflict-to-cooperation scale reflecting their agreed upon judgments about the relative cooperativeness and conflictualness of behaviors (Azar, 1980).

The COW power measure and the events data sets refined increasingly reliable methods so that measures were reproducible and consistent across cases. This did not, however, eliminate concerns about the validity of the measures. In terms of power, the indicators originally employed by COW emphasized material resources and industrial capacity, leaving aside aspects of power associated with the government’s ability to mobilize people, lead wisely and take advantage of geostrategic bargaining leverage (Baldwin, 1979). The weighted COPDAB scores also raised concerns about validity. Moreover, events data sets concentrated only on bilateral directed behaviors and resisted reading three-way significance into behavioral moves. Consequently, positive moves toward one country would not be coded as negative moves toward a third country, making it difficult to capture political moves like Washington’s playing of the China Card against the Soviet Union (Goldstein and Freeman, 1990). Reading in meaning of this sort required substantial area expertise coders may not have had, and more importantly, it injected still more subjective interpretation into the collection of what was seen as objective data.

Empirical tests of power determinist models did not affirm the accuracy of these models (Ferris, 1973; James, 1995; Sullivan, 1990). Behavior was not predictable from power distributions alone. To refine the basic theory, attention turned to the concentration of power in the system (Mansfield, 1994: 71-116) and to the changing distribution of power. Theoretically, the basic model was adjusted to expect conflict as more likely when relative power between states was in transition and uncertainty about the likely outcome of conflict was high (Gilpin, 1981; Organski, 1968). These theoretical modifications were also coupled with attempts to improve the measure of power, for instance by including indicators of the government’s ability to tax and mobilize the polity (Kugler, 1996; Organski and Kugler, 1980). These models did better empirically but still fell short of aspirations.

Power activation Power was found to limit a state’s options but typically not determine its behavior. Within the parameters of the options available, there evidently was still substantial choice. Also, leaders of states may have understood the power circumstances differently than the objective measures indicated. Theoretically, power theory could take this into account by concentrating on perceived power, that is the power situation as understood by the actor (Christensen, 1997; Wolhforth, 1993). When models highlighted perceptions of shifting relative power, they seemed to produce more empirically accurate predictions about behavior (Ferris, 1973). Although some realists made this adjustment as if it were consistent with the basic objectivist paradigm, others saw the contradiction and resisted the reformulation on essentially phenomenological terms (Powell, 1991). Of course, shifting from a conception of objective power to a conception of perceived power raises difficult empirical problems. The perception of power must be determined in operational terms independent of the observed behavior, otherwise the explanation is tautological. It is not clear how to establish these perceptions for many states across the last two centuries and produce a measure commensurate with the scope of the existing COW data.

Rather than shift to perceived power and phenomenological premises, some power-based theories revised basic assumptions about the motivation of actors. In ideal-typical power determinism, the motives of actors are held constant as power maximization. This allows the model to make a prediction about behavior from the empirical analysis of relative power and objective opportunities in the environment. If the state does not behave as expected, or the expected outcome does not occur, for instance the much more powerful state concedes to the much weaker state, this can be attributed in a post-hoc way to a lack of desire, will or insufficient motivation stemming from a substantive understanding of interests. States, after all, may not exert 100 per cent of their capability in all situations and instead activate different amounts of their power depending on the interests at stake.

By moving toward power activation theories, as James March (1966) called them, we can capture this possibility in our models, but as Robert Keohane (1984: 35) points out, only at a substantial cost. When motivation level varies, as well as beliefs about power, we cannot solve the equation, that is predict action from the empirical estimate of relative power. Instead, we can explain any action post hoc by referring to various degrees of power activation. To avoid this problem an actor’s motivation needs to be set by assumption or identified empirically. Neorealist theorizing has pursued both avenues, devoting most attention to the former.

Kenneth Waltz (1979) revised power theory by substituting the power maximizing motivational assumption with a security maximizing assumption. Determining the objectively best way to maximize security is, of course, no easier than establishing objectively the best way to maximize power, but as long as the maximizing assumption is in place empirical attention is focused on relative power. Waltz’s formulation, however, also acknowledged the power activation issue and argued that states did not always seek maximum security. For instance, states did not always pursue relative gains when this meant forgoing absolute gains. Waltz (1979: 102-28) spelled out some of the conditions that lead states to pursue relative gains. Joseph Grieco (1990) and Ducan Snidal (1991) have refined the theory further, paying special attention to the delineation of these conditions. If these conditions were defined primarily in objective terms, then the basic scientific perspective could be sustained. Powell (1991) has tried to remain within these parameters, associating the activation of relative gains behavior with situations in which the use of force is involved. Whether these situations, especially in a nuclear age when force is often symbolically engaged, can be determined in any agreed upon way remains to be seen.

Joseph Grieco (1990: 40-50) associated the activation of relative gains behavior to perceptions of future power, common enemies and past relationships. In this regard, his reformulation of neorealist theory is similar to Stephen Walt’s. Walt (1987: 22-6, 263-6) built a model of alliance formation that attributed behavior to aggregate power, proximity of actors, offensive capability and the perceived intentions of other actors. This balance of threat theory allowed the activation of power to vary, but like Grieco’s formulation rested the operational identification of the factors predicting this variation on phenomenological factors. There is nothing wrong with this theoretically, but it has serious implications for empirical testing. To make the model produce predictions, we must operationally and empirically identify these actor perceptions. On this front, neorealists offered little instruction. Rather, they frequently illustrated the basic function of their theory, treating phenomenological variables as mostly non-controversial facts. In other words, rather than inventing a method for determining what an actor’s perceptions of threat might be, they assumed these perceptions were known to area experts and/or historians.

Another way to introduce variation in power activation is to assume states are of different types. Some are offensively motivated and others are defensively motivated. Morgenthau made this distinction, although he quickly pointed out that differentiating one from the other in practice is nearly impossible. Subsequently, neorealists have returned to differentiating between offensive and defensive states but have not addressed the central empirical dilemma of how to tell one from the other (Glaser 1997; Rose, 1998; Schweller, 1996; Snyder, 1992: 11-12).

If offensive and defensive distinctions are to be central parts of international relations models, then we need to have methods for empirically distinguishing one type from the other. James Fearon (1995) has shown in formal terms why this endeavor will not be a simple one. Although propagandists may assert with great confidence what the motives of other states are, Fearon shows that careful observers will always entertain doubts. This is because actors have incentives to mislead other actors with regard to what motivational type they are and because actors face serious limits in the credible commitments they can make and thus face inherent limits in their ability to signal what type they are.

The most successful effort to explain variation in conflict and cooperative behavior that abides by the core premises of the scientific paradigm is democratic peace theory. In this paradigm, relative power is controlled for statistically and the conflict and cooperation among a pair of states is attributed theoretically to the types of governments in the interacting states. The distinction is between democratic and non-democratic governments. The indicators of each are specified and fairly large data sets have been collected, categorizing states accordingly. The Polity data is best known in this regard (Jaggers and Gurr, 1995).

Democratic peace theory predicts war will not occur between two democracies or at minimum will not occur as often, and if it does occur will remain at a lower level of violence, than between other pairs of states (Russett, 1993). The empirical test of the theory involves associating the type of government in states and the wars between states. Questions have been raised about the validity and reliability of measures of democracy (Gleditsch and Ward, 1997) and scholars have debated whether democracy should be measured in terms of elections or civil liberties (Owen, 1997), and scholars have discussed whether differentiations should be made between consolidated democracies and democratizing states (Mansfield and Snyder, 1995). Questions have also been raised about the association between regime type and war, focusing attention on the sample of pairs of states that have been studied and the statistical implications of different sampling choices in this regard (Russett, 1995; Spiro, 1994). Although these debates have been important, they have taken place within the basic parameters of the normal positivist paradigm. If we agree for sake of argument that the statistical relationship exists, we are still left with substantial uncertainty about how the mechanisms inside this relationship operate.

The most common form of the democratic peace theory argues that it is as if democratic states share norms of compromise and expect peaceful and fair outcomes when dealing with other democracies (Russett, 1993). It could also be that trust develops because of strategic associations that vary with regime type in the period after the Second World War (Gowa, 1999). Moreover, it could be that leaders of certain personality types prevail more often in democracies (Hermann and Kegley, 1995). As might be expected in positivist science, models of sub-mechanisms consistent with the overall theory have led to additional empirical investigation. They have also, however, evoked concepts like trust, shared norms and expectations, which to verify at the micro-level require investigation into the black box of actor decision-making. Unlike neorealists, the democratic peace theorists have not imported into their objectivist theory phenomenological concepts or concepts that are not operationally defined, but their research agenda has demonstrated the need for a bridge between objectivist and phenomenological perspectives.

Phenomenological Strategies

In the 1950s and early 1960s, while scientific realists operating in the objectivist perspective sought to refine power theory other scholars developed an essentially phenomenological perspective. These phenomenological scholars conceived of the relations between states as the product of the actions of individual states and they believed the values, mindsets and beliefs of actors guided these actions (Kelman, 1965; Sprout and Sprout, 1965). Scholars therefore turned to the study of foreign policy decision-making (Hudson, 1995) and in particular to the identification of the cognitive lenses through which actors understood the world. Richard Snyder, H.W. Bruck and Burton Sapin (1962) offered a framework identifying key concepts that could be used to describe such a mediated decision-making process, including the values, mindsets and domestic players that comprise them. Kenneth Boulding (1956, 1959), meantime, argued that the cognitive images leaders have of other countries guide choices about action and that the two most important components of this image are perceptions of the threat or opportunity the other country poses and the perception of the other country’s capability. He argued that by identifying the factors empirically, foreign policy action could be explained.

In the 1970s, Michael Brecher (1972) offered an elaborate conceptual framework with which to study Israeli decision-making, illustrating how the basic phenomenological argument could be connected to empirical evidence in a specific case. Robert Jervis (1976) illustrated how a phenomenological perspective modified international relations theory and drew attention to the advances made in social psychology. He identified substantive common misperceptions and a host of common perceptual tendencies that could guide the empirical study of world-views and beliefs. Robert Axelrod (1976) proposed a strategy for mapping an actor’s cognition, including the actor’s central concepts, objectives and causal beliefs.

The focus on cognitive and decision-making variables in international relations ran parallel to the cognitive revolution in social psychology and the social sciences more generally (Gardner, 1985). This revolution advanced the proposition that human cognition was not predictable from environmental factors alone. It did not contend that actor cognition is unaffected by environmental forces, personality characteristics and personal experience. What it argued was that the processes and factors affecting the formation of an actor’s images and understandings of the world are so complicated, with so many possible causes, that it is not adequate in scientific terms to assume the scholar can know what the actor thinks without direct empirical investigation of this matter. In psychology, this meant including manipulation checks in experiments to tap directly what participants were thinking rather than just assuming they were apprehending the experimenter’s manipulation as planned. In international relations and other natural settings, it meant devising strategies for identifying the mindset of leaders and other actors rather than assuming scholars could predict what these must be from the scholar’s construction of the environmental situation.

Phenomenological models connect cognitive and decisional concepts to predicted international actions. Alexander George made an initial effort to spell out the causal nexus between operational codes and action (George, 1979). Stephen Walker (1977) and Harvey Starr (1984) went further in this vein. A cognitive model that has been developed extensively is the inherent bad faith model (Holsti, 1967; Stuart and Starr, 1982). In this model, an enemy image with relatively few sub-concepts is predicted to produce aggressive action toward the country seen in enemy terms. The enemy image is also predicted to be invulnerable, or at least highly resistant, to disconfirming information. The internal validity of the model has been shown to be quite strong empirically and it has been applied in a number of concrete analytic settings (Herrmann et al., 1997; Silverstein, 1989).

The most influential framework in a basically phenomenological perspective is the subjective expected utility (SEU) paradigm. In this framework, the most important concepts are the actor’s values and the actor’s perceptions of the situation which give rise to the actor’s expectations about the utility of action. As Herbert Simon (1985, 1995) has argued, if this framework is to say more than that people have reasons for what they do, then it is necessary to identify empirically both an actor’s values (motives from which utility is established) and an actor’s perceptions (subjective constructions of situations and beliefs about causes). It is also necessary to introduce a subsidiary theory specifying how calculations are made, such as (1) a statistically rational maximizing theory, (2) a satisficing theory that assumes actors will not search indefinitely for the optimal choice but take the first one that is satisfactory (Simon, 1979, 1982), and (3) a prospect theory that argues that the framing of probabilities in terms of gains and losses affects choice (Kahneman et al., 1982; Levy, 1992a, 1992b, 1997).

It is important not to confuse subjective expected utility theory with an objectivist version of scientific realist theory. In the latter, objective incentive structures are presumed to exist in the environment and the behavior of actors is predicted from a theory that says they will behave ‘as if’ they understood these incentive structures and calculated in a statistically rational power maximizing fashion (Lake and Powell, 1999). SEU theory emphasizes the subjective character of utility calculations, allowing actors to define different utility hierarchies and to operate with different cognitive constructions of the environment. This distinction in international relations research has been blurred.

The language of a phenomenological subjective expected utility framework has been adopted but often coupled with an empirical strategy that remains essentially objectivist and non-phenomenological. For instance, in Bruce Bueno de Mesquita’s (1981) influential book The War Trap, and later in his book with David Lalman (1986), War and Reason: Domestic and International Imperatives, the central variables of the expected utility model, that is utilities and expectations, are measured with objectivist indicators. Utilities are estimated by comparing the objective configuration of alliance portfolios and expectations are measured by assuming they equal the objective measure of power as indicated in the COW data with an objective discount factor built in to account for logistic complications that increase as the proximity of the combatants decreases. The conceptual discussion organizes the traditional power determinist data into a decision-making language, but the empirical test still relies on associations of COW measures of power complemented by the contribution made by an ally which is now treated as utility rather than as a modifier of power (for treatment of alliances in COW, see Singer, 1990).

Other studies, like Dan Reiter’s Crucible of Beliefs: Learning, Alliances, and World Wars, have also clearly adopted the language and labels of phenomenological perspectives but in operational terms treat these concepts as if they were determined objectively by the environment as the scholar understands it. Reiter (1996: 85), for instance, assumes an actor’s perception of a great power’s intentions, that is revisionist or status quo, are equal to whether the great power initiated a militarized crisis with another great power or regional ally of the great power in the previous year, with this objective picture of the previous period constructed by the scholar. Similarly, Reiter (1996: 86) determines an actor’s perception of the probability that war will break out by examining the relative share of great power capabilities held by the potential revisionist, using COW data to determine capabilities.

Another method for linking cognitive concepts to empirical evidence is to study an actor’s statements and choices. The study of statements has been done with systematic content analysis, linguistic discourse analysis, closed-ended survey instruments, structured interviews, free-wheeling interviews, as well as through focus groups and interactive dialogues. Although some of these methods have produced reliable measures for an actor’s values and perceptions, they have not been employed to generate large cross-national data sets that are commensurate to the Correlates of War coverage of states over time. Most of these methods remain labor-intensive and often require highly skilled labor at that, meaning people who have language and area expertise sufficient to carry out the interviews and who can construct instruments embedded in the context and vernacular of the actor.

The construction of valid observable indicators, for phenomenological variables, cannot be accomplished by improving the reproducibility of measures alone. It is naive both for philosophical and political reasons to believe that a scholar can simply listen to what an actor says or watch what they do and know how to describe the actor’s values and world-views. The very idea of values and world-view are concepts that belong to the scholar not the observed actor. Unless we believe in pure induction unmediated by language and concepts, it is impossible to approach this task except in a deductive manner. That is, the scholar needs to develop concepts, perhaps in an interactive fashion with the actor, and then devise propositions about what concepts might accurately describe the mindset of the actor, and finally employ questions designed to probe these possibilities or watch for choices deemed by the scholar to reveal these mental dispositions. The investigation can be carried out with different techniques as noted above but typically follows this basically positivist logic. Perhaps several illustrations will make the point more clearly.

For example, if Gorbachev is the actor under study, then competing models of his values and world-views could be constructed. In the case of each model of Gorbachev, the scholar could reason that if this model is an accurate representation, then we ought to see Gorbachev saying and doing a set of predicted things. William Riker (1995) suggested we would be wise to systematize in formal language the logic of these deductions and then examine the empirical record to see which model made the best predictions. In this way, international relations research could parallel economic research using a modified version of searching for revealed preferences. Of course, as in economics, if these preferences are then used to explain the action from which they are inferred, the enterprise becomes tautological. Consequently, the inferred values and perceptions need to be derived from statements and actions different from those they are being used to explain. This can be done by focusing on statements in a different domain than international relations, or by using the values and perceptions inferred in one time period to predict action in another time period, the way economists use revealed preferences to predict future market trends.

Another approach to the Gorbachev task would be to conduct in-depth interviews and analyze text. Rather than spelling out models, scholars might prefer to reproduce the narrative Gorbachev offers. There will be by necessity selection of text in this process. Only part of the narrative will be reproduced. Inevitably, this will include the part of the text the scholar believes is most reflective of Gorbachev’s values and world-view. Typically, scholars will spell out the inferential logic they have employed, but, at times, they may simply assert expertise and their feel for the region and subject. This, of course, does not make their representation invalid, but it does make it impossible to reproduce. Because pure induction is impossible, and, unless we accept illogical reification claiming that the mental concepts used by the scholar are actually in the head of Gorbachev, there is no way to avoid the deductive leap whether it is spelled out or not.

Many phenomenological approaches rely on concepts and theories drawn from psychology, not because they believe actors are irrational as much as they believe that to understand action we need to appreciate the actor’s point of view not only the scholar’s. Instead of comparing the actor’s construction of reality to the scholar’s and declaring deviations irrational, scholars operating in a phenomenological perspective forgo this comparison and use psychology and the cognitive sciences to refine the conceptual apparatus they use to represent actor values and world-views (Cottam, 1977; Herrmann, 1985). By anticipating that actors will organize their thinking in schematic and script-like fashion, use existing information to bias incoming information, allow emotion to affect memory and cognitive processing, and combine information in ways that do not follow statistical rules, scholars in this tradition hope to devise conceptual models that are more accurate and useful representations of the observed world (Herrmann et al., 1997; Kahneman and Tversky, 2000; Vertzberger, 1990). The phenomenological tradition in international relations does not abandon the positivist approach to science. It applies it to identify actor values and beliefs instead of the nature of the international environment and builds a picture of the environment by combining the pictures drawn of individual actors.

Treating international actors as anthropomorphic agents with psychological properties requires a simplification that presents formidable obstacles in the path of empirical research. First, these approaches need to define operationally whose values and mindsets matter, equally whose voice and choices should be studied. Does the top leader matter or are the beliefs and world-views of other people in the polity relevant? Second, if multiple people matter, then how should we aggregate and construct a picture of them as a collective actor? How can we decide what portion of whose point of view to include in the overall representation? Likewise, how can we guard against picking and choosing from among many different people’s statements and choices and constructing a representation of the group that is driven by our pre-existing biases and impressions? Moreover, how should we generalize to the group from the individual data we have? The problems with the stereotyping of out-groups, attributing to them essential and immutable features, are well-known.

Although rationalist perspectives conceive of agents taking purposeful action, they recognize that any single actor is part of a social system which entails multiple actors interacting. There are several ways that rationalist theories have studied the properties of systems and interaction.

Rationalist Theories and Interaction

Both objectivist and phenomenological perspectives have developed theories that explain the interaction between multiple actors and, at the same time, describe the system as a whole. Modeling behavior such as arms races provided an early application of game theory to international relations (Richardson, 1960). These models posited decision rules for two actors and then predicted the pattern of interaction in the system. By constructing multiple models that made distinct predictions about the pattern in the interactive acquisition of weapons, these models could then be compared to the historical record to see which models were most accurate.

For example, the interactive process could be modeled as one of reciprocity in which one actor’s acquisition of weapons was met by an equivalent counter-move by the other actor. Alternatively, a model of inverse reciprocity would predict that one actor’s acquisition or demonstration of strength would lead to appeasement and compromise on the part of the other actor. Russell Leng (1984, 1993) used such models to study the Soviet–American Cold War and other bilateral crises. He concluded that the Cold War evidenced more tit-for-tat spiral escalation than peace-through-strength inverse reciprocity. William Gamson and Andre Modigliani (1971) employed a similar strategy to test alternative theories of the Cold War that were explicitly phenomenological in form.

Game theorists have explored the logical dynamics of various bargaining relationships. The identification in formal mathematical terms of how different relationships work, for example, how a prisoners’ dilemma game differs from a chicken or stag hunt game. A number of game theorists have tried to gauge the resemblance between the observed world and their formal models of various systems of interaction (for example, Bueno de Mesquita and Lalman, 1986; Mansfield et al., 2000; Niou, et al., 1989). Of course, judgments about this empirical resemblance can produce as much controversy as the explication of the logic of the model (Walt, 1999).

Employing simulations is another strategy for studying whole systems as patterns of interaction. Simulations allow scholars to re-run their experiments many times to see if their models consistently produce similar outcomes and to identify the consequences of manipulating different possible causal factors. This experimental strategy is not available to history-based research in any fashion other than counterfactual arguments. Simulations provide a systematic method for running such counterfactual thought experiments. For example, Robert Axelrod (1984) used simulations to explore the logical outcomes expected of tit-for-tat strategies given different types of actors. Lars Eric Cederman (1997) built a computer model to represent the formation and interaction of nations. In Cederman’s simulations, distributions of national strength emerge over time as the actors interact. These emergent structures describe the history of the particular simulation. Cederman uses this method to test whether structural distributions occur and associate with war in the fashion realists expect. Just as with game-theoretic models, scholars could compare the predicted patterns Cederman’s models produce to a historical record of the observed world, and, in this fashion, judge the accuracy of competing models.

Game theory and simulations include sub-models of actors, but they typically focus on evidence related to interaction. In other words, they do not establish operational indicators for each actor’s motives and perceptions, but instead posit these in a model and deduce from the model what the interaction would look like if these assumptions about motives and perceptions were accurate. Scholars proceeding this way, then compare the type and amount of conflict and cooperation expected in the model of the systemic relationship to empirical evidence regarding these matters. It is also possible to identify empirically the values and world-views of two or more actors and then predict interaction from these estimates. Herrmann and Fischerkeller (1995), for instance, have argued for a cognitive approach of this type. They identify empirically the world-views in Iran, Iraq, the United States and the Soviet Union, use these to predict action and use the simultaneous and lagged prediction of strategic action to predict the pattern of interaction. To avoid some of the problems of trying to associate a motive and world-view with a single act, they employ a concept of a strategic script grouping events into sets that have strategic meaning.

Although interaction and the whole system of relations can be built by aggregating up from models that start with agents, it is also possible to concentrate on the character of the whole and consider the effect of the system on individual units. Given how complex the system of interaction can be in international relations, just describing the constitutive parts of the system can be a demanding theoretical task. Constructivists have emphasized at least two tasks that are part of examining the whole: (1) identifying the most important ideas that actors share and which thus define relationships, and (2) critically examining the power relationships that are embedded in the language scholars use to describe the whole and the political stage the scholar is acting upon.

Constructivist Theories

Critical Theory

Both objectivist and phenomenological theories of purposeful choice operate as if the scholar is not part of the political process being examined. Phenomenological approaches, for example, although not assuming that an objective reality leads to common perceptions among actors, treat the description of actors with distinct world-views interacting with one another as an objective description of the relationship. Conceiving of the scientific enterprise as comprising two distinct worlds, one that is conceptual and theoretical and the other which is empirical and observed, is at the core of positivist perspectives but it is not immune from criticism. A very common criticism is that the pictures scholars draw become part of the political process they are describing and thus affect the process being explained, either reinforcing it or changing it.

Phenomenologists argue that the social sciences are not similar to the physical sciences because subjects can be creative and proactive in ways objects cannot be. Critical theorists go further and suggest that the social sciences are also different than the physical sciences because the subjects of study are affected by the knowledge about them that is created by the scholars. Typically, patterns found in the physical movement of objects are not changed by the scholarly theories that explain these patterns. Human subjects, on the other hand, can learn from social science and this learning can produce change in subsequent behavior.

Because the production and dissemination of social science theory can affect the future behavior of those who come to believe it is true, the scholarly enterprise becomes part of the political interaction between actors. This leads to a concern that the construction of concepts, models and empirical testing is part of a strategic agenda serving material self-interests not simply academic ends. For instance, the conceptualization of the environment as anarchic, governed only by rules of self-help, may appeal more to powerful states than to weaker ones and make realist conceptions more popular in superpowers like the United States. Conceptions that argue that societal bonds and norms govern behavior in the environment, on the other hand, may be more popular in less powerful states, for instance, the contemporary United Kingdom (Bull, 1977; Buzan, 1993).

The effects of theory as it becomes part of the mental world of contemporary actors can be diverse. When leaders reify models, they turn them into self-fulfilling prophecies. When they decide to undo the previously observed pattern, now that they understand what it is, they create a new reality. For example, if the Cold War was a spiral model of mutual suspicion, then once this neorealist insight became part of the mindset of leaders they could see the dynamic of the security dilemma and act to change it (Osgood, 1962; Wendt, 1992). This philosophical point, perhaps most associated with Hegel (1952) and the early work of Marx (Marx, 1975: 57–198) has concrete implications for international relations theories. For instance, it leads to the expectation that ideas that become institutionalized change actors. They can, therefore, persist beyond the confines of the original agent-based calculations that led to their initial creation.

Institutions and Ideas

Ernst Haas (1990) has shown how this process can work. He argues that international organizations can promote certain ideas and establish a way of thinking about issues that then affects the way states come to understand the issues and identify their own interests. Haas begins with organizations coordinating affairs in technical domains where scientific expertise is often respected and shows how the adoption of the technical language and mindset common in the international organization can affect processes in the state. The evidence used to support this theoretical claim often includes a set of case studies of international organizations and states.

The empirical strategy typically involves showing that ideas popular in the organization come to be accepted in later periods by key leaders in the participating states. Often the causal claim rests primarily on the presentation of a sequential time-line emphasizing that the idea was evident in the international organization before it was evident in the top leadership circles of the state (Finnemore, 1996). This method can include an effort to trace the process by which the idea moved from the international organization to state-level discussions about interests. One way to strengthen the causal logic that is not always a part of these efforts would be to include a correlational logic. This would explore whether states that belong to an organization adopt different ideas than states that do not belong to the organization.

A somewhat different neoliberal theory of institutional effects has been developed by Robert Keohane (1989, 1993). In this theory, institutions promote cooperation by managing both communication inefficiencies and risks that are inherent in international relationships. By providing verification of compliance with agreements, early-warning of defections and sanctions of some sort for violation as well as mechanisms for adjudication, some institutions help actors overcome security dilemmas (Keohane and Martin, 1995). Neoliberal institutional theories have been tested empirically quite often in the economic realm. Some efforts have also been undertaken in the security realm (Wallander, 2000). The evidence in these studies typically involves a measure of institutionalization and a measure of cooperation. Both are typically treated in objectivist terms and a correlation is sought between higher and deeper levels of institutionalization and higher levels of cooperation.

Of course, a relationship between institutionalization and cooperation does not necessarily sustain the causal claim that institutions cause cooperation. The causal arrow could point in the opposite direction. The argument stressing the causal significance of institutionalization and membership in shared ideational communities often rests on a claim regarding the effect institutions have on agents, particularly on the ideas, identity and understanding of self-interest that drives agent behavior. How much institutions can change agents is a question that sits at the crux of contemporary debates between realists and neoliberal institutionalists (Jervis, 1999).


Theoretical development 

The broad constructivist argument that the ideas instantiated in institutions affect the identity of members has been investigated empirically in the domain of norms.

The theoretical argument is that, unlike coercive material power that can change behavior by compulsion, norms affect behavior by changing an actor’s motives and beliefs, that is their understanding of their interests. Norms produce, therefore, not only a logic that spells out the consequences of what will happen if they are violated but also a logic of what behavior is appropriate. That is, what someone ought to do. The instantiation of norms in institutions is expected to socialize actors, both those in the institution and those who want to join, and produce in these actors a sense of what they ought to do, and, in turn, affect how they behave. The strongest test for this constructivist theory is to show that in the consciousness of actors the logic of appropriateness is operative more than a utilitarian logic calculating material consequences.

Martha Finnemore (1996) examines UNESCO and the creation of state science bureaucracies, the Red Cross and the operation of the Geneva Conventions, and the norms established in the World Bank and strategies for dealing with poverty. She argues across these cases that the norms instantiated in the international institution led states to re-evaluate what their national interests were and to adopt the behavior identified by the institution as appropriate even when there was no compelling material reason for this choice. Finnemore’s empirical strategy is to trace the evolution in thinking about these matters inside the states and to demonstrate both that the ideas for change came from the international institutions and that the sort of reasoning they led to was not simply utilitarian but deon-tic, meaning states came to believe that certain behaviors were appropriate. The evidential base includes statements made by officials taken to represent the state, funding decisions taken by the state, and changes in state-level bureaucratic organization and rules taken as instantiated norms.

Richard Price (1997) with regard to chemical weapons and Nina Tannenwald (1999) with regard to nuclear weapons investigate the effect of the norms that proscribe the use of weapons of mass destruction. In these cases, theory is connected to evidence by demonstrating that states do not even consider using weapons of mass destruction. The case is strongest when there is a practical battlefield value that could be achieved by the employment of such weapons and states still do not even consider their use. In these case studies of decision-making, Price and Tannenwald trace the process by which war-fighting decisions were taken and try to demonstrate taboo weapons were not used because they were seen as inappropriate on normative grounds. The evidence includes interviews with policy-makers asked to provide retrospective pictures of the decision process, archival documents when available, and memoir literature. Of course, the non-use of these weapons is behavioral evidence consistent with the argument but indeterminate with regard to why these choices were eschewed. The central interpretative claim that system-wide normative ideas coupled with conceptions of state identity led to this behavioral outcome raises methodological problems not unique to constructivists endeavors. It is quite parallel to the motivational attribution problem that has been prominent in rationalist theories as discussed above.

Empirical obstacles 

Efforts to link constructivist theories regarding the role of norms and empirical evidence face a number of obstacles just as rationalist efforts to link theory and evidence do. One obstacle relates to the definition of a norm and the relationship between norms and interests. On one hand, the resurgence of research on norms has been fueled by arguments about how much norms matter compared to other motives. Martha Finnemore and Stephen Krasner, for example, have argued at length about the relative importance of logics of appropriateness and logics of consequences, with Finnemore (1996: 31, 87-9) contending norms are a potent motive and Krasner (1999: 6, 40-2, 66, 72, 238) arguing interests are trump. On the other hand, constructivists deny that norms and interests should be opposed to each other as distinct motives, arguing that norms shape interests.

The argument that norms shape interests, of course, can be taken as either a truism that refers to the definition of these concepts or as an empirical claim. If a scholar defines interests and norms as indistinguishable, for example, by defining normative desires as interests, then the two cannot be opposed as alternatives. This need not be the case, however. It is possible, to define norms and interests as distinct notions at the level of foreign policy motivation and then to recognize that interests are based on different norms. If the norms underpinning interests (for example, a norm that wealth is good) are different norms from those directly relevant to foreign policy (for example, a norm that sovereignty should be respected), then the norms and interests at the foreign policy level could be treated as independent factors. Assuming we are dealing with distinct concepts that have operational meaning at the same level of analysis, the relationship between norms and interests becomes an empirical question. Do norms shape interests? If so, how much and when? And do they also affect behavior?

There is not much controversy that norms affect verbal and rhetorical behavior. For realists, like Hans Morgenthau, however, the effect of norms was to generate the need for ideological disguises. Norms established desired practices, not the practices that actually prevailed. They give rise to justification, excuses and denials; ‘organized hypocrisy’, to use Krasner’s label. Constructivists do not disagree that norms have discursive importance, but they go farther. They argue that the study of discourse provides insight into the meaning of norms and action. Moreover, they argue that norms constrain states from doing what otherwise would make utilitarian sense. To link this theoretical debate to evidence is complicated. It requires identifying what a state would do if it were motivated by material concerns that is different from what it would do if it were motivated by normative ideas. Although it is possible to identify violations of normative principles, it is more difficult to demonstrate compliance. The empirical problem is quite similar in this domain to the problem plaguing the identification of successful deterrence. When a state complies with the normative principle, it might be doing this for several reasons. One of these reasons may be that they saw no material payoff for violating the norm. Politicians have many incentives to mislead observers on this score. For bargaining purposes, for example, leaders may want to claim they gave up an easy gain in the name of justice and now want reciprocation. As Morgenthau argued, leaders also have plenty of reasons to mislead themselves and to believe in normatively self-serving stories. How to establish what sort of mindset was active in decision-making and which beliefs were decisive is an empirical challenge quite similar to that faced by rationalists when attempting to determine motives and beliefs.

Empirical strategies and evidence

There are several methods for determining what norms are shared in a community. One way is to look for evidence of the norm in codified laws. Another strategy is to examine patterns in behavior and to argue these patterns reveal certain norms. This process of attaching meaning to observed behavior, of course, can be controversial, as is evident in efforts to determine proclivities to racism. A third strategy is to examine the discourse in a community by some form of content analysis, discourse analysis, survey, or in-depth interviews.

Regardless of which method constructivists employ to identify empirically the norms that are shared in a community, they need to guard against over-generalizing and evoking essentialist stereotypes. Because they often aim to describe the ideas that are common in a community, the task of generalization is central to the constructivist enterprise. Of course, sub-communities can be identified, but the problem still remains. When constructivists draw a picture of shared norms and beliefs at the national and even world-wide level the magnitude of this empirical challenge is clearly large. For instance, Alastair Iain Johnston (1995), in his study of Chinese culture and its affect on Chinese strategic ideas, uses texts that are many centuries old to draw a picture of an essential Chinese culture. Price (1997) draws on legal texts, the discourse of leaders and state behavior to draw inferences about the norm vis-à-vis chemical weapons that is shared world-wide. It is possible to identify different zones of the world in which certain norms are shared, but this distinction drawn from limited evidence in each zone has political implications. This is especially true when zones of war and zones of peace are identified, or more pointedly when this identification of zones really means the identification of those people who share peaceful and good norms like us and those who do not.

The risks of essentialist and self-serving biases are certainly not unique to constructivist research. However, given this tradition’s emphasis on generalization to the level of shared ideas and many of its practitioners’ preference for ethnographic in-depth interviews and broad-ranging discourse analysis, the risks are worth considering in some depth. They are also raised by the constructivist interest in constitutive theory (Wendt, 1999). That is, theory that describes the component parts of a system and the essential elements that make up the political phenomenon and entities under investigation. The empirical challenge inherent in trying to describe the elemental ideational parts of a social system may be illustrated by considering the case of US hegemony. Because prominent constructivists operate from an objectivist perspective vis-à-vis the ideas that define the ideational structure of an international system (Jepperson et al., 1996), they examine the nature of US hegemony and the shared norms and ideas that it represents (Cronin, 2001; Ruggie, 1996, 1997). The shared ideas in this system might be identified as norms of free trade, liberal civil rights and democratic governance.

In defining the ideational character of a system defined by US hegemony, the ideas that are shared among those who oppose US hegemony are also described. This often includes characterizing this opposition as opposed to the ideas the United States promotes, including free trade, civil rights and democracy. Islamic fundamentalists are sometimes identified as concrete examples of this ideational opposition. The problem, of course, is in defending empirically the picture drawn of the United States and the ideas it is presumed to represent. The contrast between the picture of a democratic human-rights promoting United States and the authoritarian human-rights denying other is of course very reminiscent of the imperial and colonial stereotypes examined in some detail in the debate over Orientialist essentialism (Halliday, 1995; Said, 1978). Although this picture of a benign and worthy hegemon opposed by an unreasonable and unworthy opposition may be popular in the United States, it is a description of the United States that many people in the Third World and Europe find unpersuasive. They do not believe that the United States promotes human rights, democracy, or non-proliferation for that matter. In their picture, the United States has not promoted norms of free trade, liberalism, democracy or non-proliferation, but has instead instantiated a system of norms that give priority to the pursuit of self-interest, wealth as the arbiter of truth and justice, and dictatorship where expedient.

The key point here is not to debate the substantive reality of US hegemony. Rather, it is to emphasize the challenge facing the effort to describe empirically the norms extant in a system. Whether there is US hegemony, and, if there is, what norms it empowers may be seen as especially controversial. Other claims about the ideas that prevail in a system, however, raise similar issues and the general problem cannot be avoided. For example, Jepperson, Wendt and Katzenstein (1996) point to the idea that Moscow and Washington were locked in a competitive relationship as a system-wide belief that was extant during the Cold War. This may be true, but voices in Beijing and throughout the Third World often doubted it. How to establish the prevalence of ideas or norms in a system without engaging in ethnocentric and essentialist stereotyping is a question both constructivists and rationalists need to pursue.

Norms may be conceived of as part of a ‘supra-personal objective order’ (Heider, 1958: 219), but they are enacted at the level of individual agents. Constructivists explain variation in enactment mostly as a result of whether an actor shares the norm or not. This interpretation, however, may underestimate the importance of variation across situations. Most norms take the form: moral people do X, in situations A, B or C unless Q and/or R prevail. In other words, situations are part of the definition of the norm and so are the exceptional conditions that define exemptions from the moral obligation. This is obvious in the literature on just wars (Walzer, 1977). Situations, however, are not necessarily objective givens. If perceptions of situations are integral to the process of norm enactment, then so are the cognitive and political processes that affect actor-level perception. In this regard, a number of constructivists have indicated the need for a complementary theory of agency (Adler, 1991; Checkel, 1998; DiMaggio, 1997).


Constructivists, of course, have not confined their research to the system level exclusively, or at least they have not always defined the international system as the system they are investigating. A number of constructivist efforts have explored the processes of identity formation. Collective identity can be thought of in at least three different ways. First, it can refer to the boundaries of the group and explore who is considered a part of the group. Second, it can refer to the attributes of a prototypical group member or to the features and values shared by the modal member. Third, identity can refer to the relationship a collective actor assumes vis-à-vis other collective actors. This third usage of the word treats identity as quite analogous to role or to the combination of self-image and image of other. In this regard, it generates empirical research parallel to that done in role theory (Walker, 1987) and image theory (Herrmann and Fischerkeller, 1995). Identity has also been used to refer to both the attributes of the collectivity and its role. Robert Herman (1996) uses this method to interpret change in Soviet foreign policy.

Using identity to mean the construction of group boundaries or providing answers to the question who is us, opens the long-standing question of why collective identities form. In other words, why do people come to understand themselves as part of a nation or other political entity? And why does this become an important part of their conception of themselves? Social identity theorists (Tajfel, 1981; Tajfel and Turner, 1986) have provided psychological reasons for why people attach a part of their understanding of self to groups, but they do not explain why nations and certain other groups take on such political importance. Political scientists and historians often explain this by the rise of mass politics and the emergence of nationalism. From the French and American revolutions onwards, the idea of popular sovereignty and mass-based legitimation of political authority has played an important role in world politics. The study of why nationalism forms has produced a very large literature.

Although the literature on nationalism cannot be reviewed here, four causal factors have received a great deal of attention. They are (1) the importance of nationalism to leaders; (2) the character of the mass public, particularly its attentiveness to politics; (3) the viability and functional advantage of a nation-state, concentrating especially on the economic base, the communication base and the attitudes of neighboring states and (4) the commonality of features shared by members of the in-group and the uniqueness of these features vis-à-vis out-groups. Shared language that is unique to the in-group and common memories of group history are often pointed to in this regard.

The empirical literature investigating the relationship between these four factors and the development of nationalism is large and covers a diverse set of communities. Rupert Emerson’s (1960) classic study entitled From Empire to Nation, outlines part of the story for each factor. Benedict Anderson (1991) and John Breuilly (1982) have developed the empirical case for the importance of leaders. Richard Cottam (1964) made a case for the importance of mass politics and attentiveness, while Karl Deutsch (1953) built a theory based on the importance of language and the viability of communication. Ernst Gellner (1983) developed an elaborate economies of scale argument and empirical test. Constructivist scholars like Geoff Eley and Ronald Suny (1996) have traced the emergence of common memory. All four factors have been found to relate to the emergence of common identity but no overarching all inclusive integrative theory has been successfully proposed and empirically defended.

Very few scholars treat collective identity as a primordial given. Most accept the conventional wisdom that these categories are social constructions (Hall, 1999; Spruyt, 1994). There is debate, however, over how malleable and flexible these identities are (Connor, 1994). Once constructed and institutionalized, they might be very difficult to change. This is likely to be especially true when states derive their legitimacy from these mass-beliefs, and, consequently, work hard at preserving (or establishing) these socially shared beliefs. Many of the early studies of nationalism also explored internationalism. Given that nation-states were relatively recent historical constructions, these students of nationalism who investigated the combination of previous units into nations wondered if still larger units were likely to form.

This question becomes more energized when it is connected to the observation that conceptions of nationalism and in-group versus out-group discrimination play a role in war. David Mitrany (1966) and Ernst Haas (1964) argued that moving beyond the nation-state was not simply an academic question, but a vital objective in conflict resolution. By promoting identification with a superordinate collective, cross-group hostility and conflict between two nations might be reduced. This was the hope in neofunctionalist strategies. The idea was to promote functional cooperation in technical areas in the anticipation that over time this narrow common ground would spill over to involve a broader array of people and functions. By combining this functional notion with Deustch’s emphasis upon communication, it might also be possible to promote larger security communities over time.

Theories of regional integration generated empirical research, especially in the context of Western Europe. More recently, predictions of an emerging European identity have led to substantial empirical research (Inglehart and Reif, 1991; Niedermayer and Sinnott, 1995). Studying mass identities through survey instruments, scholars have found that national identities remain even if European identities increase. In fact, it appears that in many European countries the people who say Europe is an important part of their identity also say that their nation is an important part of their identity (Martinotti and Stefanizzi, 1995). The two identities do not seem to be mutually exclusive nor is it clear that a stronger identity with Europe is associated with a less negative view of other nations in Europe. Substantially more investigation is needed on these questions.

Emanuel Adler and Michael Barnett have revitalized the study of security communities from a constructivist perspective, looking at more than half-a-dozen inter-state cases. It is not possible to summarize the evidential argument they mount but it is possible to note a continuing dilemma in linking theory and evidence. Barnett and Gause (1998), for example, examine the relative lack of success in building community in part of the Arab world. They identify a number of factors that explain this lack of success, poor Arab leadership playing a significant role for instance. Ian Lustick (1997), in contrast, explains the failure of an Arab great state to emerge largely in terms of the intervention of outside powers. He contends that, unlike in Europe where nation-states were built with substantial coercion and force, in the Arab world the European powers have intervened to prevent any Bismarck-like regional hegemon from establishing the great Arab state. Obviously, we cannot settle this interpretative debate here, but it does remind us that the interpretative task is inevitably tied to contemporary politics. If we want to persuade other scholars that our picture is warranted by empirical investigation and not simply national bias, then we will need to defend quite explicitly how the key concepts are linked to indicators and how we established the relative importance of various factors.

The Road Ahead

This chapter has reviewed progress and problems faced in linking theory and evidence. Four broad lessons will serve to bring this chapter to a close.

First, the form in which theory is presented is less important than the substance of the theory. It also is important not to confuse the language in which a theory is presented with the substance of the theory. For example, formal theory is not synonymous with rational choice theory. Both the rational choice calculating engine associated with micro-economics and other calculating engines can be represented in formal terms. Computational models use formal theory and advanced computing technologies to construct formal models of rational choice and other types of choice and variations on these themes. Barry O’Neill (1999) has used formal language to present a phenomenological theory of symbols and honor in international relations that is very different from the rational choice theory associated with Bueno de Mesquita (1981), even though both use formal language to organize their theoretical ideas.

A second overarching lesson is that attention needs to be devoted to the validity of the indicators used to operationalize concepts. Although a good bit of debate has focused on the methods of data creation, this has often concentrated on techniques for improving reliability. Often this means systematizing the data generation and quantifying the evaluations. Clearly, reproducibility is a valuable feature of evidence, but it is not the most essential feature. The most essential feature is the relevance of the evidence and its appropriateness to the argument.

For example, the crux of the debate about attitudinal evidence is not whether structured survey instruments tap attitudes more reliably than do open-ended extensive dialogues, but whether either strategy of interviewing has a persuasive and valid theory for interpreting what the responses and verbal input from the participants mean. What strategy of interpretation underpins the logic of the data generation? Are verbal statements, however lengthy, taken at face value or is an inferential theory employed to translate the statements into meaningful data that is taken to be the operational measure of a concept?

Third, although rationalist theories have adopted the language of phenomenological subjective expectations and constructivist theories emphasize the importance of shared ideas and consciousness, the empirical challenge posed by this phenomenological shift has not been addressed adequately. Operational measures for these concepts still often rely on objective factors and the assumption that the subjects under study must see the world the way the scholar does. What other strategies might be used to create operational estimates for phenomenological concepts is a question that has not received the attention it deserves. Too often theorists assume that area specialists can simply provide these variables as if they were facts.

Finally, international relations scholars have made substantial strides in connecting theory to past history but have made rather little use of theory to predict the future. On the one hand, there is a widespread recognition that the future cannot be predicted from international relations theory, yet, on the other hand, there is apparent confidence that empirical patterns found statistically are sound and inform our understanding of causation. Because we cannot re-run history and create a true control condition, however, the validity of the causal claim remains suspect. Looking forward, scholars recognize the importance of contingency, path sequencing and stochastic events. When looking backward, these complications play a less prominent role in the evaluation of causal tests. Logically, they should play the same role as when thinking about the future (Dawes, 1993).

Predictive accuracy is surely not the only way to judge theory, but making future-oriented predictions and evaluating the outcome over time may be a very effective way to improve theorizing and improve the linkage of theory to evidence. The exercise will surely humble any theorists who exaggerate the success international relations theorists have had in linking theory and evidence, but this is not an altogether negative outcome. Overconfidence in what we already take to be demonstrated can be a serious impediment to improving theory and empirical knowledge (Tetlock, 1999). The predictive task has had a positive effect on other fields such as meteorology, where experts are now quite well calibrated, that is, aware of how confident they ought to be in their theory and empirical tests. A similar improvement in the calibration of international relations theorists could be a positive outcome from a broader practice of making predictions, as might the increasing sophistication of theories purporting to explain international relations and improvement in their linkage to empirical evidence.