Why the logic of selecting least-likely cases to make stronger generalizations makes little sense in case study research, by Derek Beach, University of Aarrhus

Scholars justifying selection of cases often utilize a ‘least-likely’ case selection strategy. As commonly used, a least-likely case is one that, ‘on all dimensions except the dimension of theoretical interest, is predicted not to achieve a certain outcome, and yet does.’ (Gerring and Seawright, 2007: 115; also King, Keohane and Verba, 1994: 209). This means that we find least-likely cases in adverse contexts in which the causal relationship should not work but does. If one finds evidence that the theory works in a least-likely case, the argument is that we then can make a ‘Sinatra inference’, claiming that if it makes it there, the theory can make it everywhere (Levy, 2008: 12). The cross-case generalization is made with greater confidence because we found that the relationship worked even in adverse circumstances.

What is problematic is that likelihood here relates to a hypothesis about the incidence rates of the causal relationship across cases in the population. The least-likely case is only ‘least-likely’ in relation to what we otherwise know about incidence rates of the relationship across the population. Therefore, in Bayesian terms, confirmation of a theory in a low likelihood case makes us more confident of the relationship being present across the whole population.

But the only reason we can make this inference is that we assume the individual case tells us something meaningful about incidence rates across the population. However, when we engage in a case study, the evidence that is produced only tells us about whether the relationship worked as theorized in a particular case (Beach and Pedersen, 2016). In case-based research we evaluate evidence that updates our knowledge about what happened in particular cases, potentially followed by cautious and bounded generalizations to causally similar cases that have been compared in detail with the studied cases. The evidence that we utilize to make inferences is within-case, relating to the observable manifestations left by the operation of causal processes (i.e. mechanisms) in a particular case (Beach and Pedersen, 2016; Collier, Brady and Seawright, 2010).

The types of inferences that are then possible after we have studied a particular case is whether the causal relationship worked in the case. Full stop. If we find the relationship worked in a case where we did not expect it based on prior knowledge of the context (a least-likely case), this finding should result in a revision of the theory about the context in which the relationship holds. We basically learn that it worked in the case, meaning that our knowledge about the contextual (also termed scope) conditions in which the relationship worked were incorrect. But finding within-case evidence that a causal relationship worked where we did not think it would work tells us nothing about whether it works in other types of contexts.

For example, when studying elite decision-making, we might have selected what we might think was a least-likely case for the causal relationship between presidential leadership (C) and rational decision-making (O) because we have chosen a case where the context is one of severe crisis, with conditions like the extreme stakes involved and the short time frame for decisions leading us to expect that irrespective of C being present that we would find poor decision-making processes (~O). However, finding confirmatory within-case evidence that C was linked with O in the case despite the adverse context would not enable us to generalize to other cases of C->O in other contexts. All we learn is that C->O worked under conditions of severe crisis, which should lead us to revise our expectations about the contextual conditions under which the theory can work. Studying adverse context cases can tell us something about the outer bounds in which a relationship might work, but nothing about whether it works in more typical contexts. This logic can be seen in car manufacturers who take their vehicles to Northern Finland in the winter to test them in adverse conditions, enabling them to understand what the outer bounds of the contextual conditions in which they can function. But finding that the vehicle works in driving snow and -400C should not make us more confident that it will always start on a rainy day and 100C. In terms of research design, this also implies that we should only start exploring the outer bounds of the context after we have explored how the relationship works in more ‘normal’ contexts.

Case studies produce what can be termed within-case ‘mechanistic evidence’ that enables us to update our posterior confidence in a causal relationship being present in the studied case. Inferring beyond the single case to the rest of the population requires that we can make the claim that because we found it in case A, and because case A and B share the cause and outcome, and are similar on all (or almost all) causally relevant contextual conditions, we should expect to find similar things going on in case B as in the studied case A.

However, given the large differences in the contextual conditions present in least-likely,  most-likely cases, and more ‘normal’ cases, and given the sensitivity of causal processes to contextual conditions that are typically assumed in case-based research, we have strong reasons to expect that different types of cases would exhibit high degrees of causal heterogeneity. This is illustrated very simply in table 1, where four cases are scored on values of a cause and three contextual conditions and the outcome.







Case A




Case B




Case C




Case D





Table 1 – Any least-likely cases?

In the example, existing research suggests that the cause is linked to the outcome through causal mechanism 1 (CM1), and that the relationship only works when C1 is present and C2 and C3 are absent. However, in exploring other cases we find case D, where we would not expect the relationship to be present because it is an adverse (least-likely) context. Upon further probing, we find that the cause is linked to the outcome, but through another causal mechanism (CM2). Here finding confirming evidence in the ‘least-likely’ (adverse context) case tells us nothing about the other cases. Instead, it tells us that the cause can also produce the outcome in other contexts, but that it happens through different causal mechanisms. In other words, finding within-case evidence of a process in case D, which is a ‘least-likely’ case, tells us nothing about whether the relationship is present in cases A – C, where there are very different contextual conditions present.

This means that we cannot just infer that because we found confirming evidence of a causal mechanism in a ‘least-likely’ case that it should also be present in other, potentially causally dissimilar cases throughout the population (e.g., in most-likely cases). This point is particularly salient when discussing causal mechanisms in process-tracing, where finding within-case evidence of particular mechanisms linking C to O in a chosen case requires a strong assumption of causal homogeneity across cases to be able to infer from the studied cases to other similar cases (see Beach and Pedersen, 2016, chapters 3 and 9).

Therefore, the ‘Sinatra inference’ that states that if “you can make it there, you can make it everywhere” (Levy 2008 : 12) does not hold for case-based research. Irrespective of the musical virtues of the song, Sinatra was basically wrong in claiming that just because you ‘make it’ as a crooner in a major venue in New York you would ‘make it’ everywhere. The reason for this is that he ignores the importance of context for the causal relationship. For example, the style of music might matter, leading us to expect that irrespective of his level of talent, Sinatra probably would not ‘make it’ in a bluegrass club in Nashville or in a Chinese opera in Beijing. The type of audience would probably matter also, meaning that just because Sinatra made it a night club in New York and Las Vegas, we should not infer that he would also rock an audience that is hard of hearing in a nursing home in Albuquerque. Because context matters in case studies, we should be very cautious in inferring across causally heterogeneous sets of cases.

Concluding, least-likely logic applied to case selection conflates within-case and cross-case evidence in ways that are impossible to untangle. I therefore recommend the use of the term typical case without any qualifications about likelihood, referring to cases where the theorized causal relationship is possible because C, O, and the requisite contextual conditions for the relationship to occur are present. I also recommend thinking about whether one has access to empirical material or not when selecting cases, and whether the theory is predicted to leave empirical fingerprints that can be observed in a particular case. For example, choosing to studying bureaucratic politics by selecting a case from an organizational setting where most deliberations are not recorded in written form would be very difficult, as the theory would leave few empirical fingerprints that could be assessed. Additionally, accessibility concerns are paramount when selecting cases. There is a reason historians typically study only cases where there is a rich documentary record that is accessible after archives are opened after thirty or more years. Unfortunately, many of the research questions we want to investigate are too contemporary for archives to be opened and/or are so politically sensitive that we can have difficulties gaining access to impartial sources. Given that good case-based research is data-demanding, selecting typical cases where we expect to be able to access as much information as possible is a wise case selection strategy.


Beach, Derek and Rasmus Brun Pedersen. 2016. Causal Case Studies. Ann Arbor: University of Michigan Press.

Collier, David, Henry E. Brady, and Jason Seawright. 2010. Sources of Leverage in Causal Inference: Toward an Alternative View of Methodology. In Rethinking Social Inquiry: Diverse Tools Shared Standards, 2nd ed., ed. Henry E. Brady and David Collier, 161–200. Lanham: Rowman and Littlefield.

Gerring, John, and Jason Seawright. 2007. Techniques for Choosing Cases. In Case Study Research. Cambridge: Cambridge University Press, pp. 86–150.

King, Gary, Robert O. Keohane, and Sidney Verba. 1994. Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton: Princeton University Press.

Levy, Jack. 2008. Case Studies: Types, Designs, and Logics of Inference. Conflict Management and Peace Science 25 (1): 1–18.


Derek Beach (derek@ps.au.dk) is a professor of Political Science at the University of Aarhus, Denmark, where he teaches case study methodology, international relations, and European integration. He has authored articles, chapters, and books on case study research methodology, international negotiations, referendums, and European integration, and co-authored the books Process-tracing Methods: Foundations and Guidelines and Causal Case Study Methods (both with University of Michigan Press). He has taught qualitative case study methods at ECPR and IPSA summer and winter schools, held short courses at the APSA annual meeting on Process-tracing and case-based research, and numerous workshops and seminars on qualitative methods throughout the world. He is also an academic co-convenor of the ECPR Methods Schools.

You may also like...