The language we use in the social sciences to talk about research studies is terribly confusing. Often, we talk in terms of relationships. In other words, we talk as though our interests were associational. Most social scientists are interested in social problems, and social problems must have causes. That is the ultimate point of the social scientific endeavor: To explain, predict, and ultimately control social phenomena. We don’t want to know the correlates of crime, we want to know what causes crime so we can intervene and keep it from happening. We don’t want to know what correlates what math anxiety, we want to know what causes it so we can intervene, stop it, and help our students do better on tests.
The key to determining causal relationships is assessing change. Most conventional statistics do a decent job of describing static things, such as population parameters from a sample. Traditionally, researchers have regarded true experiments as the ultimate tool for making causal statements precisely because true experiments include change in the design—we often refer to this change in terms of a “manipulation of the independent variable by the researcher.”
Every student has heard the mantra “correlation does not imply causation.” This truism leads us to an important principle: You cannot substantiate causal claims from associations alone. At the foundation of every causal conclusion, there must lie some causal assumption that is not testable in observational studies. An associational concept is any relationship that can be defined in terms of a joint distribution (think of a covariance matrix) of observed variables. Examples of associational concepts are correlation, regression, likelihood, odds ratio, “controlling for,” and so on. Examples of causal concepts are randomization, influence, effect, confounding, “holding constant,” spurious correlation, intervention, explanation, attribution, and so on. The former can, while the latter cannot be defined in term of the characteristics of observational sample data.
Simply put, social scientists must make causal assertions. A few bold researchers make such claims without hesitation, using phrases such as the IV “causes,” “predicts,” “affects,” “influences,” “explains,” the DV, or simply that “Y depends on X.” Many researchers shun such explicit language, choosing instead to couch their underhanded causal claims in suggestive language. Phrases such as “Y is associated with or related to x” are all too common. This is a sad state of affairs; Researchers must not shy away from making causal claims to the detriment of society. Causal claims are important for society, and it is crucial to understand when researchers can safely make them. This is a critical skillset for researchers and consumers of research alike.
We will follow Pearl’s (2009) simple definition of an effect “as a general capacity to transmit changes among variables.” Note that this definition pointedly left out that the effect can be conveniently described by a linear equation.
Much has been written about the philosophical and logical underpinnings of determining causation. We will not devote any time to these weighty matters. From a more practical perspective, we are concerned more with measuring the effect of a cause. We’ll also adopt the classic logic that three conditions must be met before we can make a causal statement about “X causing Y”:
- X must come before Y in time (called temporal precedence)
- X must be reliably correlated with Y (viz., chance must be ruled out)
- The observed relation between X and Y must not be explained by other causes
The first condition is a matter of common sense and seems logical in our everyday experiences. If I hypothesized that hitting a ball with a bat made it fly out of the ballpark, you wouldn’t see anything wrong with my logic. If, however, I theorized that balls flying out of the park made them hit bats, you’d think I was rather illogical. The problem with research situations is that often which came first is not known and cannot be easily determined. Often, researchers turn to theory to bridge this gap. This approach is especially problematic when there are competing theories that postulate opposite causal relationships. In other words, one theory can specify that X causes Y, but another may specify that Y causes X. If temporal precedence can be established, the theoretical conflict can be resolved with conviction.
The second condition also makes a lot of practical sense. As every skeptic asks, “What does that have to do with the price of tea in China?” The very idea behind that colloquial wisdom is that two things that are not related can’t possibly have a causal relationship. Note that a correlation is necessary but not sufficient to infer causation. From a mathematical perspective, a correlation between X and Y will be the same in three cases:
- X causes Y
- Y causes X
- Z causes both X and Y
The third condition is the one that poses the most difficulties and has to do with the exogeneity of X. That is, X varies randomly and is not correlated with omitted causes. Such apparent correlations caused by some omitted cause is sometimes referred to as a spurious correlation.
Suppose that we have conducted an experiment, where individuals were assigned randomly (say using the randomization function in Excel) to an experimental and a control condition which we’ll call variable X. The manipulation of the IV came before the outcome (Y) in time, and temporal precedence is thus established. The grouping variable (X) correlates reliably with the outcome. How do we rule out other (Z) causes? There could be an infinite amount of potential explanations as to why the proposed “cause” correlates with the “effect.”
One approach to testing the veracity of a causal claim made in this situation is to examine the counterfactual model. The counterfactual asks two important questions:
- If the individuals who received the treatment had in fact not received it, what would we observe on the DV for those individuals?
- If the individuals who did not receive the treatment had, in fact, received it, what would we have observed on the DV?
And that, in a nutshell, is why researchers make a big deal about random assignment to groups. If random assignment is used, then the individuals in the control and treatment groups are roughly identical at the start of the experiment (that is, before the IV is manipulated by the researcher). At that point (at least in theory) the two groups are interchangeable. In other words, each group is the counterfactual for the other group.
The treatment effect is simply the difference in Y for the treatment and control group. In a randomized experiment, the treatment effect is correctly estimated when using regression analysis (as well as ANOVA models, because ANOVA is just a special case of regression).
An important take away from all this is that the use of an “Experimental Design” doesn’t do anything for the veracity of causal statements. You don’t get any causal accuracy points for using an ANOVA instead of a regression model. The veracity has everything to do with research design and nearly nothing to do with the statistical analysis.
When the assumptions of randomized assignment do not hold, such as when we are using preexisting groups, there is no way to directly observe the counterfactuals. There may or may not be differences on some important variable; we simply don’t know.
Last Modified: 02/18/2019