# Advanced Statistics | Section 1.2

This content is released as a draft version for comment by the scholarly community.  Please do not distribute as is.

## Section 1.2:  What Are Linear Models?

Perhaps the most basic element of social scientific inquiry is the search for cause-and-effect relationships.  That is, for one reason or another, we want to be able to explain why some phenomenon (usually a human behavior) happens.  Of course, our reverence and respect for the scientific method cause complications.  Social scientists tend to avoid the terms “cause and effect” and prefer to talk about the Independent Variable (IV) and Dependent Variable (DV), or we simply say that the two (or more) variables of interest are “correlated.”

Let us for a few moments consider something of a philosophical abstraction:  What does it take to make a causal statement that social scientists can respect as rigorous and valid?  The most common way to talk about these things is to symbolize the thing doing the causing (the IV) as “X.”  The thing being caused (the DV) is symbolized as “Y.”  While the average researcher wouldn’t be so crass as to come out and say it, the point of the researcher’s efforts is to demonstrate that X causes Y.  For this statement to meet social scientific rigor, we must demonstrate temporal precedence.   Temporal precedence is just a \$64,000 word meaning that X comes before (precedes) Y in time.  This is a built-in element of a true experiment:  The researcher knows the cause (“treatment” in an experiment) comes before the outcome that is measured (the DV) because the treatment was manipulated by the researcher.  Researchers can (ideally) establish temporal precedence by experimental design (e.g., by conducting a true experiment).  Another method is to use logic.  This usually requires that a well-accepted theory be employed to refute the possibility that Y causes X.   A researcher, for example, could advance a theory that balls being hit by bats is what causes them to fly out of the park.  We would reject the alternative thesis that balls flying out of the park causes them to be hit by bats.  The latter statement is judged invalid because it goes against our knowledge of physics, and on a more visceral level, it goes against a lifetime of empirical observations.

Another requirement of a causal statement is that some variable outside of our consideration is actually doing the causing, and it just seems like X causes Y.  In our research, we often consider variables as closed systems; we are only worried about X and Y.  We often do not consider Z, and have not measured it.  Since it hasn’t been considered or measured, we didn’t include it in any of our analyses.  Consider what happens if Z and X are related (correlated) such that when Z goes up, X also goes up.  Let us further suppose that Z and Y are correlated such that when Z goes up, Y goes up (and vice versa).  If we observe only X and Y without the benefit of Z, we note that whenever X goes up Y also goes up.  They are correlated.  It would be wrong to conclude, however, that X causes Y.

```File Created: 08/24/2018