In statistics and research, when we use the term “effect” we are talking about the effect of an independent variable on a dependent variable. Similarly, when we talk about the “effect size” we are talking about the magnitude of the effect. Most of the language that social scientists use come from experimental research in medicine and psychology. In true experiments, the independent variable is manipulated by the researcher; such researchers are thus interested in treatment effects. Most of the time, they can examine the existence of a treatment effect using two basic methods (that are really a single method—the difference is language and vantage point).
The first method is to have an experimental group and a control group. The experimental group gets the treatment, and the control group gets a placebo (or the status quo treatment). One approach is to look at the mean differences between the experimental group and the control group. A difference in the means suggests that the treatment worked. One of the most widely used statistics to test such research designs is the t-test. Recall from your earlier statistics courses that t-tests are used when the independent variable is the grouping variable—in other words, the value of the independent variable for each participant is whether or not they got the treatment. Also, recall that we can expand this basic idea to look at more than two groups using the ANOVA designs.
You may never have considered exactly what t represented when you computed them in the past. (You needed to know what t was rather desperately, not what it means!). The computation of t is really just a ratio, expressed as a fraction. The top of the t equation is just the difference between the means. The bottom is just an estimate of the standard deviation in the difference between means distribution. That quantity gives us an estimation of about how much difference in means we can expect from purely random selection effects. So, t gives us a ratio of the observed effect from the experiment and the likely effects of random chance. In other words, it gives us a “signal to noise” ratio.
t = experimental difference / likely random chance difference (standard error of the differences)
If the chances are equal that the treatment worked and that random chance caused the observed differences, then we can’t make any statements about the treatment working. It’s just a coin toss. When the ratio starts to get bigger than one, we start to have more evidence that the effect was real and not due to random chance. We could (if we weren’t trying to be scientific) draw arbitrary lines in the sand and specify how many times larger than the effect of chance we needed the observed (experimental) effect to be before we are willing to say that the treatment effect very likely worked and that random chance effects are very unlikely as an explanation of the experimental results (recall that such a statement is the idea of statistical significance). Thanks to the properties of the normal curve, we can draw statistical lines in the sand based on probabilities. These shift with the number of participants, which is why researchers like large sample sizes—the power of a statistical test increases as sample size increases. As you can see from the ratio that makes up t, the bigger the effect size, the bigger the ratio will be; this means that the bigger the effect size is, the easier it is to find statistical significance (reject the null hypothesis).
When we choose an alpha of .05, we are essentially specifying that if the observed difference is about twice the size of the likely difference due to chance (as estimated by the standard error), then we will reject the null hypothesis and conclude that the population means is not likely to be zero (p < .05).
Let’s pause for a minute and clarify some terminology that may be different in “mean difference tests” and in regression analysis. As we previously stated, the independent variable has an “effect” on the dependent variable. If we make our analysis more complex, such as with a two-way ANOVA, then we’ll have two different independent variables. This complicates the analysis because we have two effects to analyze. To make matters worse, since the two factors can interact, we have to examine interaction effects. So, in the language of ‘experimental designs,’ we have two main effects and one interaction effect to analyze when we are doing a two-way ANOVA. Note that the two independent variables in such an analysis are known as factors in the language of experimental design. We can do some pretty complicated designs (given the right software) that go beyond the basic two-way ANOVA in complexity. These are considered under the heading of “factorial designs” in most experimental design textbooks. In the language of regression analysis, they are just called independent variables. The differences in terminology are historical accidents. When you understand the idea of the General Linear Model (GLM), the distinctions evaporate and understanding what is going on becomes merely a matter of keeping a bunch of synonyms straight.
When you hear the idea of an “effect” in regression analysis, you are usually dealing with someone steeped in the methods of experimental designs using regression analysis to analyze experimental results.
Note that a basic assumption of Pearson’s r is that both the dependent variable and the independent variable are measured at least at the ratio level of measurement. This means you can’t do a Pearson’s r on experimental data because the IV (the grouping variable) is discrete—you are either in group 1 or group 2. Simple linear regression has the same limitation. However, brilliant mathematical statisticians figured out that if you dummy code the independent variables, everything works out okay. The way dummy coding works is simple; if a participant is in the category of interest, you score them as 1. If they are not in the category of interest, you score them as 0. For example, if a participant got the pill in a medical experiment, then you’d code them as 1. If they got the placebo, you’d code them as zero. You can also code naturally occurring categories like this, but you have to be careful the remember your coding scheme when you are analyzing results, so you know which category to attribute to what results. In other words, if you were interested in a gender effect, you could code female participants as 1, and then code male participants as 0.
Using dummy coding, then, you can get the exact same results and reach the exact same conclusions using regression as you would get using a t-test. In a regression analysis, B gives you the mean difference between the two groups. The standard error of B in the regression analysis is the same as the standard error in a t-test. And, of course, the value of t for the coefficient in a regression analysis is the same as the value of t in a regression analysis. The intercept in the regression analysis is the mean of the group that you coded zero. Think about the implications of that given the way the regression equation works. The predicted value of Y will be the intercept (the mean of the group you coded zero), and the slope of the line, which is B. B is the mean difference, so you get the mean of the group coded 1 by adding the mean difference to the mean of the other group. It makes perfect sense in the simple case of a binary (only two values, such as experimental and control group) independent variable. Note that if you code things the other way around, the results will stay the same, but the sign of B will reverse since the slope of the line will be going the other way.
I will not belabor the point here, but we can extend this logic to one-way ANOVA designs, two-way ANOVA designs, and unique models with mixed types of independent variables that fit our research situation.
multiple regression analysis, regression coefficients, partial regression coefficients, intercept, constant, R-square, dummy code
Last Modified: 02/14/2019