Logistic regression (a.k.a. logit regression) allows us to use the same logic (but with different math) for research situations where the dependent variable is measured at the categorical level of measurement. The most common version of this is when the options are binary; that is, there are only two choices. These research situations arise in the social sciences often. Pass/fail, conviction/acquittal, and prison/probation are possible examples of such binary pairs of options. There are methods that extend this idea beyond two categories (e.g., multinomial logistic regression), but these are beyond the scope of this text.
In ordinary least squares regression (OLS Regression, the stuff we already talked about), the prediction equation is based on a combination of the effects of predictor variables. You can visualize as this as various pulls and pushes on a playground seesaw. Logistic regression uses a fundamentally different approach: In logistic regression, predictions are made looking at the probabilities (also referred to as odds) associated with a particular predictor. Not that I’ve used the term binary several times. Binary, you probably recall from a computer class, means information made up entirely of zeros and ones. Keep that idea in mind, and recall the levels of information we discussed way back in the first chapter. Remember the nominal level? Most of the time, the binary Y variable that we want to predict with logistic regression represents a state of something; we usually put that into our statistical software using zeros and ones rather than names. For example, a researcher could code “fail” as 1 and “pass” as zero.
In OLS regression, a linear equation is built using the relative influence of the X scores to predict the value of Y. In logistic regression, we are not predicting Y per se, but rather we are predicting the probability of Y taking on a specific state. You can think of the null hypothesis for such a test as being like a coin toss. With a true coin toss, the chance of the coin coming up tails is 50%. In a research situation, if the X variable is related to the probability of Y, then the odds start to move in a particular direction. This means that we have a better chance of predicting the value of Y given the value of X (this part, at least, is the same as OLS regression). This means that the underlying logic of OLS regression and logistic regression are very similar. The math behind the results, however, is very different.
In previous sections, we’ve considered how to analyze dichotomous independent variables (IVs) using the technique of dummy coding. Often, researchers want to answer questions about dependent variables that are also dichotomous. Logistic regression estimates the probability of an outcome. Using this method, DVs are coded as binary variables with a value of 1 representing the occurrence of a target outcome, and a value of zero representing its absence.
Last Modified: 02/14/2019