In social research, we are often interested in complex, latent constructs that cannot be directly observed. Imagine, for example, that a political scientist is interested in “conservatism.” It would be a poor research project indeed of our researcher used the question “Are you a Democrat or a Republican?” to measure this variable. To really get at “conservatism,” the researcher will need to include several facets of what it means to be a conservative. This would make the measure binary—you either are or are not conservative. Yet, most political scientists would agree that conservatism is a matter of degree. Some people are much more conservative than others are. There are also different aspects to the idea; there are foreign policy beliefs, religious beliefs, criminal justice beliefs, and economic beliefs (just to name a few) that factor into the concept. Now, back up and reread that last sentence, taking note of how the word “factor” is used. In this context, it means “a component” of something (conservatism in our example).
Factor analysis is a family of statistical methods that examine the degree of relatedness of measured variables thought to indicate a common, underlying construct.
We use the logic of these “factors” all the time; we just don’t think about it on those exact terms. Let’s say for example that your statistics professor gives you a quiz on central tendency that contains ten multiple choice questions. Her logic in doing so is not to see if you can answer those ten particular questions per se. The intent, rather, is to assess the underlying construct of “central tendency knowledge.” Which questions she chose to use are based largely on professional judgment. The logic is that statistics professors are experts in statistics, and that expertise informs them as to what items are good measures of your knowledge of the material. But what if our professor is in a hurry and accidentally includes a question about variability? As a student, you would be upset. After all, you have not read the material on variability yet; you’ve studied central tendency. The question is not fair because it does not measure central tendency knowledge. To fix this problem, you would have to appeal to your professor’s expertise again, hoping that she would judge the item not to measure the underlying construct that is was supposed to measure (and not include it in your grade).
In our statistics quiz example, the issue is simple. The construct central tendency knowledge is well defined, and it is relatively easy to see that an item concerning variability does not belong in the list of measured variables (test questions). In social science research, it is common for researchers to deal with constructs (factors) that are not so well understood and easily quantified. Factor analysis provides a method of judging how well a group of items (e.g., questions) is related. The logic is very simple: If we measure the same thing several different ways, then all of those measurements should be highly correlated. Factor analysis can also help us figure out how many underlying constructs there are in a given set of data. Let’s return to our political scientist studying conservatism.
After an extensive review of the literature, our scholar may identify 25 questions thought to measure conservatism. But is it best to consider all of these items to measure such a general construct as “conservatism?” Or would our researcher achieve more accurate results if the construct is broken down into subcategories such as “religious conservatism” and “fiscal conservatism?” Since the ideas are related, all of the items will be correlated. Factor analytic techniques will break out these subcategories for us, based on the correlations between the items.
What factor analysis does, then, is identify a cluster of measurements that have high correlations between all of the items in the cluster. The computer will cluster the items; the subjective process of identifying and naming those clusters is still up to the researcher. For example, our political scientist friend may find that items 3, 6, 9, and 27 form a cluster (factor). What the nature of that cluster is may or may not be apparent. The items must be examined from a theoretical perspective to provide meaning to the numerical results. If all of them obviously have to do with social spending, then the interpretation is easy. If no such relationship is apparent, then interpreting the results can be quite difficult.
Last Modified: 02/14/2019