Advanced Statistical Analysis:
Adam J. McKee, Ph.D.
This content is released as a draft version for comment by the scholarly community. Please do not distribute as is.
When a researcher computes a set of regression coefficients, there should be a concern about the accuracy of those estimates. Remember, since they are based on sample data, they are only estimates. It is helpful to consider the most common sources of error in most regression applications. Allison (1999) lists his top three:
- Measurement Error: Few variables can be measured with perfect accuracy, especially in the social sciences.
- Sampling Error: In many cases, our data are only a sample of some larger population, and the sample will never be exactly like the population.
- Uncontrolled Variation: Age and schooling are certainly not the only variables that affect a person’s income, and these uncontrolled variables may “disturb” the relationship between age and income (p. 14).
Given these sources of error, we can infer that random fluctuations in the values of coefficients are to be expected. To assess the likelihood of “chance” producing an observed coefficient, we can conduct hypothesis tests and construct confidence intervals.
References and Further Reading
Allison, P. D. (1999). Multiple regression: A primer. Thousand Oaks, CA: Sage.
Fox, J. (1997). Applied regression analysis, linear models, and related methods. Thousand Oaks, CA: Sage.
File Created: 08/24/2018 Last Modified: 08/27/2019
This work is licensed under an Open Educational Resource-Quality Master Source (OER-QMS) License.