1932

Abstract

Methods textbooks in sociology and other social sciences routinely recommend the use of the logit or probit model when an outcome variable is binary, an ordered logit or ordered probit when it is ordinal, and a multinomial logit when it has more than two categories. But these methodological guidelines take little or no account of a body of work that, over the past 30 years, has pointed to problematic aspects of these nonlinear probability models and, particularly, to difficulties in interpreting their parameters. In this review, we draw on that literature to explain the problems, show how they manifest themselves in research, discuss the strengths and weaknesses of alternatives that have been suggested, and point to lines of further analysis.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-soc-073117-041429
2018-07-30
2024-04-19
Loading full text...

Full text loading...

/deliver/fulltext/soc/44/1/annurev-soc-073117-041429.html?itemId=/content/journals/10.1146/annurev-soc-073117-041429&mimeType=html&fmt=ahah

Literature Cited

  1. Agresti A 2002. Categorical Data Analysis New York: Wiley
  2. Ai C, Norton EC 2003. Interaction terms in logit and probit models. Econ. Lett. 80:123–29
    [Google Scholar]
  3. Allison PD 1999. Comparing logit and probit coefficients across groups. Sociol. Methods Res. 28:186–208
    [Google Scholar]
  4. Allison PD 2009. Fixed Effects Regression Models Los Angeles, CA: Sage
  5. Angrist J, Pischke J-S 2009. Mostly Harmless Econometrics Princeton, NJ: Princeton Univ. Press
  6. Breen R, Goldthorpe JH 1999. Class inequality and meritocracy: a critique of Saunders and an alternative analysis. Br. J. Sociol. 50:1–27
    [Google Scholar]
  7. Breen R, Holm A, Karlson KB 2014. Correlations and non-linear probability models. Sociol. Methods Res. 43:571–605
    [Google Scholar]
  8. Breen R, Karlson KB 2013. Counterfactual causal analysis and non-linear probability models. Handbook of Causal Analysis for Social Research SL Morgan 167–87 Dordrecht, Neth.: Springer
    [Google Scholar]
  9. Breen R, Karlson KB, Holm A 2013. Total, direct, and indirect effects in logit and probit models. Sociol. Methods Res. 42:164–91
    [Google Scholar]
  10. Breen R, Luijkx R, Müller W, Pollak R 2009. Nonpersistent inequality in educational attainment: evidence from eight European countries. Am. J. Sociol. 114:1475–521
    [Google Scholar]
  11. Cameron SV, Heckman JJ 1998. Life cycle schooling and dynamic selection bias: models and evidence for five cohorts of American males. J. Political Econ. 106:262–333
    [Google Scholar]
  12. Cox DR 1958. Planning of Experiments New York: Wiley
  13. Cramer JS 2003. Logit Models from Economics and Other Fields Cambridge, UK: Cambridge Univ. Press
  14. Cramer JS 2007. Robustness of logit analysis: unobserved heterogeneity and mis-specified disturbances. Oxf. Bull. Econ. Stat. 69:545–55
    [Google Scholar]
  15. Cummings P 2009. The relative merits of risk ratios and odds ratios. Arch. Pediatr. Adol. Med. 163:438–45
    [Google Scholar]
  16. Fisher RD 1935. The Design of Experiments Edinburgh: Oliver & Boyd
  17. Gail MH 1986. Adjusting for covariates that have the same distribution in exposed and unexposed cohorts. Modern Statistical Methods in Chronic Disease Epidemiology SH Moolgavkar, RL Prentice 3–18 New York: Wiley
    [Google Scholar]
  18. Gail MH, Wieand S, Piantdosi S 1984. Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika 71:431–44
    [Google Scholar]
  19. Greene WH 2011. Econometric Analysis Upper Saddle River: Prentice Hall, 7th ed..
  20. Greenland S 2004. Model-based estimation of relative risks and other epidemiologic measures in studies of common outcomes and in case-control studies. Am. J. Epidemiol. 160:301–5
    [Google Scholar]
  21. Greenland S, Robins JM, Pearl J 1999. Confounding and collapsibility in causal inference. Stat. Sci. 14:29–46
    [Google Scholar]
  22. Guo G, Zhao H 2000. Multilevel modeling for binary data. Annu. Rev. Sociol. 26:441–62
    [Google Scholar]
  23. Hauck WW, Anderson SD, Marcus SM 1998. Should we adjust for covariates in nonlinear regression analysis of randomized trials. Control. Clin. Trials 19:249–56
    [Google Scholar]
  24. Hauck WW, Neuhaus JM, Kalbfleisch JD, Anderson S 1991. A consequence of omitted covariates when estimating odds ratios. J. Clin. Epidemiol. 44:77–81
    [Google Scholar]
  25. Holm A, Ejrnaes M, Karlson KB 2015. Comparing linear probability models across groups. Qual. Quant. 49:1823–34
    [Google Scholar]
  26. Holm A, Jaeger MM 2011. Dealing with selection bias in educational transition models: the bivariate probit selection model. Res. Soc. Stratif. Mobil. 29:311–22
    [Google Scholar]
  27. Karlson KB 2015. Another look at the method of y-standardization in logit and probit models. J. Math. Sociol. 39:29–38
    [Google Scholar]
  28. Karlson KB, Holm A, Breen R 2012. Comparing regression coefficients between same-sample nested models using logit and probit: a new method. Sociol. Methodol. 42:274–301
    [Google Scholar]
  29. Kohler U, Karlson KB, Holm A 2011. Comparing coefficients of nested nonlinear probability models. Stata J 11:420–38
    [Google Scholar]
  30. Lee L-F 1982. Specification error in multinomial logit models: analysis of the omitted variable bias. J. Econom. 20:197–209
    [Google Scholar]
  31. Lee Y, Nelder JA 2004. Conditional and marginal models: another view. Stat. Sci. 19:219–38
    [Google Scholar]
  32. Long JS 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage
  33. Long JS 2009. Group comparisons in logit and probit using predicted probabilities Work. Pap., Indiana Univ. http://www.indiana.edu/∼jslsoc/files_research/groupdif/groupwithprobabilities/groups-with-prob-2009-06-25.pdf
  34. Maddala GS 1983. Limited-Dependent and Qualitative Variables in Econometrics. Cambridge, UK: Cambridge Univ. Press
  35. Mare RD 2006. Response: statistical models of educational stratification: Hauser and Andrew's models for school transitions. Sociol. Methodol. 36:27–37
    [Google Scholar]
  36. McCullagh P 1980. Regression models for ordinal data. J. R. Stat. Soc. Ser. B 42:109–42
    [Google Scholar]
  37. McKelvey RD, Zavoina W 1975. A statistical model for the analysis of ordinal level dependent variables. J. Math. Sociol. 4:103–20
    [Google Scholar]
  38. Mood C 2010. Logistic regression: why we cannot do what we think we can do, and what we can do about it. Eur. Sociol. Rev. 26:67–82
    [Google Scholar]
  39. Neuhaus JM, Kalbfleisch JD, Hauck WW 1991. A comparison of cluster-specific and population-averaged approaches for analyzing correlated binary data. Int. Stat. Rev. 59:25–35
    [Google Scholar]
  40. Powers DA, Xie Y 2008. Statistical Methods for Categorical Data Analysis Bingley, UK: Emerald, 2nd ed..
  41. Robinson LD, Jewell NP 1991. Some surprising results about covariate adjustment in logistic regression models. Int. Stat. Rev. 58:227–40
    [Google Scholar]
  42. Rodríguez G 2008. Multilevel generalized linear models. Handbook of Multilevel Analysis J de Leeuw, E Meijer 335–76 New York: Springer
    [Google Scholar]
  43. Rodríguez G 2015. Multilevel models in demography. International Encyclopedia of the Social and Behavioral Sciences JD Wright 48–56 Oxford: Elsevier, 2nd ed..
    [Google Scholar]
  44. Swait J, Louviere J 1993. The role of the scale parameter in the estimation and comparison of multinomial logit models. J. Mark. Res. 30:305–14
    [Google Scholar]
  45. Train K 2009. Discrete Choice Methods with Simulation Cambridge, UK: Cambridge Univ. Press
  46. Vaupel JW, Yashin AI 1985. Heterogeneity's ruses: some surprising effects of selection on population dynamics. Am. Stat. 39:176–85
    [Google Scholar]
  47. Williams R 2009. Using heterogeneous choice models to compare logit and probit coefficients across groups. Sociol. Methods Res. 37:531–59
    [Google Scholar]
  48. Winship C, Mare RD 1984. Regression models with ordinal variables. Am. Sociol. Rev. 49:512–25
    [Google Scholar]
  49. Wooldridge JM 2002. Econometric Analysis of Cross Section and Panel Data Cambridge, MA: MIT Press
  50. Yatchew A, Griliches Z 1985. Specification error in probit models. Rev. Econ. Stat. 67:134–39
    [Google Scholar]
  51. Zeger SL, Liang K-Y, Albert PS 1988. Models for longitudinal data: a generalized estimating equation approach. Biometrics 44:1049–60
    [Google Scholar]
/content/journals/10.1146/annurev-soc-073117-041429
Loading
  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error