1932

Abstract

Diagnostic classification tests are designed to assess examinees’ discrete mastery status on a set of skills or attributes. Such tests have gained increasing attention in educational and psychological measurement. We review diagnostic classification models and their applications to testing and learning, discuss their statistical and machine learning connections and related challenges, and introduce some contemporary and future extensions.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-033021-111803
2023-03-10
2024-04-15
Loading full text...

Full text loading...

/deliver/fulltext/statistics/10/1/annurev-statistics-033021-111803.html?itemId=/content/journals/10.1146/annurev-statistics-033021-111803&mimeType=html&fmt=ahah

Literature Cited

  1. Albert JH, Chib S. 1993. Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88:422669–79
    [Google Scholar]
  2. Allen MJ, Yen WM. 2001. Introduction to Measurement Theory Long Grove, IL: Waveland Press
  3. Anderson JR, Corbett AT, Koedinger KR, Pelletier R. 1995. Cognitive tutors: lessons learned. J. Learn. Sci. 4:2167–207
    [Google Scholar]
  4. Baker FB, Kim SH. 2004. Item Response Theory: Parameter Estimation Techniques New York: Marcel Dekker
  5. Balamuta JJ, Culpepper SA, Douglas JA. 2020. edcm: exploratory cognitive diagnostic models Package https://github.com/tmsalab/edm
  6. Birnbaum AL 1968. Some latent trait models and their use in inferring an examinee's ability. Statistical Theories of Mental Test Scores FM Lord, JW Tukey, MR Novick 395–479 Reading, MA: Addison-Wesley
    [Google Scholar]
  7. Bishop CM. 1995. Neural Networks for Pattern Recognition Oxford, UK: Oxford Univ. Press
  8. Blei DM, Kucukelbir A, McAuliffe JD. 2017. Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112:518859–77
    [Google Scholar]
  9. Bloom BS. 1956. Taxonomy of Educational Objectives: The Classification of Educational Goals. Book 1. Cognitive Domain New York: Longman
  10. Bock RD, Aitkin M. 1981. Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika 46:4443–59
    [Google Scholar]
  11. Brown JS, VanLehn K. 1980. Repair theory: a generative theory of bugs in procedural skills. Cogn. Sci. 4:4379–426
    [Google Scholar]
  12. Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B et al. 2017. Stan: a probabilistic programming language. J. Stat. Softw. 76:11–32
    [Google Scholar]
  13. Casella G, Berger RL. 2002. Statistical Inference Belmont, CA: Cengage Learn.
  14. Chang HH, Wang C, Zhang S. 2021. Statistical applications in educational measurement. Annu. Rev. Stat. Appl. 8:439–61
    [Google Scholar]
  15. Chen J, de la Torre J. 2013. A general cognitive diagnosis model for expert-defined polytomous attributes. Appl. Psychol. Meas. 37:6419–37
    [Google Scholar]
  16. Chen Y, Culpepper S, Liang F. 2020. A sparse latent class model for cognitive diagnosis. Psychometrika 85:1121–53
    [Google Scholar]
  17. Chen Y, Culpepper SA. 2020. A multivariate probit model for learning trajectories: a fine-grained evaluation of an educational intervention. Appl. Psychol. Meas. 44:7–8515–30
    [Google Scholar]
  18. Chen Y, Culpepper SA, Chen Y, Douglas J. 2018a. Bayesian estimation of the DINA Q matrix. Psychometrika 83:189–108
    [Google Scholar]
  19. Chen Y, Li X, Liu J, Ying Z. 2018b. Recommendation system for adaptive learning. Appl. Psychol. Meas. 42:124–41
    [Google Scholar]
  20. Chen Y, Liu J, Xu G, Ying Z 2015. Statistical analysis of Q-matrix based diagnostic classification models. J. Am. Stat. Assoc. 110:510850–66
    [Google Scholar]
  21. Chen Y, Liu Y, Culpepper SA, Chen Y. 2021. Inferring the number of attributes for the exploratory DINA model. Psychometrika 86:130–64
    [Google Scholar]
  22. Cheng Y. 2009. When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika 74:4619–32
    [Google Scholar]
  23. Chiu CY. 2013. Statistical refinement of the Q-matrix in cognitive diagnosis. Appl. Psychol. Meas. 37:8598–618
    [Google Scholar]
  24. Chiu CY, Chang YP. 2021. Advances in CD-CAT: the general nonparametric item selection method. Psychometrika 86:41039–57
    [Google Scholar]
  25. Chiu CY, Douglas J. 2013. A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. J. Classif. 30:2225–50
    [Google Scholar]
  26. Chiu CY, Douglas JA, Li X. 2009. Cluster analysis for cognitive diagnosis: theory and applications. Psychometrika 74:4633–65
    [Google Scholar]
  27. Chiu CY, Köhn HF. 2019. Nonparametric methods in cognitively diagnostic assessment. See von Davier & Lee 2019a 107–32
  28. Chiu CY, Köhn HF, Zheng Y, Henson R. 2016. Joint maximum likelihood estimation for diagnostic classification models. Psychometrika 81:41069–92
    [Google Scholar]
  29. Chiu CY, Sun Y, Bian Y. 2018. Cognitive diagnosis for small educational programs: the general nonparametric classification method. Psychometrika 83:2355–75
    [Google Scholar]
  30. Cronbach LJ, Meehl PE. 1955. Construct validity in psychological tests. Psychol. Bull. 52:4281–302
    [Google Scholar]
  31. Culpepper SA. 2015. Bayesian estimation of the DINA model with Gibbs sampling. J. Educ. Behav. Stat. 40:5454–76
    [Google Scholar]
  32. Culpepper SA. 2019. Estimating the cognitive diagnosis Q-matrix with expert knowledge: application to the fraction-subtraction dataset. Psychometrika 84:2333–57
    [Google Scholar]
  33. Culpepper SA, Hudson A. 2018. An improved strategy for Bayesian estimation of the reduced reparameterized unified model. Appl. Psychol. Meas. 42:299–115
    [Google Scholar]
  34. de la Torre J. 2008. An empirically based method of Q-matrix validation for the DINA model: development and applications. J. Educ. Meas. 45:4343–62
    [Google Scholar]
  35. de la Torre J. 2009. DINA model and parameter estimation: a didactic. J. Educ. Behav. Stat. 34:1115–30
    [Google Scholar]
  36. de la Torre J. 2011. The generalized DINA model framework. Psychometrika 76:2179–99
    [Google Scholar]
  37. de la Torre J, Douglas JA. 2004. Higher-order latent trait models for cognitive diagnosis. Psychometrika 69:3333–53
    [Google Scholar]
  38. de la Torre J, Lee YS. 2010. A note on the invariance of the DINA model parameters. J. Educ. Meas. 47:1115–27
    [Google Scholar]
  39. de la Torre J, van der Ark LA, Rossi G. 2018. Analysis of clinical data from a cognitive diagnosis modeling framework. Meas. Eval. Counsel. Dev. 51:4281–96
    [Google Scholar]
  40. Dempster AP, Laird NM, Rubin DB. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39:11–22
    [Google Scholar]
  41. DiBello LV, Roussos LA, Stout W. 2006. Review of cognitively diagnostic assessment and a summary of psychometric models. Handb. Stat. 26:979–1030
    [Google Scholar]
  42. DiBello LV, Stout WF, Roussos LA. 1995. Unified cognitive/psychometric diagnostic assessment likelihood-based classification techniques. See Nichols et al. 1995 361–89
  43. Doignon JP, Falmagne JC. 1985. Spaces for the assessment of knowledge. Int. J. Man Mach. Stud. 23:2175–96
    [Google Scholar]
  44. Doignon JP, Falmagne JC. 2012. Knowledge Spaces. Berlin: Springer Science & Business Media
    [Google Scholar]
  45. Falmagne JC, Koppen M, Villano M, Doignon JP, Johannesen L. 1990. Introduction to knowledge spaces: how to build, test, and search them. Psychol. Rev. 97:2201–24
    [Google Scholar]
  46. Fang G, Liu J, Ying Z 2019. On the identifiability of diagnostic classification models. Psychometrika 84:119–40
    [Google Scholar]
  47. Fisher RA. 1936. The use of multiple measurements in taxonomic problems. Ann. Eugen. 7:2179–88
    [Google Scholar]
  48. Gabrielsen A. 1978. Consistency and identifiability. J. Econom. 8:2261–63
    [Google Scholar]
  49. Gierl MJ, Leighton JP, Hunka SM. 2000. An NCME instructional module on exploring the logic of Tatsuoka's rule-space model for test development and analysis. Educ. Meas. Issues Pract. 19:334–44
    [Google Scholar]
  50. Glaser R. 1963. Instructional technology and the measurement of learning outcomes: some questions. Am. Psychol. 18:8519–21
    [Google Scholar]
  51. Glaser R, Nitko AJ. 1970. Measurement in learning and instruction Tech. Rep., Univ. Pittsburgh R&D Cent.
  52. Gu Y, Xu G. 2019. The sufficient and necessary condition for the identifiability and estimability of the DINA model. Psychometrika 84:2468–83
    [Google Scholar]
  53. Gu Y, Xu G. 2020. Partial identifiability of restricted latent class models. Ann. Stat. 48:42082–107
    [Google Scholar]
  54. Gyllenberg M, Koski T, Reilink E, Verlaan M. 1994. Non-uniqueness in probabilistic numerical identification of bacteria. J. Appl. Probab. 31:2542–48
    [Google Scholar]
  55. Haertel EH. 1990. Continuous and discrete latent structure models for item response data. Psychometrika 55:3477–94
    [Google Scholar]
  56. Hambleton RK, Novick MR. 1973. Toward an integration of theory and method for criterion-referenced tests. J. Educ. Meas. 10:3159–70
    [Google Scholar]
  57. Hansen MP. 2013. Hierarchical item response models for cognitive diagnosis Thesis, Univ. Calif. Los Angeles:
  58. Hartigan JA. 1975. Clustering Algorithms New York: John Wiley & Sons
  59. Hartigan JA, Wong MA. 1979. Algorithm AS 136: a K-means clustering algorithm. J. R. Stat. Soc. C 28:1100–8
    [Google Scholar]
  60. Hartz SM. 2002. A Bayesian framework for the unified model for assessing cognitive abilities: blending theory with practicality Thesis, Univ. Illinois Urbana-Champaign
  61. Hastie T, Tibshirani R, Friedman JH. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction New York: Springer
  62. Hastie T, Tibshirani R, Wainwright M. 2015. Statistical Learning with Sparsity: The Lasso and Generalizations Boca Raton, FL: CRC Press
  63. Hastings WK. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:197–109
    [Google Scholar]
  64. Heller J, Stefanutti L, Anselmi P, Robusto E. 2015. On the link between cognitive diagnostic models and knowledge space theory. Psychometrika 80:4995–1019
    [Google Scholar]
  65. Henson RA, Templin JL, Willse JT. 2009. Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika 74:2191–210
    [Google Scholar]
  66. Jiang H. 1996. Applications of computational statistics in cognitive diagnosis and IRT modeling Thesis, Univ. Illinois Urbana-Champaign
  67. Jiang Z, Carter R. 2019. Using Hamiltonian Monte Carlo to estimate the log-linear cognitive diagnosis model via Stan. Behav. Res. Methods 51:2651–62
    [Google Scholar]
  68. Johnson SC. 1967. Hierarchical clustering schemes. Psychometrika 32:3241–54
    [Google Scholar]
  69. Junker BW, Sijtsma K. 2001. Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Appl. Psychol. Meas. 25:3258–72
    [Google Scholar]
  70. Kaya Y, Leite WL. 2017. Assessing change in latent skills across time with longitudinal cognitive diagnosis modeling: an evaluation of model performance. Educ. Psychol. Meas. 77:3369–88
    [Google Scholar]
  71. Khorramdel L, Shin HJ, von Davier M 2019. GDM software mdltm including parallel EM algorithm. See von Davier & Lee 2019a 603–28
  72. Köhn HF, Chiu CY. 2016. A proof of the duality of the DINA model and the DINO model. J. Classif. 33:2171–84
    [Google Scholar]
  73. Köhn HF, Chiu CY. 2017. A procedure for assessing the completeness of the Q-matrices of cognitively diagnostic tests. Psychometrika 82:1112–32
    [Google Scholar]
  74. Langeheine RE, Rost JE. 1988. Latent Trait and Latent Class Models Boston, MA: Springer
  75. Lazarsfeld PF, Henry NW. 1968. Latent Structure Analysis New York: Houghton Mifflin
  76. Leighton JP, Gierl MJ, Hunka SM. 2004. The attribute hierarchy method for cognitive assessment: a variation on Tatsuoka's rule-space approach. J. Educ. Meas. 41:3205–37
    [Google Scholar]
  77. Li F, Cohen A, Bottge B, Templin J. 2016. A latent transition analysis model for assessing change in cognitive skills. Educ. Psychol. Meas. 76:2181–204
    [Google Scholar]
  78. Liu J, Kang HA. 2019. Q-matrix learning via latent variable selection and identifiability. See von Davier & Lee 2019a 247–63
  79. Liu J, Xu G, Ying Z 2012. Data-driven learning of Q-matrix. Appl. Psychol. Meas. 36:7548–64
    [Google Scholar]
  80. Liu J, Xu G, Ying Z 2013. Theory of the self-learning Q-matrix. Bernoulli 19:5A1790–817
    [Google Scholar]
  81. Liu J, Ying Z, Zhang S 2015. A rate function approach to computerized adaptive testing for cognitive diagnosis. Psychometrika 80:2468–90
    [Google Scholar]
  82. Lord FM. 1980. Applications of Item Response Theory to Practical Testing Problems Mahwah, NJ: Lawrence Erlbaum Assoc.
  83. Lunn D, Spiegelhalter D, Thomas A, Best N 2009. The BUGS project: evolution, critique and future directions. Stat. Med. 28:253049–67
    [Google Scholar]
  84. Ma W, de la Torre J. 2020. GDINA: an R package for cognitive diagnosis modeling. J. Stat. Softw. 93:1–26
    [Google Scholar]
  85. MacQueen J. 1967. Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, No. 14281–97 Oakland, CA: Univ. Calif. Press
    [Google Scholar]
  86. Macready GB, Dayton CM. 1977. The use of probabilistic models in the assessment of mastery. J. Educ. Stat. 2:299–120
    [Google Scholar]
  87. Maris E 1999. Estimating multiple classification latent class models. Psychometrika 64:2187–212
    [Google Scholar]
  88. Nelder JA, Wedderburn RW. 1972. Generalized linear models. J. R. Stat. Soc. A 135:3370–84
    [Google Scholar]
  89. Neyman J, Pearson ES. 1933. On the problem of the most efficient tests of statistical hypotheses. Philos. Trans. R. Soc. A 231:694–706289–337
    [Google Scholar]
  90. Nichols PD, Chipman SF, Brennan RL, eds. 1995. Cognitively Diagnostic Assessment Mahwah, NJ: Lawrence Erlbaum Assoc.
  91. OECD 2014. PISA 2012 Results: Creative Problem Solving: Students' Skills in Tackling Real-Life Problems. Volume V. Paris: OECD
    [Google Scholar]
  92. Piaget J. 1950. The Psychology of Intelligence New York: Harcourt Brace
  93. Ripley BD. 1996. Pattern Recognition and Neural Networks Cambridge, UK: Cambridge Univ. Press
  94. Roussos LA, DiBello LV, Stout W, Hartz SM, Henson RA, Templin JL 2007. The Fusion Model skills diagnosis system. Cognitive Diagnostic Assessment for Education: Theory and Applications J Leighton, M Gierl 275–318 Cambridge, UK: Cambridge Univ. Press
    [Google Scholar]
  95. Rupp AA, Templin J. 2008. The effects of Q-matrix misspecification on parameter estimates and classification accuracy in the DINA model. Educ. Psychol. Meas. 68:178–96
    [Google Scholar]
  96. Rupp AA, Templin J, Henson RA. 2010. Diagnostic Measurement: Theory, Methods, and Applications New York: Guilford
  97. Samejima F. 1995. A cognitive diagnosis method using latent trait models: competency space approach and its relationship with Dibello and Stout's unified cognitive-psychometric diagnosis model. See Nichols et al. 1995 391–410
  98. Sessoms J, Henson RA. 2018. Applications of diagnostic classification models: a literature review and critical commentary. Meas. Interdiscip. Res. Perspect. 16:11–17
    [Google Scholar]
  99. Spearman C. 1904. The proof and measurement of association between two things. Am. J. Psychol. 15:72–101
    [Google Scholar]
  100. Sutton RS, Barto AG. 2018. Reinforcement Learning: An Introduction Cambridge, MA: MIT Press
  101. Tang X, Chen Y, Li X, Liu J, Ying Z 2019. A reinforcement learning approach to personalized learning recommendation systems. Br. J. Math. Stat. Psychol. 72:1108–35
    [Google Scholar]
  102. Tang X, Wang Z, He Q, Liu J, Ying Z 2020. Latent feature extraction for process data via multidimensional scaling. Psychometrika 85:2378–97
    [Google Scholar]
  103. Tang X, Wang Z, Liu J, Ying Z 2021. An exploratory analysis of the latent structure of process data via action sequence autoencoders. Br. J. Math. Stat. Psychol. 74:11–33
    [Google Scholar]
  104. Tatsuoka KK. 1983. Rule space: an approach for dealing with misconceptions based on item response theory. J. Educ. Meas. 20:4345–54
    [Google Scholar]
  105. Tatsuoka KK. 1984. Caution indices based on item response theory. Psychometrika 49:195–110
    [Google Scholar]
  106. Tatsuoka KK 1990. Toward an integration of item-response theory and cognitive error analysis. Diagnostic Monitoring of Skill and Knowledge Acquisition N Frederiksen 453–88 Hillsdale, NJ: Lawrence Erlbaum Assoc.
    [Google Scholar]
  107. Tatsuoka KK. 1995. Architecture of knowledge structures and cognitive diagnosis: a statistical pattern recognition and classification approach. See Nichols et al. 1995 327–59
  108. Tatsuoka KK. 2009. Cognitive Assessment: An Introduction to the Rule Space Method New York: Routledge
  109. Tatsuoka KK, Birenbaum M, Tatsuoka MM, Baillie R 1980. Psychometric approach to error analysis on response patterns of achievement tests Tech. Rep., Comput. Based Educ. Res. Lab., Univ. Illinois Urbana-Champaign
  110. Tatsuoka KK, Tatsuoka MM. 1983. Spotting erroneous rules of operation by the individual consistency index. J. Educ. Meas. 20:3221–30
    [Google Scholar]
  111. Templin JL, Henson RA. 2006. Measurement of psychological disorders using cognitive diagnosis models. Psychol. Methods 11:3287–305
    [Google Scholar]
  112. Tyler RW, Gagne RM, Scriven M. 1972. Perspectives of Curriculum Evaluation (AERA Monograph Series on Curriculum Evaluation, Vol. 1) Chicago: Rand McNally
  113. von Davier M. 2005. A general diagnostic model applied to language testing data. ETS Res. Rep. Ser. 2005:2i–35
    [Google Scholar]
  114. von Davier M. 2018. Diagnosing diagnostic models: from von Neumann's elephant to model equivalencies and network psychometrics. Meas. Interdiscip. Res. Perspect. 16:159–70
    [Google Scholar]
  115. von Davier M. 2019. The general diagnostic model. See von Davier & Lee 2019a 133–53
  116. von Davier M, Lee Y-S 2019a. Handbook of Diagnostic Classification Models Cham, Switz: Springer
  117. von Davier M, Lee Y-S. 2019b. Introduction: from latent classes to cognitive diagnostic models. See von Davier & Lee 2019a 1–17
  118. Wang S, Douglas J. 2015. Consistency of nonparametric classification in cognitive diagnosis. Psychometrika 80:185–100
    [Google Scholar]
  119. Wang S, Yang Y, Culpepper SA, Douglas JA. 2018a. Tracking skill acquisition with cognitive diagnosis models: a higher-order, hidden Markov model with covariates. J. Educ. Behav. Stat. 43:157–87
    [Google Scholar]
  120. Wang S, Zhang S, Douglas J, Culpepper S. 2018b. Using response times to assess learning progress: a joint model for responses and response times. Meas. Interdiscip. Res. Perspect. 16:145–58
    [Google Scholar]
  121. Wang Z, Tang X, Liu J, Ying Z 2020. Subtask analysis of process data through a predictive model. arXiv:2009.00717 [cs.HC]
  122. Ward JH Jr. 1963. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58:301236–44
    [Google Scholar]
  123. Xu G. 2017. Identifiability of restricted latent class models with binary responses. Ann. Stat. 45:2675–707
    [Google Scholar]
  124. Xu G. 2019. Identifiability and cognitive diagnosis models. See von Davier & Lee 2019a 333–57
  125. Xu G, Shang Z. 2018. Identifying latent structures in restricted latent class models. J. Am. Stat. Assoc. 113:5231284–95
    [Google Scholar]
  126. Xu G, Wang C, Shang Z. 2016. On initial item selection in cognitive diagnostic computerized adaptive testing. Br. J. Math. Stat. Psychol. 69:3291–315
    [Google Scholar]
  127. Xu G, Zhang S. 2016. Identifiability of diagnostic classification models. Psychometrika 81:3625–49
    [Google Scholar]
  128. Xu X, Chang H, Douglas J. 2003. A simulation study to compare CAT strategies for cognitive diagnosis Presented at Annu. Meet. Am. Educ. Res. Assoc. Chicago: April 21–25
  129. Xu X, Fang G, Guo J, Ying Z, Zhang S 2022. Modeling interactive testlet effect under diagnostic classification models. In preparation
  130. Yamaguchi K, Okada K. 2020. Variational Bayes inference algorithm for the saturated diagnostic classification model. Psychometrika 85:4973–95
    [Google Scholar]
  131. Yuan M, Lin Y. 2006. Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. B 68:149–67
    [Google Scholar]
  132. Zhan P. 2020. Longitudinal learning diagnosis: minireview and future research directions. Front. Psychol. 11:1185
    [Google Scholar]
  133. Zhan P, Jiao H, Liao D. 2018a. Cognitive diagnosis modelling incorporating item response times. Br. J. Math. Stat. Psychol. 71:2262–86
    [Google Scholar]
  134. Zhan P, Li X, Wang WC, Bian Y, Wang L. 2015. The multidimensional testlet-effect cognitive diagnostic models. Acta Psychol. Sin. 47:5689–701
    [Google Scholar]
  135. Zhan P, Liao M, Bian Y. 2018b. Joint testlet cognitive diagnosis modeling for paired local item dependence in response times and response accuracy. Front. Psychol. 9:607
    [Google Scholar]
  136. Zhan P, Qiao X. 2022. DIAGNOSTIC classification analysis of problem-solving competence using process data: an item expansion method. Psychometrika 87:41529–47
    [Google Scholar]
  137. Zhang S, Chang HH. 2020. A multilevel logistic hidden Markov model for learning under cognitive diagnosis. Behav. Res. Methods 52:1408–21
    [Google Scholar]
  138. Zhang S, Wang Z, Qi J, Liu J, Ying Z 2023. Accurate assessment via process data. Psychometrika In press
    [Google Scholar]
/content/journals/10.1146/annurev-statistics-033021-111803
Loading
  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error