Graphical models can represent a multivariate distribution in a convenient and accessible form as a graph. Causal models can be viewed as a special class of graphical models that represent not only the distribution of the observed system but also the distributions under external interventions. They hence enable predictions under hypothetical interventions, which is important for decision making. The challenging task of learning causal models from data always relies on some underlying assumptions. We discuss several recently proposed structure learning algorithms and their assumptions, and we compare their empirical performance under various scenarios.


Article metrics loading...

Loading full text...

Full text loading...


Literature Cited

  1. Ali RA, Richardson TS, Spirtes P. 2009. Markov equivalence for ancestral graphs. Ann. Stat. 37:2808–37 [Google Scholar]
  2. Andersson SA, Madigan D, Perlman MD. 1997. A characterization of Markov equivalence classes for acyclic digraphs. Ann. Stat. 25:505–41 [Google Scholar]
  3. Angrist JD, Imbens GW, Rubin DB. 1996. Identification of causal effects using instrumental variables. J. Am. Stat. Assoc. 91:444–55 [Google Scholar]
  4. Chickering DM. 2002.a Learning equivalence classes of Bayesian-network structures. J. Mach. Learn. Res. 2:445–98 [Google Scholar]
  5. Chickering DM. 2002.b Optimal structure identification with greedy search. J. Mach. Learn. Res. 3:507–54 [Google Scholar]
  6. Cho SW, Kim S, Kim Y, Kweon J, Kim HS. et al. 2014. Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases. Genome Res 24:132–41 [Google Scholar]
  7. Claassen T, Mooij JM, Heskes T. 2013. Learning sparse causal models is not NP-hard. Proc. 29th Annu. Conf. Uncertain. Artif. Intell. (UAI)172–81 Arlington, VA: AUAI [Google Scholar]
  8. Colombo D, Maathuis MH. 2014. Order-independent constraint-based causal structure learning. J. Mach. Learn. Res. 15:3741–82 [Google Scholar]
  9. Colombo D, Maathuis MH, Kalisch M, Richardson TS. 2012. Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Stat. 40:294–321 [Google Scholar]
  10. Comon P. 1994. Independent component analysis, a new concept. Signal Proc 36:287–314 [Google Scholar]
  11. Cooper G, Yoo C. 1999. Causal discovery from a mixture of experimental and observational data. Proc. 15th Annu. Conf. Uncertain. Artif. Intell. (UAI)116–25 San Francisco: Morgan Kaufmann [Google Scholar]
  12. Dawid AP. 2000. Causal inference without counterfactuals. J. Am. Stat. Assoc. 95:407–24 [Google Scholar]
  13. Didelez V. 2018. Causal concepts and graphical models. Handbook of Graphical Models M Drton, SL Lauritzen, M Maathuis Boca Raton, FL: Chapman and Hall/CRC In press [Google Scholar]
  14. Drton M, Maathuis MH. 2017. Structure learning in graphical modeling. Annu. Rev. Stat. Appl. 4:365–93 [Google Scholar]
  15. Eaton D, Murphy KP. 2007. Exact Bayesian structure learning from uncertain interventions. Proc. 11th Int. Conf. Artif. Intell. Stat. (AISTATS) D Eaton, K Murphy 107–14 http://proceedings.mlr.press/v2/eaton07a.html [Google Scholar]
  16. Frisch R. 1995 (1938). Autonomy of economic relations: statistical versus theoretical relations in economic macrodynamics. The Foundations of Econometric Analysis DF Hendry, MS Morgan 407–23 Cambridge, UK: Cambridge Univ. Press [Google Scholar]
  17. Haavelmo T. 1944. The probability approach in econometrics. Econometrica 12:S1–115 [Google Scholar]
  18. Harris N, Drton M. 2013. PC algorithm for nonparanormal graphical models. J. Mach. Learn. Res. 14:3365–83 [Google Scholar]
  19. Hauser A, Bühlmann P. 2012. Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. J. Mach. Learn. Res. 13:2409–64 [Google Scholar]
  20. Heinze-Deml C. 2017. backShift: learning causal cyclic graphs from unknown shift interventions. R package https://github.com/christinaheinze/backShift [Google Scholar]
  21. Heinze-Deml C, Meinshausen N. 2017. CompareCausalNetworks: interface to diverse estimation methods of causal networks. R package https://github.com/christinaheinze/CompareCausalNetworks [Google Scholar]
  22. Hoyer PO, Shimizu S, Kerminen AJ, Palviainen M. 2008. Estimation of causal effects using linear non-Gaussian causal models with hidden variables. Int. J. Approx. Reason. 49:362–78 [Google Scholar]
  23. Hyttinen A, Eberhardt F, Hoyer PO. 2012. Learning linear cyclic causal models with latent variables. J. Mach. Learn. Res. 13:3387–439 [Google Scholar]
  24. Imbens G. 2014. Instrumental variables: an econometricians perspective. Stat. Sci. 29:323–58 [Google Scholar]
  25. Kalisch M, Bühlmann P. 2007. Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8:613–36 [Google Scholar]
  26. Kalisch M, Mächler M, Colombo D, Maathuis MH, Bühlmann P. 2012. Causal inference using graphical models with the R package pcalg. J. Stat. Softw. 47:111–26 [Google Scholar]
  27. Lacerda G, Spirtes P, Ramsey J, Hoyer PO. 2008. Discovering cyclic causal models by independent components analysis. Proc. 24th Annu. Conf. Uncertain. Artif. Intell. (UAI)366–74 Corvallis, OR: AUAI [Google Scholar]
  28. Lauritzen SL. 1996. Graphical Models New York: Oxford Univ. Press [Google Scholar]
  29. Maathuis MH, Colombo D, Kalisch M, Bühlmann P. 2010. Predicting causal effects in large-scale systems from observational data. Nat. Methods 7:247–48 [Google Scholar]
  30. Maathuis MH, Kalisch M, Bühlmann P. 2009. Estimating high-dimensional intervention effects from observational data. Ann. Stat. 37:3133–64 [Google Scholar]
  31. Nandy P, Hauser A, Maathuis MH. 2017.a High-dimensional consistency in score-based and hybrid structure learning. arXiv1507.02608 [math.ST]
  32. Nandy P, Maathuis MH, Richardson TS. 2017.b Estimating the effect of joint interventions from observational data in high-dimensional settings. Ann. Stat. 45:647–74 [Google Scholar]
  33. Pearl J. 2009. Causality: Models, Reasoning, and Inference Cambridge, UK: Cambridge Univ. Press, 2nd ed.. [Google Scholar]
  34. Peters J, Bühlmann P, Meinshausen N. 2016. Causal inference using invariant prediction: identification and confidence intervals. J. R. Stat. Soc. B 78:947–1012 [Google Scholar]
  35. R Core Team. 2017. The R project for statistical computing Vienna: R Found. Stat. Comput https://www.R-project.org/ [Google Scholar]
  36. Richardson T, Robins JM. 2013. Single world intervention graphs (SWIGs): a unification of the counterfactual and graphical approaches to causality Work. Pap. 128, Cent. Stat. Soc. Sci., Univ. Wash. [Google Scholar]
  37. Richardson T, Spirtes P. 1999. Automated discovery of linear feedback models. Computation, Causation, and Discovery C Glymour, GF Cooper 253–304 Cambridge, MA: MIT Press [Google Scholar]
  38. Richardson T, Spirtes P. 2002. Ancestral graph Markov models. Ann. Stat. 30:962–1030 [Google Scholar]
  39. Robins JM. 1986. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math. Model. 7:1393–512 [Google Scholar]
  40. Rothenhäusler D, Heinze C, Peters J, Meinshausen N. 2015. backShift: learning causal cyclic graphs from unknown shift interventions. Advances in Neural Information Processing Systems 28 (NIPS) C Cortes, ND Lawrence, DD Lee, M Sugiyama, R Garnett 1513–21 Red Hook, NY: Curran: [Google Scholar]
  41. Rubin DB. 2005. Causal inference using potential outcomes. J. Am. Stat. Assoc. 100:322–31 [Google Scholar]
  42. Scutari M. 2010. Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35:31–22 http://www.jstatsoft.org/v35/i03/ [Google Scholar]
  43. Shimizu S, Hoyer PO, Hyvärinen A, Kerminen AJ. 2006. A linear non-Gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7:2003–30 [Google Scholar]
  44. Shimizu S, Inazumi T, Sogawa Y, Hyvärinen A, Kawahara Y. et al. 2011. DirectLiNGAM: a direct method for learning a linear non-Gaussian structural equation model. J. Mach. Learn. Res. 12:1225–48 [Google Scholar]
  45. Spirtes P, Glymour C, Scheines R. 2000. Causation, Prediction, and Search Cambridge, MA: MIT Press. , 2nd ed.. [Google Scholar]
  46. Spirtes P, Meek C, Richardson TS. 1999. An algorithm for causal inference in the presence of latent variables and selection bias. Computation, Causation and Discovery GF Cooper, C Glymour 211–52 Cambridge, MA: MIT Press [Google Scholar]
  47. Stekhoven DJ, Moraes I, Sveinbjörnsson G, Hennig L, Maathuis MH, Bühlmann P. 2012. Causal stability ranking. Bioinformatics 28:2819–23 [Google Scholar]
  48. Tian J, Pearl J. 2001. Causal discovery from changes. Proc. 17th Annu. Conf. Uncertain. Artif. Intell. (UAI)512–22 San Francisco: Morgan Kaufmann [Google Scholar]
  49. Tsamardinos I, Brown LE, Aliferis CF. 2006. The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65:31–78 [Google Scholar]
  50. Wright D. 1934. The method of path coefficients. Ann. Math. Stat. 5:161–215 [Google Scholar]
  51. Wright S. 1921. Correlation and causation. J. Agric. Res. 20:557–85 [Google Scholar]
  52. Zhang J. 2008.a Causal reasoning with ancestral graphs. J. Mach. Learn. Res. 9:1437–74 [Google Scholar]
  53. Zhang J. 2008.b On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif. Intel. 172:1873–96 [Google Scholar]

Data & Media loading...

Supplementary Data

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error