Causal Structure Learning

Christina Heinze-Deml; Marloes H. Maathuis; Nicolai Meinshausen

doi:10.1146/annurev-statistics-031017-100630

Annual Review of Statistics and Its Application

Volume 5, 2018

Review Article

Free

Causal Structure Learning

Christina Heinze-Deml¹, Marloes H. Maathuis¹, and Nicolai Meinshausen¹
View Affiliations Hide Affiliations

Affiliations: Seminar for Statistics, Department of Mathematics, ETH Zurich, CH-8092 Zurich, Switzerland; email: [email protected], [email protected], [email protected]
Vol. 5:371-391 (Volume publication date March 2018) https://doi.org/10.1146/annurev-statistics-031017-100630
First published as a Review in Advance on December 08, 2017
© Annual Reviews

Abstract

Graphical models can represent a multivariate distribution in a convenient and accessible form as a graph. Causal models can be viewed as a special class of graphical models that represent not only the distribution of the observed system but also the distributions under external interventions. They hence enable predictions under hypothetical interventions, which is important for decision making. The challenging task of learning causal models from data always relies on some underlying assumptions. We discuss several recently proposed structure learning algorithms and their assumptions, and we compare their empirical performance under various scenarios.

Keyword(s): causal model, directed graphs, feedback, interventions, latent variables

Article metrics loading...

/content/journals/10.1146/annurev-statistics-031017-100630

2018-03-07

2024-05-07

Full text loading...

/deliver/fulltext/statistics/5/1/annurev-statistics-031017-100630.html?itemId=/content/journals/10.1146/annurev-statistics-031017-100630&mimeType=html&fmt=ahah

Literature Cited

Ali RA, Richardson TS, Spirtes P. 2009. Markov equivalence for ancestral graphs. Ann. Stat. 37:2808–37 [Google Scholar]
Andersson SA, Madigan D, Perlman MD. 1997. A characterization of Markov equivalence classes for acyclic digraphs. Ann. Stat. 25:505–41 [Google Scholar]
Angrist JD, Imbens GW, Rubin DB. 1996. Identification of causal effects using instrumental variables. J. Am. Stat. Assoc. 91:444–55 [Google Scholar]
Chickering DM. 2002.a Learning equivalence classes of Bayesian-network structures. J. Mach. Learn. Res. 2:445–98 [Google Scholar]
Chickering DM. 2002.b Optimal structure identification with greedy search. J. Mach. Learn. Res. 3:507–54 [Google Scholar]
Cho SW, Kim S, Kim Y, Kweon J, Kim HS. et al. 2014. Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases. Genome Res 24:132–41 [Google Scholar]
Claassen T, Mooij JM, Heskes T. 2013. Learning sparse causal models is not NP-hard. Proc. 29th Annu. Conf. Uncertain. Artif. Intell. (UAI)172–81 Arlington, VA: AUAI [Google Scholar]
Colombo D, Maathuis MH. 2014. Order-independent constraint-based causal structure learning. J. Mach. Learn. Res. 15:3741–82 [Google Scholar]
Colombo D, Maathuis MH, Kalisch M, Richardson TS. 2012. Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Stat. 40:294–321 [Google Scholar]
Comon P. 1994. Independent component analysis, a new concept. Signal Proc 36:287–314 [Google Scholar]
Cooper G, Yoo C. 1999. Causal discovery from a mixture of experimental and observational data. Proc. 15th Annu. Conf. Uncertain. Artif. Intell. (UAI)116–25 San Francisco: Morgan Kaufmann [Google Scholar]
Dawid AP. 2000. Causal inference without counterfactuals. J. Am. Stat. Assoc. 95:407–24 [Google Scholar]
Didelez V. 2018. Causal concepts and graphical models. Handbook of Graphical Models M Drton, SL Lauritzen, M Maathuis Boca Raton, FL: Chapman and Hall/CRC In press [Google Scholar]
Drton M, Maathuis MH. 2017. Structure learning in graphical modeling. Annu. Rev. Stat. Appl. 4:365–93 [Google Scholar]
Eaton D, Murphy KP. 2007. Exact Bayesian structure learning from uncertain interventions. Proc. 11th Int. Conf. Artif. Intell. Stat. (AISTATS) D Eaton, K Murphy 107–14 http://proceedings.mlr.press/v2/eaton07a.html [Google Scholar]
Frisch R. 1995 (1938). Autonomy of economic relations: statistical versus theoretical relations in economic macrodynamics. The Foundations of Econometric Analysis DF Hendry, MS Morgan 407–23 Cambridge, UK: Cambridge Univ. Press [Google Scholar]
Haavelmo T. 1944. The probability approach in econometrics. Econometrica 12:S1–115 [Google Scholar]
Harris N, Drton M. 2013. PC algorithm for nonparanormal graphical models. J. Mach. Learn. Res. 14:3365–83 [Google Scholar]
Hauser A, Bühlmann P. 2012. Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. J. Mach. Learn. Res. 13:2409–64 [Google Scholar]
Heinze-Deml C. 2017. backShift: learning causal cyclic graphs from unknown shift interventions. R package https://github.com/christinaheinze/backShift
Heinze-Deml C, Meinshausen N. 2017. CompareCausalNetworks: interface to diverse estimation methods of causal networks. R package https://github.com/christinaheinze/CompareCausalNetworks
Hoyer PO, Shimizu S, Kerminen AJ, Palviainen M. 2008. Estimation of causal effects using linear non-Gaussian causal models with hidden variables. Int. J. Approx. Reason. 49:362–78 [Google Scholar]
Hyttinen A, Eberhardt F, Hoyer PO. 2012. Learning linear cyclic causal models with latent variables. J. Mach. Learn. Res. 13:3387–439 [Google Scholar]
Imbens G. 2014. Instrumental variables: an econometricians perspective. Stat. Sci. 29:323–58 [Google Scholar]
Kalisch M, Bühlmann P. 2007. Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8:613–36 [Google Scholar]
Kalisch M, Mächler M, Colombo D, Maathuis MH, Bühlmann P. 2012. Causal inference using graphical models with the R package pcalg. J. Stat. Softw. 47:111–26 [Google Scholar]
Lacerda G, Spirtes P, Ramsey J, Hoyer PO. 2008. Discovering cyclic causal models by independent components analysis. Proc. 24th Annu. Conf. Uncertain. Artif. Intell. (UAI)366–74 Corvallis, OR: AUAI [Google Scholar]
Lauritzen SL. 1996. Graphical Models New York: Oxford Univ. Press
Maathuis MH, Colombo D, Kalisch M, Bühlmann P. 2010. Predicting causal effects in large-scale systems from observational data. Nat. Methods 7:247–48 [Google Scholar]
Maathuis MH, Kalisch M, Bühlmann P. 2009. Estimating high-dimensional intervention effects from observational data. Ann. Stat. 37:3133–64 [Google Scholar]
Nandy P, Hauser A, Maathuis MH. 2017.a High-dimensional consistency in score-based and hybrid structure learning. arXiv1507.02608 [math.ST]
Nandy P, Maathuis MH, Richardson TS. 2017.b Estimating the effect of joint interventions from observational data in high-dimensional settings. Ann. Stat. 45:647–74 [Google Scholar]
Pearl J. 2009. Causality: Models, Reasoning, and Inference Cambridge, UK: Cambridge Univ. Press, 2nd ed..
Peters J, Bühlmann P, Meinshausen N. 2016. Causal inference using invariant prediction: identification and confidence intervals. J. R. Stat. Soc. B 78:947–1012 [Google Scholar]
R Core Team. 2017. The R project for statistical computing Vienna: R Found. Stat. Comput https://www.R-project.org/
Richardson T, Robins JM. 2013. Single world intervention graphs (SWIGs): a unification of the counterfactual and graphical approaches to causality Work. Pap. 128, Cent. Stat. Soc. Sci., Univ. Wash.
Richardson T, Spirtes P. 1999. Automated discovery of linear feedback models. Computation, Causation, and Discovery C Glymour, GF Cooper 253–304 Cambridge, MA: MIT Press [Google Scholar]
Richardson T, Spirtes P. 2002. Ancestral graph Markov models. Ann. Stat. 30:962–1030 [Google Scholar]
Robins JM. 1986. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math. Model. 7:1393–512 [Google Scholar]
Rothenhäusler D, Heinze C, Peters J, Meinshausen N. 2015. backShift: learning causal cyclic graphs from unknown shift interventions. Advances in Neural Information Processing Systems 28 (NIPS) C Cortes, ND Lawrence, DD Lee, M Sugiyama, R Garnett 1513–21 Red Hook, NY: Curran: [Google Scholar]
Rubin DB. 2005. Causal inference using potential outcomes. J. Am. Stat. Assoc. 100:322–31 [Google Scholar]
Scutari M. 2010. Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35:31–22 http://www.jstatsoft.org/v35/i03/ [Google Scholar]
Shimizu S, Hoyer PO, Hyvärinen A, Kerminen AJ. 2006. A linear non-Gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7:2003–30 [Google Scholar]
Shimizu S, Inazumi T, Sogawa Y, Hyvärinen A, Kawahara Y. et al. 2011. DirectLiNGAM: a direct method for learning a linear non-Gaussian structural equation model. J. Mach. Learn. Res. 12:1225–48 [Google Scholar]
Spirtes P, Glymour C, Scheines R. 2000. Causation, Prediction, and Search Cambridge, MA: MIT Press. , 2nd ed..
Spirtes P, Meek C, Richardson TS. 1999. An algorithm for causal inference in the presence of latent variables and selection bias. Computation, Causation and Discovery GF Cooper, C Glymour 211–52 Cambridge, MA: MIT Press [Google Scholar]
Stekhoven DJ, Moraes I, Sveinbjörnsson G, Hennig L, Maathuis MH, Bühlmann P. 2012. Causal stability ranking. Bioinformatics 28:2819–23 [Google Scholar]
Tian J, Pearl J. 2001. Causal discovery from changes. Proc. 17th Annu. Conf. Uncertain. Artif. Intell. (UAI)512–22 San Francisco: Morgan Kaufmann [Google Scholar]
Tsamardinos I, Brown LE, Aliferis CF. 2006. The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65:31–78 [Google Scholar]
Wright D. 1934. The method of path coefficients. Ann. Math. Stat. 5:161–215 [Google Scholar]
Wright S. 1921. Correlation and causation. J. Agric. Res. 20:557–85 [Google Scholar]
Zhang J. 2008.a Causal reasoning with ancestral graphs. J. Mach. Learn. Res. 9:1437–74 [Google Scholar]
Zhang J. 2008.b On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif. Intel. 172:1873–96 [Google Scholar]

/content/journals/10.1146/annurev-statistics-031017-100630

Causal Structure Learning

Annual Review of Statistics and Its Application 5, 371 (2018); https://doi.org/10.1146/annurev-statistics-031017-100630

/content/journals/10.1146/annurev-statistics-031017-100630

Data & Media loading...

Supplemental Material

Supplementary Data

Download Supplemental Appendix (PDF).

Article Type: Review Article

Most Cited Most Cited RSS feed

- Functional Data Analysis
  
  Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Müller
  
  Vol. 3 (2016), pp. 257–295
- Probabilistic Forecasting
  
  Tilmann Gneiting, and Matthias Katzfuss
  
  Vol. 1 (2014), pp. 125–151
- Bayesian Computing with INLA: A Review
  
  Håvard Rue, Andrea Riebler, Sigrunn H. Sørbye, Janine B. Illian, Daniel P. Simpson, and Finn K. Lindgren
  
  Vol. 4 (2017), pp. 395–421
- Functional Regression
  
  Jeffrey S. Morris
  
  Vol. 2 (2015), pp. 321–359
- Topological Data Analysis
  
  Larry Wasserman
  
  Vol. 5 (2018), pp. 501–532
- Algorithmic Fairness: Choices, Assumptions, and Definitions
  
  Shira Mitchell, Eric Potash, Solon Barocas, Alexander D'Amour, and Kristian Lum
  
  Vol. 8 (2021), pp. 141–163
- Microbiome, Metagenomics, and High-Dimensional Compositional Data Analysis
  
  Hongzhe Li
  
  Vol. 2 (2015), pp. 73–94
- Learning Deep Generative Models
  
  Ruslan Salakhutdinov
  
  Vol. 2 (2015), pp. 361–385
- On p-Values and Bayes Factors
  
  Leonhard Held, and Manuela Ott
  
  Vol. 5 (2018), pp. 393–419
- High-Dimensional Statistics with a View Toward Applications in Biology
  
  Peter Bühlmann, Markus Kalisch, and Lukas Meier
  
  Vol. 1 (2014), pp. 255–278
More Less

Annual Review of Statistics and Its Application

Volume 5, 2018

Review Article

Free

Causal Structure Learning

Abstract

Supplementary Data

Most Read This Month

Most Cited Most Cited RSS feed