1932

Abstract

The data science of networks is a rapidly developing field with myriad applications. In neuroscience, the brain is commonly modeled as a connectome, a network of nodes connected by edges. While there have been thousands of papers on connectomics, the statistics of networks remains limited and poorly understood. Here, we provide an overview from the perspective of statistical network science of the kinds of models, assumptions, problems, and applications that are theoretically and empirically justified for analysis of connectome data. We hope this review spurs further development and application of statistically grounded methods in connectomics.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-042720-023234
2021-03-07
2024-03-29
Loading full text...

Full text loading...

/deliver/fulltext/statistics/8/1/annurev-statistics-042720-023234.html?itemId=/content/journals/10.1146/annurev-statistics-042720-023234&mimeType=html&fmt=ahah

Literature Cited

  1. Afshin-Pour B, Hossein-Zadeh GA, Strother SC, Soltanian-Zadeh H 2012. Enhancing reproducibility of fMRI statistical maps using generalized canonical correlation analysis in NPAIRS framework. NeuroImage 60:1970–81
    [Google Scholar]
  2. Airoldi EM, Blei DM, Fienberg SE, Xing EP 2008. Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9:1981–2014
    [Google Scholar]
  3. Akaike H. 1974. A new look at the statistical model identification. IEEE Trans. Autom. Control 19:716–23
    [Google Scholar]
  4. Alexander LM, Escalera J, Ai L, Andreotti C, Febre K et al. 2017. An open resource for transdiagnostic research in pediatric mental health and learning disorders. Sci. Data 4:170181
    [Google Scholar]
  5. Amico E, Marinazzo D, Di Perri C, Heine L, Annen J et al. 2017. Mapping the functional connectome traits of levels of consciousness. Neuroimage 148:201–11
    [Google Scholar]
  6. Anscombe FJ. 1973. Graphs in statistical analysis. Am. Stat. 27:17–21
    [Google Scholar]
  7. Arroyo J, Athreya A, Cape J, Chen G, Priebe CE, Vogelstein JT 2019. Inference for multiple heterogeneous networks with a common invariant subspace. arXiv:1906.10026 [stat.ME]
  8. Arroyo J, Levina E. 2020. Simultaneous prediction and community detection for networks with application to neuroimaging. arXiv:2002.01645 [stat.ME]
  9. Arroyo Relión JD, Kessler D, Levina E, Taylor SF 2019. Network classification with applications to brain connectomics. Ann. Appl. Stat. 13:1648–77
    [Google Scholar]
  10. Athey TL, Vogelstein JT. 2019. AutoGMM: Automatic Gaussian mixture modeling in Python. arXiv:1909.02688 [cs.LG]
  11. Athreya A, Fishkind DE, Tang M, Priebe CE, Park Y et al. 2017. Statistical inference on random dot product graphs: a survey. J. Mach. Learn. Res. 18:8393–484
    [Google Scholar]
  12. Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57:289–300
    [Google Scholar]
  13. Biswal BB, Mennes M, Zuo XNN, Gohel S, Kelly AMC et al. 2010. Toward discovery science of human brain function. PNAS 107:4734–39
    [Google Scholar]
  14. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E 2008. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008:P10008
    [Google Scholar]
  15. Bullmore ET, Bassett DS. 2011. Brain graphs: graphical models of the human brain connectome. Annu. Rev. Clin. Psychol. 7:113–40
    [Google Scholar]
  16. Bullmore ET, Sporns O. 2009. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10:186–98
    [Google Scholar]
  17. Cape J, Tang M, Priebe CE 2019. On spectral embedding performance and elucidating network structure in stochastic blockmodel graphs. Netw. Sci. 7:269–91
    [Google Scholar]
  18. Chatterjee S. 2015. Matrix estimation by universal singular value thresholding. Ann. Stat. 43:177–214
    [Google Scholar]
  19. Chen H, Soni U, Lu Y, Maciejewski R, Kobourov S 2018. Same stats, different graphs. International Symposium on Graph Drawing and Network Visualization T Biedl, A Kerren 463–77 New York: Springer
    [Google Scholar]
  20. Chung J, Pedigo BD, Bridgeford EW, Varjavand BK, Helm HS, Vogelstein JT 2019. GraSPy: graph statistics in Python. J. Mach. Learn. Res. 20:1–7
    [Google Scholar]
  21. Chung K, Deisseroth K. 2013. CLARITY for mapping the nervous system. Nat. Methods 10:508–13
    [Google Scholar]
  22. Clauset A, Newman ME, Moore C 2004. Finding community structure in very large networks. Phys. Rev. E 70:066111
    [Google Scholar]
  23. Craddock RC, Jbabdi S, Yan CG, Vogelstein JT, Castellanos FX et al. 2013. Imaging human connectomes at the macroscale. Nat. Methods 10:524–39
    [Google Scholar]
  24. Crainiceanu CM, Caffo BS, Luo S, Zipunnikov VM, Punjabi NM 2011. Population value decomposition, a framework for the analysis of image populations. J. Am. Stat. Assoc. 106:775–90
    [Google Scholar]
  25. Durante D, Dunson DB, Vogelstein JT 2017. Rejoinder: nonparametric Bayes modeling of populations of networks. J. Am. Stat. Assoc. 112:1547–52
    [Google Scholar]
  26. Efron B. 2008. Simultaneous inference: When should hypothesis testing problems be combined. ? Ann. Appl. Stat. 2:197–223
    [Google Scholar]
  27. Erdős P, Rényi A. 1959. On random graphs, I. Publ. Math. Debrecen 6:290–97
    [Google Scholar]
  28. Faskowitz J, Yan X, Zuo XN, Sporns O 2018. Weighted stochastic block models of the human connectome across the life span. Sci. Rep. 8:1–16
    [Google Scholar]
  29. Fisher R. 1925. Statistical Methods for Research Workers Edinburgh: Oliver & Boyd
  30. Fortunato S, Hric D. 2016. Community detection in networks: a user guide. Phys. Rep. 659:1–44
    [Google Scholar]
  31. Genovese CR, Lazar NA, Nichols T 2002. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage 15:870–78
    [Google Scholar]
  32. Ghoshdastidar D, Gutzeit M, Carpentier A, von Luxburg U 2017. Two-sample tests for large random graphs using network statistics. arXiv:1705.06168 [stat.ME]
  33. Ginestet CE, Li J, Balachandran P, Rosenberg S, Kolaczyk ED 2017. Hypothesis testing for network data in functional neuroimaging. Ann. Appl. Stat. 11:725–50
    [Google Scholar]
  34. Goldenberg A, Zheng AX, Fienberg SE, Airoldi EM 2010. A survey of statistical network models. Found. Trends Mach. Learn. 2:129–233
    [Google Scholar]
  35. Grover A, Leskovec J. 2016. node2vec: Scalable feature learning for networks. KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining855–64 New York: ACM
    [Google Scholar]
  36. Guha S, Rodriguez A. 2020. Bayesian regression with undirected network predictors with an application to brain connectome data. J. Am. Stat. Assoc. https://doi.org/10.1080/01621459.2020.1772079
    [Crossref] [Google Scholar]
  37. Hagmann P. 2005. From diffusion MRI to brain connectomics PhD Thesis, École Polytechnique Fédérale de Lausanne Lausanne, Switz:.
  38. Hoff PD, Raftery AE, Handcock MS 2002. Latent space approaches to social network analysis. J. Am. Stat. Assoc. 97:1090–98
    [Google Scholar]
  39. Holland PW, Laskey KB, Leinhardt S 1983. Stochastic blockmodels: first steps. Soc. Netw. 5:109–37
    [Google Scholar]
  40. Jackson JE. 2005. A User's Guide to Principal Components New York: Wiley
  41. Kiar G, Bridgeford EW, Roncal WRG, Chandrashekhar V, Mhembere D et al. 2018. A high-throughput pipeline identifies robust connectomes but troublesome variability. bioRxiv 188706. https://doi.org/10.1101/188706
    [Crossref]
  42. Kim Y, Levina E. 2019. Graph-aware modeling of brain connectivity networks. arXiv:1903.02129 [stat.AP]
  43. Kolaczyk ED, Csárdi G. 2014. Statistical Analysis of Network Data with R New York: Springer
  44. Levin K, Athreya A, Tang M, Lyzinski V, Park Y, Priebe CE 2017. A central limit theorem for an omnibus embedding of multiple random graphs and implications for multiscale network inference. arXiv:1705.09355 [stat.ME]
  45. Lock EF, Hoadley KA, Marron JS, Nobel AB 2013. Joint and individual variation explained (jive) for integrated analysis of multiple data types. Ann. Appl. Stat. 7:523
    [Google Scholar]
  46. Lyzinski V, Sussman DL. 2017. Matchability of heterogeneous networks pairs. Inform. Inference. https://doi.org/10.1093/imaiai/iaz031
    [Crossref] [Google Scholar]
  47. Lyzinski V, Tang M, Athreya A, Park Y, Priebe CE 2017. Community detection and classification in hierarchical stochastic blockmodels. IEEE Trans. Netw. Sci. Eng. 4:13–26
    [Google Scholar]
  48. Marchette D, Priebe CE, Coppersmith G 2011. Vertex nomination via attributed random dot product graphs. Bulletin of the International Statistical Institute Proceedings of the 58th World Statistics Congress 2011, Dublin5047–52 The Hague, Neth: Int. Stat. Inst https://2011.isiproceedings.org/papers/950095.pdf
    [Google Scholar]
  49. Matejka J, Fitzmaurice G. 2017. Same stats, different graphs: generating datasets with varied appearance and identical statistics through simulated annealing. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems1290–94 New York: ACM
    [Google Scholar]
  50. Mhembere D, Roncal WG, Sussman D, Priebe CE, Jung R et al. 2013. Computing scalable multivariate glocal invariants of large (brain-) graphs. 2013 IEEE Global Conference on Signal and Information Processing297–300 Piscataway, NJ: IEEE
    [Google Scholar]
  51. Newman ME. 2002. Random graphs as models of networks. Handbook of Graphs and Networks: From the Genome to the Internet S Bornholdt, HG Schuster 35–68 New York: Wiley
    [Google Scholar]
  52. Newman ME. 2013. Spectral methods for community detection and graph partitioning. Phys. Rev. E 88:042822
    [Google Scholar]
  53. Nielsen AM, Witten D. 2018. The multiple random dot product graph model. arXiv:1811.12172 [stat.ME]
  54. Panda S, Palaniappan S, Xiong J, Bridgeford EW, Mehta R et al. 2019. hyppo: a comprehensive multivariate hypothesis testing Python package. arXiv:1907.02088 [stat.CO]
  55. Priebe CE, Coppersmith G, Rukhin A 2010. You say “graph invariant,” I say “test statistic”. Stat. Comput. Stat. Graph. 21:11–14
    [Google Scholar]
  56. Priebe CE, Park Y, Tang M, Athreya A, Lyzinski V et al. 2017. Semiparametric spectral modeling of the Drosophila connectome. arXiv:1705.03297 [stat.ML]
  57. Priebe CE, Park Y, Vogelstein JT, Conroy JM, Lyzinski V et al. 2019. On a two-truths phenomenon in spectral graph clustering. PNAS 116:5995–6000
    [Google Scholar]
  58. Richiardi J, Eryilmaz H, Schwartz S, Vuilleumier P, Van De Ville D 2011. Decoding brain states from fMRI connectivity graphs. Neuroimage 56:616–26
    [Google Scholar]
  59. Rieke F. 1997. Spikes: Exploring the Neural Code Cambridge, MA: MIT Press
  60. Rissanen J. 1978. Modeling by shortest data description. Automatica 14:465–71
    [Google Scholar]
  61. Rohe K, Chatterjee S, Yu B 2011. Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Stat. 39:1878–915
    [Google Scholar]
  62. Rohe K, Qin T, Yu B 2016. Co-clustering directed graphs to discover asymmetries and directional communities. PNAS 113:12679–84
    [Google Scholar]
  63. Rubin-Delanchy P, Priebe CE, Tang M, Cape J 2017. A statistical interpretation of spectral embedding: the generalised random dot product graph. arXiv:1709.05506 [stat.ML]
  64. Rukhin A, Priebe CE. 2010. A comparative power analysis of the maximum degree and size invariants for random graph inference. J. Stat. Plann. Inference 141:1041–46
    [Google Scholar]
  65. Russell SJ, Norvig P. 2016. Artificial Intelligence: A Modern Approach. Essex, UK: Pearson
  66. Scheinerman ER, Tucker K. 2010. Modeling graphs using dot product representations. Comput. Stat. 25:1–16
    [Google Scholar]
  67. Schwarz G. 1978. Estimating the dimension of a model. Ann. Stat. 6:461–64
    [Google Scholar]
  68. Scrucca L, Fop M, Murphy TB, Raftery AE 2016. mclust 5: Clustering, classification and density estimation using Gaussian finite mixture models. R J. 8:289
    [Google Scholar]
  69. Shepherd GM. 1991. Foundations of the Neuron Doctrine Oxford, UK: Oxford Univ. Press. , 1st. ed.
  70. Simes RJ. 1986. An improved Bonferroni procedure for multiple tests of significance. Biometrika 73:751–54
    [Google Scholar]
  71. Sporns O, Tononi G, Kötter R 2005. The human connectome: a structural description of the human brain. PLOS Comput. Biol. 1:e42
    [Google Scholar]
  72. Sussman DL, Tang M, Fishkind DE, Priebe CE 2012. A consistent adjacency spectral embedding for stochastic blockmodel graphs. J. Am. Stat. Assoc. 107:1119–28
    [Google Scholar]
  73. Sussman DL, Tang M, Priebe CE 2014. Consistent latent position estimation and vertex classification for random dot product graphs. IEEE Trans. Pattern Anal. Mach. Intell. 36:48–57
    [Google Scholar]
  74. Tang M, Athreya A, Sussman DL, Lyzinski V, Park Y, Priebe CE 2017a. A semiparametric two-sample hypothesis testing problem for random graphs. J. Comput. Graph. Stat. 26:344–54
    [Google Scholar]
  75. Tang M, Athreya A, Sussman DL, Lyzinski V, Priebe CE et al. 2017b. A nonparametric two-sample hypothesis testing problem for random graphs. Bernoulli 23:1599–630
    [Google Scholar]
  76. Tang M, Sussman DL, Priebe CE 2013. Universally consistent vertex classification for latent positions graphs. Ann. Stat. 41:1406–30
    [Google Scholar]
  77. Tang R, Ketcha M, Badea A, Calabrese ED, Margulies DS et al. 2018. Connectome smoothing via low-rank approximations. IEEE Trans. Med. Imaging 38:1446–56
    [Google Scholar]
  78. Thirion B, Varoquaux G, Dohmatob E, Poline JB 2014. Which fMRI clustering gives good brain parcellations. ? Front. Neurosci. 8:167
    [Google Scholar]
  79. Van Essen DC, Smith SM, Barch DM, Behrens TE, Yacoub E et al. 2013. The WU-Minn Human Connectome Project: an overview. Neuroimage 80:62–79
    [Google Scholar]
  80. Varoquaux G, Craddock RC. 2013. Learning and comparing functional connectomes across subjects. Neuroimage 80:405–15
    [Google Scholar]
  81. Varoquaux G, Gramfort A, Poline JB 2010. Brain covariance selection: better individual functional connectivity models using population prior. arXiv:1008.5071 [stat.ML]
  82. Vogelstein JT, Bridgeford EW, Pedigo BD, Chung J, Levin K et al. 2019. Connectal coding: discovering the structures linking cognitive phenotypes to individual histories. Curr. Opin. Neurobiol. 55:199–212
    [Google Scholar]
  83. Vogelstein JT, Conroy JM, Lyzinski V, Podrazik LJ, Kratzer SG et al. 2015. Fast approximate quadratic programming for graph matching. PLOS ONE 10:e0121002
    [Google Scholar]
  84. Vogelstein JT, Roncal WG, Vogelstein RJ, Priebe CE 2012. Graph classification using signal-subgraphs: applications in statistical connectomics. IEEE Trans. Pattern Anal. Mach. Intel. 35:1539–51
    [Google Scholar]
  85. Wang L, Zhang Z, Dunson D 2019a. Common and individual structure of brain networks. Ann. Appl. Stat. 13:85–112
    [Google Scholar]
  86. Wang L, Zhang Z, Dunson D 2019b. Symmetric bilinear regression for signal subgraph estimation. IEEE Trans. Signal Proc. 67:1929–40
    [Google Scholar]
  87. Wang S, Arroyo J, Vogelstein JT, Priebe CE 2019c. Joint embedding of graphs. IEEE Trans. Pattern Anal. Mach. Intel. https://doi.org/10.1109/TPAMI.2019.2948619
    [Crossref] [Google Scholar]
  88. Wang S, Shen C, Badea A, Priebe CE, Vogelstein JT 2018. Signal subgraph estimation via vertex screening. arXiv:1801.07683 [stat.ME]
  89. Wasserman S, Anderson C. 1987. Stochastic a posteriori blockmodels: construction and assessment. Soc. Netw. 9:1–36
    [Google Scholar]
  90. Xia Y, Li L. 2019. Matrix graph hypothesis testing and application in brain connectivity alternation detection. Stat. Sin. 29:303–28
    [Google Scholar]
  91. Young SJ, Scheinerman ER. 2007. Random dot product graph models for social networks. Algorithms and Models for the Web Graph B Kamiński, P Prałat, P Szufel 138–49 New York: Springer
    [Google Scholar]
  92. Zalesky A, Fornito A, Harding IH, Cocchi L, Yücel M et al. 2010. Whole-brain anatomical networks: does the choice of nodes matter. ? Neuroimage 50:970–83
    [Google Scholar]
  93. Zhang J, Sun WW, Li L 2018a. Network response regression for modeling population of networks with covariates. arXiv:1810.03192 [stat.ME]
  94. Zhang Z, Descoteaux M, Zhang J, Girard G, Chamberland M et al. 2018b. Mapping population-based structural connectomes. NeuroImage 172:130–45
    [Google Scholar]
  95. Zheng AX, Fienberg SE, Airoldi EM, Goldenberg A 2009. A survey of statistical network models. Found. Trends Mach. Learn. 2:129–233
    [Google Scholar]
  96. Zhu M, Ghodsi A. 2006. Automatic dimensionality selection from the scree plot via the use of profile likelihood. Comput. Stat. Data Anal. 51:918–30
    [Google Scholar]
  97. Zuo XN, Anderson JS, Bellec P, Birn RM, Biswal BB et al. 2014. An open science resource for establishing reliability and reproducibility in functional connectomics. Sci. Data 1:140049
    [Google Scholar]
/content/journals/10.1146/annurev-statistics-042720-023234
Loading
/content/journals/10.1146/annurev-statistics-042720-023234
Loading

Data & Media loading...

Supplemental Material

Supplementary Data

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error