1932

Abstract

The recent striking success of deep neural networks in machine learning raises profound questions about the theoretical principles underlying their success. For example, what can such deep networks compute? How can we train them? How does information propagate through them? Why can they generalize? And how can we teach them to imagine? We review recent work in which methods of physical analysis rooted in statistical mechanics have begun to provide conceptual insights into these questions. These insights yield connections between deep learning and diverse physical and mathematical topics, including random landscapes, spin glasses, jamming, dynamical phase transitions, chaos, Riemannian geometry, random matrix theory, free probability, and nonequilibrium statistical mechanics. Indeed, the fields of statistical mechanics and machine learning have long enjoyed a rich history of strongly coupled interactions, and recent advances at the intersection of statistical mechanics and deep learning suggest these interactions will only deepen going forward.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-conmatphys-031119-050745
2020-03-10
2024-10-03
Loading full text...

Full text loading...

/deliver/fulltext/conmatphys/11/1/annurev-conmatphys-031119-050745.html?itemId=/content/journals/10.1146/annurev-conmatphys-031119-050745&mimeType=html&fmt=ahah

Literature Cited

  1. 1. 
    LeCun Y, Bengio Y, Hinton G 2015. Nature 521:436–44
    [Google Scholar]
  2. 2. 
    Krizhevsky A, Sutskever I, Hinton GE 2012. Advances in Neural Information Processing Systems 25 (NIPS 2012) F Bereira, CJC Burges, L Bottou, KQ Weinberger1097–105 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  3. 3. 
    Hannun A, Case C, Casper J, Catanzaro B, Diamos G et al. 2014. arXiv:1412.5567
  4. 4. 
    Devlin J, Chang MW, Lee K, Toutanova K 2019. North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT)4171–86 Minneapolis, MN: Assoc. Comput. Linguist.
    [Google Scholar]
  5. 5. 
    Silver D, Huang A, Maddison CJ, Guez A, Sifre L et al. 2016. Nature 529:484–89
    [Google Scholar]
  6. 6. 
    Yamins DLK, Hong H, Cadieu CF, Solomon EA, Seibert D, DiCarlo JJ 2014. PNAS 111:238619–24
    [Google Scholar]
  7. 7. 
    McIntosh L, Nayebi A, Maheswaranathan N, Ganguli S, Baccus S 2016. See Reference 191, pp 1369–77
  8. 8. 
    Rogers TT, McClelland JL 2004. Semantic Cognition: A Parallel Distributed Processing Approach Cambridge, MA: MIT Press
    [Google Scholar]
  9. 9. 
    Saxe AM, McClelland JL, Ganguli S 2019. PNAS 116:2311537–46
    [Google Scholar]
  10. 10. 
    Piech C, Bassen J, Huang J, Ganguli S, Sahami M et al. 2015. Advances in Neural Information Processing Systems 28 (NIPS 2015) C Cortes, ND Lawrence, DD Lee505–13 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  11. 11. 
    Engel A, den Broeck CV 2001. Statistical Mechanics of Learning Cambridge, UK: Cambridge Univ. Press
    [Google Scholar]
  12. 12. 
    Mézard M, Montanari A 2009. Information, Physics, and Computation New York: Oxford Univ. Press
    [Google Scholar]
  13. 13. 
    Advani M, Lahiri S, Ganguli S 2013. J. Stat. Mech. Theory Exp. 2013:P03014
    [Google Scholar]
  14. 14. 
    Mehta P, Bukov M, Wang CH, Day AGR, Richardson C et al. 2019. Phys. Rep. 810:1–124
    [Google Scholar]
  15. 15. 
    Carleo G, Cirac I, Cranmer K, Daudet L, Schuld M et al. 2019. Rev. Mod. Phys. 91:045002
    [Google Scholar]
  16. 16. 
    Sohl-Dickstein J, Weiss EA, Maheswaranathan N, Ganguli S 2015. Proc. Mach. Learn. Res. 37:2256–65
    [Google Scholar]
  17. 17. 
    van den Oord A, Dieleman S, Zen H, Simonyan K, Vinyals O et al. 2016. arXiv:1609.03499
  18. 18. 
    Nguyen HC, Zecchina R, Berg J 2017. Adv. Phys. 66:197–261
    [Google Scholar]
  19. 19. 
    Hornik K, Stinchcombe M, White H 1989. Neural Netw. 2:359–66
    [Google Scholar]
  20. 20. 
    Cybenko G 1989. Math. Control Signals Syst. 2:303–14
    [Google Scholar]
  21. 21. 
    Bengio Y, Courville A, Vincent P 2013. IEEE Trans. Pattern Anal. Mach. Intel. 35:1798–828
    [Google Scholar]
  22. 22. 
    DiCarlo JJ, Cox DD 2007. Trends Cogn. Sci. 11:333–41
    [Google Scholar]
  23. 23. 
    Montufar GF, Pascanu R, Cho K, Bengio Y 2014. See Reference 192, pp 2924–32
  24. 24. 
    Delalleau O, Bengio Y 2011. Advances in Neural Information Processing Systems 24 (NIPS 2011) J Shawe-Taylor, RS Zemel, PL Bartlett, F Pereira, KQ Weinberger666–74 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  25. 25. 
    Eldan R, Shamir O 2015. Proc. Mach. Learn. Res. 49:907–940
    [Google Scholar]
  26. 26. 
    Telgarsky M 2015. Proc. Mach. Learn. Res. 49:1517–39
    [Google Scholar]
  27. 27. 
    Martens J, Chattopadhya A, Pitassi T, Zemel R 2013. Advances in Neural Information Processing Systems 26 (NIPS 2013) CJC Burges, L Bottou, M Welling, Z Ghahramani, KQ Weinberger2877–85 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  28. 28. 
    Bianchini M, Scarselli F 2014. IEEE Trans. Neural Netw. Learn. Syst. 25:1553–65
    [Google Scholar]
  29. 29. 
    Poole B, Lahiri S, Raghu M, Sohl-Dickstein J, Ganguli S 2016. See Reference 191, pp 3360–68
  30. 30. 
    Sompolinsky H, Crisanti A, Sommers H 1988. Phys. Rev. Lett. 61:259–62
    [Google Scholar]
  31. 31. 
    Schoenholz SS, Gilmer J, Ganguli S, Sohl-Dickstein J 2017. Paper presented at 5th International Conference on Learning Representations (ICLR 2017), Toulon, France. https://openreview.net/forum?id=H1W1UN9gg
  32. 32. 
    Raghu M, Poole B, Kleinberg J, Ganguli S, Dickstein JS 2017. Proc. Mach. Learn. Res. 70:2847–54
    [Google Scholar]
  33. 33. 
    Mhaskar H, Liao Q, Poggio T 2016. arXiv:1603.00988
  34. 34. 
    Chung S, Lee DD, Sompolinsky H 2018. Phys. Rev. X 8:031003
    [Google Scholar]
  35. 35. 
    Boyd SP, Vandenberghe L 2004. Convex Optimization Cambridge, UK: Cambridge Univ. Press
    [Google Scholar]
  36. 36. 
    Bray AJ, Dean DS 2007. Phys. Rev. Lett. 98:150201
    [Google Scholar]
  37. 37. 
    Fyodorov YV, Williams I 2007. J. Stat. Phys. 129:1081–116
    [Google Scholar]
  38. 38. 
    Dauphin YN, Pascanu R, Gulcehre C, Cho K, Ganguli S, Bengio Y 2014. See Reference 192, pp 2933–41
  39. 39. 
    Baldi P, Hornik K 1989. Neural Netw. 2:53–58
    [Google Scholar]
  40. 40. 
    Kawaguchi K 2016. See Reference 191, pp 586–94
  41. 41. 
    Choromanska A, Henaff M, Mathieu M, Arous GB, LeCun Y 2015. J. Mach. Learn. Res. 38:192–204
    [Google Scholar]
  42. 42. 
    Crisanti A, Sommers HJ 1992. Z. Phys. B Condens. Matter 87:341–54
    [Google Scholar]
  43. 43. 
    Crisanti A, Horner H, Sommers HJ 1993. Z. Phys. B Condens. Matter 92:257–71
    [Google Scholar]
  44. 44. 
    Auffinger A, Arous GB 2013. Ann. Probab. 41:4214–47
    [Google Scholar]
  45. 45. 
    Auffinger A, Arous GB, Černy` J 2013. Commun. Pure Appl. Math. 66:165–201
    [Google Scholar]
  46. 46. 
    Baity-Jesi M, Sagun L, Geiger M, Spigler S, Arous GB et al. 2018. Proc. Mach. Learn. Res. 80:314–23
    [Google Scholar]
  47. 47. 
    Cugliandolo LF, Kurchan J 1993. Phys. Rev. Lett. 71:173–76
    [Google Scholar]
  48. 48. 
    Arous GB, Dembo A, Guionnet A 2006. Probab. Theory Relat. Fields 136:619–60
    [Google Scholar]
  49. 49. 
    Spigler S, Geiger M, d'Ascoli S, Sagun L, Biroli G, Wyart M 2018. J. Phys. A Math. Theor. 52:474001
    [Google Scholar]
  50. 50. 
    Geiger M, Spigler S, d'Ascoli S, Sagun L, Baity-Jesi M et al. 2019. Phys. Rev. E 100:012115
    [Google Scholar]
  51. 51. 
    O'Hern CS, Silbert LE, Liu AJ, Nagel SR 2003. Phys. Rev. E 68:011306
    [Google Scholar]
  52. 52. 
    Franz S, Parisi G 2016. J. Phys. A Math. Theor. 49:145001
    [Google Scholar]
  53. 53. 
    Sagun L, Bottou L, LeCun Y 2016. Paper presented at 4th International Conference on Learning Representations (ICLR 2016), San Juan, Puerto Rico. arXiv:1611.07476
  54. 54. 
    Sagun L, Evci U, Guney VU, Dauphin Y, Bottou L 2018. Paper presented at 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada. arXiv:1706.04454
  55. 55. 
    Papyan V 2018. arXiv:1811.07062
  56. 56. 
    Ghorbani B, Krishnan S, Xiao Y 2019. Proceedings of the 36th International Conference on Machine Learning (ICML 2019), Long Beach, CA, June 9–15 K Chaudhuri, R Salakhutdinov2232–41 Princeton, NJ: Int. Mach. Learn. Soc. arXiv:1901.10159
    [Google Scholar]
  57. 57. 
    Baldassi C, Borgs C, Chayes JT, Ingrosso A, Lucibello C et al. 2016. PNAS 113:48E7655–62
    [Google Scholar]
  58. 58. 
    Baldassi C, Ingrosso A, Lucibello C, Saglietti L, Zecchina R 2015. Phys. Rev. Lett. 115:12128101
    [Google Scholar]
  59. 59. 
    Chaudhari P, Choromanska A, Soatto S, LeCun Y, Baldassi C et al. 2017. Paper presented at 5th International Conference on Learning Representations (ICLR 2017), Toulon, France
  60. 60. 
    Neal RM 1996. Bayesian Learning for Neural Networks New York: Springer Sci. Bus. Med.
    [Google Scholar]
  61. 61. 
    Daniely A, Frostig R, Singer Y 2016. See Reference 191, pp 2253–61
  62. 62. 
    Yang G 2019. arXiv:1902.04760
  63. 63. 
    Xiao L, Bahri Y, Sohl-Dickstein J, Schoenholz S, Pennington J 2018. Proc. Mach. Learn. Res. 80:5393–402
    [Google Scholar]
  64. 64. 
    Li P, Nguyen PM 2019. Paper presented at 7th International Conference on Learning Representations (ICLR 2019), New Orleans, LA
  65. 65. 
    Chen M, Pennington J, Schoenholz S 2018. Proc. Mach. Learn. Res. 80:873–82
    [Google Scholar]
  66. 66. 
    Gilboa D, Chang B, Chen M, Yang G, Schoenholz SS et al. 2019. arXiv:1901.08987
  67. 67. 
    Lee J, Bahri Y, Novak R, Schoenholz S, Pennington J, Sohl-Dickstein J 2018. Paper presented at 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada
  68. 68. 
    Yang G, Schoenholz S 2017. See Reference 193, pp 7103–14
  69. 69. 
    Yang G, Pennington J, Rao V, Sohl-Dickstein J, Schoenholz SS 2019. Paper presented at 7th International Conference on Learning Representations (ICLR 2019), New Orleans, LA
  70. 70. 
    Pretorius A, van Biljon E, Kroon S, Kamper H 2018. See Reference 194, pp 5717–26
  71. 71. 
    Hayou S, Doucet A, Rousseau J 2018. arXiv:1805.08266
  72. 72. 
    Cubuk ED, Zoph B, Schoenholz SS, Le QV 2017. arXiv:1711.02846
  73. 73. 
    Karakida R, Akaho S, Amari Si 2018. arXiv:1806.01316
  74. 74. 
    Blumenfeld Y, Gilboa D, Soudry D 2019. arXiv:1906.00771
  75. 75. 
    Kawamoto T, Tsubaki M, Obuchi T 2018. See Reference 194, pp 4361–71
  76. 76. 
    Saxe A, McClelland J, Ganguli S 2014. Paper presented at 2nd International Conference on Learning Representations (ICLR 2014), Banff, AB, Canada
  77. 77. 
    Pennington J, Schoenholz S, Ganguli S 2017. See Reference 193, pp 4785–95
  78. 78. 
    Pennington J, Schoenholz SS, Ganguli S 2018. Proc. Mach. Learn. Res. 84:1924–32
    [Google Scholar]
  79. 79. 
    Speicher R 1994. Math. Ann. 298:611–28
    [Google Scholar]
  80. 80. 
    Voiculescu DV, Dykema KJ, Nica A 1992. Free Random Variables Providence, RI: Am. Math. Soc.
    [Google Scholar]
  81. 81. 
    Tarnowski W, Warchoł P, Jastrzebski S, Tabor J, Nowak MA 2018. Proc. Mach. Learn. Res. 89:2221–30
    [Google Scholar]
  82. 82. 
    Pennington J, Bahri Y 2017. Proc. Mach. Learn. Res. 70:2798–806
    [Google Scholar]
  83. 83. 
    Pennington J, Worah P 2017. See Reference 193, pp 2637–46
  84. 84. 
    Pennington J, Worah P 2018. See Reference 194, pp 5410–19
  85. 85. 
    Liao Z, Couillet R 2018. Proc. Mach. Learn. Res. 80:3072–81
    [Google Scholar]
  86. 86. 
    Advani MS, Saxe AM 2017. arXiv:1710.03667
  87. 87. 
    Lampinen AK, Ganguli S 2018. Paper presented at 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada
  88. 88. 
    Martin CH, Mahoney MW 2018. arXiv:1810.01075
  89. 89. 
    Louart C, Liao Z, Couillet R et al. 2018. Ann. Appl. Probab. 28:21190–248
    [Google Scholar]
  90. 90. 
    Liao Z, Couillet R 2018. Proc. Mach. Learn. Res. 80:3072–81
    [Google Scholar]
  91. 91. 
    Kadmon J, Sompolinsky H 2016. See Reference 191, pp 4781–89
  92. 92. 
    Schoenholz SS, Pennington J, Sohl-Dickstein J 2017. arXiv:1710.06570
  93. 93. 
    Parisi G, Ritort F, Slanina F 1999. J. Phys. A Math. Gen. 26:247
    [Google Scholar]
  94. 94. 
    Martin PC, Siggia ED, Rose HA 1973. Phys. Rev. A 8:423
    [Google Scholar]
  95. 95. 
    Sommers HJ 1987. Phys. Rev. Lett. 58:1268–71
    [Google Scholar]
  96. 96. 
    De Dominicis C 1978. Phys. Rev. B Condens. Matter Mater. Phys. 18:4913
    [Google Scholar]
  97. 97. 
    Sompolinsky H, Crisanti A, Sommers HJ 1988. Phys. Rev. Lett. 61:259–62
    [Google Scholar]
  98. 98. 
    Kadmon J, Sompolinsky H 2015. Phys. Rev. X 5:4041030
    [Google Scholar]
  99. 99. 
    Crisanti A, Sompolinksy H 2018. Phys. Rev. E 98:062120
    [Google Scholar]
  100. 100. 
    Hertz JA, Roudi Y, Sollich P 2016. J. Phys. A Math. Theor. 50:033001
    [Google Scholar]
  101. 101. 
    Schücker J, Goedeke S, Dahmen D, Helias M 2016. arXiv:1605.06758
  102. 102. 
    Janssen HK 1976. Z. Phys. B 23:377–80
    [Google Scholar]
  103. 103. 
    Chow CC, Buice MA 2015. J. Math. Neurosci. 5:8
    [Google Scholar]
  104. 104. 
    Buice MA, Cowan JD 2007. Phys. Rev. E 75:051919
    [Google Scholar]
  105. 105. 
    Buice MA, Chow CC 2013. J. Stat. Mech. 2013:P03003
    [Google Scholar]
  106. 106. 
    Martí D, Brunel N, Ostojic S 2018. Phys. Rev. E 97:062314
    [Google Scholar]
  107. 107. 
    Stapmanns J, Kühn T, Dahmen D, Luu T, Honerkamp C, Helias M 2018. arXiv:1812.09345
  108. 108. 
    Domany E, Meir R 1991. Models of Neural Networks E Domany, JL van Hammen, K Schulten307–34 Berlin/Heidelberg: Springer-Verlag
    [Google Scholar]
  109. 109. 
    Zhang C, Bengio S, Hardt M, Recht B, Vinyals O 2017. Paper presented at 5th International Conference on Learning Representations (ICLR 2017), Toulon, France. arXiv:1611.03530
  110. 110. 
    Shazeer N, Mirhoseini A, Maziarz K, Davis A Le Q, et al. 2017. Paper presented at 5th International Conference on Learning Representations (ICLR 2017), Toulon, France. arXiv:1701.06538
  111. 111. 
    Valiant LG 1984. Proceedings of the 16th Annual ACM Symposium on Theory of Computing436–45 New York: Assoc. Comput. Mach.
    [Google Scholar]
  112. 112. 
    Vapnik VN 1998. Statistical Learning Theory New York: John Wiley & Sons
    [Google Scholar]
  113. 113. 
    Koltchinskii V, Panchenko D 2000. High Dimensional Probability II E Giné, DM Mason, JA Wellner443–57 Boston: Birkhäuser
    [Google Scholar]
  114. 114. 
    Bartlett PL, Mendelson S 2002. J. Mach. Learn. Res. 3:463–82
    [Google Scholar]
  115. 115. 
    Bousquet O, Elisseeff A 2002. J. Mach. Learn. Res. 2:499–526
    [Google Scholar]
  116. 116. 
    McAllester DA 1999. Proceedings of the 12th Annual Conference on Learning Theory, (COLT 1999) DA McAllester164–70 New York: Assoc. Comput. Mach.
    [Google Scholar]
  117. 117. 
    Bartlett PL, Mendelson S 2002. J. Mach. Learn. Res. 3:463–82
    [Google Scholar]
  118. 118. 
    Neyshabur B, Tomioka R, Srebro N 2015. Proc. Mach. Learn. Res. 40:1376–401
    [Google Scholar]
  119. 119. 
    Dziugaite GK, Roy DM 2017. arXiv:1703.11008
  120. 120. 
    Golowich N, Rakhlin A, Shamir O 2018. Proc. Mach. Learn. Res. 75:297–99
    [Google Scholar]
  121. 121. 
    Neyshabur B, Bhojanapalli S, Srebro N 2018. Paper presented at 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada
  122. 122. 
    Bartlett PL, Foster DJ, Telgarsky MJ 2017. See Reference 193, pp 6240–49
  123. 123. 
    Arora S, Ge R, Neyshabur B, Zhang Y 2018. Proc. Mach. Learn. Res. 80:254–63
    [Google Scholar]
  124. 124. 
    Cortes C, Vapnik V 1995. Mach. Learn. 20:273–97
    [Google Scholar]
  125. 125. 
    Belkin M, Ma S, Mandal S 2018. Proc. Mach. Learn. Res. 80:540–48
    [Google Scholar]
  126. 126. 
    Gardner E 1988. J. Phys. A Math. Gen. 21:257–70
    [Google Scholar]
  127. 127. 
    Seung HS, Sompolinsky H, Tishby N 1992. Phys. Rev. A 45:6056
    [Google Scholar]
  128. 128. 
    Advani M, Ganguli S 2016. Phys. Rev. X 6:3031034
    [Google Scholar]
  129. 129. 
    Hochreiter S, Schmidhuber J 1997. Neural Comput. 9:1–42
    [Google Scholar]
  130. 130. 
    Keskar NS, Mudigere D, Nocedal J, Smelyanskiy M, Tang PTP 2017. Paper presented at 5th International Conference on Learning Representations (ICLR 2017), Toulon, France
  131. 131. 
    Shwartz-Ziv R, Tishby N 2017. arXiv:1703.00810
  132. 132. 
    Saxe AM, Bansal Y, Dapello J, Advani M, Kolchinsky A et al. 2018. Paper presented at 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada
  133. 133. 
    Hinton G, Van Camp D 1993. Proceedings of the 6th Annual Conference on Computational Learning Theory (COLT 1993) L Pitt5–13 New York: Assoc. Comput. Mach.
    [Google Scholar]
  134. 134. 
    Hochreiter S, Schmidhuber J 1994. Advances in Neural Information Processing Systems 31 (NIPS 1994) S Bengio, H Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, R Garnett529–36 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  135. 135. 
    Neyshabur B, Tomioka R, Srebro N 2015. Paper presented at 3rd International Conference on Learning Representations (ICLR 2015) Workshop Track, San Diego, CA, Abstr. #1412.6614
  136. 136. 
    Novak R, Bahri Y, Abolafia DA, Pennington J, Sohl-Dickstein J 2018. Paper presented at 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada
  137. 137. 
    Novak R, Xiao L, Lee J, Bahri Y, Yang G et al. 2019. Paper presented at 7th International Conference on Learning Representations (ICLR 2019), New Orleans, LA
  138. 138. 
    de G. Matthews AG, Hron J, Rowland M, Turner RE, Ghahramani Z 2018. Paper presented at 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada
  139. 139. 
    Williams CK 1997. Advances in Neural Information Processing Systems 10 (NIPS 1997) MI Jordan, MJ Kearns, SA Solla295–301 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  140. 140. 
    Rasmussen CE, Williams CKI 2005. Gaussian Processes for Machine Learning Cambridge, MA: MIT Press
    [Google Scholar]
  141. 141. 
    Lemm J 1999. arXiv:physics/9912005
  142. 142. 
    Jacot A, Gabriel F, Hongler C 2018. See Reference 194, pp 8571–80
  143. 143. 
    Lee J, Xiao L, Schoenholz SS, Bahri Y, Sohl-Dickstein J, Pennington J 2019. Advances in Neural Information Processing Systems 32 (NIPS 2019) S Bengio, H Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, R Garnett8570–81 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  144. 144. 
    Arora S, Du SS, Hu W, Li Z, Salakhutdinov R, Wang R 2019. Advances in Neural Information Processing Systems 32 (NIPS 2019) S Bengio, H Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, R Garnett8139–48 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  145. 145. 
    Chizat L, Bach F 2018. See Reference 194, pp 3036–46
  146. 146. 
    Song M, Montanari A, Nguyen P 2018. PNAS 115:33E7665–71
    [Google Scholar]
  147. 147. 
    Rotskoff GM, Vanden-Eijnden E 2018. See Reference 194, pp 7146–55
  148. 148. 
    Sirignano J, Spiliopoulos K 2019. Stoch. Process. Appl. In press
    [Google Scholar]
  149. 149. 
    Ranzato M, Mnih V, Hinton GE 2010. Advances in Neural Information Processing Systems 23 (NIPS 2010) JD Lafferty, CKI Williams, J Shawe-Taylor, RS Zemel, A Culotta2002–10 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  150. 150. 
    Du Y, Mordatch I 2019. arXiv:1903.08689
  151. 151. 
    Menick J, Kalchbrenner N 2019. Paper presented at 7th International Conference on Learning Representations (ICLR 2019), New Orleans, LA
  152. 152. 
    Radford A, Metz L, Chintala S 2015. Paper presented at 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA
  153. 153. 
    Zontak M, Irani M 2011. Conference on Computer Vision and Pattern Recognition (CVPR 2011), Colorado Springs, CO, June 20–25 Piscataway, NJ: IEEE https://doi.org/10.1109/CVPR.2011.5995401
    [Crossref] [Google Scholar]
  154. 154. 
    MacKay DJ 2003. Information Theory, Inference and Learning Algorithms Cambridge, UK: Cambridge Univ. Press
    [Google Scholar]
  155. 155. 
    Zhu JY, Krähenbühl P, Shechtman E, Efros AA 2016. European Conference on Computer Vision (ECCV 2016) B Leibe, J Matas, N Sebe, M Welling597–613 Cham: Springer
    [Google Scholar]
  156. 156. 
    Murphy KP 2012. Machine Learning: A Probabilistic Perspective Cambridge, MA: MIT Press
    [Google Scholar]
  157. 157. 
    Ackley DH, Hinton GE, Sejnowski TJ 1985. Cogn. Sci. 9:147–69
    [Google Scholar]
  158. 158. 
    Freund Y, Haussler D 1992. Advances in Neural Information Processing Systems 5 (NIPS 1992) SJ Hanson, JD Cowan, CL Giles912–19 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  159. 159. 
    Hinton GE, Osindero S, Teh YW 2006. Neural Comput. 18:1527–54
    [Google Scholar]
  160. 160. 
    Salakhutdinov R, Hinton G 2009. J. Mach. Learn. Res. 5:448–55
    [Google Scholar]
  161. 161. 
    Ngiam J, Chen Z, Koh PW, Ng AY 2011. Proceedings of the 28th International Conference on Learning Representations (ICLR 2011) L Getoor, T Scheffer1105–12 Madison, WI: Omnipress
    [Google Scholar]
  162. 162. 
    Zhao J, Mathieu M, LeCun Y 2017. Paper presented at 5th International Conference on Learning Representations (ICLR 2017), Toulon, France
  163. 163. 
    Hinton GE 2002. Neural Comput. 14:1771–800
    [Google Scholar]
  164. 164. 
    Tieleman T, Hinton G 2009. Proceedings of the 26th International Conference on Machine Learning (ICML 2009), Montreal, Quebec, Canada, June 14–18 A Danyluk, L Bottou, M Littman1033–40 New York: Assoc. Comput. Mach.
    [Google Scholar]
  165. 165. 
    Hyvärinen A 2005. J. Mach. Learn. Res. 6:695–709
    [Google Scholar]
  166. 166. 
    Besag J 1975. J. R. Stat. Soc. Ser. D (Statistician) 24:179–95
    [Google Scholar]
  167. 167. 
    Sohl-Dickstein J, Battaglino P, DeWeese MR 2011. Proceedings of the 28th International Conference on Machine Learning (ICML 2011), Bellevue, Washington, June 28–July 2 L Getoor, T Scheffer905–12 Madison, WI: Omnipress
    [Google Scholar]
  168. 168. 
    Sohl-Dickstein J, Battaglino P, DeWeese MR 2011. Phys. Rev. Lett. 107:220601
    [Google Scholar]
  169. 169. 
    LeCun Y, Chopra S, Hadsell R, Ranzato M, Huang FJ 2006. Predicting Structured Data G Bakır, T Hofmann, B Schölkopf, A Smola, B Taskar191–246 Cambridge, MA: MIT Press
    [Google Scholar]
  170. 170. 
    Jordan MI 2003. An Introduction to Probabilistic Graphical Models Chapters available at https://people.eecs.berkeley.edu/˜jordan/prelims
    [Google Scholar]
  171. 171. 
    Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D et al. 2014. See Reference 192, pp 2672–80
  172. 172. 
    Levy D, Hoffman MD, Sohl-Dickstein J 2017. Paper presented at 5th International Conference on Learning Representations (ICLR 2017), Toulon, France
  173. 173. 
    Dinh L, Krueger D, Bengio Y 2014. arXiv:1410.8516
  174. 174. 
    Dinh L, Sohl-Dickstein J, Bengio S 2016. Paper presented at 4th International Conference on Learning Representations (ICLR 2016), San Juan, Puerto Rico
  175. 175. 
    Rezende DJ, Mohamed S 2015. Paper presented at 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA
  176. 176. 
    van den Oord A, Kalchbrenner N, Kavukcuoglu K 2016. Proc. Mach. Learn. Res. 48:1747–56
    [Google Scholar]
  177. 177. 
    Kingma DP, Welling M 2014. Paper presented at the 2nd International Conference on Learning Representations (ICLR 2014), Banff, AB, Canada
  178. 178. 
    Gregor K, Danihelka I, Mnih A, Blundell C, Wierstra D 2014. Proc. Mach. Learn. Res. 32:21242–50
    [Google Scholar]
  179. 179. 
    Rezende DJ, Mohamed S, Wierstra D 2014. Proc. Mach. Learn. Res. 32:21278–86
    [Google Scholar]
  180. 180. 
    Ozair S, Bengio Y 2014. arXiv:1410.0630
  181. 181. 
    Crutchfield JP, Mitchell M 1995. PNAS 92:10742–46
    [Google Scholar]
  182. 182. 
    Still S, Sivak DA, Bell AJ, Crooks GE 2012. Phys. Rev. Lett. 109:120604
    [Google Scholar]
  183. 183. 
    Parrondo JM, Horowitz JM, Sagawa T 2015. Nat. Phys. 11:131–39
    [Google Scholar]
  184. 184. 
    Lahiri S, Sohl-Dickstein J, Ganguli S 2016. arXiv:1603.07758
  185. 185. 
    Neal RM 2001. Stat. Comput. 11:125–39
    [Google Scholar]
  186. 186. 
    Neal RM 2005. arXiv:math/0511216
  187. 187. 
    Sohl-Dickstein J, Culpepper BJ 2012. arXiv:1205.1925
  188. 188. 
    Goyal A, Ke NR, Ganguli S, Bengio Y 2017. See Reference 193, pp 4392–402
  189. 189. 
    Bordes F, Honari S, Vincent P 2017. Paper presented at 5th International Conference on Learning Representations (ICLR 2017), Toulon, France
  190. 190. 
    Gao P, Ganguli S 2015. Curr. Opin. Neurobiol. 32:148–55
    [Google Scholar]
  191. 191. 
    Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett Reds 2016. Advances in Neural Information Processing Systems 29 (NIPS 2016) Red Hook, NY: Curran Assoc.
    [Google Scholar]
  192. 192. 
    Ghahramani Z, Welling M, Cortes Ceds 2014. Advances in Neural Information Processing Systems 27 (NIPS 2014) Red Hook, NY: Curran Assoc.
    [Google Scholar]
  193. 193. 
    Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, et al.eds 2017. Advances in Neural Information Processing Systems 30 (NIPS 2017) Red Hook, NY: Curran Assoc.
    [Google Scholar]
  194. 194. 
    Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R Advances in Neural Information Processing Systems 31 (NIPS 2018) Red Hook, NY: Curran Assoc.
    [Google Scholar]
/content/journals/10.1146/annurev-conmatphys-031119-050745
Loading
/content/journals/10.1146/annurev-conmatphys-031119-050745
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error