A Separation Principle for Control in the Age of Deep Learning

Alessandro Achille; Stefano Soatto

doi:10.1146/annurev-control-060117-105140

Annual Review of Control, Robotics, and Autonomous Systems

Volume 1, 2018

Review Article

Free

A Separation Principle for Control in the Age of Deep Learning

Alessandro Achille¹, and Stefano Soatto¹
View Affiliations Hide Affiliations

Affiliations: Department of Computer Science, University of California, Los Angeles, California 90095, USA; email: [email protected], [email protected]
Vol. 1:287-307 (Volume publication date May 2018) https://doi.org/10.1146/annurev-control-060117-105140
Copyright © 2018 by Annual Reviews. All rights reserved

Abstract

We review the problem of defining and inferring a state for a control system based on complex, high-dimensional, highly uncertain measurement streams, such as videos. Such a state, or representation, should contain all and only the information needed for control and discount nuisance variability in the data. It should also have finite complexity, ideally modulated depending on available resources. This representation is what we want to store in memory in lieu of the data, as it separates the control task from the measurement process. For the trivial case with no dynamics, a representation can be inferred by minimizing the information bottleneck Lagrangian in a function class realized by deep neural networks. The resulting representation has much higher dimension than the data (already in the millions) but is smaller in the sense of information content, retaining only what is needed for the task. This process also yields representations that are invariant to nuisance factors and have maximally independent components. We extend these ideas to the dynamic case, where the representation is the posterior density of the task variable given the measurements up to the current time, which is in general much simpler than the prediction density maintained by the classical Bayesian filter. Again, this can be finitely parameterized using a deep neural network, and some applications are already beginning to emerge. No explicit assumption of Markovianity is needed; instead, complexity trades off approximation of an optimal representation, including the degree of Markovianity.

Keyword(s): autonomy, deep learning, end-to-end training, generalization, invariance, learning control, minimality, neural network, prediction, representation

Article metrics loading...

/content/journals/10.1146/annurev-control-060117-105140

2018-05-28

2024-04-25

Full text loading...

/deliver/fulltext/control/1/1/annurev-control-060117-105140.html?itemId=/content/journals/10.1146/annurev-control-060117-105140&mimeType=html&fmt=ahah

Literature Cited

1. Kalman RE 1960. A new approach to linear filtering and prediction problems. ASME J. Basic Eng. 82:35–45
[Google Scholar]
2. Arun K, Kung S 1990. Balanced approximation of stochastic systems. SIAM J. Matrix Anal. Appl. 11:42–68
[Google Scholar]
3. Lindquist A, Picci G 1979. On the stochastic realization problem. SIAM J. Control Optim. 17:365–89
[Google Scholar]
4. Akaike H 1974. A new look at the statistical model identification. IEEE Trans. Autom. Control 19:716–23
[Google Scholar]
5. Lewis FL, Vrabie D, Syrmos VL 2012. Optimal Control New York: Wiley & Sons
6. Chiaromonte F, Cook RD, Li B 2002. Sufficient dimension reduction in regressions with categorical predictors. Ann. Stat. 30:475–97
[Google Scholar]
7. Shyr A, Urtasun R, Jordan MI 2010. Sufficient dimension reduction for visual sequence classification. 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)3610–17 New York: IEEE
[Google Scholar]
8. Sundaramoorthi G, Petersen P, Varadarajan VS, Soatto S 2009. On the set of images modulo viewpoint and contrast changes. 2009 IEEE Conference on Computer Vision and Pattern Recognition832–39 New York: IEEE
[Google Scholar]
9. Krizhevsky A, Sutskever I, Hinton GE 2012. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25:1097–105
[Google Scholar]
10. Cover TM, Thomas JA 2012. Elements of Information Theory New York: Wiley & Sons
11. Zhang C, Bengio S, Hardt M, Recht B, Vinyals O 2016. Understanding deep learning requires rethinking generalization. arXiv:1611.03530
12. Achille A, Soatto S 2018. On the emergence of invariance and disentangling in deep representations. Int. J. Mach. Learn. Res. In press
13. Jeffreys H 1960. An extension of the Pitman–Koopman theorem. Math. Proc. Camb. Philos. Soc. 56:393–95
[Google Scholar]
14. Bahadur RR 1954. Sufficiency and statistical decision functions. Ann. Math. Stat. 25:423–62
[Google Scholar]
15. Achille A, Soatto S 2018. Information dropout: learning optimal representations through noisy computation. IEEE Trans. Pattern Anal. Mach. Intell. In press
16. Koren Y 2010. Collaborative filtering with temporal dynamics. Commun. ACM 53:89–97
[Google Scholar]
17. Wu CY, Ahmed A, Beutel A, Smola AJ, Jing H 2017. Recurrent recommender networks. WSDM '17: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining495–503 New York: ACM
[Google Scholar]
18. Krishnan RG, Shalit U, Sontag D 2015. Deep Kalman filters. arXiv:1511.05121
19. Raiko T, Tornio M 2009. Variational Bayesian learning of nonlinear hidden state-space models for model predictive control. Neurocomputing 72:3704–12
[Google Scholar]
20. Langford J, Salakhutdinov R, Zhang T 2009. Learning nonlinear dynamic models. ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning593–600 New York: ACM
[Google Scholar]
21. Jazwinski AH 2007. Stochastic Processes and Filtering Theory North Chelmsford, MA: Courier
22. Wan EA, Van Der Merwe R 2000. The unscented Kalman filter for nonlinear estimation. Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium153–58 New York: IEEE
[Google Scholar]
23. Fox R, Tishby N 2016. Minimum-information LQG control part II: retentive controllers. 2016 IEEE 55th Conference on Decision and Control (CDC)5603–9 New York: IEEE
[Google Scholar]
24. Tiomkin S, Polani D, Tishby N 2017. Control capacity of partially observable dynamic systems in continuous time. arXiv:1701.04984
25. Rubin J, Shamir O, Tishby N 2012. Trading value and information in MDPS. Decision Making with Imperfect Decision Makers TV Guy, M Kàrny`, DH Wolpert 57–74 Berlin: Springer
[Google Scholar]
26. Fox R, Moshkovitz M, Tishby N 2016. Principled option learning in Markov decision processes. arXiv:1609.05524
27. Dosovitskiy A, Koltun V 2016. Learning to act by predicting the future. arXiv:1611.01779
28. Houthooft R, Chen X, Duan Y, Schulman J, De Turck F, Abbeel P 2016. VIME: variational information maximizing exploration. Adv. Neural Inf. Process. Syst. 29:1109–117
[Google Scholar]
29. Dong J, Soatto S 2015. Domain-size pooling in local descriptors: DSP-SIFT. 2015 IEEE Conference on Computer Vision and Pattern Recognition5097–106 New York: IEEE
[Google Scholar]
30. Tishby N, Pereira FC, Bialek W 1999. The information bottleneck method. The 37th Annual Allerton Conference on Communication, Control, and Computing B Hajek, RS Sreenivas 368–77 Urbana: Univ. Ill.
[Google Scholar]
31. Alemi AA, Fischer I, Dillon JV, Murphy K 2016. Deep variational information bottleneck. arXiv:1612.00410
32. Soatto S 2013. Actionable information in vision. Machine Learning for Computer Vision R Cipolla, S Battiato, GM Farinella 17–48 Berlin: Springer
[Google Scholar]
33. Strouse D, Schwab DJ 2016. The deterministic information bottleneck. arXiv:1604.00268
34. LeCun Y, Boser B, Denker J, Henderson D, Howard R et al. 1990. Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Process. Syst. 2:396–404
[Google Scholar]
35. Soatto S, Chiuso A 2016. Visual representations: defining properties and deep approximations. 4th International Conference on Learning Representation (ICLR). https://arxiv.org/abs/1411.7676
36. Glorot X, Bengio Y 2010. Understanding the difficulty of training deep feedforward neural networks. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics249–56 N.p.: PMLR
[Google Scholar]
37. Nesterov Y 2013. Introductory Lectures on Convex Optimization: A Basic Course New York: Springer
38. Kingma D, Ba J 2014. Adam: a method for stochastic optimization. arXiv:1412.6980
39. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J et al. 2015. Human-level control through deep reinforcement learning. Nature 518:529–33
[Google Scholar]
40. Kingma DP, Salimans T, Welling M 2015. Variational dropout and the local reparameterization trick. Adv. Neural Inf. Process. Syst. 28:2575–83
[Google Scholar]
41. McAllester D 2013. A PAC-Bayesian tutorial with a dropout bound. arXiv:1307.2118
42. Dziugaite GK, Roy DM 2017. Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data. Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence chap. 173 N.p.: AUAI Press http://auai.org/uai2017/proceedings/papers/173.pdf
[Google Scholar]
43. Chaudhari P, Soatto S 2017. Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks. arXiv:1710.11029
44. Hochreiter S, Schmidhuber J 1997. Flat minima. Neural Comput 9:1–42
[Google Scholar]
45. Bar-Shalom Y, Fortmann TE 1987. Tracking and Data Association San Diego, CA: Academic

/content/journals/10.1146/annurev-control-060117-105140

A Separation Principle for Control in the Age of Deep Learning

Annual Review of Control, Robotics, and Autonomous Systems 1, 287 (2018); https://doi.org/10.1146/annurev-control-060117-105140

/content/journals/10.1146/annurev-control-060117-105140

Data & Media loading...

Article Type: Review Article

Most Cited Most Cited RSS feed

- Planning and Decision-Making for Autonomous Vehicles
  
  Wilko Schwarting, Javier Alonso-Mora, and Daniela Rus
  
  Vol. 1 (2018), pp. 187–210
- Learning-Based Model Predictive Control: Toward Safe Learning in Control
  
  Lukas Hewing, Kim P. Wabersich, Marcel Menner, and Melanie N. Zeilinger
  
  Vol. 3 (2020), pp. 269–296
- Recent Advances in Robot Learning from Demonstration
  
  Harish Ravichandar, Athanasios S. Polydoros, Sonia Chernova, and Aude Billard
  
  Vol. 3 (2020), pp. 297–330
- A Tour of Reinforcement Learning: The View from Continuous Control
  
  Benjamin Recht
  
  Vol. 2 (2019), pp. 253–279
- Haptics: The Present and Future of Artificial Touch Sensation
  
  Heather Culbertson, Samuel B. Schorr, and Allison M. Okamura
  
  Vol. 1 (2018), pp. 385–409
- Magnetic Methods in Robotics
  
  Jake J. Abbott, Eric Diller, and Andrew J. Petruska
  
  Vol. 3 (2020), pp. 57–90
- A Century of Robotic Hands
  
  C. Piazza, G. Grioli, M.G. Catalano, and A. Bicchi
  
  Vol. 2 (2019), pp. 1–32
- Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning
  
  Lukas Brunke, Melissa Greeff, Adam W. Hall, Zhaocong Yuan, Siqi Zhou, Jacopo Panerati, and Angela P. Schoellig
  
  Vol. 5 (2022), pp. 411–444
- Distributed Optimization for Control
  
  Angelia Nedić, and Ji Liu
  
  Vol. 1 (2018), pp. 77–103
- Soft Micro- and Nanorobotics
  
  Chengzhi Hu, Salvador Pané, and Bradley J. Nelson
  
  Vol. 1 (2018), pp. 53–75
More Less

Annual Review of Control, Robotics, and Autonomous Systems

Volume 1, 2018

Review Article

Free

A Separation Principle for Control in the Age of Deep Learning

Abstract

Most Read This Month

Most Cited Most Cited RSS feed