1932

Abstract

Machine learning is rapidly advancing nearly every field of science and engineering, and control theory is no exception. In particular, it has shown incredible promise for handling several of the main challenges facing modern dynamics and control, including complexity, unmodeled dynamics, strong nonlinearity, and hidden variables. However, machine learning models are often expensive to train and deploy, fail to generalize beyond the training data, and suffer from a lack of explainability, interpretability, and guarantees, all of which limit their use in real-world and safety-critical control applications. Sparse nonlinear modeling and control techniques are a powerful class of machine learning that promote parsimony through sparse optimization, providing data-efficient models that are more interpretable and generalizable and have proven effective for control. In this review, we explore the use of sparse optimization in the context of machine learning to develop compact models and controllers that are easy to train, require significantly less data, and make low-latency predictions. In particular, we focus on applications in model predictive control and reinforcement learning, two of the foundational algorithms in control theory.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-control-030123-015238
2025-05-05
2025-06-22
Loading full text...

Full text loading...

/deliver/fulltext/control/8/1/annurev-control-030123-015238.html?itemId=/content/journals/10.1146/annurev-control-030123-015238&mimeType=html&fmt=ahah

Literature Cited

  1. 1.
    Garcia CE, Prett DM, Morari M. 1989.. Model predictive control: theory and practice—a survey. . Automatica 25:(3):33548
    [Crossref] [Google Scholar]
  2. 2.
    Morari M, Lee JH. 1999.. Model predictive control: past, present and future. . Comput. Chem. Eng. 23:(4):66782
    [Crossref] [Google Scholar]
  3. 3.
    Mayne DQ. 2014.. Model predictive control: recent developments and future promise. . Automatica 50:(12):296786
    [Crossref] [Google Scholar]
  4. 4.
    Sutton RS, Barto AG. 1998.. Reinforcement Learning: An Introduction. Cambridge, MA:: MIT Press
    [Google Scholar]
  5. 5.
    Recht B. 2019.. A tour of reinforcement learning: the view from continuous control. . Annu. Rev. Control Robot. Auton. Syst. 2::25379
    [Crossref] [Google Scholar]
  6. 6.
    Brunton SL, Noack BR, Koumoutsakos P. 2020.. Machine learning for fluid mechanics. . Annu. Rev. Fluid Mech. 52::477508
    [Crossref] [Google Scholar]
  7. 7.
    Loiseau JC, Brunton SL. 2018.. Constrained sparse Galerkin regression. . J. Fluid Mech. 838::4267
    [Crossref] [Google Scholar]
  8. 8.
    Duraisamy K, Iaccarino G, Xiao H. 2019.. Turbulence modeling in the age of data. . Annu. Rev. Fluid Mech. 51::35777
    [Crossref] [Google Scholar]
  9. 9.
    Loiseau JC. 2020.. Data-driven modeling of the chaotic thermal convection in an annular thermosyphon. . Theor. Comput. Fluid Dyn. 34::33965
    [Crossref] [Google Scholar]
  10. 10.
    Deng N, Noack BR, Morzynski M, Pastur LR. 2020.. Low-order model for successive bifurcations of the fluidic pinball. . J. Fluid Mech. 884::A37
    [Crossref] [Google Scholar]
  11. 11.
    Schmelzer M, Dwight RP, Cinnella P. 2020.. Discovery of algebraic Reynolds-stress models using sparse symbolic regression. . Flow Turbul. Combust. 104:(2):579603
    [Crossref] [Google Scholar]
  12. 12.
    Zanna L, Bolton T. 2020.. Data-driven equation discovery of ocean mesoscale closures. . Geophys. Res. Lett. 47:(17):e2020GL088376
    [Crossref] [Google Scholar]
  13. 13.
    Beetham S, Fox RO, Capecelatro J. 2021.. Sparse identification of multiphase turbulence closures for coupled fluid–particle flows. . J. Fluid Mech. 914::A11
    [Crossref] [Google Scholar]
  14. 14.
    Montague PR, Dayan P, Sejnowski TJ. 1996.. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. . J. Neurosci. 16:(5):193647
    [Crossref] [Google Scholar]
  15. 15.
    Dayan P, Abbott LF. 2001.. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. Cambridge, MA:: MIT Press
    [Google Scholar]
  16. 16.
    Ho BL, Kalman RE. 1965.. Effective construction of linear state-variable models from input/output data. . In Proceedings of the 3rd Allerton Conference on Circuit and System Theory, pp. 44959. Urbana:: Univ. Ill. Press
    [Google Scholar]
  17. 17.
    Juang JN, Pappa RS. 1985.. An eigensystem realization algorithm for modal parameter identification and model reduction. . J. Guid. Control Dyn. 8:(5):62027
    [Crossref] [Google Scholar]
  18. 18.
    Juang JN. 1994.. Applied System Identification. Englewood Cliffs, NJ:: Prentice Hall
    [Google Scholar]
  19. 19.
    Ljung L. 1999.. System Identification: Theory for the User. Upper Saddle River, NJ:: Prentice Hall. , 2nd ed..
    [Google Scholar]
  20. 20.
    Brunton SL, Kutz JN. 2022.. Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Cambridge, UK:: Cambridge Univ. Press. , 2nd ed..
    [Google Scholar]
  21. 21.
    Jordan M, Mitchell T. 2015.. Machine learning: trends, perspectives, and prospects. . Science 349:(6245):25560
    [Crossref] [Google Scholar]
  22. 22.
    Tsukamoto H, Chung SJ, Slotine JJE. 2021.. Contraction theory for nonlinear stability analysis and learning-based control: a tutorial overview. . Annu. Rev. Control 52::13569
    [Crossref] [Google Scholar]
  23. 23.
    Furieri L, Galimberti CL, Zakwan M, Ferrari-Trecate G. 2022.. Distributed neural network control with dependability guarantees: a compositional port-Hamiltonian approach. . In Proceedings of the 4th Annual Learning for Dynamics and Control Conference, ed. R Firoozi, N Mehr, E Yel, R Antonova, J Bohg , et al., pp. 57183. Proc. Mach. Learn. Res. 168 . N.p.:: PMLR
    [Google Scholar]
  24. 24.
    Wang R, Barbara NH, Revay M, Manchester IR. 2022.. Learning over all stabilizing nonlinear controllers for a partially-observed linear system. . IEEE Control Syst. Lett. 7::9196
    [Crossref] [Google Scholar]
  25. 25.
    Furieri L, Galimberti CL, Ferrari-Trecate G. 2022.. Neural system level synthesis: learning over all stabilizing policies for nonlinear systems. . In 2022 IEEE 61st Conference on Decision and Control, pp. 276570. Piscataway, NJ:: IEEE
    [Google Scholar]
  26. 26.
    Gu F, Yin H, El Ghaoui L, Arcak M, Seiler P, Jin M. 2022.. Recurrent neural network controllers synthesis with stability guarantees for partially observed systems. . Proc. AAAI Conf. Artif. Intell. 36::538594
    [Google Scholar]
  27. 27.
    Cheng J, Wang R, Manchester IR. 2024.. Learning stable and passive neural differential equations. . arXiv:2404.12554 [eess.SY]
  28. 28.
    Barbara NH, Wang R, Manchester IR. 2024.. On robust reinforcement learning with Lipschitz-bounded policy networks. . arXiv:2405.11432 [cs.LG]
  29. 29.
    Saccani D, Massai L, Furieri L, Ferrari-Trecate G. 2024.. Optimal distributed control with stability guarantees by training a network of neural closed-loop maps. . arXiv:2404.02820 [math.OC]
  30. 30.
    Doyle JC, Francis BA, Tannenbaum AR. 2013.. Feedback Control Theory. North Chelmsford, MA:: Courier
    [Google Scholar]
  31. 31.
    Brunton SL, Proctor JL, Kutz JN. 2016.. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. . PNAS 113:(15):393237 Introduces sparse regression for parsimonious nonlinear modeling.
    [Crossref] [Google Scholar]
  32. 32.
    Rudy SH, Brunton SL, Proctor JL, Kutz JN. 2017.. Data-driven discovery of partial differential equations. . Sci. Adv. 3::e1602614
    [Crossref] [Google Scholar]
  33. 33.
    Bongard J, Lipson H. 2007.. Automated reverse engineering of nonlinear dynamical systems. . PNAS 104:(24):994348
    [Crossref] [Google Scholar]
  34. 34.
    Schmidt M, Lipson H. 2009.. Distilling free-form natural laws from experimental data . Science 324:(5923):8185 An early use of symbolic regression for parsimonious nonlinear modeling.
    [Crossref] [Google Scholar]
  35. 35.
    Cranmer M. 2023.. Interpretable machine learning for science with PySR and SymbolicRegression.jl. . arXiv:2305.01582 [astro-ph.IM]
  36. 36.
    Conti P, Gobat G, Fresca S, Manzoni A, Frangi A. 2023.. Reduced order modeling of parametrized systems through autoencoders and SINDy approach: continuation of periodic solutions. . Comput. Methods Appl. Mech. Eng. 411::116072
    [Crossref] [Google Scholar]
  37. 37.
    Kaptanoglu AA, Callaham JL, Hansen CJ, Aravkin A, Brunton SL. 2021.. Promoting global stability in data-driven models of quadratic nonlinear dynamics. . Phys. Rev. Fluids 6::094401
    [Crossref] [Google Scholar]
  38. 38.
    Schmid PJ. 2010.. Dynamic mode decomposition of numerical and experimental data. . J. Fluid Mech. 656::528
    [Crossref] [Google Scholar]
  39. 39.
    Rowley CW, Mezic I, Bagheri S, Schlatter P, Henningson D. 2009.. Spectral analysis of nonlinear flows. . J. Fluid Mech. 645::11527
    [Crossref] [Google Scholar]
  40. 40.
    Kutz JN, Brunton SL, Brunton BW, Proctor JL. 2016.. Dynamic Mode Decomposition: Data-Driven Modeling of Complex Systems. Philadelphia:: Soc. Ind. Appl. Math.
    [Google Scholar]
  41. 41.
    Schmid PJ. 2022.. Dynamic mode decomposition and its variants. . Annu. Rev. Fluid Mech. 54::22554
    [Crossref] [Google Scholar]
  42. 42.
    Brunton SL, Budišić M, Kaiser E, Kutz JN. 2022.. Modern Koopman theory for dynamical systems. . SIAM Rev. 64:(2):229340
    [Crossref] [Google Scholar]
  43. 43.
    Kaiser E, Kutz JN, Brunton SL. 2018.. Sparse identification of nonlinear dynamics for model predictive control in the low-data limit. . Proc. R. Soc. A 474:(2219):20180335 Provides a direct comparison of DMD, SINDy, and neural networks for MPC.
    [Crossref] [Google Scholar]
  44. 44.
    Zolman N, Fasel U, Kutz JN, Brunton SL. 2024.. SINDy-RL: interpretable and efficient model-based reinforcement learning. . arXiv:2403.09110 [cs.LG]
  45. 45.
    Paszke A, Gross S, Chintala S, Chanan G, Yang E, et al. 2017.. Automatic differentiation in PyTorch. . In NIPS 2017 Autodiff Workshop: The Future of Gradient-Based Machine Learning Software and Techniques. https://openreview.net/forum?id=BJJsrmfCZ
    [Google Scholar]
  46. 46.
    Baydin AG, Pearlmutter BA, Radul AA, Siskind JM. 2018.. Automatic differentiation in machine learning: a survey. . J. Mach. Learn. Res. 18:(153):143
    [Google Scholar]
  47. 47.
    Drgoňa J, Kiš K, Tuor A, Vrabie D, Klaučo M. 2022.. Differentiable predictive control: deep learning alternative to explicit model predictive control for unknown nonlinear systems. . J. Process Control 116::8092
    [Crossref] [Google Scholar]
  48. 48.
    Amos B, Jimenez I, Sacks J, Boots B, Kolter JZ. 2018.. Differentiable MPC for end-to-end planning and control. . In Advances in Neural Information Processing Systems 31, ed. S Bengio, H Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, R Garnett , pp. 8289300. Red Hook, NY:: Curran
    [Google Scholar]
  49. 49.
    de Avila Belbute-Peres F, Smith K, Allen K, Tenenbaum J, Kolter JZ. 2018.. End-to-end differentiable physics for learning and control. . In Advances in Neural Information Processing Systems 31, ed. S Bengio, H Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, R Garnett , pp. 717889. Red Hook, NY:: Curran
    [Google Scholar]
  50. 50.
    Holl P, Koltun V, Thuerey N. 2020.. Learning to control PDEs with differentiable physics. . arXiv:2001.07457 [cs.LG]
  51. 51.
    Jin W, Wang Z, Yang Z, Mou S. 2020.. Pontryagin differentiable programming: an end-to-end learning and control framework. . In Advances in Neural Information Processing Systems 33, ed. H Larochelle, M Ranzato, R Hadsell, MF Balcan, H Lin , pp. 797992. Red Hook, NY:: Curran
    [Google Scholar]
  52. 52.
    Jameson A, Martinelli L, Pierce N. 1998.. Optimum aerodynamic design using the Navier–Stokes equations. . Theor. Comput. Fluid Dyn. 10:(1–4):21337
    [Crossref] [Google Scholar]
  53. 53.
    Billings SA. 2013.. Nonlinear System Identification: NARMAX Methods in the Time, Frequency, and Spatio-Temporal Domains. Hoboken, NJ:: Wiley & Sons
    [Google Scholar]
  54. 54.
    Tibshirani R. 1996.. Regression shrinkage and selection via the lasso. . J. R. Stat. Soc. B 58:(1):26788
    [Crossref] [Google Scholar]
  55. 55.
    Zhang L, Schaeffer H. 2019.. On the convergence of the SINDy algorithm. . Multiscale Model. Simul. 17:(3):94872
    [Crossref] [Google Scholar]
  56. 56.
    Kaptanoglu AA, de Silva BM, Fasel U, Kaheman K, Callaham JL, et al. 2021.. PySINDy: a comprehensive Python package for robust sparse system identification. . arXiv:2111.08481 [eess.SY]
  57. 57.
    Carleman T. 1932.. Application de la théorie des équations intégrales linéaires aux systémes d'équations différentielles non linéaires. . Acta Math. 59:(1):6387
    [Crossref] [Google Scholar]
  58. 58.
    Van Breugel F, Kutz JN, Brunton BW. 2020.. Numerical differentiation of noisy data: a unifying multi-objective optimization framework. . IEEE Access 8::19686577
    [Crossref] [Google Scholar]
  59. 59.
    Rosafalco L, Conti P, Manzoni A, Mariani S, Frangi A. 2024.. EKF-SINDy: empowering the extended Kalman filter with sparse identification of nonlinear dynamics. . arXiv:2404.07536 [math.DS]
  60. 60.
    Reinbold PA, Gurevich DR, Grigoriev RO. 2020.. Using noisy or incomplete data to discover models of spatiotemporal dynamics. . Phys. Rev. E 101:(1):010203
    [Crossref] [Google Scholar]
  61. 61.
    Messenger DA, Bortz DM. 2021.. Weak SINDy: Galerkin-based data-driven model selection . Multiscale Model. Simul. 19:(3):147497 Presents weak-form SINDy and shows that it dramatically improves noise robustness.
    [Crossref] [Google Scholar]
  62. 62.
    Fasel U, Kutz JN, Brunton BW, Brunton SL. 2022.. Ensemble-SINDy: robust sparse model discovery in the low-data, high-noise limit, with active learning and control. . Proc. R. Soc. A 478:(2260):20210904
    [Crossref] [Google Scholar]
  63. 63.
    Schaeffer H, McCalla SG. 2017.. Sparse model selection via integral terms. . Phys. Rev. E 96:(2):023302
    [Crossref] [Google Scholar]
  64. 64.
    Breiman L. 1996.. Bagging predictors. . Mach. Learn. 24::12340
    [Google Scholar]
  65. 65.
    Bühlmann P. 2012.. Bagging, boosting and ensemble methods. . In Handbook of Computational Statistics: Concepts and Methods, ed. J Gentle, W Härdle, Y Mori , pp. 9851022. Berlin:: Springer
    [Google Scholar]
  66. 66.
    Hastie T, Tibshirani R, Friedman JH. 2009.. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York:: Springer. , 2nd ed..
    [Google Scholar]
  67. 67.
    Meinshausen N, Bühlmann P. 2010.. Stability selection. . J. R. Stat. Soc. B 72:(4):41773
    [Crossref] [Google Scholar]
  68. 68.
    Gao L, Fasel U, Brunton SL, Kutz JN. 2023.. Convergence of uncertainty estimates in ensemble and Bayesian sparse model discovery. . arXiv:2301.12649 [cs.LG]
  69. 69.
    Proctor JL, Brunton SL, Kutz JN. 2016.. Dynamic mode decomposition with control. . SIAM J. Appl. Dyn. Syst. 15:(1):14261
    [Crossref] [Google Scholar]
  70. 70.
    Askham T, Kutz JN. 2018.. Variable projection methods for an optimized dynamic mode decomposition. . SIAM J. Appl. Dyn. Syst. 17:(1):380416
    [Crossref] [Google Scholar]
  71. 71.
    Sashidhar D, Kutz JN. 2022.. Bagging, optimized dynamic mode decomposition for robust, stable forecasting with spatial and temporal uncertainty quantification. . Philos. Trans. R. Soc. A 380:(2229):20210199
    [Crossref] [Google Scholar]
  72. 72.
    Surana A. 2016.. Koopman operator based observer synthesis for control-affine nonlinear systems. . In 2016 55th IEEE Conference on Decision and Control, pp. 649299. Piscataway, NJ:: IEEE
    [Google Scholar]
  73. 73.
    Peitz S, Klus S. 2019.. Koopman operator-based model reduction for switched-system control of PDEs. . Automatica 106::18491
    [Crossref] [Google Scholar]
  74. 74.
    Peitz S, Otto SE, Rowley CW. 2020.. Data-driven model predictive control using interpolated Koopman generators. . SIAM J. Appl. Dyn. Syst. 19:(3):216293
    [Crossref] [Google Scholar]
  75. 75.
    Klus S, Nüske F, Peitz S, Niemann JH, Clementi C, Schütte C. 2020.. Data-driven approximation of the Koopman generator: model reduction, system identification, and control. . Phys. D 406::132416
    [Crossref] [Google Scholar]
  76. 76.
    Folkestad C, Chen Y, Ames AD, Burdick JW. 2020.. Data-driven safety-critical control: synthesizing control barrier functions with Koopman operators. . IEEE Control Syst. Lett. 5:(6):201217
    [Crossref] [Google Scholar]
  77. 77.
    Folkestad C, Burdick JW. 2021.. Koopman NMPC: Koopman-based learning and nonlinear model predictive control of control-affine systems. . In 2021 IEEE International Conference on Robotics and Automation, pp. 735056. Piscataway, NJ:: IEEE
    [Google Scholar]
  78. 78.
    Folkestad C, Wei SX, Burdick JW. 2021.. Quadrotor trajectory tracking with learned dynamics: joint Koopman-based learning of system models and function dictionaries. . arXiv:2110.10341 [cs.RO]
  79. 79.
    Bevanda P, Sosnowski S, Hirche S. 2021.. Koopman operator dynamical models: learning, analysis and control. . Annu. Rev. Control 52::197212
    [Crossref] [Google Scholar]
  80. 80.
    Bevanda P, Beier M, Heshmati-Alamdari S, Sosnowski S, Hirche S. 2022.. Towards data-driven LQR with Koopmanizing flows. . IFAC-PapersOnLine 55:(15):1318
    [Crossref] [Google Scholar]
  81. 81.
    Retchin M, Amos B, Brunton S, Song S. 2023.. Koopman constrained policy optimization: a Koopman operator theoretic method for differentiable optimal control in robotics. . In ICML 2023 Workshop on Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators. https://openreview.net/forum?id=3W7vPqWCeM
    [Google Scholar]
  82. 82.
    Rozwood P, Mehrez E, Paehler L, Brunton SL. 2023.. Koopman-assisted reinforcement learning. . In NeurIPS 2023 AI for Science Workshop. https://openreview.net/forum?id=IaUDEYN48p
    [Google Scholar]
  83. 83.
    Takens F. 1981.. Detecting strange attractors in turbulence. . In Dynamical Systems and Turbulence, ed. D Rand, LS Young , pp. 36681. Berlin:: Springer Presents Takens's embedding theorem, which justifies time-delay embedding of partial measurements to reconstruct chaotic attractors.
    [Google Scholar]
  84. 84.
    Bozzo E, Carniel R, Fasino D. 2010.. Relationship between singular spectrum analysis and Fourier analysis: theory and application to the monitoring of volcanic activity. . Comput. Math. Appl. 60:(3):81220
    [Crossref] [Google Scholar]
  85. 85.
    Brunton SL, Brunton BW, Proctor JL, Kaiser E, Kutz JN. 2017.. Chaos as an intermittently forced linear system. . Nat. Commun. 8::19
    [Crossref] [Google Scholar]
  86. 86.
    Cicci L, Fresca S, Guo M, Manzoni A, Zunino P. 2023.. Uncertainty quantification for nonlinear solid mechanics using reduced order models with Gaussian process regression. . Comput. Math. Appl. 149::123
    [Crossref] [Google Scholar]
  87. 87.
    Deisenroth M, Rasmussen CE. 2011.. PILCO: a model-based and data-efficient approach to policy search. . In Proceedings of the 28th International Conference on Machine Learning, pp. 46572. Madison, WI:: Omnipress
    [Google Scholar]
  88. 88.
    Schwenzer M, Ay M, Bergs T, Abel D. 2021.. Review on model predictive control: an engineering perspective. . Int. J. Adv. Manuf. Technol. 117:(5):132749
    [Crossref] [Google Scholar]
  89. 89.
    Hewing L, Wabersich KP, Menner M, Zeilinger MN. 2020.. Learning-based model predictive control: toward safe learning in control. . Annu. Rev. Control Robot. Auton. Syst. 3::26996
    [Crossref] [Google Scholar]
  90. 90.
    Brunke L, Greeff M, Hall AW, Yuan Z, Zhou S, et al. 2022.. Safe learning in robotics: from learning-based control to safe reinforcement learning. . Annu. Rev. Control Robot. Auton. Syst. 5::41144
    [Crossref] [Google Scholar]
  91. 91.
    Lenz I, Knepper RA, Saxena A. 2015.. DeepMPC: learning deep latent features for model predictive control. . In Robotics: Science and Systems XI, ed. LE Kavraki, D Hsu, J Buchli, pap. 12 . San Francisco:: Robot. Sci. Syst. Found.
    [Google Scholar]
  92. 92.
    Zhang T, Kahn G, Levine S, Abbeel P. 2016.. Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search. . In 2016 IEEE International Conference on Robotics and Automation, pp. 52835. Piscataway, NJ:: IEEE
    [Google Scholar]
  93. 93.
    Drews P, Williams G, Goldfain B, Theodorou EA, Rehg JM. 2017.. Aggressive deep driving: model predictive control with a cnn cost model. . arXiv:1707.05303 [cs.RO]
  94. 94.
    Fan DD, Agha-mohammadi A, Theodorou EA. 2020.. Deep learning tubes for tube MPC. . arXiv:2002.01587 [cs.RO]
  95. 95.
    Akaike H. 1969.. Fitting autoregressive models for prediction. . Ann. Inst. Stat. Math. 21:(1):24347
    [Crossref] [Google Scholar]
  96. 96.
    Champion K, Lusch B, Kutz JN, Brunton SL. 2019.. Data-driven discovery of coordinates and governing equations. . PNAS 116:(45):2244551
    [Crossref] [Google Scholar]
  97. 97.
    Bakarji J, Champion K, Kutz JN, Brunton SL. 2023.. Discovering governing equations from partial measurements with deep delay autoencoders. . Proc. R. Soc. A 479:(2276):20230422
    [Crossref] [Google Scholar]
  98. 98.
    Quade M, Abel M, Kutz JN, Brunton SL. 2018.. Sparse identification of nonlinear dynamics for rapid model recovery. . Chaos 28:(6):063116
    [Crossref] [Google Scholar]
  99. 99.
    Mesbah A. 2018.. Stochastic model predictive control with active uncertainty learning: a survey on dual control. . Annu. Rev. Control 45::10717
    [Crossref] [Google Scholar]
  100. 100.
    Zhang S, Lin G. 2018.. Robust data-driven discovery of governing physical laws with error bars. . Proc. R. Soc. A 474:(2217):20180305
    [Crossref] [Google Scholar]
  101. 101.
    Niven RK, Cordier L, Mohammad-Djafari A, Abel M, Quade M. 2024.. Dynamical system identification, model selection and model uncertainty quantification by Bayesian inference. . arXiv:2401.16943 [stat.ME]
  102. 102.
    Fung L, Fasel U, Juniper MP. 2024.. Rapid Bayesian identification of sparse nonlinear dynamics from scarce and noisy data. . arXiv:2402.15357 [stat.ME]
  103. 103.
    Skinner BF. 2019 (1938).. The Behavior of Organisms: An Experimental Analysis. Cambridge, MA:: B.F. Skinner Found.
    [Google Scholar]
  104. 104.
    Woodworth RS, Schlosberg H. 2008 (1938).. Experimental Psychology. New Delhi, India:: Oxford & IBH Publ.
    [Google Scholar]
  105. 105.
    Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, et al. 2015.. Human-level control through deep reinforcement learning. . Nature 518:(7540):52933
    [Crossref] [Google Scholar]
  106. 106.
    Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, et al. 2017.. Mastering the game of Go without human knowledge. . Nature 550:(7676):35459
    [Crossref] [Google Scholar]
  107. 107.
    Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Dudzik A, et al. 2019.. Grandmaster level in Starcraft II using multi-agent reinforcement learning. . Nature 575:(7782):35054
    [Crossref] [Google Scholar]
  108. 108.
    Kaufmann E, Bauersfeld L, Loquercio A, Müller M, Koltun V, Scaramuzza D. 2023.. Champion-level drone racing using deep reinforcement learning. . Nature 620:(7976):98287
    [Crossref] [Google Scholar]
  109. 109.
    Degrave J, Felici F, Buchli J, Neunert M, Tracey B, et al. 2022.. Magnetic control of tokamak plasmas through deep reinforcement learning. . Nature 602:(7897):41419
    [Crossref] [Google Scholar]
  110. 110.
    Popova M, Isayev O, Tropsha A. 2018.. Deep reinforcement learning for de novo drug design. . Sci. Adv. 4:(7):eaap7885
    [Crossref] [Google Scholar]
  111. 111.
    Colabrese S, Gustavsson K, Celani A, Biferale L. 2017.. Flow navigation by smart microswimmers via reinforcement learning. . Phys. Rev. Lett. 118:(15):158004
    [Crossref] [Google Scholar]
  112. 112.
    Pivot C, Mathelin L, Cordier L, Guéniat F, Noack BR. 2017.. A continuous reinforcement learning strategy for closed-loop control in fluid dynamics. . In 35th AIAA Applied Aerodynamics Conference, pp. 14049. Red Hook, NY:: Curran
    [Google Scholar]
  113. 113.
    Verma S, Novati G, Koumoutsakos P. 2018.. Efficient collective swimming by harnessing vortices through deep reinforcement learning. . PNAS 115:(23):584954
    [Crossref] [Google Scholar]
  114. 114.
    Fan D, Yang L, Wang Z, Triantafyllou MS, Karniadakis GE. 2020.. Reinforcement learning for bluff body active flow control in experiments and simulations. . PNAS 117:(42):2609198
    [Crossref] [Google Scholar]
  115. 115.
    Rabault J, Kuhnle A. 2023.. Deep reinforcement learning applied to active flow control. . In Data-Driven Fluid Mechanics: Combining First Principles and Machine Learning, ed. MA Mendez, A Ianiro, BR Noack, SL Brunton , pp. 36890. Cambridge, UK:: Cambridge Univ. Press
    [Google Scholar]
  116. 116.
    Novati G, de Laroussilhe HL, Koumoutsakos P. 2021.. Automating turbulence modelling by multi-agent reinforcement learning. . Nat. Mach. Intell. 3:(1):8796
    [Crossref] [Google Scholar]
  117. 117.
    Bae HJ, Koumoutsakos P. 2022.. Scientific multi-agent reinforcement learning for wall-models of turbulent flows. . Nat. Commun. 13::1443
    [Crossref] [Google Scholar]
  118. 118.
    Vinuesa R, Lehmkuhl O, Lozano-Durán A, Rabault J. 2022.. Flow control in wings and discovery of novel approaches via deep reinforcement learning. . Fluids 7:(2):62
    [Crossref] [Google Scholar]
  119. 119.
    Vignon C, Rabault J, Vinuesa R. 2023.. Recent advances in applying deep reinforcement learning for flow control: perspectives and future directions. . Phys. Fluids 35:(3):031301
    [Crossref] [Google Scholar]
  120. 120.
    Levine S, Kumar A, Tucker G, Fu J. 2020.. Offline reinforcement learning: tutorial, review, and perspectives on open problems. . arXiv:2005.01643 [cs.LG]
  121. 121.
    Hussein A, Gaber MM, Elyan E, Jayne C. 2017.. Imitation learning: a survey of learning methods. . ACM Comput. Surv. 50:(2):21
    [Google Scholar]
  122. 122.
    Fang B, Jia S, Guo D, Xu M, Wen S, Sun F. 2019.. Survey of imitation learning for robotic manipulation. . Int. J. Intell. Robot. Appl. 3::36269
    [Crossref] [Google Scholar]
  123. 123.
    Le Mero L, Yi D, Dianati M, Mouzakitis A. 2022.. A survey on imitation learning techniques for end-to-end autonomous vehicles. . IEEE Trans. Intell. Transp. Syst. 23:(9):1412847
    [Crossref] [Google Scholar]
  124. 124.
    Schaul T, Quan J, Antonoglou I, Silver D. 2016.. Prioritized experience replay. Paper presented at the 4th International Conference on Learning Representations, San Juan, Puerto Rico:, May 2–4
    [Google Scholar]
  125. 125.
    Rolnick D, Ahuja A, Schwarz J, Lillicrap T, Wayne G. 2019.. Experience replay for continual learning. . In Advances in Neural Information Processing Systems 32, ed. H Wallach, H Larochelle, A Beygelzimer, F d'Alché-Buc, E Fox, R Garnett , pp. 35060. Red Hook, NY:: Curran
    [Google Scholar]
  126. 126.
    Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, et al. 2017.. Hindsight experience replay. . In Advances in Neural Information Processing Systems 30, ed. I Guyon, U Von Luxburg, S Bengio, H Wallach, R Fergus , et al., pp. 504959. Red Hook, NY:: Curran
    [Google Scholar]
  127. 127.
    Zhu Z, Lin K, Jain AK, Zhou J. 2023.. Transfer learning in deep reinforcement learning: a survey. . IEEE Trans. Pattern Anal. Mach. Intell. 45:(11):1334462
    [Crossref] [Google Scholar]
  128. 128.
    Wang JX, Kurth-Nelson Z, Tirumala D, Soyer H, Leibo JZ, et al. 2016.. Learning to reinforcement learn. . arXiv:1611.05763 [cs.LG]
  129. 129.
    Finn C, Abbeel P, Levine S. 2017.. Model-agnostic meta-learning for fast adaptation of deep networks. . In Proceedings of the 34th International Conference on Machine Learning, ed. D Precup, YW Teh , pp. 112635. Proc. Mach. Learn. Res. 70 . N.p.:: PMLR
    [Google Scholar]
  130. 130.
    Wang T, Bao X, Clavera I, Hoang J, Wen Y, et al. 2019.. Benchmarking model-based reinforcement learning. . arXiv:1907.02057 [cs.LG]
  131. 131.
    Rao AV. 2009.. A survey of numerical methods for optimal control. . Adv. Astronaut. Sci. 135:(1):497528
    [Google Scholar]
  132. 132.
    Tassa Y, Erez T, Todorov E. 2012.. Synthesis and stabilization of complex behaviors through online trajectory optimization. . In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 490613. Piscataway, NJ:: IEEE
    [Google Scholar]
  133. 133.
    Heess N, Wayne G, Silver D, Lillicrap T, Erez T, Tassa Y. 2015.. Learning continuous control policies by stochastic value gradients. . In Advances in Neural Information Processing Systems 28, ed. C Cortes, N Lawrence, D Lee, M Sugiyama, R Garnett , pp. 294452. Red Hook, NY:: Curran
    [Google Scholar]
  134. 134.
    Levine S, Finn C, Darrell T, Abbeel P. 2016.. End-to-end training of deep visuomotor policies. . J. Mach. Learn. Res. 17:(1):133473
    [Google Scholar]
  135. 135.
    Grande R, Walsh T, How J. 2014.. Sample efficient reinforcement learning with Gaussian processes. . In Proceedings of the 31st International Conference on Machine Learning, ed. EP Xing, T Jebara , pp. 133240. Proc. Mach. Learn. Res. 32 . N.p.:: PMLR
    [Google Scholar]
  136. 136.
    Sutton RS, Szepesvári C, Geramifard A, Bowling MP. 2008.. Dyna-style planning with linear function approximation and prioritized sweeping. . In UAI'08: Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence, pp. 52836. Arlington, VA:: AUAI Press
    [Google Scholar]
  137. 137.
    Chua K, Calandra R, McAllister R, Levine S. 2018.. Deep reinforcement learning in a handful of trials using probabilistic dynamics models. . In Advances in Neural Information Processing Systems 31, ed. S Bengio, H Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, R Garnett , pp. 475465. Red Hook, NY:: Curran
    [Google Scholar]
  138. 138.
    Nagabandi A, Kahn G, Fearing RS, Levine S. 2018.. Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. . In 2018 IEEE International Conference on Robotics and Automation, pp. 755966. Piscataway, NJ:: IEEE
    [Google Scholar]
  139. 139.
    Hafner D, Lillicrap T, Ba J, Norouzi M. 2020.. Dream to control: learning behaviors by latent imagination. . In 2020 International Conference on Learning Representations. https://openreview.net/forum?id=S1lOTC4tDS
    [Google Scholar]
  140. 140.
    Arora R, da Silva BC, Moss E. 2022.. Model-based reinforcement learning with SINDy. . arXiv:2208.14501 [cs.LG]
  141. 141.
    Sutton RS. 1990.. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. . In Machine Learning: Proceedings of the Seventh International Conference on Machine Learning, ed. B Porter, R Mooney , pp. 21624. Amsterdam:: Elsevier Describes a system that learns a surrogate model to improve RL training.
    [Google Scholar]
  142. 142.
    Clavera I, Rothfuss J, Schulman J, Fujita Y, Asfour T, Abbeel P. 2018.. Model-based reinforcement learning via meta-policy optimization. . In Proceedings of the 2nd Conference on Robot Learning, ed. A Billard, A Dragan, J Peters, J Morimoto , pp. 61729. Proc. Mach. Learn. Res. 87 . N.p.:: PMLR
    [Google Scholar]
  143. 143.
    Kim S, Lu PY, Mukherjee S, Gilbert M, Jing L, et al. 2020.. Integration of neural network-based symbolic regression in deep learning for scientific discovery. . IEEE Trans. Neural Netw. Learn. Syst. 32:(9):416677
    [Crossref] [Google Scholar]
  144. 144.
    Sahoo S, Lampert C, Martius G. 2018.. Learning equations for extrapolation and control. . In Proceedings of the 35th International Conference on Machine Learning, ed. J Dy, A Krause , pp. 444250. Proc. Mach. Learn. Res. 80 . N.p.:: PMLR
    [Google Scholar]
  145. 145.
    Mania H, Guy A, Recht B. 2018.. Simple random search of static linear policies is competitive for reinforcement learning. . In Advances in Neural Information Processing Systems 31, ed. S Bengio, H Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, R Garnett , pp. 18009. Red Hook, NY:: Curran
    [Google Scholar]
  146. 146.
    Rajeswaran A, Lowrey K, Todorov EV, Kakade SM. 2017.. Towards generalization and simplicity in continuous control. . In Advances in Neural Information Processing Systems 30, ed. I Guyon, U Von Luxburg, S Bengio, H Wallach, R Fergus , et al., pp. 655162. Red Hook, NY:: Curran
    [Google Scholar]
  147. 147.
    Zhu F, Jing D, Leve F, Ferrari S. 2022.. NN-Poly: approximating common neural networks with Taylor polynomials to imbue dynamical system constraints. . Front. Robot. AI 9::968305
    [Crossref] [Google Scholar]
  148. 148.
    Botteghi N, Fasel U. 2024.. Parametric PDE control with deep reinforcement learning and differentiable L0-sparse polynomial policies. . arXiv:2403.15267 [cs.LG]
  149. 149.
    Alla A, Pacifico A, Palladino M, Pesare A. 2023.. Online identification and control of PDEs via reinforcement learning methods. . arXiv:2308.04068 [math.OC]
  150. 150.
    Farsi M, Liu J. 2020.. Structured online learning-based control of continuous-time nonlinear systems. . IFAC-PapersOnLine 53:(2):814249
    [Crossref] [Google Scholar]
  151. 151.
    Lusch B, Kutz JN, Brunton SL. 2018.. Deep learning for universal linear embeddings of nonlinear dynamics. . Nat. Commun. 9::4950
    [Crossref] [Google Scholar]
  152. 152.
    Chen B, Huang K, Raghupathi S, Chandratreya I, Du Q, Lipson H. 2022.. Automated discovery of fundamental variables hidden in experimental data. . Nat. Comput. Sci. 2:(7):43342
    [Crossref] [Google Scholar]
  153. 153.
    Moore BC. 1981.. Principal component analysis in linear systems: controllability, observability, and model reduction. . IEEE Trans. Autom. Control 26:(1):1732
    [Crossref] [Google Scholar]
  154. 154.
    Holmes P, Guckenheimer J. 1983.. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. Berlin:: Springer
    [Google Scholar]
  155. 155.
    Yeung E, Kundu S, Hodas N. 2017.. Learning deep neural network representations for Koopman operators of nonlinear dynamical systems. . arXiv:1708.06850 [cs.LG]
  156. 156.
    Takeishi N, Kawahara Y, Yairi T. 2017.. Learning Koopman invariant subspaces for dynamic mode decomposition. . In Advances in Neural Information Processing Systems 30, ed. I Guyon, U Von Luxburg, S Bengio, H Wallach, R Fergus , et al., pp. 113040. Red Hook, NY:: Curran
    [Google Scholar]
  157. 157.
    Mardt A, Pasquali L, Wu H, Noé F. 2018.. VAMPnets: deep learning of molecular kinetics. . Nat. Commun. 9::5
    [Crossref] [Google Scholar]
  158. 158.
    Otto SE, Rowley CW. 2019.. Linearly-recurrent autoencoder networks for learning dynamics. . SIAM J. Appl. Dyn. Syst. 18:(1):55893
    [Crossref] [Google Scholar]
  159. 159.
    Qian E, Kramer B, Peherstorfer B, Willcox K. 2020.. Lift & learn: physics-informed machine learning for large-scale nonlinear dynamical systems. . Phys. D 406::132401 Explores quadratization for model reduction and system identification.
    [Crossref] [Google Scholar]
  160. 160.
    Bychkov A, Issan O, Pogudin G, Kramer B. 2024.. Exact and optimal quadratization of nonlinear finite-dimensional nonautonomous dynamical systems. . SIAM J. Appl. Dyn. Syst. 23:(1):9821016
    [Crossref] [Google Scholar]
  161. 161.
    Kramer B, Peherstorfer B, Willcox KE. 2024.. Learning nonlinear reduced models from data with operator inference. . Annu. Rev. Fluid Mech. 56::52148
    [Crossref] [Google Scholar]
  162. 162.
    Kramer B. 2021.. Stability domains for quadratic-bilinear reduced-order models. . SIAM J. Appl. Dyn. Syst. 20:(2):98196
    [Crossref] [Google Scholar]
  163. 163.
    De Haan P, Weiler M, Cohen T, Welling M. 2020.. Gauge equivariant mesh CNNs: anisotropic convolutions on geometric graphs. . arXiv:2003.05425 [cs.LG]
  164. 164.
    Wang R, Walters R, Yu R. 2020.. Incorporating symmetry into deep dynamics models for improved generalization. . arXiv:2002.03061 [cs.LG]
  165. 165.
    Miller BK, Geiger M, Smidt TE, Noé F. 2020.. Relevance of rotationally equivariant convolutions for predicting molecular properties. . arXiv:2008.08461 [cs.LG]
  166. 166.
    Batzner S, Musaelian A, Sun L, Geiger M, Mailoa JP, et al. 2022.. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. . Nat. Commun. 13::2453
    [Crossref] [Google Scholar]
  167. 167.
    Brandstetter J, Welling M, Worrall DE. 2022.. Lie point symmetry data augmentation for neural PDE solvers. . In Proceedings of the 39th International Conference on Machine Learning, ed. K Chaudhuri, S Jegelka, L Song, C Szepesvari, G Niu, S Sabato , pp. 224156. Proc. Mach. Learn. Res. 162 . N.p.:: PMLR
    [Google Scholar]
  168. 168.
    Otto SE, Zolman N, Kutz JN, Brunton SL. 2023.. A unified framework to enforce, discover, and promote symmetry in machine learning. . arXiv:2311.00212 [cs.LG]
  169. 169.
    Yang J, Rao W, Dehmamy N, Walters R, Yu R. 2024.. Symmetry-informed governing equation discovery. . arXiv:2405.16756 [cs.LG]
  170. 170.
    Sutton R. 2019.. The bitter lesson. . Incomplete Ideas, Mar. 13. http://www.incompleteideas.net/IncIdeas/BitterLesson.html
    [Google Scholar]
  171. 171.
    Fasel U, Kaiser E, Kutz JN, Brunton BW, Brunton SL. 2021.. SINDy with control: a tutorial. . In 2021 60th IEEE Conference on Decision and Control, pp. 1621. Piscataway, NJ:: IEEE
    [Google Scholar]
/content/journals/10.1146/annurev-control-030123-015238
Loading
/content/journals/10.1146/annurev-control-030123-015238
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error