Has Dynamic Programming Improved Decision Making?

John Rust

doi:10.1146/annurev-economics-080218-025721

Has Dynamic Programming Improved Decision Making?

John Rust¹
View Affiliations Hide Affiliations

Affiliations: Department of Economics, Georgetown University, Washington, DC 20057, USA; email: [email protected]
Vol. 11:833-858 (Volume publication date August 2019) https://doi.org/10.1146/annurev-economics-080218-025721
First published as a Review in Advance on June 12, 2019
Copyright © 2019 by Annual Reviews. All rights reserved

Abstract

Dynamic programming (DP) is a powerful tool for solving a wide class of sequential decision-making problems under uncertainty. In principle, it enables us to compute optimal decision rules that specify the best possible decision in any situation. This article reviews developments in DP and contrasts its revolutionary impact on economics, operations research, engineering, and artificial intelligence with the comparative paucity of its real-world applications to improve the decision making of individuals and firms. The fuzziness of many real-world decision problems and the difficulty in mathematically modeling them are key obstacles to a wider application of DP in real-world settings. Nevertheless, I discuss several success stories, and I conclude that DP offers substantial promise for improving decision making if we let go of the empirically untenable assumption of unbounded rationality and confront the challenging decision problems faced every day by individuals and firms.

Keyword(s): actor-critic algorithms, artificial intelligence, behavioral economics, bounded rationality, computational complexity, curse of dimensionality, dynamic programming, JEL C5, JEL C6, JEL C8, JEL C9, JEL D9, JEL L2, neural networks, reinforcement learning, revenue management

Article metrics loading...

/content/journals/10.1146/annurev-economics-080218-025721

2019-08-02

2024-04-18

Full text loading...

/deliver/fulltext/economics/11/1/annurev-economics-080218-025721.html?itemId=/content/journals/10.1146/annurev-economics-080218-025721&mimeType=html&fmt=ahah

Literature Cited

Adda J, Cooper R. 2003. Dynamic Economics: Quantitative Methods and Applications Cambridge, MA: MIT Press
Aguirregabiria V, Mira P. 2010. Dynamic discrete choice structural models: a survey. J. Econom. 156:38–67
[Google Scholar]
Andersen L, Broadie M. 2004. Primal-dual simulation algorithm for pricing multidimensional American options. Manag. Sci. 50:91222–34
[Google Scholar]
Arrow K, Harris T, Marshak J 1951. Optimal inventory policy. Econometrica 19:250–72
[Google Scholar]
Barron A. 1994. Approximation and estimation bounds for neural networks. Mach. Learn. 14:115–33
[Google Scholar]
Barto A, Bradtke S, Singh S 1995. Learning to act using real-time dynamic programming. Artif. Intell. 72:81–138
[Google Scholar]
Barto A, Dietterich T. 2004. Reinforcement learning and its relation to supervised learning. Learning and Approximate Dynamic Programming: Scaling Up to the Real World J Si, A Barto, W Powell, D Wunsch 46–63 New York: Wiley Intersci
[Google Scholar]
Bellman R. 1984. Eye of the Hurricane Singapore: World Sci
Bertsekas D. 1982. Distributed dynamic programming. IEEE Trans. Autom. Control 27:610–16
[Google Scholar]
Bertsekas D. 2017. Dynamic Programming and Optimal Control 1 Belmont, MA: Athena Sci
Bertsekas D, Tsitsiklis J. 1989. Parallel and Distributed Computation: Numerical Methods Englewood Cliffs, NJ: Prentice Hall
Bertsekas D, Tsitsiklis J. 1996. Neuro-Dynamic Programming Belmont, MA: Athena Sci
Bilonis I, Scheidegger S. 2017. Machine learning for high-dimensional dynamic stochastic economies Work. Pap Sch. Econ. Bus. Admin., Univ. Lausanne Switz:
Blackwell D. 1965. Discounted dynamic programming. Ann. Math. Stat. 36:226–35
[Google Scholar]
Bloise G, Vailakis Y. 2018. Convex dynamic programming with (bounded) recursive utility. J. Econ. Theory 173:118–41
[Google Scholar]
Broadie M, Glasserman P. 2004. A stochastic mesh method for pricing high dimensional American options. J. Comput. Finance 7:435–72
[Google Scholar]
Brown G. 1951. Iterative solutions of games by fictitious play. Activity Analysis of Production and Allocation TC Koopmans 374–76 New York: Wiley
[Google Scholar]
Brumm J, Scheidegger S. 2017. Using adaptive sparse grids to solve high-dimensional dynamic models. Econometrica 85:1575–612
[Google Scholar]
Campbell J, Shiller R. 1988. Stock prices, earnings and expected dividends. J. Finance 43:3661–76
[Google Scholar]
Cellan-Jones R. 2014. Stephen Hawking warns artificial intelligence could end mankind. BBC News Dec. 2. https://www.bbc.com/news/technology-30290540
[Google Scholar]
Cho S, Lee G, Rust J, Yu M 2018. Optimal dynamic hotel pricing Work. Pap Dep. Econ., Georgetown Univ
Cho S, Lee G, Rust J, Yu M 2019. Semi-parametric instrument-free demand estimation: relaxing optimality and equilibrium assumptions Work. Pap., Dep. Econ., Georgetown Univ Washington, DC:
Cho S, Rust J. 2010. The flat rental puzzle. Rev. Econ. Stud. 77:560–94
[Google Scholar]
Chow CS, Tsitsiklis JN. 1989. The complexity of dynamic programming. J. Complex. 5:466–88
[Google Scholar]
Chow GC. 1976. Analysis and Control of Dynamic Economic Systems New York: Wiley
DellaVigna S. 2018. Structural behavioral economics. Handbook of Behavioral Economics 1 D Bernheim, S DellaVigna, D Laibson 621–723 Amsterdam: Elsevier
[Google Scholar]
Dreyfus S. 2002. Richard Bellman on the birth of dynamic programming. Oper. Res. 50:48–51
[Google Scholar]
Eckstein Z, Wolpin K. 1989. The specification and estimation of dynamic discrete choice models. J. Hum. Resour. 24:562–98
[Google Scholar]
Forbes Insights 2018. What not to wear: how algorithms are taking uncertainty out of fashion. Forbes July 17. https://www.forbes.com/sites/insights-intelai/2018/07/17/what-not-to-wear-how-algorithms-are-taking-uncertainty-out-of-fashion/#3e21594186ab
[Google Scholar]
Frederick S, Loewenstein G, O'Donoghue T 2002. Time discounting and time preference: a critical review. J. Econ. Lit. 40:351–401
[Google Scholar]
Gallego G, van Ryzin G 1994. Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Manag. Sci. 40:999–1020
[Google Scholar]
Gittins J. 1989. Multi-Armed Bandit Allocation Indices New York: Wiley
Griffiths T, Tenenbaum J. 2009. Theory-based causal induction. Psychol. Rev. 116:661–716
[Google Scholar]
Gupta S, Rust J. 2018. A simple theory of why and when firms go public Work. Pap Dep. Econ., Georgetown Univ Washington, DC:
Hall G, Rust J. 2019. Econometric methods for endogenously sampled time series: the case of commodity price speculation in the steel market.. J. Econom. In press
[Google Scholar]
Heckman J, Singer B. 2017. Abducting economics. Am. Econ. Rev. 107:298–302
[Google Scholar]
Hutchinson J, Meyer RJ. 1994. Dynamic decision making: optimal policies and actual behavior in sequential choice problems. Mark. Lett. 5:369–82
[Google Scholar]
Iskhakov F, Rust J, Schjerning B 2015. Recursive lexicographical search: finding all Markov perfect equilibria of finite state directional dynamic games. Rev. Econ. Stud. 83:658–703
[Google Scholar]
Iskhakov F, Rust J, Schjerning B 2018. The dynamics of Bertrand price competition with cost-reducing investments. Int. Econ. Rev. 59:41681–731
[Google Scholar]
Judd K. 1998. Numerical Methods in Economics Cambridge, MA: MIT Press
Kahneman D. 2011. Thinking, Fast and Slow New York: Farrar, Straus and Giroux
Leary M, Michaely R. 2011. Determinants of dividend smoothing: empirical evidence. Rev. Financ. Stud. 24:3197–249
[Google Scholar]
Magnac T, Thesmar D. 2002. Identifying discrete decision processes. Econometrica 70:810–16
[Google Scholar]
Mak T, Cheung P, Lam K, Luk W 2011. Adaptive routing in network-on-chips using a dynamic-programming network. IEEE Trans. Ind. Electron. 58:3701–16
[Google Scholar]
MarketsandMarkets. 2016. Revenue management market by solutions (risk management, pricing and revenue forecast management, revenue analytics, revenue leakage detection, channel revenue management) by services (professional, managed) by deployment mode—global forecast to 2020 Tech. Rep MarketsandMarkets https://www.marketsandmarkets.com/Market-Reports/revenue-management-market-264806846.html
Maskin E, Tirole J. 2001. Markov perfect equilibrium I: observable actions. J. Econ. Theory 100:191–219
[Google Scholar]
Massé P. 1944. Application des probabilités en chaîne à l'hydrologie statistique et au jeu des réservoirs. Soc. Stat. Paris 86:204–19
[Google Scholar]
McAfee P, te Velde V 2008. Dynamic pricing with constant demand elasticity. Prod. Oper. Manag. 17:432–38
[Google Scholar]
Miller J, Palmer R, Rust J 1993. Behavior of trading automata in a computerized double auction market. The Double Auction Market: Theory, Institutions, and Laboratory Evidence D Friedman, J Rust 155–98 Redwood City, CA: Addison Wesley
[Google Scholar]
Misra S, Nair H. 2011. A structural model of sales-force compensation dynamics: estimation and field implementation. Quant. Mark. Econ. 9:211–57
[Google Scholar]
Mnih V, Kavukcuoglu K, Silver D, Rusu A, Veness J et al. 2015. Human-level control through deep reinforcement learning. Nature 518:529–33
[Google Scholar]
Myerson R. 1981. Optimal auction design. Math. Oper. Res. 6:58–73
[Google Scholar]
Nutt PC. 2002. Why Decisions Fail San Francisco: Berret-Koehler Publ
Phillips RL. 2005. Pricing and Revenue Optimization Palo Alto, CA: Stanford Univ. Press
Powell W. 2010. Approximate Dynamic Programming: Solving the Curses of Dimensionality New York: Wiley
Powell W, Bouzaiene-Ayari B, Lawrence C, Cheng C, Das S, Fiorillo R 2014. Locomotive planning at Norfolk Southern: an optimizing simulator using approximate dynamic programming. Interfaces 44:567–78
[Google Scholar]
Puterman M. 2005. Markov Decision Processes: Discrete Stochastic Dynamic Programming New York: Wiley
Renner P, Scheidegger S. 2018. Machine learning for dynamic incentive problems Work. Pap Dep. Econ., Univ Lancaster, UK:
Robbins H, Munro S. 1951. A stochastic approximation method. Ann. Math. Stat. 22:400–25
[Google Scholar]
Rust J. 1994. Structural estimation of Markov decision processes. Handbook of Econometrics 4 R Engel, D McFadden 3081–143 Amsterdam: Elsevier
[Google Scholar]
Rust J. 1996. Numerical dynamic programming in economics. Handbook of Computational Economics H Amman, D Kendrick, J Rust 619–730 Amsterdam: Elsevier
[Google Scholar]
Rust J. 1997. Using randomization to break the curse of dimensionality. Econometrica 65:487–516
[Google Scholar]
Rust J. 2008. Dynamic programming. The New Palgrave Dictionary of Economics 1 SN Durlauf, LE Blume New York: Palgrave Macmillan https://doi.org/10.1057/978-1-349-95121-5_1932-1
[Crossref] [Google Scholar]
Rust J. 2014. The limits of inference with theory: a review of Wolpin 2013. J. Econ. Lit. 52:820–50
[Google Scholar]
Rust J. 2017. Dynamic programming, numerical. Wiley StatsRef Feb. 15. https://doi.org/10.1002/9781118445112.stat07921
[Crossref] [Google Scholar]
Silver D. 2017. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv:1712.01815[cs.AI]
Silver D, Schrittewieser J, Simonyan K, Antonoglu I, Huang A et al. 2017. Mastering the game of Go without human knowledge. Nature 550:354–58
[Google Scholar]
Simon H. 1978. Rational decision-making in business organizations Nobel Memorial Lecture Dec. 8. https://www.nobelprize.org/uploads/2018/06/simon-lecture.pdf
Stokey N, Lucas R. 1989. Recursive Methods in Economic Dynamics Cambridge, MA: Harvard Univ. Press
Sutton R. 1988. Learning to predict by the methods of temporal differences. Mach. Learn. 3:9–44
[Google Scholar]
Thaler R, Sunstein C. 2008. Nudge New Haven, CT: Yale Univ. Press
Traub J, Werschulz AG. 1998. Complexity and Information Pisa, Italy: Accad. Naz. Lincei
Tsitsiklis J. 1995. Asynchronous stochastic approximation and Q-learning. Mach. Learn. 16:185–202
[Google Scholar]
Wald A. 1947. Foundations of a general theory of statistical decision functions. Econometrica 15:279–313
[Google Scholar]
Watkins C. 1989. Learning from delayed rewards PhD Thesis, Cambridge Univ Cambridge, UK:
Wiener N. 1948. Cybernetics: Or Control and Communication in the Animal and the Machine Paris: Hermann & Cie
Wikipedia 2018. Machine learning. Wikipedia https://en.wikipedia.org/wiki/Machine_learning
[Google Scholar]
Zhang K, Wu T, Chen S, Cai L, Peng C 2017. A new energy efficient VM scheduling algorithm for cloud computing based on dynamic programming. 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud)249–54 New York: IEEE
[Google Scholar]

/content/journals/10.1146/annurev-economics-080218-025721

Article Type: Review Article

Most Cited Most Cited RSS feed

- Power Laws in Economics and Finance
  
  Xavier Gabaix
  
  Vol. 1 (2009), pp. 255–294
- The Gravity Model
  
  James E. Anderson
  
  Vol. 3 (2011), pp. 133–160
- Microeconomics of Technology Adoption
  
  Andrew D. Foster, and Mark R. Rosenzweig
  
  Vol. 2 (2010), pp. 395–424
- The China Shock: Learning from Labor-Market Adjustment to Large Changes in Trade
  
  David H. Autor, David Dorn, and Gordon H. Hanson
  
  Vol. 8 (2016), pp. 205–240
- Financial Literacy, Financial Education, and Economic Outcomes
  
  Justine S. Hastings, Brigitte C. Madrian, and William L. Skimmyhorn
  
  Vol. 5 (2013), pp. 347–373
- Gender and Competition
  
  Muriel Niederle, and Lise Vesterlund
  
  Vol. 3 (2011), pp. 601–630
- Corruption in Developing Countries
  
  Benjamin A. Olken, and Rohini Pande
  
  Vol. 4 (2012), pp. 479–509
- The Economics of Human Development and Social Mobility
  
  James J. Heckman, and Stefano Mosso
  
  Vol. 6 (2014), pp. 689–733
- The Roots of Gender Inequality in Developing Countries
  
  Seema Jayachandran
  
  Vol. 7 (2015), pp. 63–88
- The Consumption Response to Income Changes
  
  Tullio Jappelli, and Luigi Pistaferri
  
  Vol. 2 (2010), pp. 479–506
More Less

Annual Review of Economics

Volume 11, 2019

Review Article

Free

Has Dynamic Programming Improved Decision Making?

Abstract

Most Read This Month

Most Cited Most Cited RSS feed

Power Laws in Economics and Finance

The Gravity Model

Microeconomics of Technology Adoption

The China Shock: Learning from Labor-Market Adjustment to Large Changes in Trade

Financial Literacy, Financial Education, and Economic Outcomes

Gender and Competition

Corruption in Developing Countries

The Economics of Human Development and Social Mobility

The Roots of Gender Inequality in Developing Countries

The Consumption Response to Income Changes