A Perspective on Incentive Design: Challenges and Opportunities

Lillian J. Ratliff; Roy Dong; Shreyas Sekar; Tanner Fiez

doi:10.1146/annurev-control-053018-023634

Annual Review of Control, Robotics, and Autonomous Systems

Volume 2, 2019

Review Article

Free

A Perspective on Incentive Design: Challenges and Opportunities

Lillian J. Ratliff¹, Roy Dong², Shreyas Sekar¹, and Tanner Fiez¹
View Affiliations Hide Affiliations

Affiliations: ¹Department of Electrical Engineering, University of Washington, Seattle, Washington 98195, USA; email: [email protected] ²Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign, Urbana, Illinois 61801, USA
Vol. 2:305-338 (Volume publication date May 2019) https://doi.org/10.1146/annurev-control-053018-023634
First published as a Review in Advance on December 11, 2019
Copyright © 2019 by Annual Reviews. All rights reserved

Abstract

The increasingly tight coupling between humans and system operations in domains ranging from intelligent infrastructure to e-commerce has led to a challenging new class of problems founded on a well-established area of research: incentive design. There is a clear need for a new tool kit for designing mechanisms that help coordinate self-interested parties while avoiding unexpected outcomes in the face of information asymmetries, exogenous uncertainties from dynamic environments, and resource constraints. This article provides a perspective on the current state of the art in incentive design from three core communities—economics, control theory, and machine learning—and highlights interesting avenues for future research at the interface of these domains.

Keyword(s): control theory, economics, incentive design, machine learning

Article metrics loading...

/content/journals/10.1146/annurev-control-053018-023634

2019-05-03

2024-04-26

Full text loading...

/deliver/fulltext/control/2/1/annurev-control-053018-023634.html?itemId=/content/journals/10.1146/annurev-control-053018-023634&mimeType=html&fmt=ahah

Literature Cited

1. Campaigne C, Balandat M, Ratliff LJ 2016. Welfare effects of dynamic electricity pricing Work. Pap., Univ. Calif., Berkeley
2. Ge Y, Knittel CR, MacKenzie D, Zoepf S 2016. Racial and gender discrimination in transportation network companies NBER Work. Pap. 22776
3. Yuan C, Thai J, Bayen AM 2016. ZUbers against ZLyfts apocalypse: an analysis framework for DoS attacks on mobility-as-a-service systems. 2016 ACM/IEEE 7th International Conference on Cyber-Physical Systems (ICCPS) New York: IEEE https://doi.org/10.1109/ICCPS.2016.7479132
[Crossref] [Google Scholar]
4. Westenbroek T, Dong R, Ratliff LJ, Sastry SS 2017. Statistical estimation in competitive settings with strategic data sources. 2017 IEEE 56th Annual Conference on Decision and Control4994–99 New York: IEEE
[Google Scholar]
5. Ho CJ, Slivkins A, Vaughan JW 2016. Adaptive contract design for crowdsourcing markets: bandit algorithms for repeated principal-agent problems. J. Artif. Intell. Res. 55:317–59
[Google Scholar]
6. Singla A, Krause A 2013. Truthful incentives in crowdsourcing tasks using regret minimization mechanisms. Proceedings of the 22nd International Conference on World Wide Web1167–78 New York: ACM
[Google Scholar]
7. Cai Y, Daskalakis C, Papadimitriou CH 2015. Optimum statistical estimation with strategic data sources. Proceedings of the 28th Conference on Learning Theory P Grünwald, E Hazan, S Kale280–96 Proc. Mach. Learn. Res. 40 N.p.: PMLR
[Google Scholar]
8. Goodfellow IJ, Shlens J, Szegedy C 2014. Explaining and harnessing adversarial examples. arXiv:1412.6572 [stat.ML]
9. Laffont JJ, Martimort D 2002. The Theory of Incentives: The Principal-Agent Model Princeton, NJ: Princeton Univ. Press
10. Bolton P, Dewatripont M 2005. Contract Theory Cambridge, MA: MIT Press
11. Weber T 2011. Optimal Control Theory with Applications in Economics Cambridge, MA: MIT Press
12. Antmann P 2009. Reducing technical and non-technical losses in the power sector Rep. 92639, World Bank, Washington, DC
13. Amin S, Schwartz G, Cardenas A, Sastry S 2015. Game-theoretic models of electricity theft detection in smart utility networks. IEEE Control Syst. Mag. 35:166–81
[Google Scholar]
14. Akerlof GA 1970. The market for “lemons”: quality uncertainty and the market mechanism. Q. J. Econ. 84:488–500
[Google Scholar]
15. Mussa M, Rosen S 1978. Monopoly and product quality. J. Econ. Theory 18:301–17
[Google Scholar]
16. Maskin E, Riley J 1984. Monopoly with incomplete information. RAND J. Econ. 15:171–96
[Google Scholar]
17. Rothschild M, Stiglitz J 1976. Equilibrium in competitive insurance markets: an essay on the economics of imperfect information. Q. J. Econ. 90:629–49
[Google Scholar]
18. Spence A 1974. Market Signaling: Informational Transfer in Hiring and Related Screening Processes Cambridge, MA: Harvard Univ. Press
19. Alchian AA, Demsetz H 1972. Production, information costs, and economic organization. Am. Econ. Rev. 62:777–95
[Google Scholar]
20. Hölmstrom B 1979. Moral hazard and observability. Bell J. Econ. 10:74–91
[Google Scholar]
21. Dirk B, Juuso V 2010. The dynamic pivot mechanism. Econometrica 78:771–89
[Google Scholar]
22. Susan A, Ilya S 2013. An efficient dynamic mechanism. Econometrica 81:2463–85
[Google Scholar]
23. Courty P, Hao L 2000. Sequential screening. Rev. Econ. Stud. 67:697–717
[Google Scholar]
24. Battaglini M 2005. Long-term contracting with Markovian consumers. Am. Econ. Rev. 95:637–58
[Google Scholar]
25. Esö P, Szentes B 2007. Optimal information disclosure in auctions and the handicap auction. Rev. Econ. Stud. 74:705–31
[Google Scholar]
26. Board S 2007. Selling options. J. Econ. Theory 136:324–40
[Google Scholar]
27. Kakade SM, Lobel I, Nazerzadeh H 2013. Optimal dynamic mechanism design and the virtual-pivot mechanism. Oper. Res. 61:837–54
[Google Scholar]
28. Alessandro P, Ilya S, Juuso T 2014. Dynamic mechanism design: a Myersonian approach. Econometrica 82:601–53
[Google Scholar]
29. Borenstein S 2005. The long-run efficiency of real-time electricity pricing. Energy J. 26:93–116
[Google Scholar]
30. Jónsson T, Pinson P, Madsen H 2010. On the market impact of wind energy forecasts. Energy Econ. 32:313–20
[Google Scholar]
31. Başar T, Olsder GJ 1995. Dynamic Noncooperative Game Theory Philadelphia: Soc. Ind. Appl. Math.
32. Ho YCH, Luh PB, Olsder GJ 1980. A control-theoretic view on incentives. 1980 19th IEEE Conference on Decision and Control Including the Symposium on Adaptive Processes1160–70 New York: IEEE
[Google Scholar]
33. Groot N, Schutter BD, Hellendoorn H 2012. Reverse Stackelberg games, part I: basic framework. 2012 IEEE International Conference on Control Applications421–26 New York: IEEE
[Google Scholar]
34. Zheng YP, Başar T, Cruz JB 1984. Stackelberg strategies and incentives in multiperson deterministic decision problems. IEEE Trans. Syst. Man Cybernet. SMC-14:10–24
[Google Scholar]
35. Ho YC, Luh P, Muralidharan R 1981. Information structure, Stackelberg games, and incentive controllability. IEEE Trans. Autom. Control 26:454–60
[Google Scholar]
36. Zheng YP, Başar T 1982. Existence and derivation of optimal affine incentive schemes for Stackelberg games with partial information: a geometric approach. Int. J. Control 35:997–1011
[Google Scholar]
37. Ratliff LJ, Coogan S, Calderone D, Sastry SS 2012. Pricing in linear-quadratic dynamic games. 2012 50th Annual Allerton Conference on Communication, Control, and Computing1798–805 New York: IEEE
[Google Scholar]
38. Olsder GJ 2009. Phenomena in inverse Stackelberg games, part 1: static problems. J. Optim. Theory Appl. 143:589
[Google Scholar]
39. Olsder GJ 2009. Phenomena in inverse Stackelberg games, part 2: dynamic problems. J. Optim. Theory Appl. 143:601
[Google Scholar]
40. Liu X, Zhang S 1992. Optimal incentive strategy for leader-follower games. IEEE Trans. Autom. Control 37:1957–61
[Google Scholar]
41. Ho YC 1983. On incentive problems. Syst. Control Lett. 3:62–68
[Google Scholar]
42. Cruz J 1978. Leader-follower strategies for multilevel systems. IEEE Trans. Autom. Control 23:244–55
[Google Scholar]
43. Başar T, Selbuz H 1979. Closed-loop Stackelberg strategies with applications in optimal control or multilevel systems. IEEE Trans. Autom. Control 24:166–79
[Google Scholar]
44. Tolwinski B 1981. Closed-loop Stackelberg solution to a multistage linear-quadratic game. J. Optim. Theory Appl. 34:485–501
[Google Scholar]
45. Banerjee S, Johari R, Riquelme C 2015. Pricing in ride-sharing platforms: a queueing-theoretic approach. Proceedings of the Sixteenth ACM Conference on Economics and Computationp 639 New York: ACM
[Google Scholar]
46. Calderone D, Ratliff LJ, Sastry SS 2014. Pricing for coordination in open-loop differential games. IFAC Proc. Vol. 47:9001–6
[Google Scholar]
47. Jing YW, Zhang SY 1988. The solution to a kind of Stackelberg game systems with multi-follower: coordinative and incentive. Analysis and Optimization of Systems A Bensoussan, JL Lions pp. 593–602. Berlin: Springer
[Google Scholar]
48. Zhang SY 1987. A nonlinear incentive strategy for multi-stage Stackelberg games with partial information. 1986 25th IEEE Conference on Decision and Control1352–57 New York: IEEE
[Google Scholar]
49. Dobakhshari DG, Gupta V 2016. A contract design approach for phantom demand response. arXiv:1611.09788 [math.OC]
50. Vamvoudakis KG, Lewis FL, Dixon WE 2019. Open-loop Stackelberg learning solution for hierarchical control problems. Int. J. Adapt. Control Signal Process. 33:285–99
[Google Scholar]
51. Ratliff LJ 2015. Incentivizing efficiency in societal-scale cyber-physical systems PhD Thesis, Univ. Calif., Berkeley
52. Ratliff LJ, Fiez T 2018. Adaptive incentive design. arXiv:1806.05749 [cs.GT]
53. Auer P, Cesa-Bianchi N, Fischer P 2002. Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47:235–56
[Google Scholar]
54. Arora S, Hazan E, Kale S 2012. The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8:121–64
[Google Scholar]
55. Auer P, Cesa-Bianchi N, Freund Y, Schapire RE 2002. The nonstochastic multiarmed bandit problem. J. Comput. 32:48–77
[Google Scholar]
56. Bubeck S, Cesa-Bianchi N 2012. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5:1–122
[Google Scholar]
57. Bubeck S 2011. Introduction to online optimization Lect. Notes, Dep. Oper. Res. Financ. Eng., Princeton Univ., Princeton, NJ http://sbubeck.com/BubeckLectureNotes.pdf
58. Hazan E 2016. Introduction to online convex optimization. Found. Trends Optim. 2:157–325
[Google Scholar]
59. Ratliff LJ, Sekar S, Zheng L, Fiez T 2018. Incentives in the dark: multi-armed bandits for evolving users with unknown type. arXiv:1803.04008 [cs.LG]
60. Fiez T, Sekar S, Zheng L, Ratliff LJ 2018. Combinatorial bandits for incentivizing agents with dynamic preferences. Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference A Globerson, R Silva693–703 Corvallis, OR: AUAI Press
[Google Scholar]
61. Jain S, Gujar S, Bhat S, Zoeter O, Narahari Y 2018. A quality assuring, cost optimal multi-armed bandit mechanism for expert sourcing. Artif. Intell. 254:44–63
[Google Scholar]
62. Jain S, Narayanaswamy B, Narahari Y 2014. A multiarmed bandit incentive mechanism for crowdsourcing demand response in smart grids. Twenty-Eighth AAAI Conference on Artificial Intelligence721–27 Menlo Park, CA: AAAI Press
[Google Scholar]
63. Mansour Y, Slivkins A, Syrgkanis V, Wu ZS 2016. Bayesian exploration: incentivizing exploration in Bayesian games. Proceedings of the 2016 ACM Conference on Economics and Computationp 661 New York: ACM
[Google Scholar]
64. Li L, Chu W, Langford J, Schapire RE 2010. A contextual-bandit approach to personalized news article recommendation. Proceedings of the 19th International Conference on World Wide Web667–70 New York: ACM
[Google Scholar]
65. Wais P, Lingamneni S, Cook D, Fennell J, Goldenberg B 2010. Towards building a high-quality workforce with Mechanical Turk Paper presented at the Computational Social Science and the Wisdom of Crowds Workshop, Neural Information Processing Systems Conference, Whistler, Can., Dec. 10 https://people.cs.umass.edu/∼wallach/workshops/nips2010css/papers/wais.pdf
66. Huang JL, Liu M, Bowling NA 2015. Insufficient effort responding: examining an insidious confound in survey data. J. Appl. Psychol. 100:828–45
[Google Scholar]
67. Lovett M, Bajaba S, Lovett M, Simmering MJ 2018. Data quality from crowdsourced surveys: a mixed method inquiry into perceptions of Amazon's Mechanical Turk Masters. Appl. Psychol. 67:339–66
[Google Scholar]
68. Guha S, Munagala K 2007. Approximation algorithms for budgeted learning problems. Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing104–13 New York: ACM
[Google Scholar]
69. Babaioff M, Sharma Y, Slivkins A 2014. Characterizing truthful multi-armed bandit mechanisms. SIAM J. Comput. 43:194–230
[Google Scholar]
70. Einav L, Farronato C, Levin J, Sundaresan N 2018. Auctions versus posted prices in online markets. J. Political Econ. 126:178–215
[Google Scholar]
71. Blum A, Hartline JD 2005. Near-optimal online auctions. Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms1156–63 Philadelphia: Soc. Ind. Appl. Math.
[Google Scholar]
72. Hajiaghayi MT, Kleinberg RD, Parkes DC 2004. Adaptive limited-supply online auctions. Proceedings of the 5th ACM Conference on Electronic Commerce71–80 New York: ACM
[Google Scholar]
73. Cesa-Bianchi N, Gentile C, Mansour Y 2015. Regret minimization for reserve prices in second-price auctions. IEEE Trans. Inf. Theory 61:549–64
[Google Scholar]
74. Babaioff M, Dughmi S, Kleinberg R, Slivkins A 2015. Dynamic pricing with limited supply. ACM Trans. Econ. Comput. 3:4
[Google Scholar]
75. Badanidiyuru A, Kleinberg R, Slivkins A 2018. Bandits with knapsacks. J. ACM 65:13
[Google Scholar]
76. Roth A, Ullman J, Wu ZS 2016. Watch and learn: optimizing from revealed preferences feedback. Proceedings of the 48th Annual ACM Symposium on Theory of Computing949–62 New York: ACM
[Google Scholar]
77. Frazier PI, Kempe D, Kleinberg JM, Kleinberg R 2014. Incentivizing exploration. Proceedings of the Fifteenth ACM Conference on Economics and Computation5–22 New York: ACM
[Google Scholar]
78. Han L, Kempe D, Qiang R 2015. Incentivizing exploration with heterogeneous value of money. Proceedings of the 11th International Conference on Web and Internet Economics370–83 Berlin: Springer
[Google Scholar]
79. Singla A, Santoni M, Bartók G, Mukerji P, Meenen M, Krause A 2015. Incentivizing users for balancing bike sharing systems. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence723–29 Menlo Park, CA: AAAI Press
[Google Scholar]
80. Papanastasiou Y, Bimpikis K, Savva N 2017. Crowdsourcing exploration. Manag. Sci. 64:1727–46
[Google Scholar]
81. Liu Y, Ho C 2018. Incentivizing high quality user contributions: new arm generation in bandit learning. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence1146–53 Menlo Park, CA: AAAI Press
[Google Scholar]
82. Liu Y, Chen Y 2016. A bandit framework for strategic regression. Advances in Neural Information Processing Systems 29 DD Lee, M Sugiyama, UV Luxburg, I Guyon, R Garnett1813–21 Red Hook, NY: Curran
[Google Scholar]
83. Roughgarden T, Schrijvers O 2017.Online prediction with selfish experts Advances in Neural Information Processing Systems 30 I Guyon, UV Luxburg, S Bengio, H Wallach, R Fergus1300–10 Red Hook, NY: Curran:
84. Amin K, Rostamizadeh A, Syed U 2013. Learning prices for repeated auctions with strategic buyers. Advances in Neural Information Processing Systems 26 CJC Burges, L Bottou, M Welling, Z Ghahramani, KQ Weinberger1169–77 Red Hook, NY: Curran:
[Google Scholar]
85. Braverman M, Mao J, Schneider J, Weinberg SM 2017. Multi-armed bandit problems with strategic arms. arXiv:1706.09060 [cs.GT]
86. Tversky A, Kahneman D 1974. Judgment under uncertainty: heuristics and biases. Science 185:1124–31
[Google Scholar]
87. Brown PN, Marden JR 2017. Studies on robust social influence mechanisms: incentives for efficient network routing in uncertain settings. IEEE Control Syst. 37:98–115
[Google Scholar]
88. Croci E 2016. Urban road pricing: a comparative study on the experiences of London, Stockholm and Milan. Transp. Res. Procedia 14:253–62
[Google Scholar]
89. Kahneman D, Tversky A 1979. Prospect theory: an analysis of decision under risk. Econometrica 47:263–91
[Google Scholar]
90. Tang CK 2016. Traffic externalities and housing prices: evidence from the London congestion charge Discuss. Pap. 205, Spatial Econ. Res. Cent., London
91. Tversky A, Kahneman D 1992. Advances in prospect theory: cumulative representation of uncertainty. J. Risk Uncertain. 5:297–323
[Google Scholar]
92. Simon HA 1955. A behavioral model of rational choice. Q. J. Econ. 69:99–118
[Google Scholar]
93. Cohen A, Einav L 2007. Estimating risk preferences from deductible choice. Am. Econ. Rev. 97:745–88
[Google Scholar]
94. Gershman SJ, Horvitz EJ, Tenenbaum JB 2015. Computational rationality: a converging paradigm for intelligence in brains, minds, and machines. Science 349:273–78
[Google Scholar]
95. Ratliff LJ, Mazumdar E 2017. Inverse risk-sensitive reinforcement learning. arXiv:1703.09842v3 [cs.LG]
96. Mazumdar E, Ratliff LJ, Fiez T, Sastry SS 2017. Gradient-based inverse risk-sensitive reinforcement learning. 2017 IEEE 56th Annual Conference on Decision and Control5796–801 New York: IEEE
[Google Scholar]
97. Shen Y, Tobia MJ, Sommer T, Obermayer K 2014. Risk-sensitive reinforcement learning. Neural Comput. 26:1298–328
[Google Scholar]
98. Majumdar A, Singh S, Mandlekar A, Provone M 2017. Risk-sensitive inverse reinforcement learning via coherent risk models. Robotics: Science and Systems XIII N Amato, S Srinivasa, N Ayanian, S Kuindersma chap. 69. N.p.: Robot. Sci. Syst. Found.
99. Acemoglu D, Makhdoumi A, Malekian A, Ozdaglar A 2018. Informational Braess’ paradox: the effect of information on traffic congestion. Oper. Res. 66:893–917
[Google Scholar]
100. Sekar S, Zheng L, Ratliff LJ, Zhang B 2018. Uncertainty in multi-commodity routing networks: When does it help. 2018 Annual American Control Conference6553–58 New York: IEEE:
[Google Scholar]
101. Wu M, Liu J, Amin S 2017. Informational aspects in a class of Bayesian congestion games. 2017 American Control Conference3650–57 New York: IEEE
[Google Scholar]
102. Braess D, Nagurney A, Wakolbinger T 2005. On a paradox of traffic planning. Transp. Sci. 39:446–50
[Google Scholar]
103. Lo C 2012. Game theory: introducing randomness to airport security. Airport Technology July 25 http://www.airport-technology.com/features/featuregame-theory-airport-security-teamcore-Stackelberg
104. Li P, Sekar S, Zhang B 2018. A capacity-price game for uncertain renewables resources. Proceedings of the Ninth International Conference on Future Energy Systems119–33 New York: ACM
[Google Scholar]
105. Kamenica E, Gentzkow M 2011. Bayesian persuasion. Am. Econ. Rev. 101:2590–615
[Google Scholar]
106. Bergemann D, Morris S 2018. Information design: a unified perspective. J. Econ. Lit. In press
107. Vakili S, Zhao Q 2016. Risk-averse multi-armed bandit problems under mean-variance measure. J. Sel. Top. Signal Process. 10:1093–111
[Google Scholar]
108. Even-Dar E, Kearns MJ, Wortman J 2006. Risk-sensitive online learning. Algorithmic Learning Theory: 17th International Conference, ALT 2006 JL Balcázar, PM Long, F Stephan199–213 Berlin: Springer:
[Google Scholar]
109. Haws KL, Bearden WO 2006. Dynamic pricing and consumer fairness perceptions. J. Consum. Res. 33:304–11
[Google Scholar]
110. Irwin N 2017. Why surge prices make us so mad: what Springsteen, Home Depot and a Nobel winner know. New York Times Oct. 14 https://www.nytimes.com/2017/10/14/upshot/why-surge-prices-make-us-so-mad-what-springsteen-home-depot-and-a-nobel-winner-know.html
111. Varian HR 1974. Equity, envy, and efficiency. J. Econ. Theory 9:63–91
[Google Scholar]
112. Berliant M, Thomson W, Dunz K 1992. On the fair division of a heterogeneous commodity. J. Math. Econ. 21:201–16
[Google Scholar]
113. Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R 2012. Fairness through awareness. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference214–26 New York: ACM
[Google Scholar]
114. Joseph M, Kearns M, Morgenstern JH, Roth A 2016.Fairness in learning: classic and contextual bandits Advances in Neural Information Processing Systems 29 DD Lee, M Sugiyama, UV Luxburg, I Guyon, R Garnett325–33 Red Hook, NY: Curran
115. Nace D, Pioro M 2008. Max-min fairness and its applications to routing and load-balancing in communication networks: a tutorial. IEEE Commun. Surv. Tutor. 10:5–17
[Google Scholar]
116. Gillen S, Jung C, Kearns M, Roth A 2018. Online learning with an unknown fairness metric. arXiv:1802.06936 [cs.LG]
117. Bertsimas D, Farias VF, Trichakis N 2012. On the efficiency-fairness trade-off. Manag. Sci. 58:2234–50
[Google Scholar]
118. Hu L, Chen Y 2018. A short-term intervention for long-term fairness in the labor market. Proceedings of the 2018 World Wide Web Conference1389–98 Geneva: Int. World Wide Web Conf. Comm.
[Google Scholar]
119. Benade G, Kazachkov AM, Procaccia AD, Psomas CA 2018. How to make envy vanish over time. Proceedings of the 2018 ACM Conference on Economics and Computation593–610 New York: ACM
[Google Scholar]
120. Lorini E, Mühlenbernd R 2015. The long-term benefits of following fairness norms: a game-theoretic analysis. PRIMA 2015: Principles and Practice of Multi-Agent Systems Q Chen, P Torroni, S Villata, J Hsu, A Omicini301–18 Cham, Switz.: Springer
[Google Scholar]
121. Corbett-Davies S, Pierson E, Feller A, Goel S, Huq A 2017. Algorithmic decision making and the cost of fairness. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining797–806 New York: ACM
[Google Scholar]
122. Friedler SA, Scheidegger C, Venkatasubramanian S 2016. On the (im)possibility of fairness. arXiv:1609.07236 [cs.CY]
123. Smith M, Patil D, Munoz C 2016. Big data: a report on algorithmic systems, opportunity, and civil rights Rep., Exec. Off. Pres., Washington, DC
124. Aghion P, Bloom N, Blundell R, Griffith R, Howitt P 2005. Competition and innovation: an inverted-U relationship. Q. J. Econ. 120:701–28
[Google Scholar]
125. Anandkumar A, Michael N, Tang A 2010. Opportunistic spectrum access with multiple users: learning under competition. 2010 Proceedings IEEE INFOCOM New York: IEEE: https://doi.org/10.1109/INFCOM.2010.5462144
[Crossref] [Google Scholar]
126. Mansour Y, Slivkins A, Wu ZS 2018. Competing bandits: learning under competition. Proceedings of the 9th Innovations in Theoretical Computer Science Conference AR Karlin48 Saarbrücken, Ger.: Dagstuhl
[Google Scholar]
127. Ben-Porat O, Tennenholtz M 2018. Competing prediction algorithms. arXiv:1806.01703 [cs.GT]
128. Fed. Trade Comm 2017. Algorithms and collusion - note by the United States Doc. DAF/COMP/WD(2017)41, Fed. Trade Comm., Washington, DC
129. Mehra SK 2015. Antitrust and the robo-seller: competition in the time of algorithms. Minn. Law Rev. 100:1323–75
[Google Scholar]
130. Dani V, Hayes TP, Kakade SM 2008. Stochastic linear optimization under bandit feedback. Proceedings of the 21st Annual Conference on Learning Theory355–66 Madison, WI: Omnipress
[Google Scholar]
131. Abbasi-Yadkori Y, Pál D, Szepesvári C 2011. Improved algorithms for linear stochastic bandits. Advances in Neural Information Processing Systems 24 J Shawe-Taylor, RS Zemel, PL Bartlett, F Pereira, KQ Weinberger2312–20 Red Hook, NY: Curran
[Google Scholar]
132. Slivkins A 2011. Multi-armed bandits on implicit metric spaces. Advances in Neural Information Processing Systems 24 J Shawe-Taylor, RS Zemel, PL Bartlett, F Pereira, KQ Weinberger1602–10 Red Hook, NY: Curran
[Google Scholar]
133. Kleinberg R, Slivkins A, Upfal E 2013. Bandits and experts in metric spaces. arXiv:1312.1277 [cs.DS]
134. Slivkins A, Radlinski F, Gollapudi S 2013. Ranked bandits in metric spaces: learning diverse rankings over large document collections. J. Mach. Learn. Res. 14:399–436
[Google Scholar]
135. Slivkins A 2014. Contextual bandits with similarity information. J. Mach. Learn. Res. 15:2533–68
[Google Scholar]
136. Chapelle O, Li L 2011. An empirical evaluation of Thompson sampling. Advances in Neural Information Processing Systems 24 J Shawe-Taylor, RS Zemel, PL Bartlett, F Pereira, KQ Weinberger2249–57 Red Hook, NY:Curran
[Google Scholar]
137. Agrawal S, Goyal N 2012. Analysis of Thompson sampling for the multi-armed bandit problem. Proceedings of the 25th Annual Conference on Learning Theory S Mannor, N Srebro, RC Williamson39.1–26 Proc. Mach. Learn. Res. 23. N.p.: PMLR
138. Srinivas N, Krause A, Kakade SM, Seeger M 2009. Gaussian process optimization in the bandit setting: no regret and experimental design. Proceedings of the 27th International Conference on International Conference on Machine Learning1015–22 Madison, WI: Omnipress
[Google Scholar]
139. Srinivas N, Krause A, Kakade SM, Seeger MW 2012. Information-theoretic regret bounds for Gaussian process optimization in the bandit setting. IEEE Trans. Inf. Theory 58:3250–65
[Google Scholar]
140. Langford J, Zhang T 2008. The epoch-greedy algorithm for multi-armed bandits with side information. Advances in Neural Information Processing Systems 20 JC Platt, D Koller, Y Singer, ST Roweis817–24 Red Hook, NY: Curran
[Google Scholar]
141. Chu W, Li L, Reyzin L, Schapire R 2011. Contextual bandits with linear payoff functions. Proceedings of the 14th International Conference on Artificial Intelligence and Statistics G Gordon, D Dunson, M Dudík208–14 Proc. Mach. Learn. Res. 15. N.p.: PMLR
142. Agrawal S, Goyal N 2013. Thompson sampling for contextual bandits with linear payoffs. Proceedings of the 30th International Conference on Machine Learning S Dasgupta, D McAllester127–35 Proc. Mach. Learn. Res. 28(3). N.p.: PMLR
143. Eghbali R, Fazel M 2016. Designing smoothing functions for improved worst-case competitive ratio in online optimization. Advances in Neural Information Processing Systems 29 DD Lee, M Sugiyama, UV Luxburg, I Guyon, R Garnett3287–95 Red Hook, NY: Curran
[Google Scholar]
144. Eghbali R, Fazel M, Mesbahi M 2016. Worst case competitive analysis for online conic optimization. 2016 IEEE 55th Conference on Decision and Control1945–50 New York: IEEE
[Google Scholar]
145. Eghbali R, Saunderson J, Fazel M 2018. Competitive online algorithms for resource allocation over the positive semidefinite cone. Math. Program. 170:267–92
[Google Scholar]
146. Mansour Y, Slivkins A, Syrgkanis V 2015. Bayesian incentive-compatible bandit exploration. Proceedings of the 16th ACM Conference on Economics and Computation565–82 New York: ACM
[Google Scholar]
147. Kannan S, Kearns M, Morgenstern J, Pai M, Roth A 2017. Fairness incentives for myopic agents. Proceedings of the 2017 ACM Conference on Economics and Computation369–86 New York: ACM
[Google Scholar]
148. Ghalme G, Jain S, Gujar S, Narahari Y 2017. Thompson sampling based mechanisms for stochastic multi-armed bandit problems. Proceedings of the 16th Conference on Autonomous Agents and Multiagent Systems87–95 Richland, SC: Int. Found. Auton. Agents Multiagent Syst.
[Google Scholar]
149. Kohavi R, Longbotham R, Sommerfield D, Henne RM 2009. Controlled experiments on the web: survey and practical guide. Data Min. Knowl. Discov. 18:140–81
[Google Scholar]
150. Swaminathan A, Krishnamurthy A, Agarwal A, Dudík M, Langford J 2017. Off-policy evaluation for slate recommendation. Advances in Neural Information Processing Systems 30 I Guyon, UV Luxburg, S Bengio, H Wallach, R Fergus3635–45 Red Hook, NY: Curran
[Google Scholar]
151. Strehl AL, Langford J, Li L, Kakade S 2010.Learning from logged implicit exploration data Advances in Neural Information Processing Systems 23 JD Lafferty, CKI Williams, J Shawe-Taylor, RS Zemel, A Culottapp2217–25 Red Hook, NY: Curran
152. Bottou L, Peters J, Candela JQ, Charles DX, Chickering M 2013. Counterfactual reasoning and learning systems: the example of computational advertising. J. Mach. Learn. Res. 14:3207–60
[Google Scholar]
153. Bareinboim E, Forney A, Pearl J 2015. Bandits with unobserved confounders: a causal approach. Advances in Neural Information Processing Systems 28 C Cortes, ND Lawrence, DD Lee, M Sugiyama, R Garnett1342–50 Red Hook, NY: Curran
[Google Scholar]
154. Alon N, Cesa-Bianchi N, Dekel O, Koren T 2015. Online learning with feedback graphs: beyond bandits. Proceedings of the 28th Conference on Learning Theory P Grünwald, E Hazan, S Kalepp23–35 Proc. Mach. Learn. Res. 40. N.p.: PMLR
155. Hu H, Li Z, Vetta AR 2014. Randomized experimental design for causal graph discovery. Advances in Neural Information Processing Systems 27 Z Ghahramani, M Welling, C Cortes, ND Lawrence, KQ Weinberger2339–47 Red Hook, NY: Curran
[Google Scholar]
156. Lattimore F, Lattimore T, Reid MD 2016. Causal bandits: learning good interventions via causal inference. Advances in Neural Information Processing Systems 29 DD Lee, M Sugiyama, UV Luxburg, I Guyon, R Garnett1181–89 Red Hook, NY: Curran
[Google Scholar]

/content/journals/10.1146/annurev-control-053018-023634

Article Type: Review Article

Most Cited Most Cited RSS feed

- Planning and Decision-Making for Autonomous Vehicles
  
  Wilko Schwarting, Javier Alonso-Mora, and Daniela Rus
  
  Vol. 1 (2018), pp. 187–210
- Learning-Based Model Predictive Control: Toward Safe Learning in Control
  
  Lukas Hewing, Kim P. Wabersich, Marcel Menner, and Melanie N. Zeilinger
  
  Vol. 3 (2020), pp. 269–296
- Recent Advances in Robot Learning from Demonstration
  
  Harish Ravichandar, Athanasios S. Polydoros, Sonia Chernova, and Aude Billard
  
  Vol. 3 (2020), pp. 297–330
- A Tour of Reinforcement Learning: The View from Continuous Control
  
  Benjamin Recht
  
  Vol. 2 (2019), pp. 253–279
- Haptics: The Present and Future of Artificial Touch Sensation
  
  Heather Culbertson, Samuel B. Schorr, and Allison M. Okamura
  
  Vol. 1 (2018), pp. 385–409
- Magnetic Methods in Robotics
  
  Jake J. Abbott, Eric Diller, and Andrew J. Petruska
  
  Vol. 3 (2020), pp. 57–90
- A Century of Robotic Hands
  
  C. Piazza, G. Grioli, M.G. Catalano, and A. Bicchi
  
  Vol. 2 (2019), pp. 1–32
- Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning
  
  Lukas Brunke, Melissa Greeff, Adam W. Hall, Zhaocong Yuan, Siqi Zhou, Jacopo Panerati, and Angela P. Schoellig
  
  Vol. 5 (2022), pp. 411–444
- Distributed Optimization for Control
  
  Angelia Nedić, and Ji Liu
  
  Vol. 1 (2018), pp. 77–103
- Soft Micro- and Nanorobotics
  
  Chengzhi Hu, Salvador Pané, and Bradley J. Nelson
  
  Vol. 1 (2018), pp. 53–75
More Less

Annual Review of Control, Robotics, and Autonomous Systems

Volume 2, 2019

Review Article

Free

A Perspective on Incentive Design: Challenges and Opportunities

Abstract

Most Read This Month

Most Cited Most Cited RSS feed