1932

Abstract

In the context of robotics and automation, learning from demonstration (LfD) is the paradigm in which robots acquire new skills by learning to imitate an expert. The choice of LfD over other robot learning methods is compelling when ideal behavior can be neither easily scripted (as is done in traditional robot programming) nor easily defined as an optimization problem, but can be demonstrated. While there have been multiple surveys of this field in the past, there is a need for a new one given the considerable growth in the number of publications in recent years. This review aims to provide an overview of the collection of machine-learning methods used to enable a robot to learn from and imitate a teacher. We focus on recent advancements in the field and present an updated taxonomy and characterization of existing methods. We also discuss mature and emerging application areas for LfD and highlight the significant challenges that remain to be overcome both in theory and in practice.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-control-100819-063206
2020-05-03
2024-10-13
Loading full text...

Full text loading...

/deliver/fulltext/control/3/1/annurev-control-100819-063206.html?itemId=/content/journals/10.1146/annurev-control-100819-063206&mimeType=html&fmt=ahah

Literature Cited

  1. 1. 
    Schaal S. 1999. Is imitation learning the route to humanoid robots?. Trends Cogn. Sci. 3:233–42
    [Google Scholar]
  2. 2. 
    Billard A, Calinon S, Dillmann R, Schaal S 2008. Robot programming by demonstration. Springer Handbook of Robotics B Siciliano, O Khatib1371–94 Berlin: Springer
    [Google Scholar]
  3. 3. 
    Argall BD, Chernova S, Veloso M, Browning B 2009. A survey of robot learning from demonstration. Robot. Auton. Syst. 57:469–83
    [Google Scholar]
  4. 4. 
    Chernova S, Thomaz AL 2014. Robot Learning from Human Teachers San Rafael, CA: Morgan & Claypool
    [Google Scholar]
  5. 5. 
    Schulman J, Ho J, Lee A, Awwal I, Bradlow H, Abbeel P 2013. Finding locally optimal, collision-free trajectories with sequential convex optimization. Robotics: Science and Systems IX P Newman, D Fox, D Hsu pap. 31. N.p.: Robot. Sci. Syst. Found.
    [Google Scholar]
  6. 6. 
    Zucker M, Ratliff N, Dragan AD, Pivtoraiko M, Klingensmith M 2013. CHOMP: covariant Hamiltonian optimization for motion planning. Int. J. Robot. Res. 32:1164–93
    [Google Scholar]
  7. 7. 
    Zhu Z, Hu H 2018. Robot learning from demonstration in robotic assembly: a survey. Robotics 7:17
    [Google Scholar]
  8. 8. 
    Lauretti C, Cordella F, Guglielmelli E, Zollo L 2017. Learning by demonstration for planning activities of daily living in rehabilitation and assistive robotics. IEEE Robot. Autom. Lett. 2:1375–82
    [Google Scholar]
  9. 9. 
    Friedrich H, Kaiser M, Dillmann R 1996. What can robots learn from humans?. Annu. Rev. Control 20:167–72
    [Google Scholar]
  10. 10. 
    Billard AG, Calinon S, Dillmann R 2016. Learning from humans. Springer Handbook of Robotics B Siciliano, O Khatib1995–2014 Berlin: Springer. 2nd ed.
    [Google Scholar]
  11. 11. 
    Osa T, Pajarinen J, Neumann G, Bagnell JA, Abbeel P, Peters J 2018. An algorithmic perspective on imitation learning. Found. Trends Robot. 7:1–179
    [Google Scholar]
  12. 12. 
    Bohg J, Morales A, Asfour T, Kragic D 2014. Data-driven grasp synthesis—a survey. IEEE Trans. Robot. 30:289–309
    [Google Scholar]
  13. 13. 
    Ahmadzadeh SR, Kaushik R, Chernova S 2016. Trajectory learning from demonstration with canal surfaces: a parameter-free approach. 2016 IEEE-RAS 16th International Conference on Humanoid Robots544–49 Piscataway, NJ: IEEE
    [Google Scholar]
  14. 14. 
    Maeda GJ, Neumann G, Ewerton M, Lioutikov R, Kroemer O, Peters J 2017. Probabilistic movement primitives for coordination of multiple humanrobot collaborative tasks. Auton. Robots 41:593–612
    [Google Scholar]
  15. 15. 
    Pervez A, Lee D 2018. Learning task-parameterized dynamic movement primitives using mixture of GMMs. Intell. Serv. Robot. 11:61–78
    [Google Scholar]
  16. 16. 
    Shavit Y, Figueroa N, Salehian SSM, Billard A 2018. Learning augmented joint-space task-oriented dynamical systems: a linear parameter varying and synergetic control approach. IEEE Robot. Autom. Lett. 3:2718–25
    [Google Scholar]
  17. 17. 
    Elliott S, Xu Z, Cakmak M 2017. Learning generalizable surface cleaning actions from demonstration. 2017 26th IEEE International Symposium on Robot and Human Interactive Communication993–99 Piscataway, NJ: IEEE
    [Google Scholar]
  18. 18. 
    Chu V, Fitzgerald T, Thomaz AL 2016. Learning object affordances by leveraging the combination of human-guidance and self-exploration. 2016 11th ACM/IEEE International Conference on Human-Robot Interaction221–28 Piscataway, NJ: IEEE
    [Google Scholar]
  19. 19. 
    Calinon S, Guenter F, Billard A 2007. On learning, representing, and generalizing a task in a humanoid robot. IEEE Trans. Syst. Man Cybernet. B 37:286–98
    [Google Scholar]
  20. 20. 
    Nehaniv CL, Dautenhahn K 2002. The correspondence problem. Imitation in Animals and Artifacts K Dautenhahn, CL Nehaniv41–61 Cambridge, MA: MIT Press
    [Google Scholar]
  21. 21. 
    Abbeel P, Coates A, Ng AY 2010. Autonomous helicopter aerobatics through apprenticeship learning. Int. J. Robot. Res. 29:1608–39
    [Google Scholar]
  22. 22. 
    Peters RA, Campbell CL, Bluethmann WJ, Huber E 2003. Robonaut task learning through teleoperation. 2003 IEEE International Conference on Robotics and Automation 22806–11 Piscataway, NJ: IEEE
    [Google Scholar]
  23. 23. 
    Whitney D, Rosen E, Phillips E, Konidaris G, Tellex S 2020. Comparing robot grasping teleoperation across desktop and virtual reality with ROS reality. Robotics Research: The 18th International Symposium ISRR NM Amato, G Hager, S Thomas, M Torres-Torriti335–50 Cham, Switz.: Springer
    [Google Scholar]
  24. 24. 
    Mohseni-Kabir A, Rich C, Chernova S, Sidner CL, Miller D 2015. Interactive hierarchical task learning from a single demonstration. Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction205–12 New York: ACM
    [Google Scholar]
  25. 25. 
    Su Z, Kroemer O, Loeb GE, Sukhatme GS, Schaal S 2016. Learning to switch between sensorimotor primitives using multimodal haptic signals. International Conference on Simulation of Adaptive Behavior. pp. 170–82 Cham, Switz.: Springer
  26. 26. 
    Kormushev P, Calinon S, Caldwell DG 2011. Imitation learning of positional and force skills demonstrated via kinesthetic teaching and haptic input. Adv. Robot. 25:581–603
    [Google Scholar]
  27. 27. 
    Rosen E, Whitney D, Phillips E, Ullman D, Tellex S 2018. Testing robot teleoperation using a virtual reality interface with ROS reality Paper presented at the 1st International Workshop on Virtual, Augmented, and Mixed Reality for Human-Robot Interaction, Chicago, Mar. 5
    [Google Scholar]
  28. 28. 
    Zhang T, McCarthy Z, Jow O, Lee D, Chen X 2018. Deep imitation learning for complex manipulation tasks from virtual reality teleoperation. 2018 IEEE International Conference on Robotics and Automation5628–35 Piscataway, NJ: IEEE
    [Google Scholar]
  29. 29. 
    Spranger J, Buzatoiu R, Polydoros A, Nalpantidis L, Boukas E 2018. Human-machine interface for remote training of robot tasks. 2018 IEEE International Conference on Imaging Systems and Techniques Piscataway, NJ: IEEE https://doi.org/10.1109/IST.2018.8577081
    [Crossref] [Google Scholar]
  30. 30. 
    Whitney D, Rosen E, Tellex S 2018. Learning from crowdsourced virtual reality demonstrations Paper presented at the 1st International Workshop on Virtual, Augmented, and Mixed Reality for Human-Robot Interaction, Chicago, Mar. 5
    [Google Scholar]
  31. 31. 
    Toris R, Kent D, Chernova S 2015. Unsupervised learning of multi-hypothesized pick-and- place task templates via crowdsourcing. 2015 IEEE International Conference on Robotics and Automation4504–10 Piscataway, NJ: IEEE
    [Google Scholar]
  32. 32. 
    Mandlekar A, Zhu Y, Garg A, Booher J, Spero M 2018. RoboTurk: a crowdsourcing platform for robotic skill learning through imitation. Proceedings of the 2nd Conference on Robot Learning A Billared, A Dragan, J Peters, J Morimoto879–93 Proc. Mach. Learn. Res. Vol. 87. N.p.: PMLR
    [Google Scholar]
  33. 33. 
    Kent D, Behrooz M, Chernova S 2016. Construction of a 3D object recognition and manipulation database from grasp demonstrations. Auton. Robots 40:175–92
    [Google Scholar]
  34. 34. 
    Aleotti J, Caselli S 2011. Part-based robot grasp planning from human demonstration. 2011 IEEE International Conference on Robotics and Automation4554–60 Piscataway, NJ: IEEE
    [Google Scholar]
  35. 35. 
    Havoutis I, Calinon S 2018. Learning from demonstration for semi-autonomous teleoperation. Auton. Robots 43:713–26
    [Google Scholar]
  36. 36. 
    Kaiser J, Melbaum S, Tieck JCV, Roennau A, Butz MV, Dillmann R 2018. Learning to reproduce visually similar movements by minimizing event-based prediction error. 2018 7th IEEE International Conference on Biomedical Robotics and Biomechatronics260–67 Piscataway, NJ: IEEE
    [Google Scholar]
  37. 37. 
    Dillmann R 2004. Teaching and learning of robot tasks via observation of human performance. Robot. Auton. Syst. 47:109–16
    [Google Scholar]
  38. 38. 
    Vogt D, Stepputtis S, Grehl S, Jung B, Amor HB 2017. A system for learning continuous human-robot interactions from human-human demonstrations. 2017 IEEE International Conference on Robotics and Automation2882–89 Piscataway, NJ: IEEE
    [Google Scholar]
  39. 39. 
    Hayes B, Scassellati B 2014. Discovering task constraints through observation and active learning. 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems4442–49 Piscataway, NJ: IEEE
    [Google Scholar]
  40. 40. 
    Codevilla F, Miiller M, López A, Koltun V, Dosovitskiy A 2018. End-to-end driving via conditional imitation learning. 2018 IEEE International Conference on Robotics and Automation4693–700 Piscataway, NJ: IEEE
    [Google Scholar]
  41. 41. 
    Pervez A, Mao Y, Lee D 2017. Learning deep movement primitives using convolutional neural networks. 2017 IEEE-RAS International Conference on Humanoid Robots191–97 Piscataway, NJ: IEEE
    [Google Scholar]
  42. 42. 
    Liu Y, Gupta A, Abbeel P, Levine S 2018. Imitation from observation: learning to imitate behaviors from raw video via context translation. 2018 IEEE International Conference on Robotics and Automation1118–25 Piscataway, NJ: IEEE
    [Google Scholar]
  43. 43. 
    Schulman J, Ho J, Lee C, Abbeel P 2016. Learning from demonstrations through the use of non-rigid registration. Robotics Research339–54 Cham, Switz.: Springer
    [Google Scholar]
  44. 44. 
    Fitzgerald T, McGreggor K, Akgun B, Thomaz A, Goel A 2015. Visual case retrieval for interpreting skill demonstrations. International Conference on Case-Based Reasoning119–33 Cham, Switz.: Springer
    [Google Scholar]
  45. 45. 
    Cakmak M, Thomaz AL 2012. Designing robot learners that ask good questions. Proceedings of the Seventh Annual ACM/IEEE International Conference On Human-Robot Interaction17–24 New York: ACM
    [Google Scholar]
  46. 46. 
    Cakmak M, Chao C, Thomaz AL 2010. Designing interactions for robot active learners. IEEE Trans. Auton. Mental Dev. 2:108–18
    [Google Scholar]
  47. 47. 
    Bullard K, Schroecker Y, Chernova S 2019. Active learning within constrained environments through imitation of an expert questioner. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence S Kraus2045–52 Calif.: IJCAI
    [Google Scholar]
  48. 48. 
    Bullard K, Thomaz AL, Chernova S 2018. Towards intelligent arbitration of diverse active learning queries. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems6049–56 Piscataway, NJ: IEEE
    [Google Scholar]
  49. 49. 
    Gutierrez RA, Short ES, Niekum S, Thomaz AL 2019. Learning from corrective demonstrations. 2019 14th ACM/IEEE International Conference on Human-Robot Interaction712–14 Piscataway, NJ: IEEE
    [Google Scholar]
  50. 50. 
    Bajcsy A, Losey DP, O'Malley MK, Dragan AD 2018. Learning from physical human corrections, one feature at a time. Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction141–49 New York: ACM
    [Google Scholar]
  51. 51. 
    Amershi S, Cakmak M, Knox WB, Kulesza T 2014. Power to the people: the role of humans in interactive machine learning. AI Mag. 35:4105–20
    [Google Scholar]
  52. 52. 
    Laird JE, Gluck K, Anderson J, Forbus KD, Jenkins OC 2017. Interactive task learning. IEEE Intell. Syst. 32:6–21
    [Google Scholar]
  53. 53. 
    Saran A, Short ES, Thomaz A, Niekum S 2019. Enhancing robot learning with human social cues. 2019 14th ACM/IEEE International Conference on Human-Robot Interaction745–47 Piscataway, NJ: IEEE
    [Google Scholar]
  54. 54. 
    Kessler Faulkner T, Gutierrez RA, Short ES, Hoffman G, Thomaz AL 2019. Active attention-modified policy shaping: socially interactive agents track. Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems728–36 Richland, SC: Int. Found. Auton. Agents Multiagent Syst.
    [Google Scholar]
  55. 55. 
    Kessler Faulkner T, Niekum S, Thomaz A 2018. Asking for help effectively via modeling of human beliefs. Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction149–50 New York: ACM
    [Google Scholar]
  56. 56. 
    Bullard K, Chernova S, Thomaz AL 2018. Human-driven feature selection for a robotic agent learning classification tasks from demonstration. 2018 IEEE International Conference on Robotics and Automation6923–30 Piscataway, NJ: IEEE
    [Google Scholar]
  57. 57. 
    Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D 2014. Generative adversarial nets. Advances in Neural Information Processing Systems 27 Z Ghahramani, M Welling, C Cortes, ND Lawrence, KQ Weinberger2672–80 Red Hook, NY: Curran
    [Google Scholar]
  58. 58. 
    Ho J, Ermon S 2016. Generative adversarial imitation learning. Advances in Neural Information Processing Systems 29 DD Lee, M Sugiyama, UV Luxburg, I Guyon, R Garnett4565–73 Red Hook, NY: Curran
    [Google Scholar]
  59. 59. 
    Schneider M, Ertel W 2010. Robot learning by demonstration with local Gaussian process regression. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 255–60 Piscataway, NJ: IEEE
  60. 60. 
    Akgun B, Cakmak M, Jiang K, Thomaz AL 2012. Keyframe-based learning from demonstration. Int. J. Soc. Robot. 4:343–55
    [Google Scholar]
  61. 61. 
    Lin Y, Ren S, Clevenger M, Sun Y 2012. Learning grasping force from demonstration. 2012 IEEE International Conference on Robotics and Automation1526–31 Piscataway, NJ: IEEE
    [Google Scholar]
  62. 62. 
    Paraschos A, Daniel C, Peters J, Neumann G 2013. Probabilistic movement primitives. Advances in Neural Information Processing Systems 26 CJC Burges, L Bottou, M Welling, Z Ghahramani, KQ Weinberger2616–24 Red Hook, NY: Curran
    [Google Scholar]
  63. 63. 
    Rozo L, Calinon S, Caldwell D, Jimenez P, Torras C, Jiménez P 2013. Learning collaborative impedance-based robot behaviors. Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence1422–28 Palo Alto, CA: AAAI Press
    [Google Scholar]
  64. 64. 
    Osa T, Harada K, Sugita N, Mitsuishi M 2014. Trajectory planning under different initial conditions for surgical task automation by learning from demonstration. 2014 IEEE International Conference on Robotics and Automation6507–13 Piscataway, NJ: IEEE
    [Google Scholar]
  65. 65. 
    Reiner B, Ertel W, Posenauer H, Schneider M 2014. LAT: a simple learning from demonstration method. 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems4436–41 Piscataway, NJ: IEEE
    [Google Scholar]
  66. 66. 
    Calinon S, Evrard P 2009. Learning collaborative manipulation tasks by demonstration using a haptic interface. 2009 International Conference on Advanced Robotics Piscataway, NJ: IEEE https://ieeexplore.ieee.org/document/5174740
    [Google Scholar]
  67. 67. 
    Pastor P, Hoffmann H, Asfour T, Schaal S 2009. Learning and generalization of motor skills by learning from demonstration. 2009 IEEE International Conference on Robotics and Automation763–68 Piscataway, NJ: IEEE
    [Google Scholar]
  68. 68. 
    Calinon S, Sardellitti I, Caldwell DG 2010. Learning-based control strategy for safe human-robot interaction exploiting task and robot redundancies. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems249–54 Piscataway, NJ: IEEE
    [Google Scholar]
  69. 69. 
    Calinon S, Florent D, Sauser EL, Caldwell DG, Billard AG 2010. Learning and reproduction of gestures by imitation: an approach based on hidden Markov model and Gaussian mixture regression. IEEE Robot. Autom. Mag. 17:244–45
    [Google Scholar]
  70. 70. 
    Khansari-Zadeh SM, Billard A, Mohammad Khansari-Zadeh S, Billard A, Khansari-Zadeh SM, Billard A 2011. Learning stable nonlinear dynamical systems with gaussian mixture models. IEEE Trans. Robot. 27:943–57
    [Google Scholar]
  71. 71. 
    Rozo L, Jiménez P, Torras C 2011. Robot learning from demonstration of force-based tasks with multiple solution trajectories. IEEE 15th International Conference on Advanced Robotics, pp. 124–29 Piscataway, NJ: IEEE
  72. 72. 
    Kronander K, Billard A 2012. Online learning of varying stiffness through physical human-robot interaction. 2012 IEEE International Conference on Robotics and Automation1842–49 Piscataway, NJ: IEEE
    [Google Scholar]
  73. 73. 
    Herzog A, Pastor P, Kalakrishnan M, Righetti L, Bohg J 2014. Learning of grasp selection based on shape-templates. Auton. Robots 36:51–65
    [Google Scholar]
  74. 74. 
    Li M, Yin H, Tahara K, Billard A 2014. Learning object-level impedance control for robust grasping and dexterous manipulation. 2014 IEEE International Conference on Robotics and Automation6784–91 Piscataway, NJ: IEEE
    [Google Scholar]
  75. 75. 
    Amor HB, Neumann G, Kamthe S, Kroemer O, Peters J 2014. Interaction primitives for human-robot cooperation tasks. 2014 IEEE International Conference on Robotics and Automation2831–37 Piscataway, NJ: IEEE
    [Google Scholar]
  76. 76. 
    Kober J, Gienger M, Steil JJ 2015. Learning movement primitives for force interaction tasks. 2015 IEEE International Conference on Robotics and Automation3192–99 Piscataway, NJ: IEEE
    [Google Scholar]
  77. 77. 
    Lee AX, Lu H, Gupta A, Levine S, Abbeel P 2015. Learning force-based manipulation of deformable objects from multiple demonstrations. 2015 IEEE International Conference on Robotics and Automation177–84 Piscataway, NJ: IEEE
    [Google Scholar]
  78. 78. 
    Neumann K, Steil JJ 2015. Learning robot motions with stable dynamical systems under diffeomorphic transformations. Robot. Auton. Syst. 70:1–15
    [Google Scholar]
  79. 79. 
    Denisa M, Gams A, Ude A, Petric T 2016. Learning compliant movement primitives through demonstration and statistical generalization. IEEE/ASME Trans. Mechatron. 21:2581–94
    [Google Scholar]
  80. 80. 
    Perrin N, Schlehuber-Caissier P 2016. Fast diffeomorphic matching to learn globally asymptotically stable nonlinear dynamical systems. Syst. Control Lett. 96:51–59
    [Google Scholar]
  81. 81. 
    Rana MA, Mukadam M, Ahmadzadeh SR, Chernova S, Boots B 2017. Towards robust skill generalization: unifying learning from demonstration and motion planning. Proceedings of the 1st Annual Conference on Robot Learning S Levine, V Vanhoucke, K Goldberg109–18 Proc. Mach. Learn. Res. Vol. 78. N.p.: PMLR
    [Google Scholar]
  82. 82. 
    Ravichandar H, Dani A 2018. Learning position and orientation dynamics from demonstrations via contraction analysis. Auton. Robots 43:897–912
    [Google Scholar]
  83. 83. 
    Silverio J, Huang Y, Rozo L, Calinon S, Caldwell DG 2018. Probabilistic learning of torque controllers from kinematic and force constraints. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems6552–59 Piscataway, NJ: IEEE
    [Google Scholar]
  84. 84. 
    Maeda G, Ewerton M, Lioutikov R, Ben Amor H, Peters J, Neumann G 2015. Learning interaction for collaborative tasks with probabilistic movement primitives. 2014 IEEE-RAS International Conference on Humanoid Robots527–34 Piscataway, NJ: IEEE
    [Google Scholar]
  85. 85. 
    van den Berg J, Miller S, Duckworth D, Hu H, Wan A 2010. Superhuman performance of surgical tasks by robots using iterative learning from human-guided demonstrations. 2010 IEEE International Conference on Robotics and Automation2074–81 Piscataway, NJ: IEEE
    [Google Scholar]
  86. 86. 
    Ciocarlie M, Goldfeder C, Allen PK 2007. Dimensionality reduction for hand-independent dexterous robotic grasping. 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems3270–75 Piscataway, NJ: IEEE
    [Google Scholar]
  87. 87. 
    Jonschkowski R, Brock O 2015. Learning state representations with robotic priors. Auton. Robots 39:407–28
    [Google Scholar]
  88. 88. 
    Ugur E, Piater J 2015. Bottom-up learning of object categories, action effects and logical rules: from continuous manipulative exploration to symbolic planning. 2015 IEEE International Conference on Robotics and Automation2627–33 Piscataway, NJ: IEEE
    [Google Scholar]
  89. 89. 
    Byravan A, Fox D 2017. SE3-Nets: learning rigid body motion using deep neural networks. 2017 IEEE International Conference on Robotics and Automation173–80 Piscataway, NJ: IEEE
    [Google Scholar]
  90. 90. 
    Finn C, Levine S 2017. Deep visual foresight for planning robot motion. 2017 IEEE International Conference on Robotics and Automation2786–93 Piscataway, NJ: IEEE
    [Google Scholar]
  91. 91. 
    Mayer H, Gomez F, Wierstra D, Nagy I, Knoll A, Schmidhuber J 2008. A system for robotic heart surgery that learns to tie knots using recurrent neural networks. Adv. Robot. 22:1521–37
    [Google Scholar]
  92. 92. 
    Polydoros AS, Boukas E, Nalpantidis L 2017. Online multi-target learning of inverse dynamics models for computed-torque control of compliant manipulators. 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems4716–22 Piscataway, NJ: IEEE
    [Google Scholar]
  93. 93. 
    Rahmatizadeh R, Abolghasemi P, Boloni L, Levine S 2018. Vision-based multi-task manipulation for inexpensive robots using end-to-end learning from demonstration. 2018 IEEE International Conference on Robotics and Automation3758–65 Piscataway, NJ: IEEE
    [Google Scholar]
  94. 94. 
    Ravichandar H, Salehi I, Dani AP 2017. Learning partially contracting dynamical systems from demonstrations. Proceedings of the 1st Annual Conference on Robot Learning S Levine, V Vanhoucke, K Goldberg369–78 Proc. Mach. Learn. Res. Vol. 78. N.p.: PMLR
    [Google Scholar]
  95. 95. 
    Ravichandar H, Ahmadzadeh SR, Rana MA, Chernova S 2019. Skill acquisition via automated multi-coordinate cost balancing. 2019 IEEE International Conference on Robotics and Automation7776–82 Piscataway, NJ: IEEE
    [Google Scholar]
  96. 96. 
    Manschitz S, Gienger M, Kober J, Peters J 2018. Mixture of attractors: a novel movement primitive representation for learning motor skills from demonstrations. IEEE Robot. Autom. Lett. 3:926–33
    [Google Scholar]
  97. 97. 
    Silvério J, Rozo L, Calinon S, Caldwell DG 2015. Learning bimanual end-effector poses from demonstrations using task-parameterized dynamical systems. 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems464–70 Piscataway, NJ: IEEE
    [Google Scholar]
  98. 98. 
    Chatzis SP, Korkinof D, Demiris Y 2012. A nonparametric Bayesian approach toward robot learning by demonstration. Robot. Auton. Syst. 60:789–802
    [Google Scholar]
  99. 99. 
    Ijspeert AJ, Nakanishi J, Hoffmann H, Pastor P, Schaal S 2013. Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 25:328–73
    [Google Scholar]
  100. 100. 
    Umlauft J, Hirche S 2017. Learning stable stochastic nonlinear dynamical systems. Proceedings of the 34th International Conference on Machine Learning D Precup, YW Teh3502–10 Proc. Mach. Learn. Res. Vol. 70. N.p.: PMLR
    [Google Scholar]
  101. 101. 
    Petrič T, Gams A, Colasanto L, Ijspeert AJ, Ude A 2018. Accelerated sensorimotor learning of compliant movement primitives. IEEE Trans. Robot. 34:1636–42
    [Google Scholar]
  102. 102. 
    Ravichandar H, Trombetta D, Dani A 2019. Human intention-driven learning control for trajectory synchronization in human-robot collaborative tasks. IFAC-PapersOnLine 51:341–7
    [Google Scholar]
  103. 103. 
    Peternel L, Petrič T, Babič J 2018. Robotic assembly solution by human-in-the-loop teaching method based on real-time stiffness modulation. Auton. Robots 42:1–17
    [Google Scholar]
  104. 104. 
    Suomalainen M, Kyrki V 2016. Learning compliant assembly motions from demonstration. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems871–76 Piscataway, NJ: IEEE
    [Google Scholar]
  105. 105. 
    Polydoros AS, Nalpantidis L 2017. Survey of model-based reinforcement learning: applications on robotics. J. Intell. Robot. Syst. 86:153–73
    [Google Scholar]
  106. 106. 
    Englert P, Paraschos A, Deisenroth MP, Peters J 2013. Probabilistic model-based imitation learning. Adapt. Behav. 21:388–403
    [Google Scholar]
  107. 107. 
    Ratliff N, Zucker M, Bagnell JA, Srinivasa S 2009. CHOMP: gradient optimization techniques for efficient motion planning. 2009 IEEE International Conference on Robotics and Automation, pp. 489–94 Piscataway, NJ: IEEE
  108. 108. 
    Kalakrishnan M, Chitta S, Theodorou E, Pastor P, Schaal S 2011. STOMP: stochastic trajectory optimization for motion planning. 2011 IEEE International Conference on Robotics and Automation4569–74 Piscataway, NJ: IEEE
    [Google Scholar]
  109. 109. 
    Kalakrishnan M, Pastor P, Righetti L, Schaal S 2013. Learning objective functions for manipulation. 2013 IEEE International Conference on Robotics and Automation1331–36 Piscataway, NJ: IEEE
    [Google Scholar]
  110. 110. 
    Dragan AD, Muelling K, Bagnell JA, Srinivasa SS 2015. Movement primitives via optimization. 2015 IEEE International Conference on Robotics and Automation2339–46 Piscataway, NJ: IEEE
    [Google Scholar]
  111. 111. 
    Bajcsy A, Losey DP, O'Malley MK, Dragan AD 2017. Learning robot objectives from physical human interaction. Proceedings of the 1st Annual Conference on Robot Learning S Levine, V Vanhoucke, K Goldberg217–26 Proc. Mach. Learn. Res. Vol. 78. N.p.: PMLR
    [Google Scholar]
  112. 112. 
    Ng AY, Russell SJ 2000. Algorithms for inverse reinforcement learning. Proceedings of the Seventeenth International Conference on Machine Learning663–70 San Francisco: Morgan Kaufmann
    [Google Scholar]
  113. 113. 
    Dvijotham K, Todorov E 2010. Inverse optimal control with linearly-solvable MDPs. Proceedings of the Twenty-Seventh International Conference on Machine Learning335–42 Madison, WI: Omnipress
    [Google Scholar]
  114. 114. 
    Levine S, Koltun V 2012. Continuous inverse optimal control with locally optimal examples. arXiv:1206.4617 [cs.LG]
  115. 115. 
    Doerr A, Ratliff ND, Bohg J, Toussaint M, Schaal S 2015. Direct loss minimization inverse optimal control. Robotics: Science and Systems XI LE Kavraki, D Hsu, J Buchli pap. 13. N.p.: Robot. Sci. Syst. Found.
    [Google Scholar]
  116. 116. 
    Finn C, Levine S, Abbeel P 2016. Guided cost learning: deep inverse optimal control via policy optimization. Proceedings of the 33rd International Conference on International Conference on Machine Learning MF Balcan, KQ Weinberger49–58 Proc. Mach. Learn. Res. Vol. 48. N.p.: PMLR
    [Google Scholar]
  117. 117. 
    Silver D, Bagnell JA, Stentz A 2010. Learning from demonstration for autonomous navigation in complex unstructured terrain. Int. J. Robot. Res. 29:1565–92
    [Google Scholar]
  118. 118. 
    Boularias A, Kober J, Peters J 2011. Relative entropy inverse reinforcement learning. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics G Gordon, D Dunson, M Dudik182–89 Proc. Mach. Learn. Res. Vol. 15. N.p.: PMLR
    [Google Scholar]
  119. 119. 
    Hadfield-Menell D, Russell SJ, Abbeel P, Dragan A 2016. Cooperative inverse reinforcement learning. Advances in Neural Information Processing Systems 29 DD Lee, M Sugiyama, UV Luxburg, I Guyon, R Garnett3909–17 Red Hook, NY: Curran
    [Google Scholar]
  120. 120. 
    Ratliff ND, Silver D, Bagnell JA 2009. Learning to search: functional gradient techniques for imitation learning. Auton. Robots 27:25–53
    [Google Scholar]
  121. 121. 
    Abbeel P, Ng AY 2004. Apprenticeship learning via inverse reinforcement learning. Proceedings of the Twenty-First International Conference on Machine Learning pap. 1. New York: ACM
    [Google Scholar]
  122. 122. 
    Zucker M, Ratliff N, Stolle M, Chestnutt J, Bagnell JA 2011. Optimization and learning for rough terrain legged locomotion. Int. J. Robot. Res. 30:175–91
    [Google Scholar]
  123. 123. 
    Ziebart BD 2010. Modeling purposeful adaptive behavior with the principle of maximum causal entropy PhD Thesis, Carnegie Mellon Univ., Pittsburgh, PA
    [Google Scholar]
  124. 124. 
    Choi J, Kim KE 2011. Map inference for Bayesian inverse reinforcement learning. Advances in Neural Information Processing Systems 24 J Shawe-Taylor, RS Zemel, PL Bartlett, F Pereira, KQ Weinberger1989–97 Red Hook, NY: Curran
    [Google Scholar]
  125. 125. 
    Choudhury R, Swamy G, Hadfield-Menell D, Dragan AD 2019. On the utility of model learning in HRI. 2019 14th ACM/IEEE International Conference on Human-Robot Interaction317–25 Piscataway, NJ: IEEE
    [Google Scholar]
  126. 126. 
    Kober J, Bagnell JA, Peters J 2013. Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32:1238–74
    [Google Scholar]
  127. 127. 
    Ekvall S, Kragic D 2008. Robot learning from demonstration: a task-level planning approach. Int. J. Adv. Robot. Syst. 5:223–34
    [Google Scholar]
  128. 128. 
    Grollman DH, Jenkins OC 2010. Incremental learning of subtasks from unsegmented demonstration. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 261–66 Piscataway, NJ: IEEE
  129. 129. 
    Mohseni-Kabir A, Li C, Wu V, Miller D, Hylak B 2018. Simultaneous learning of hierarchy and primitives for complex robot tasks. Auton. Robots 43:859–74
    [Google Scholar]
  130. 130. 
    Yin H, Melo FS, Paiva A, Billard A 2018. An ensemble inverse optimal control approach for robotic task learning and adaptation. Auton. Robots 43:875–96
    [Google Scholar]
  131. 131. 
    Konidaris G, Kuindersma S, Grupen R, Barto A 2012. Robot learning from demonstration by constructing skill trees. Int. J. Robot. Res. 31:360–75
    [Google Scholar]
  132. 132. 
    Manschitz S, Kober J, Gienger M, Peters J 2014. Learning to sequence movement primitives from demonstrations. 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems4414–21 Piscataway, NJ: IEEE
    [Google Scholar]
  133. 133. 
    Hayes B, Scassellati B 2016. Autonomously constructing hierarchical task networks for planning and human-robot collaboration. 2016 IEEE International Conference on Robotics and Automation5469–76 Piscataway, NJ: IEEE
    [Google Scholar]
  134. 134. 
    Lioutikov R, Maeda G, Veiga F, Kersting K, Peters J 2018. Inducing probabilistic context-free grammars for the sequencing of movement primitives. 2018 IEEE International Conference on Robotics and Automation5651–58 Piscataway, NJ: IEEE
    [Google Scholar]
  135. 135. 
    Pastor P, Kalakrishnan M, Righetti L, Schaal S 2012. Towards associative skill memories. 2012 12th IEEE-RAS International Conference on Humanoid Robots309–15 Piscataway, NJ: IEEE
    [Google Scholar]
  136. 136. 
    Dang H, Allen PK 2010. Robot learning of everyday object manipulations via human demonstration. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems1284–89 Piscataway, NJ: IEEE
    [Google Scholar]
  137. 137. 
    Meier F, Theodorou E, Stulp F, Schaal S 2011. Movement segmentation using a primitive library. 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems3407–12 Piscataway, NJ: IEEE
    [Google Scholar]
  138. 138. 
    Kulić D, Ott C, Lee D, Ishikawa J, Nakamura Y 2012. Incremental learning of full body motion primitives and their sequencing through human motion observation. Int. J. Robot. Res. 31:330–45
    [Google Scholar]
  139. 139. 
    Niekum S, Osentoski S, Konidaris G, Barto AG 2012. Learning and generalization of complex tasks from unstructured demonstrations. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems5239–46 Piscataway, NJ: IEEE
    [Google Scholar]
  140. 140. 
    Kroemer O, Van Hoof H, Neumann G, Peters J 2014. Learning to predict phases of manipulation tasks as hidden states. 2014 IEEE International Conference on Robotics and Automation4009–14 Piscataway, NJ: IEEE
    [Google Scholar]
  141. 141. 
    Lioutikov R, Neumann G, Maeda G, Peters J 2015. Probabilistic segmentation applied to an assembly task. 2015 IEEE-RAS 15th International Conference on Humanoid Robots533–40 Piscataway, NJ: IEEE
    [Google Scholar]
  142. 142. 
    Su Z, Kroemer O, Loeb GE, Sukhatme GS, Schaal S 2018. Learning manipulation graphs from demonstrations using multimodal sensory signals. 2018 IEEE International Conference on Robotics and Automation2758–65 Piscataway, NJ: IEEE
    [Google Scholar]
  143. 143. 
    Baisero A, Mollard Y, Lopes M, Toussaint M, Lütkebohle I 2015. Temporal segmentation of pair-wise interaction phases in sequential manipulation demonstrations. 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems478–84 Piscataway, NJ: IEEE
    [Google Scholar]
  144. 144. 
    Takano W, Nakamura Y 2015. Statistical mutual conversion between whole body motion primitives and linguistic sentences for human motions. Int. J. Robot. Res. 34:1314–28
    [Google Scholar]
  145. 145. 
    Nicolescu MN, Mataric MJ 2003. Natural methods for robot task learning: instructive demonstrations, generalization and practice. Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems241–48 New York: ACM
    [Google Scholar]
  146. 146. 
    Pardowitz M, Knoop S, Dillmann R, Zollner RD 2007. Incremental learning of tasks from user demonstrations, past experiences, and vocal comments. IEEE Trans. Syst. Man Cybernet. B 37:322–32
    [Google Scholar]
  147. 147. 
    Jäkel R, Schmidt-Rohr SR, Rühl SW, Kasper A, Xue Z, Dillmann R 2012. Learning of planning models for dexterous manipulation based on human demonstrations. Int. J. Soc. Robot. 4:437–48
    [Google Scholar]
  148. 148. 
    Kroemer O, Daniel C, Neumann G, Van Hoof H, Peters J 2015. Towards learning hierarchical skills for multi-phase manipulation tasks. 2015 IEEE International Conference on Robotics and Automation1503–10 Piscataway, NJ: IEEE
    [Google Scholar]
  149. 149. 
    Gombolay M, Jensen R, Stigile J, Golen T, Shah N 2018. Human-machine collaborative optimization via apprenticeship scheduling. J. Artif. Intell. Res. 63:1–49
    [Google Scholar]
  150. 150. 
    Schmidt-Rohr SR, Lösch M, Jäkel R, Dillmann R 2010. Programming by demonstration of probabilistic decision making on a multi-modal service robot. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems784–89 Piscataway, NJ: IEEE
    [Google Scholar]
  151. 151. 
    Butterfield J, Osentoski S, Jay G, Jenkins OC 2010. Learning from demonstration using a multi-valued function regressor for time-series data. 2010 10th IEEE-RAS International Conference on Humanoid Robots328–33 Piscataway, NJ: IEEE
    [Google Scholar]
  152. 152. 
    Niekum S, Osentoski S, Konidaris G, Chitta S, Marthi B, Barto AG 2015. Learning grounded finite-state representations from unstructured demonstrations. Int. J. Robot. Res. 34:131–57
    [Google Scholar]
  153. 153. 
    Hausman K, Chebotar Y, Schaal S, Sukhatme G, Lim JJ 2017. Multi-modal imitation learning from unstructured demonstrations using generative adversarial nets. Advances in Neural Information Processing Systems 30 I Guyon, UV Luxburg, S Bengio, H Wallach, R Fergus1235–45 Red Hook, NY: Curran
    [Google Scholar]
  154. 154. 
    Krishnan S, Garg A, Liaw R, Thananjeyan B, Miller L 2019. SWIRL: a sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards. Int. J. Robot. Res. 38:126–45
    [Google Scholar]
  155. 155. 
    Krishnan S, Garg A, Liaw R, Miller L, Pokorny FT, Goldberg K 2016. HIRL: hierarchical inverse reinforcement learning for long-horizon tasks with delayed rewards. arXiv:1604.06508 [cs.RO]
  156. 156. 
    Hirzinger G, Heindl J 1983. Sensor programming, a new way for teaching a robot paths and forces torques simultaneously. Proceedings of the 3rd International Conference on Robot Vision and Sensory Control B Rooks549–58 Oxford, UK: Cotswold
    [Google Scholar]
  157. 157. 
    Asada H, Izumi H 1987. Direct teaching and automatic program generation for the hybrid control of robot manipulators. 1987 IEEE International Conference on Robotics and Automation 41401–6 Piscataway, NJ: IEEE
    [Google Scholar]
  158. 158. 
    Kronander K, Khansari M, Billard A 2015. Incremental motion learning with locally modulated dynamical systems. Robot. Auton. Syst. 70:52–62
    [Google Scholar]
  159. 159. 
    Kent D, Chernova S 2014. Construction of an object manipulation database from grasp demonstrations. 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems3347–52 Piscataway, NJ: IEEE
    [Google Scholar]
  160. 160. 
    Bhattacharjee T, Lee G, Song H, Srinivasa SS 2019. Towards robotic feeding: role of haptics in fork-based food manipulation. IEEE Robot. Autom. Lett. 4:1485–92
    [Google Scholar]
  161. 161. 
    Fong J, Tavakoli M 2018. Kinesthetic teaching of a therapist's behavior to a rehabilitation robot. 2018 International Symposium on Medical Robotics Piscataway, NJ: IEEE. https://doi.org/10.1109/ISMR.2018.8333285
    [Crossref] [Google Scholar]
  162. 162. 
    Wang H, Chen J, Lau HYK, Ren H 2016. Motion planning based on learning from demonstration for multiple-segment flexible soft robots actuated by electroactive polymers. IEEE Robot. Autom. Lett. 1:391–98
    [Google Scholar]
  163. 163. 
    Najafi M, Sharifi M, Adams K, Tavakoli M 2017. Robotic assistance for children with cerebral palsy based on learning from tele-cooperative demonstration. Int. J. Intell. Robot. Appl. 1:43–54
    [Google Scholar]
  164. 164. 
    Moro C, Nejat G, Mihailidis A 2018. Learning and personalizing socially assistive robot behaviors to aid with activities of daily living. ACM Trans. Human-Robot Interact. 7:15
    [Google Scholar]
  165. 165. 
    Ma Z, Ben-Tzvi P, Danoff J 2015. Hand rehabilitation learning system with an exoskeleton robotic glove. IEEE Trans. Neural Syst. Rehabil. Eng. 24:1323–32
    [Google Scholar]
  166. 166. 
    Strabala K, Lee MK, Dragan A, Forlizzi J, Srinivasa SS 2013. Toward seamless human-robot handovers. J. Hum.-Robot Interact. 2:112–32
    [Google Scholar]
  167. 167. 
    Pomerleau DA 1991. Efficient training of artificial neural networks for autonomous navigation. Neural Comput. 3:88–97
    [Google Scholar]
  168. 168. 
    Boularias A, Krömer O, Peters J 2012. Structured apprenticeship learning. Joint European Conference on Machine Learning and Knowledge Discovery in Databases227–42 Berlin: Springer
    [Google Scholar]
  169. 169. 
    Ross S, Gordon G, Bagnell D 2011. A reduction of imitation learning and structured prediction to no-regret online learning. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics G Gordon, D Dunson, M Dudik627–35 Proc. Mach. Learn. Res. Vol. 15. N.p.: PMLR
    [Google Scholar]
  170. 170. 
    Silver D, Bagnell JA, Stentz A 2012. Active learning from demonstration for robust autonomous navigation. 2012 IEEE International Conference on Robotics and Automation200–7 Piscataway, NJ: IEEE
    [Google Scholar]
  171. 171. 
    Li Y, Song J, Ermon S 2017. InfoGAIL: interpretable imitation learning from visual demonstrations. Advances in Neural Information Processing Systems 30 I Guyon, UV Luxburg, S Bengio, H Wallach, R Fergus3812–22 Red Hook, NY: Curran
    [Google Scholar]
  172. 172. 
    Pan Y, Cheng CA, Saigol K, Lee K, Yan X 2018. Agile autonomous driving using end-to-end deep imitation learning. Robotics: Science and Systems XIV H Kress-Gazit, S Srinivasa, T Howard, N Atanasov pap. 56. N.p.: Robot. Sci. Syst. Found.
    [Google Scholar]
  173. 173. 
    Kuderer M, Gulati S, Burgard W 2015. Learning driving styles for autonomous vehicles from demonstration. 2015 IEEE International Conference on Robotics and Automation2641–46 Piscataway, NJ: IEEE
    [Google Scholar]
  174. 174. 
    Ross S, Melik-Barkhudarov N, Shankar KS, Wendel A, Dey D 2013. Learning monocular reactive UAV control in cluttered natural environments. 2013 IEEE International Conference on Robotics and Automation1765–72 Piscataway, NJ: IEEE
    [Google Scholar]
  175. 175. 
    Kaufmann E, Loquercio A, Ranftl R, Dosovitskiy A, Koltun V, Scaramuzza D 2018. Deep drone racing: learning agile flight in dynamic environments. Proceedings of the 2nd Conference on Robot Learning A Billard, A Dragan, J Peters, J Morimoto133–45 Proc. Mach. Learn. Res. Vol. 87. N.p.: PMLR
    [Google Scholar]
  176. 176. 
    Loquercio A, Maqueda AI, Del-Blanco CR, Scaramuzza D 2018. DroNet: learning to fly by driving. IEEE Robot. Autom. Lett. 3:1088–95
    [Google Scholar]
  177. 177. 
    Farchy A, Barrett S, MacAlpine P, Stone P 2013. Humanoid robots learning to walk faster: from the real world to simulation and back. Proceedings of the 2013 International Conference on Autonomous Agents and Multiagent Systems39–46 Richland, SC: Int. Found. Auton. Agents Multiagent Syst.
    [Google Scholar]
  178. 178. 
    Meriçliç, Veloso M 2010. Biped walk learning through playback and corrective demonstration. Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence1594–99 Palo Alto, CA: AAAI Press
    [Google Scholar]
  179. 179. 
    Calandra R, Gopalan N, Seyfarth A, Peters J, Deisenroth MP 2014. Bayesian gait optimization for bipedal locomotion. International Conference on Learning and Intelligent Optimization274–90 Cham, Switz.: Springer
    [Google Scholar]
  180. 180. 
    Kolter JZ, Abbeel P, Ng AY 2008. Hierarchical apprenticeship learning with application to quadruped locomotion. Advances in Neural Information Processing Systems 20 JC Platt, D Koller, Y Singer, ST Roweis769–76 Red Hook, NY: Curran
    [Google Scholar]
  181. 181. 
    Kalakrishnan M, Buchli J, Pastor P, Mistry M, Schaal S 2011. Learning, planning, and control for quadruped locomotion over challenging terrain. Int. J. Robot. Res. 30:236–58
    [Google Scholar]
  182. 182. 
    Nakanishi J, Morimoto J, Endo G, Cheng G, Schaal S, Kawato M 2004. Learning from demonstration and adaptation of biped locomotion. Robot. Auton. Syst. 47:79–91
    [Google Scholar]
  183. 183. 
    Carrera A, Palomeras N, Ribas D, Kormushev P, Carreras M 2014. An intervention-AUV learns how to perform an underwater valve turning. OCEANS 2014 - TAIPEI Piscataway, NJ: IEEE https://doi.org/10.1109/OCEANS-TAIPEI.2014.6964483
    [Crossref] [Google Scholar]
  184. 184. 
    Havoutis I, Calinon S 2017. Supervisory teleoperation with online learning and optimal control. 2017 IEEE International Conference on Robotics and Automation1534–40 Piscataway, NJ: IEEE
    [Google Scholar]
  185. 185. 
    Birk A, Doernbach T, Mueller C, Łuczynski T, Chavez AG 2018. Dexterous underwater manipulation from onshore locations: streamlining efficiencies for remotely operated underwater vehicles. IEEE Robot. Autom. Mag. 25:424–33
    [Google Scholar]
  186. 186. 
    Somers T, Hollinger GA 2016. Human–robot planning and learning for marine data collection. Auton. Robots 40:1123–37
    [Google Scholar]
  187. 187. 
    Sun W, Venkatraman A, Gordon GJ, Boots B, Bagnell JA 2017. Deeply AggreVaTeD: differentiable imitation learning for sequential prediction. Proceedings of the 34th International Conference on Machine Learning D Precup, YW Teh3309–18 Proc. Mach. Learn. Res. Vol. 70. N.p.: PMLR
    [Google Scholar]
  188. 188. 
    Kober J, Peters JR 2009. Policy search for motor primitives in robotics. Advances in Neural Information Processing Systems 21 D Koller, D Schuurmans, Y Bengio, L Bottou849–56 Red Hook, NY: Curran
    [Google Scholar]
  189. 189. 
    Taylor ME, Suay HB, Chernova S 2011. Integrating reinforcement learning with human demonstrations of varying ability. The 10th International Conference on Autonomous Agents and Multiagent Systems 2617–24 Richland, SC: Int. Found. Auton. Agents Multiagent Syst.
    [Google Scholar]
  190. 190. 
    Pastor P, Kalakrishnan M, Chitta S, Theodorou E, Schaal S 2011. Skill learning and task outcome prediction for manipulation. 2011 IEEE International Conference on Robotics and Automation3828–34 Piscataway, NJ: IEEE
    [Google Scholar]
  191. 191. 
    Kim B, Farahmand A, Pineau J, Precup D 2013. Learning from limited demonstrations. Advances in Neural Information Processing Systems 26 CJC Burges, L Bottou, M Welling, Z Ghahramani, KQ Weinberger2859–67 Red Hook, NY: Curran
    [Google Scholar]
  192. 192. 
    Vecerik M, Hester T, Scholz J, Wang F, Pietquin O 2017. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. arXiv:1707.08817 [cs.AI]
  193. 193. 
    Vecerik M, Sushkov O, Barker D, Rothörl T, Hester T, Scholz J 2019. A practical approach to insertion with variable socket position using deep reinforcement learning. 2019 International Conference on Robotics and Automation754–60 Piscataway, NJ: IEEE
    [Google Scholar]
  194. 194. 
    Nair A, McGrew B, Andrychowicz M, Zaremba W, Abbeel P 2018. Overcoming exploration in reinforcement learning with demonstrations. 2018 IEEE International Conference on Robotics and Automation6292–99 Piscataway, NJ: IEEE
    [Google Scholar]
  195. 195. 
    Sermanet P, Lynch C, Chebotar Y, Hsu J, Jang E 2018. Time-contrastive networks: self-supervised learning from video. 2018 IEEE International Conference on Robotics and Automation1134–41 Piscataway, NJ: IEEE
    [Google Scholar]
  196. 196. 
    Brown DS, Niekum S 2017. Toward probabilistic safety bounds for robot learning from demonstration Tech. Rep. FS-17-01, Assoc. Adv. Artif. Intell., Palo Alto, CA
    [Google Scholar]
  197. 197. 
    Laskey M, Staszak S, Hsieh WYS, Mahler J, Pokorny FT 2016. SHIV: reducing supervisor burden in dagger using support vectors for efficient learning from demonstrations in high dimensional state spaces. 2016 IEEE International Conference on Robotics and Automation462–69 Piscataway, NJ: IEEE
    [Google Scholar]
  198. 198. 
    Zhou W, Li W 2018. Safety-aware apprenticeship learning. Computer Aided Verification: 30th International Conference, CAV 2018 H Chockler, G Weissenbacher662–80 Cham, Switz.: Springer
    [Google Scholar]
  199. 199. 
    Gupta A, Eppner C, Levine S, Abbeel P 2016. Learning dexterous manipulation for a soft robotic hand from human demonstrations. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems3786–93 Piscataway, NJ: IEEE
    [Google Scholar]
  200. 200. 
    Ogrinc M, Gams A, Petrič T, Sugimoto N, Ude A 2013. Motion capture and reinforcement learning of dynamically stable humanoid movement primitives. 2013 IEEE International Conference on Robotics and Automation5284–90 Piscataway, NJ: IEEE
    [Google Scholar]
  201. 201. 
    Lee H, Kim H, Kim HJ 2016. Planning and control for collision-free cooperative aerial transportation. IEEE Trans. Autom. Sci. Eng. 15:189–201
    [Google Scholar]
  202. 202. 
    Coates A, Abbeel P, Ng AY 2008. Learning for control from multiple demonstrations. Proceedings of the 25th International Conference on Machine Learning144–51 New York: ACM
    [Google Scholar]
  203. 203. 
    Choi S, Lee K, Oh S 2016. Robust learning from demonstration using leveraged Gaussian processes and sparse-constrained optimization. 2016 IEEE International Conference on Robotics and Automation470–75 Piscataway, NJ: IEEE
    [Google Scholar]
  204. 204. 
    Shepard RN 1987. Toward a universal law of generalization for psychological science. Science 237:1317–23
    [Google Scholar]
  205. 205. 
    Ghirlanda S, Enquist M 2003. A century of generalization. Anim. Behav. 66:15–36
    [Google Scholar]
  206. 206. 
    Bagnell JA 2015. An invitation to imitation Tech. Rep. CMU-RI-TR-15-08, Robot. Inst., Carnegie Mellon Univ., Pittsburgh, PA
    [Google Scholar]
  207. 207. 
    Calinon S, Bruno D, Caldwell DG 2014. A task-parameterized probabilistic model with minimal intervention control. 2014 IEEE International Conference on Robotics and Automation3339–44 Piscataway, NJ: IEEE
    [Google Scholar]
  208. 208. 
    Finn C, Yu T, Zhang T, Abbeel P, Levine S 2017. One-shot visual imitation learning via meta-learning. arXiv:1709.04905 [cs.LG]
  209. 209. 
    Corduneanu A, Bishop CM 2001. Variational Bayesian model selection for mixture distributions. Artificial Intelligence and Statistics 2001 T Jaakkola, T Richardson27–34 San Francisco: Morgan Kaufmann
    [Google Scholar]
  210. 210. 
    Ketchen DJ, Shook CL 1996. The application of cluster analysis in strategic management research: an analysis and critique. Strateg. Manag. J. 17:441–58
    [Google Scholar]
  211. 211. 
    Shams L, Seitz AR 2008. Benefits of multisensory learning. Trends Cogn. Sci. 12:411–17
    [Google Scholar]
  212. 212. 
    Sung J, Jin SH, Saxena A 2018. Robobarista: object part based transfer of manipulation trajectories from crowd-sourcing in 3D pointclouds. Robotics Research A Bicchi, W Burgard701–20 Cham, Switz.: Springer
    [Google Scholar]
  213. 213. 
    Castro PS, Li S, Zhang D 2019. Inverse reinforcement learning with multiple ranked experts. arXiv:1907.13411 [cs.LG]
/content/journals/10.1146/annurev-control-100819-063206
Loading
/content/journals/10.1146/annurev-control-100819-063206
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error