Collision-Free Deep Reinforcement Learning for Mobile Robots using Crash-Prevention Policy
2021 (English) In: 2021 7th International Conference on Control, Automation and Robotics, ICCAR 2021, Institute of Electrical and Electronics Engineers Inc. , 2021, p. 52-59Conference paper, Published paper (Refereed)
Abstract [en]
In this paper, we propose a crash-prevention policy for an autonomous collision-free mobile robot based on deep reinforcement learning. The objective is to reach a random location in a limited workspace safely. We go beyond the well-treated navigation paradigm by introducing a crash-prevention policy derived from action-sensor-space characteristics to achieve collision-free learning. This approach enables efficient and safe exploration by guaranteeing continuous collision-free actions, especially for agents learning in physical systems. We use Deep Deterministic Policy Gradient as a base method to evaluate the proposed crash-prevention policy on a mobile robot environment. Experiments show that using our approach maintains or even slightly improves training results while collisions are entirely avoided. © 2021 IEEE.
Place, publisher, year, edition, pages Institute of Electrical and Electronics Engineers Inc. , 2021. p. 52-59
Keywords [en]
Collision-Free Learning, Continuous Control, Deep Reinforcement Learning, Mobile Robots, Safe Reinforcement Learning, Agricultural robots, Reinforcement learning, Robotics, Collision-free, Limited workspaces, Physical systems, Policy gradient, Prevention policy, Random location, Robot environment, Deep learning
Identifiers URN: urn:nbn:se:miun:diva-43419 DOI: 10.1109/ICCAR52225.2021.9463474 Scopus ID: 2-s2.0-85114471635 ISBN: 9781665449861 (print) OAI: oai:DiVA.org:miun-43419 DiVA, id: diva2:1604117
Conference 7th International Conference on Control, Automation and Robotics, ICCAR 2021, 23 April 2021 through 26 April 2021
Note Conference code: 171166; Export Date: 15 October 2021; Conference Paper; References: Sutton, R.S., Barto, A.G., (2018) Reinforcement Learning, , second edition: An Introduction. MIT Press, Nov; Dulac-Arnold, G., Mankowitz, D., Hester, T., (2019) Challenges of Real-World Reinforcement Learning, , Apr; Kahn, G., Villaflor, A., Pong, V., Abbeel, P., Levine, S., (2017) Uncertainty-Aware Reinforcement Learning for Collision Avoidance, , Feb; Smart, W., Pack Kaelbling, L., Effective reinforcement learning for mobile robots (2002) Proceedings 2002 Ieee International Conference on Robotics and Automation (Cat. No. 02CH37292), 4 (4), pp. 3404-3410. , May; Antonelo, E., Figueiredo, M., Baerveldt, A.-J., Calvo, R., Intelligent autonomous navigation for mobile robots: Spatial concept acquisition and object discrimination (2005) 2005 International Symposium on Computational Intelligence in Robotics and Automation, pp. 553-557. , Jun; Fan, T., Long, P., Liu, W., Pan, J., (2018) Fully Distributed Multi-Robot Collision Avoidance Via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios, , Aug; Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D., (2019) Continuous Control with Deep Reinforcement Learning, , Jul; Garca, J., Fernndez, F., A comprehensive survey on safe reinforcement learning (2015) Journal of Machine Learning Research, 16 (1), pp. 1437-1480; Heger, M., Consideration of risk in reinforcement learning (1994) Machine Learning Proceedings, 1994, pp. 105-111. , Elsevier; Di Castro, D., Tamar, A., Mannor, S., (2012) Policy Gradients with Variance Related Risk Criteria, , Jun; Moldovan, T.M., Abbeel, P., (2012) Safe Exploration in Markov Decision Processes, , Jul; Smart, W.D., Kaelbling, L.P., Practical reinforcement learning in continuous spaces (2000) ICML., pp. 903-910. , Citeseer; Abbeel, P., Coates, A., Ng, A.Y., Autonomous helicopter aerobatics through apprenticeship learning (2010) The International Journal of Robotics Research, 29 (13), pp. 1608-1639. , Nov; Hans, A., Schneega, D., Schfer, A.M., Udluft, S., Safe exploration for reinforcement learning (2008) ESANN., pp. 143-148. , Citeseer; Garcia, J., Fernandez, F., Safe exploration of state and action spaces in reinforcement learning (2012) Journal of Artificial Intelligence Research, 45, pp. 515-564. , Dec; Maclin, R., Shavlik, J., Walker, T., Torrey, L., Knowledge-based support-vector regression for reinforcement learning (2005) Reasoning, Representation, and Learning in Computer Games, p. 61; Mannucci, T., Kampen, E.V., Visser, C.D., Chu, Q., Safe exploration algorithms for reinforcement learning controllers (2018) Ieee Transactions on Neural Networks and Learning Systems, 29 (4), pp. 1069-1081. , Apr., conference Name: IEEE Transactions on Neural Networks and Learning Systems; Li, Z., Kalabi, U., Chu, T., Safe reinforcement learning: Learning with supervision using a constraint-Admissible set (2018) 2018 Annual American Control Conference (ACC), pp. 6390-6395. , Jun., iSSN: 2378-5861; Chow, Y., Nachum, O., Duenez-Guzman, E., Ghavamzadeh, M., (2018) A Lyapunov-based Approach to Safe Reinforcement Learning, , May; Wang, L., Theodorou, E.A., Egerstedt, M., Safe learning of quadrotor dynamics using barrier certificates (2018) 2018 Ieee International Conference on Robotics and Automation (ICRA), pp. 2460-2465. , May, iSSN: 2577-087X; Cheng, R., Orosz, G., Murray, R.M., Burdick, J.W., (2019) End-To-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks, , Mar; Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M., (2013) Playing Atari with Deep Reinforcement Learning, , Dec; Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O., (2017) Proximal Policy Optimization Algorithms, , Aug; Ng, A.Y., Harada, D., Russell, S., Policy invariance under reward transformations: Theory and application to reward shaping (1999) Icml, 99, pp. 278-287
2021-10-182021-10-182021-10-18 Bibliographically approved