Off-Policy Reinforcement Learning for Bipedal Robot Locomotion
MetadataShow full item record
Publisher:The Ohio State University
Series/Report no.:The Ohio State University. Department of Mechanical and Aerospace Engineering Honors Theses; 2021
Reinforcement Learning (RL) is a developing learning-based approach that has shown potential to be instrumental in the programming of bipedal robots. RL allows robots to learn ideal behaviors through many iterations of trial and error. The objective of RL is to learn an optimal policy, which is a mathematical function that takes in an agent's state as input and outputs an optimal action. Traditional RL approaches have been on-policy in nature meaning they attempt to improve the policy that is used to make decisions. However, the inability of these methods to utilize training data not generated from the policy often leads to a data inefficient training process and results in policies that generalize poorly to unseen data. These limitations have led researchers to pursue off-policy algorithms. One such method, Deep Deterministic Policy Gradients (DDPG), utilizes an experience buffer and multiple neural networks to learn ideal actions in environments with continuous actions spaces. While DPPG methods have been successful in some robotic control problems, they are prone to converge to suboptimal solutions when high reward actions are not discovered early in training. In particular, when used for bipedal walking, they have been unable to learn policies that produce stable walking movements. In this work, a revised DDPG RL approach is proposed that incorporates physical insights of robot walking and utilizes previously collected actions. The proposed framework is tested on the RABBIT robot model in OpenAI Gym with the MuJoCo physics engine. This approach is successful in training RABBIT to complete tasks like walking at desired velocities and walking up hills of varying grade. This work exhibits the value of utilizing external data in developing data efficient and generalizable RL approaches.
Received award in Engineering: Physical Sciences category at Denman Undergraduate Research Forum
Academic Major: Computer Science and Engineering
Items in Knowledge Bank are protected by copyright, with all rights reserved, unless otherwise indicated.