Seminar in Computer Engineering

Bilkent University
Department of Computer Engineering
CS 590/690 SEMINAR

Sample-Efficient Safe Policy Adaptation for Robot Joint Failures

Kutay Demiray
Master Student
(Supervisor: Asst.Prof.Özgür S.Öğüz)

Computer Engineering Department
Bilkent University

Abstract: In many real-life robotics applications, we want robots to be robust to changes in dynamics such as malfunctions. For example, a space rover must be able to continue performing its tasks to some extent even after one or more of its joints break, since repairing is very costly or even impossible. Safe reinforcement learning algorithms can be used to learn a new policy for the malfunctioning robot, but existing methods often require discarding old experiences. Model-based control and reinforcement learning methods can handle experience reuse to some extent, but accurately and efficiently modeling the dynamics may be difficult and computationally costly, which makes model-free reinforcement learning an appealing alternative. While a variety of methods exist in off-policy model-free reinforcement learning to efficiently learn using samples obtained from a different policy, the analogous approach of reusing samples from different environment dynamics remains underexplored. In this work, we propose a novel algorithm to efficiently adapt a policy to joint failures. Our method reuses old samples alongside new ones with an importance sampling–based approach that accounts for failure probabilities, enabling efficient policy tuning via interior point policy optimization with failure-adaptive constraints.

DATE: March 17, Monday @ 14:50 Place: EA 502