Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

Joar Skalse, Matthew Farrugia-Roberts, Stuart Russell, Alessandro Abate, Adam Gleave

March, 2022

Abstract

It’s challenging to design reward functions for complex, real-world tasks. Reward learning lets one instead infer reward functions from data. However, multiple reward functions often fit the data equally well, even in the infinite-data limit. Prior work often considers reward functions to be uniquely recoverable, by imposing additional assumptions on data sources. By contrast, we formally characterise the partial identifiability of popular data sources, including demonstrations and trajectory preferences, under multiple com- mon sets of assumptions. We analyse the impact of this partial identifiability on downstream tasks such as policy optimisation, including under changes in environment dynamics. We unify our results in a framework for comparing data sources and downstream tasks by their invariances, with implications for the design and selection of data sources for reward learning.

Type

Conference paper

Publication

International Conference on Machine Learning

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

Abstract

Adam Gleave

Founder & CEO at FAR AI