Adam Gleave
Adam Gleave
Home
Publications
Opinions
Contact
CV
Light
Dark
Automatic
1
Adversarial Policies Beat Superhuman Go AIs
We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies that play against frozen KataGo victims. …
Tony Wang
,
Adam Gleave
,
Nora Belrose
,
Tom Tseng
,
Joseph Miller
,
Michael Dennis
,
Yawen Duan
,
Viktor Pogrebniak
,
Sergey Levine
,
Stuart Russell
PDF
Cite
Code
Project
Invariance in Policy Optimisation and Partial Identifiability in Reward Learning
It’s challenging to design reward functions for complex, real-world tasks. Reward learning lets one instead infer reward …
Joar Skalse
,
Matthew Farrugia-Roberts
,
Stuart Russell
,
Alessandro Abate
,
Adam Gleave
PDF
Cite
Preprocessing Reward Functions for Interpretability
In many real-world applications, the reward function is too complex to be manually specified. In such cases, reward functions must …
Erik Jenner
,
Adam Gleave
PDF
Cite
Code
Quantifying Differences in Reward Functions
For many tasks, the reward function is inaccessible to introspection or too complex to be specified procedurally, and must instead be …
Adam Gleave
,
Michael Dennis
,
Shane Legg
,
Stuart Russell
,
Jan Leike
PDF
Cite
Code
Slides
Video
OpenReview
Blog
DERAIL: Diagnostic Environments for Reward And Imitation Learning
The objective of many real-world tasks is complex and difficult to procedurally specify. This makes it necessary to use reward or …
Pedro Freire
,
Adam Gleave
,
Sam Toyer
,
Stuart Russell
PDF
Cite
Code
Understanding Learned Reward Functions
In many real-world tasks, it is not possible to procedurally specify an RL agent’s reward function. In such cases, a reward …
Eric J. Michaud
,
Adam Gleave
,
Stuart Russell
PDF
Cite
Code
Adversarial Policies: Attacking Deep Reinforcement Learning
Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to …
Adam Gleave
,
Michael Dennis
,
Cody Wild
,
Neel Kant
,
Sergey Levine
,
Stuart Russell
PDF
Cite
Code
Project
Slides
Video
OpenReview
Poster
Blog
Inverse Reinforcement Learning for Video Games
Deep reinforcement learning achieves superhuman performance in a range of video game environments, but requires that a designer …
Aaron Tucker
,
Adam Gleave
,
Stuart Russell
PDF
Cite
Code
Multi-task Maximum Causal Entropy Inverse Reinforcement Learning
Multi-task Inverse Reinforcement Learning (IRL) is the problem of inferring multiple reward functions from expert demonstrations. Prior …
Adam Gleave
,
Oliver Habryka
PDF
Cite
Code
Slides
Active Inverse Reward Design
Reward design, the problem of selecting an appropriate reward function for an AI system, is both critically important, as it encodes …
Sören Mindermann
,
Rohin Shah
,
Adam Gleave
,
Dylan Hadfield-Menell
PDF
Cite
»
Cite
×