Occam’s Razor Is Insufficient to Infer the Preferences of Irrational Agents - 2019

Details

Title : Occam’s Razor Is Insufficient to Infer the Preferences of Irrational Agents Author(s): Armstrong, Stuart and Mindermann, Sören Link(s) : http://arxiv.org/abs/1712.05812

Rough Notes

Suppose we observe human behaviour (based on some policy), this paper argues that inferring the reward function from sub-optimal behaviour needs to infer the human reward function and their planning algorithm (called a planner) simultaneously - they call this the problem of decomposing the human policy. The main result of this paper is that when it comes to this decomposition problem, there is no free lunch. That is, we cannot recover a unique decomposition of the human policy to get the unique human reward function. This implies that if IRL acts on a human policy, the regret could be near-maximal. The additional assumptions we need to bake in go beyond regularization based approaches, specifically, we would need to put "normative assumptions" (key assumptions about the reward function and/or planner that cannot be deduced from observations).