Occam’s Razor Is Insufficient to Infer the Preferences of Irrational Agents - 2019
Details
Title : Occam’s Razor Is Insufficient to Infer the Preferences of Irrational Agents Author(s): Armstrong, Stuart and Mindermann, Sören Link(s) : http://arxiv.org/abs/1712.05812
Rough Notes
Suppose we observe human behaviour (based on some policy), this paper argues that inferring the reward function from sub-optimal behaviour needs to infer the human reward function and their planning algorithm (called a planner) simultaneously - they call this the problem of decomposing the human policy. The main result of this paper is that when it comes to this decomposition problem, there is no free lunch. That is, we cannot recover a unique decomposition of the human policy to get the unique human reward function. This implies that if IRL acts on a human policy, the regret could be near-maximal. The additional assumptions we need to bake in go beyond regularization based approaches, specifically, we would need to put "normative assumptions" (key assumptions about the reward function and/or planner that cannot be deduced from observations).