Inverse Reinforcement Learning (IRL)

[Definition from here] Given:

Learn:

Then use the learnt reward funcion to learn \(\pi^*(a\mid s)\).

Some papers mentioned from Lecture linked above:

Emacs 30.1.90 (Org mode 9.7.11)