Inverse Reinforcement Learning (IRL)

[Definition from here] Given:

Learn:

Then use the learnt reward funcion to learn \(\pi^*(a\mid s)\).

Some papers mentioned from Lecture linked above:

Emacs 29.4 (Org mode 9.6.15)