Policy Improvement Theorem

If \(Q^{\pi}(s,a)\geq V^{\pi}(s)\) for some action \(a\), then changing \(\pi\) to \(\pi'\) where \(\pi'\) takes action \(a\) in state \(s\) will improve the value function i.e. \(\forall s\in S\: Q^{\pi}(s,\pi'(s))\geq V^{\pi}(s) \implies \forall s\in S\: V^{\pi'}(s)\geq V^{\pi}(s)\).

Emacs 29.4 (Org mode 9.6.15)