Markov Game/Stochastic Game
Introduced in (, ), the following definition is from Chapter 2 in (, a).
A Markov game is a tuple \((N,S,\mathbf{ A },\mathbf{ R },T)\) where:
- \(S\) is a set of $n$-agent stage games.
- \(N=\{1,\cdots,n\}\) is a set of agents.
- \(\mathbf{ A }=\{A_1,\cdots,A_n\}\) is the set of actions/pure strategies per agent. (Here, we assume the same strategy space in all games).
- \(\mathbf{ R }=\{R_1,\cdots,R_n\}\) is the reward function per agent, which is a mapping \(S \times \mathbf{ A }\to \mathbb{ R }\).
- \(T:S\times \mathbf{ A }\to \Pi(S)\) is a stochastic transition function, specifying the probability of the next stage game to be played based on the game just played and the actions taken in it.
Notice that a Markov Decision Process (MDP) is a Markov game with 1 agent.