Cooperative Markov decision processes: Time consistency, greedy players satisfaction, and cooperation maintenance

Avrachenkov, Konstantin; Cottatellucci, Laura; Maggi, Lorenzo
Research report RR-11-248

We deal with multi-agent Markov Decision Processes (MDPs) in which cooperation among players is allowed. We find a cooperative payoff distribution procedure (MDP-CPDP) that distributes in the course of the game the payoff that players would get in the long run static game. We show under which conditions such a MDP-CPDP fulfills a time consistency property, contents greedy players, and strengthen the coalition cohesiveness throughout

the game.


Type:
Rapport
Date:
2012-01-05
Department:
Systèmes de Communication
Eurecom Ref:
3601
Copyright:
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Research report RR-11-248 and is available at :

PERMALINK : https://www.eurecom.fr/publication/3601