Recursive games: uniform value, Tauberian theorem and the Mertens conjecture
Short link to this article: http://bit.ly/1W6L8hX
Xiaoxi Li et Xavier Venel
The model called “stochastic games” was introduced by Shapley in 1953 in order to simulate the interaction between several agents in a dynamic model. In numerous economic situations, agents’ decisions have a double influence: an influence on their immediate gains and an influence on those that they might obtain in the future. In financial terms, agents face a dilemma and must choose between maximising their income, added value, etc. each day, and assure themselves that they will be able to continue to obtain what they deem to be sufficient of these gains in the days that follow.
In this article, Xavier Venel and Xiaoxi Li look at the specific case of zero-sum recursive games for two players (wherein what is won by one player is necessarily lost by the other). In this model, there are only two agents, with perfectly opposing aims, who interact in repeated fashion. Moreover, the game can evolve either towards absorbing states (1) that the process never leaves, or towards non-absorbing states in which payment is worth “0 dollars” for the two players. Thus the dilemma mentioned above is simplified because the agents do not have worry about payment in the current state: each tries to reach the absorbing states that are favourable to him.
There are several ways in which to study these games. The first is to fix a time limit and to examine the average payments. A second way is to fix a discount rate and investigate the average discounted payments: a gain today is more interesting than the same gain tomorrow (2). In each of these cases, there is a unique Nash equilibrium (3) which depends on the initial state of the world. In these two approaches, the players only consider a finite number of states. Indeed, even if all the steps are taken into account in the second case, the weight of the future becomes negligible. A third, stronger, approach, consists in defining uniform equilibria where each player seeks to guarantee her maximum payment over the long-term without knowing the length of the game. Venel and Li analyse these models and show that the three approaches are linked by the notion of “uniform convergence”. The results of the average theoretical game converge when the number of steps tends towards infinity if and only if the results of the discounted theoretical game converge when the players become patient (4). In other words, on condition that the future has sufficient weight, the exact way (average or discounted average) in which the agents evaluate the gains have little influence over the equilibrium. As well, when there is a uniform equilibrium, agents can make optimal decisions without knowing exactly the number of steps or the discount rate.
(1) Once reached, the state of the world no longer changes and the gains are fixed.
(2) The discount rate δ is fixed between 0 and 1. In the discounted average, the weight of the gain at step t is (1- δ)δ^t thus a gain of one dollar tomorrow is equal to a gain of δ dollar today.
(3) Equilibrium point at which players no longer modify their choices individually at the risk of weakening their personal position.
(4) We speak of « uniform convergence » because the speed of convergence should be independent of the initial state.
Original title of the article : “Recursive games: uniform value, Tauberian theorem and the Mertens conjecture “Maxmin=lim vn(p) = lim vδ(p)’’ ”
Published in : International Journal of Game Theory
Available at : http://arxiv.org/abs/1506.00949