Sunday, May 12, 2013

A refinement of approximate correlated equilibrium

One reason to prefer correlated equilibrium (CE) over Nash equilibrium (NE) is that correlated equilibrium is easier to compute. We have various PPAD/PLS-hardness results for exact Nash equilibria (NE), and it is possible that approximate Nash equilibria may also be similarly ‘hard’ to compute. In contrast, approximate correlated equilibria (CE) are not just easy to compute, but are obtainable by simple ‘learning dynamics’ type algorithms. The price you pay is that it’s a much weaker solution concept: every Nash equilibrium is a correlated equilibrium, and there may be CEs that are far from any NEs. For example some congestion games have CEs with social cost much higher than the worst NE. Some games have unrealistic Nash equilibria that motivate the study of various refinements of NE, restrictions that hopefully weed out some or all of the unnatural ones.

Given that CE is a relaxation of NE, it looks like there is even more motivation to impose restrictions on them so as to get rid of unnatural equilibria. For CE, I reckon that any such restriction should try to chip away at the set of approximate correlated equilibria, via some mathematical restriction that gets rid of undesirable ones while preserving the property that the remaining solutions are still easy to find, preferably via simple algorithms. The following definition is derived from the notion of approximate well-supported Nash equilibrium, but the version for correlated equilibrium doesn’t seem to have been considered previously. Here I just give the definition and a motivating example, and leave it as a guess that the restriction makes it not too much harder to compute.

In a well-supported ε-NE, it is disallowed for a player to allocate positive probability to a pure strategy whose payoff is worse than the best response by more than ε. (In contrast, an ε-NE that is not well-supported allows a player to play a really bad response, provided that he does so with such low probability that his average payoff does not fall short of the best response by more than ε. For bimatrix games with payoffs in [0,1], it is currently known how to — in poly time — compute ε-NE for ε just over ⅓, but for well-supported ε-NE, it is only know how to do this for ε slightly less than ⅔.

Applying that “well-supported” principle to ε-correlated equilibria, it corresponds to a requirement that when a player observes his action (a sample of his marginal distribution derived from the approximate CE), he does not gain more than ε by switching to some other pure strategy. Here’s an example of how it makes a difference. We start with a trivial example and then make it more interesting. The trivial example has one player with actions {0,1}, who gets paid 0 to play 0, and 1 to play 1. In a standard ε-approximate CE, he can play 0 with probability up to ε, but if the approximate CE should be well-supported, he must play 1 with probability 1. To make the example more interesting, suppose you have a sequence of players 1,2,...,n, all with 2 pure-strategies 0 and 1, where player 1 is like the original player in the trivial 1-player game, and player i > 1 gets paid 1 to copy player i−1 and 0 to play the opposite strategy of player i−1. In a well-supported ε-CE, all players must play 1 with probability 1. By contrast, in a standard ε-CE, the probability that player i+1 plays 1, may differ by up to ε from the corresponding probability for player i. Consequently, the probability that player i plays 1, may fluctuate arbitrarily between 0 and 1 as i increases, subject to the constraint that it changes by at most ε each time i increases by 1. Thus the “well-supported” constraint seems to rule out a large number of outcomes. Whether you think this “well-supported” constraint is a good one, depends on whether you think these outcomes really ought to be ruled out...

No comments: