gambit icon indicating copy to clipboard operation
gambit copied to clipboard

BUG: `liap_value` incorrect for mixed strategy profile on an extensive game

Open tturocy opened this issue 3 months ago • 1 comments

Overview

The value of liap_value() is not being reported correctly for at least some mixed strategy profiles defined on a game that has an underlying extensive representation.

Steps to reproduce

Consider the game below, which is Figure 4.2 from Myerson's 1991 textbook:

Image

This is an example of a game that has an equilibrium in the agent form that is not an equilibrium in behaviors. The .efg representation of the game is included at the bottom of this note.

In [1]: import pygambit as gbt

In [2]: efg = gbt.read_efg("myerson_fig_4_2.efg")

In [3]: efg_eqa = gbt.nash.enumpure_solve(efg, use_strategic=False).equilibria

In [4]: for eqm in efg_eqa:
   ...:     print(eqm)
   ...: 
[[[Rational(1, 1), Rational(0, 1)], [Rational(0, 1), Rational(1, 1)]], [[Rational(0, 1), Rational(1, 1)]]]
[[[Rational(0, 1), Rational(1, 1)], [Rational(0, 1), Rational(1, 1)]], [[Rational(1, 1), Rational(0, 1)]]]

In [5]: for eqm in efg_eqa:
   ...:     print(eqm.max_regret())
   ...: 
0
0

In [6]: for eqm in efg_eqa:
   ...:     print(eqm.liap_value())
   ...: 
0
0

enumpure_solve returns two equilibria - which is expected because this is actually documented to find agent-form equilibria (clarifying this nomenclature is the subject of a separate set of issues). At least insofar as liap_value is actually reporting the value for the multiagent form, then this is as expected.

However, only the first of these profiles is a Nash equilibrium of the game in behavior strategies (as Myerson points out). The second one is not. So indeed if we convert these to strategies and look at regret, we do find that the second one has a positive regret (for the first player):

In [12]: for eqm in efg_eqa:
    ...:     print(eqm.as_strategy().max_regret())
    ...: 
0
1

However, when we check liap_value on the strategy, we find an incorrect value:

In [13]: for eqm in efg_eqa:
    ...:     print(eqm.as_strategy().liap_value())
    ...: 
0
0

The liap_value for this must be positive (because max_regret is clearly positive), but it is being reported as zero.

If you create the normal form representation and create the same mixed strategy profile on the game represented as normal form, liap_value is reported correctly:

In [14]: nfg = gbt.read_nfg("myerson_fig_4_2.nfg")

In [15]: profile = nfg.mixed_strategy_profile([[0, 0, 1], [1, 0]])

In [16]: profile.max_regret()
Out[16]: 1.0

In [17]: profile.liap_value()
Out[17]: 1.0
EFG 2 R "Untitled Extensive Game" { "Player 1" "Player 2" }
""

p "" 1 1 "" { "A1" "B1" } 0
p "" 2 1 "" { "W2" "X2" } 0
p "" 1 2 "" { "Y1" "Z1" } 0
t "" 1 "" { 3, 0 }
t "" 2 "" { 0, 0 }
p "" 1 2 "" { "Y1" "Z1" } 0
t "" 3 "" { 2, 3 }
t "" 4 "" { 4, 1 }
p "" 2 1 "" { "W2" "X2" } 0
t "" 5 "" { 2, 3 }
t "" 6 "" { 3, 2 }
NFG 1 R "Untitled Extensive Game" { "Player 1" "Player 2" }

{ { "11" "12" "2*" }
{ "1" "2" }
}
""

3 0
0 0
2 3
2 3
4 1
3 2

tturocy avatar Nov 17 '25 12:11 tturocy

The issue is with ComputerPayoffs in MixedStrategyProfile<T>::GetLiapValue() of game.cc. For the "strategy" payoffs that populate map_strategy_payoffs, we call GetPayoff(strategy), which in turn uses GetPayoffDeriv.

In the case of starting with an NFG, we end up using GetPayoffDeriv from gametable.cc and get what (at least appears to be) the correct computation. In the case of starting with an EFG and behavior profile and using as_strategy to go via a MixedStrategyProfile, we actually end up, via TreeMixedStrategyProfileRep using the behavior profile again, namely, by using TreeMixedStrategyProfileRep<T>::GetPayoffDeriv(int pl, const GameStrategy &strategy) in gametree.cc and then:

template <class T> T TreeMixedStrategyProfileRep<T>::GetPayoff(int pl) const
{
  MakeBehavior();
  return mixed_behav_profile_sptr->GetPayoff(pl);
}

Thus, this is indeed intimately related to issue #617, and one option is to fix this as part of fixing that, ensuring there is a liap_value for mixed behaviors for the normal game as opposed to the agent-form (and probably having other agent-form functions, suitable named and called only when we really want them).

rahulsavani avatar Nov 20 '25 06:11 rahulsavani

This turned out to be a cacheing-related issue in the implementation of the strategy profile. The ultimate reason for the difference in regret and liap was order in which various player/strategy payoffs were being computed.

tturocy avatar Dec 04 '25 18:12 tturocy