Multi-agent-RL icon indicating copy to clipboard operation
Multi-agent-RL copied to clipboard

TD-Linear中Reward list 坐标对应错误

Open dalton-ly opened this issue 1 year ago • 2 comments

image

TD-Linear中reward list初始化有问题,和GridEnv PSA矩阵的初始化过程中的reward list的顺序不一致: image 这会导致TD-Linear中的policy_evaluation函数得不到正确的状态值

dalton-ly avatar Dec 14 '24 08:12 dalton-ly

image 另外,原代码中这里少乘了一个 $\phi(s_t)$

dalton-ly avatar Dec 14 '24 11:12 dalton-ly

感谢您指出的问题!

Ronchy2000 avatar Jan 06 '25 06:01 Ronchy2000