leepengcheng
Results
1
comments of
leepengcheng
> Sorry about that, I have re-edited the problem described. ------------- big brother,I also confused by the code. if follow the paper it should be ```python policy_loss=log_prob-expected_new_q_value #when alpha=1 ```...