leepengcheng

Results 1 comments of leepengcheng

> Sorry about that, I have re-edited the problem described. ------------- big brother,I also confused by the code. if follow the paper it should be ```python policy_loss=log_prob-expected_new_q_value #when alpha=1 ```...