Yuanzhao Zhai issues

Results 9 issues of


                                            Yuanzhao Zhai

A question about the approach

Hello! Thanks for your wonderful work. You predicted the next states of humans with the relational graph containing the robot. However, you also assumed that the robot is invisible to...

I am reading your excellent code. I found a small bug that failing to load the imitation model. I suggest replacing model.load_state_dict_rl(torch.load(il_weight_file)) with policy.load_state_dict_rl(torch.load(il_weight_file)) in line 138 in train.py. Thanks!

A question about backpropagation

Hello! Thanks for your wonderful work. In the Optimization section of the paper, it said that "In order to optimize (2) via backpropagation, we need to compute a subgradient of...

Release of the Chinese Dataset.

Hello! Thanks for sharing the valuable experience. Do you plan to release the Chinese datasets?

大于三个作者的参考文献报错

感谢大佬分享，想请教一个问题：使用\printbib 和 \printbibtabular[title={~}]时，参考文献只要大于三个，就会报如下错误： .10 \printbibtabular[title={~}] ./Tex/6_references.tex:10: Missing \cr inserted. \cr l.10 \printbibtabular[title={~}]

Zero win rate in SMAC scenario

Hello, thanks for your elegant code. When I directly run: python dicg_ce_smac_runner.py --map 3s_vs_5z python dicg_ce_smac_runner.py --map 6h_vs_8z the win rate is always zero. When I directly run: python dicg_ce_smac_runner.py...

Llama2 as actor using zero_stage3

Hello! Did anyone meet the following bug when using zero_stage3 for Lllama2? step3_rlhf_finetuning/rlhf_engine.py:61 in __init__ │ │ │ │ 58 │ │ self.num_total_iters = num_total_iters │ │ 59 │ │...

How to run generalization tasks?

Hello! Have you implemented halfcheetah-jump and ant-angle tasks proposed in MOPO?

Average reward of gpt-3.5-turbo

Hello! When I run the following command: "python -m eval_agent.main --agent_config openai --exp_config alfworld --split test --verbose" I got the results: All tasks done. Output saved to outputs/gpt-3.5-turbo/alfworld Average reward:...