DDQ
DDQ copied to clipboard
what is pre-training dqn model and world model ? initialize Q(s; a; θQ) and M(s; a; θM) via pre-training on human conversational data?
Hi I dont understand the pretraining of the world model because I can not find the pretraining process in your code, can you explain me what is that? and where is the pretraining dqn model and world model in your repo? thanks
I have the same question. lol