dfp
dfp copied to clipboard
Could you advice about making DFP algorithm be based on actor-critic(or DDPG, PPO) for continuous action space?
Hi, I'm a graduate student and want to say 'thank you' for explaining DFP in detail.
It is very interesting algorithm. However, since I am majoring in robotics with learning approach, i want to make it work in continuous action space. Thus, I've tried to make it be based on DDPG and Actor-Critic. Unfortunately, It doesn't work.... Could you give me some advice about it, please?
Wonchul Kim