Giacomo comments

Results 24 comments of


                                            Giacomo

Have you tried to use the best CG approximation among iterations?

Nope, but I have used simple alternatives like plain gradient descent and steepest descent, which however is also affected by negative alphas, although it performs better than CG in my...

Loss normalization

You are indeed correct, thanks for pointing it out! The correct version should be (specifying the reduction axes) `cross_entropy = - tf.reduce_mean(tf.reduce_sum( y_tgt*tf.log(y+1e-04) + (1.-y_tgt)*tf.log(1.-y+1e-04) ,1) ,0) ` However this...

Loss normalization

Note that in the paper they use Adam instead of SGD. I have slightly more complex code locally that actually computes the delta(t) weights changes, instead of simply replacing it...

Loss normalization

Yep. I just wanted to do it automatically. The code I'm using on my compute computes the delta by adding dependencies and grouping a few different operations under the train...

Will removing batch normalization significantly hurt performance?

I didn't remember it was the only algorithm to solve it. :P I am pretty sure that TRPO would do better anyway. In any case, I did not use batch...

Will removing batch normalization significantly hurt performance?

Hmm my MuJoCo license has expired so I can't test it now apparently. xD Did you try changing the discount factor or the other parameters? Also note that the 25.000...

Will removing batch normalization significantly hurt performance?

Awesome. :)

Could we specify the ternsorflow and gym versions? Thanks!

Could you please post the full error you get? I have recently tried the code (albeit a local version, not the git one, so it may be slightly different) on...

AttributeError: 'TimeLimit' object has no attribute 'monitor'

Unfortunately this code is a bit old and deprecated. You can solve the problem by removing all references to 'monitor' from the code.

Thanks!

That's really cool! Congratulations! :)