PufferLib icon indicating copy to clipboard operation
PufferLib copied to clipboard

Cache the number of elements in the action space

Open thatguy11325 opened this issue 1 year ago • 3 comments

You probably dont need to dispatch to numpy everytime you call split to calculate the number of elements in the space. This PR caches the sizes (in a less than nice way imo) as an example. Before and after pictures below

Screenshot 2024-01-28 at 9 12 20 PM Screenshot 2024-01-28 at 9 12 52 PM

thatguy11325 avatar Jan 29 '24 02:01 thatguy11325

This looks reasonable, waiting for hardware to test end to end. Any other optimization ideas for split? It's the main bottleneck right now. From before your patch:

🐡 python tests/test_extensions.py 0.00000032: Flatten time 0.00000294: Concatenate time 0.00001958: Split time 0.00000056: Unflatten time

jsuarez5341 avatar Feb 06 '24 13:02 jsuarez5341

You could try to vectorize the generation of samps -> leaves a la what was done in evaluate? Though I'm unsure if that'll work if the sz's can vary.

thatguy11325 avatar Feb 06 '24 16:02 thatguy11325

I think it'd look like

leaves = stacked_sample.reshape(len(flat_space), batch, *next(flat_space.values()).shape)

I assume since sz is the same across all flat spaces, shape will be the same too.

thatguy11325 avatar Feb 06 '24 17:02 thatguy11325