blocksparse icon indicating copy to clipboard operation
blocksparse copied to clipboard

Is there a plan for a pytorch wrapper?

Open tonylins opened this issue 8 years ago • 9 comments

tonylins avatar Dec 10 '17 09:12 tonylins

I don't think I'll have time to do this myself, but someone else is welcome to. Nvidia is likely also looking to formalize blocksparse primitives in their libraries as well.

scott-gray avatar Dec 15 '17 01:12 scott-gray

Hello, @scott-gray I would like to try to port it to pytorch. I know the procedure of incorporating c++ and cuda extension in pytorch through setuptools.

Any advice on what op to start with and how should I go about testing and validating that everything works as envisioned ? I only have a colab instance which should be good enough, but something that trains quickly would be appreciated (small dataset, model).

Thanks !

karanchahal avatar May 24 '19 09:05 karanchahal

Hey @karanchahal, your efforts to help port these ops to pytorch would be more than welcome (both internally here at OpenAI and likely by the wider community). I think right now the demand is most high for the sparse transformer primitives. I'd look here to find anything not yet supported in pytorch:

https://github.com/openai/blocksparse/blob/master/examples/transformer/enwik8.py

You might also look at the layer_norm implementation. I'm pretty sure it's significantly faster than anything else I've seen out there. Also, my clip_by_global_norm and fused optimizer ops make training in fp16 rather easy.

scott-gray avatar May 24 '19 19:05 scott-gray

Thanks ! I'll look into it

karanchahal avatar May 25 '19 03:05 karanchahal

Over the weekend, @soumith put together this PR that adds support for one of our ops: https://github.com/soumith/blocksparse/commit/4071232a4a73a441424434ca2e81b1e4fd4e836c

We should be able to follow this example to add the other ops as well. Thanks @soumith!

nottombrown avatar May 29 '19 23:05 nottombrown

Is there an update on this issue?

reactivetype avatar Aug 21 '19 22:08 reactivetype

We have some pytorch coverage of the ops, work is ongoing to make it more complete. We'll release this code sometime soon (along with new ops/modes).

scott-gray avatar Aug 21 '19 22:08 scott-gray

Is there an update on this issue?

msharmavikram avatar Nov 02 '19 19:11 msharmavikram

Is there an update on this issue? Thanks!

shizhediao avatar Feb 28 '20 01:02 shizhediao