Upgrade to tensorflow 2.0
Hi all,
Has anybody tried to upgrade this project to tensorflow 2.0?
AFAIK one of the main issues is that cuda_stream.h header was removed in TF 2.0 (also see #40 ). Now instead of passing CUstream directly when writing an op, users must pass a GPUDevice object (probably to uncouple from CUDA dependency).
Tried to patch with this change but failed. Have others had any luck?
I have tried to build this against TF 1.14 with no success, I recall an issue related to cuda_stream.h, and some others. I don't know enough about TF to fix these myself.
I would certainly like to see this library updated to the latest TF and cuda 10, and even ported to pytorch if possible. There are many interesting applications for bsmm that I am very keen to try
This issue is currently blocking 1.14 support:
https://github.com/tensorflow/tensorflow/issues/31349
Otherwise, I can fix the code that grabs the cu_stream to the new way (it is stupidly awkward to get a hold of this handle in tensorflow).
We have lots of people at OpenAI that are making the switch to pytorch. Some of the ops have already been ported over. I think we should be able to just fully support both frameworks in the future. Relative attention, new convolution primitives, more learned sparsity support, fast product-key memory ops, among other things will be released soon. Priority now is to finish up our paper on learned sparsity and dump a lot of this code.
Hi Scott, Great to hear that there is a plan to support tf 2.0 and pytorch.
Is there any progress on this?
The same question: Is there any progress on this?
Hi Scott, Is there any progress on this?
Hi, @georgepar , Have you solved this problem? If you do it, please give me some advice. Thanks.
hi @lhl2017 unfortunately no. I ended up using other alternatives like the reformer. You can check out a recent implementation of block sparse in pytorch available though https://github.com/ptillet/torch-blocksparse
You might also want to give Longformer a shot, especially if you are working on an NLP task as it includes a pretrained model for long docs https://github.com/allenai/longformer (self-promotion :D)
I ended up using a sparsity constraint on the weights of my kernel (a custom tensorflow/keras constraint that just multiples the weights matrix with a sparse mask).
@georgepar Thank you! I will try this version. Actually, I wish to use the official version of BlockSparse to reproduce Sparse transformer paper. In addition that I wanna compare to CuBLAS and CuSPARSE for checking results that they said.