Colab Notebook to run blocksparse
I wish to understand more with this awesome work!
I opened a Colab Notebook to test blocksparse: https://colab.research.google.com/drive/1F7VofDAAXhwi46DX-HTmk1Hhq6XZB69g
But unfortunately I faced two problems:
1. When I run ./blocksparse/examples/transformer/enwik8.py The loss goes NaN:
Starting epoch 0 Not including 1 sequences Number of minibatches this epoch: 8789 train iteration: 0, loss: 5.53922, bits per byte: 7.99140 ns:0.16947 gn:5.90084 train iteration: 200, loss: nan, bits per byte: nan ns:0.00000 gn:nan train iteration: 400, loss: nan, bits per byte: nan ns:0.00000 gn:nan train iteration: 600, loss: nan, bits per byte: nan ns:0.00000 gn:nan
2. When I run ./blocksparse/test/blocksparse_matmul_test.py
ERROR: testBlocksparseMatMul (main.BlocksparseMatMulTest)
Traceback (most recent call last): File "./blocksparse/test/blocksparse_matmul_test.py", line 346, in testBlocksparseMatMul y = bsmm(y, w2, dw_dtype=dtF, bench=repeat) # (bench and j==depth-1) (bench and j==0) TypeError: call() got an unexpected keyword argument 'dw_dtype'
Ran 2 tests in 1.389s
FAILED (errors=1, skipped=1)
Anyone can help?
It's hard to say what the problem is without knowing more about your setup. This code hasn't gotten much testing outside of our research env, which is all V100 based now.
As far as the test files go, I need to refresh those. I'm frequently tweaking this lib and doing one-off tests so the local state of them can diverge. I'll clean things up today and check in versions where everything should work.