Colab Notebook to run blocksparse

Open ecchochan opened this issue 6 years ago • 1 comments

I wish to understand more with this awesome work!

I opened a Colab Notebook to test blocksparse: https://colab.research.google.com/drive/1F7VofDAAXhwi46DX-HTmk1Hhq6XZB69g

But unfortunately I faced two problems:

1. When I run ./blocksparse/examples/transformer/enwik8.py The loss goes NaN:

Starting epoch 0 Not including 1 sequences Number of minibatches this epoch: 8789 train iteration: 0, loss: 5.53922, bits per byte: 7.99140 ns:0.16947 gn:5.90084 train iteration: 200, loss: nan, bits per byte: nan ns:0.00000 gn:nan train iteration: 400, loss: nan, bits per byte: nan ns:0.00000 gn:nan train iteration: 600, loss: nan, bits per byte: nan ns:0.00000 gn:nan

2. When I run ./blocksparse/test/blocksparse_matmul_test.py

ERROR: testBlocksparseMatMul (main.BlocksparseMatMulTest)

Traceback (most recent call last): File "./blocksparse/test/blocksparse_matmul_test.py", line 346, in testBlocksparseMatMul y = bsmm(y, w2, dw_dtype=dtF, bench=repeat) # (bench and j==depth-1) (bench and j==0) TypeError: call() got an unexpected keyword argument 'dw_dtype'

Ran 2 tests in 1.389s

FAILED (errors=1, skipped=1)

Anyone can help?

May 02 '19 10:05 ecchochan

It's hard to say what the problem is without knowing more about your setup. This code hasn't gotten much testing outside of our research env, which is all V100 based now.

As far as the test files go, I need to refresh those. I'm frequently tweaking this lib and doing one-off tests so the local state of them can diverge. I'll clean things up today and check in versions where everything should work.

May 03 '19 16:05 scott-gray