Yoni Kremer
Yoni Kremer
In my experience with TensorFlow, the time estimation is pretty accurate, so I think best effort estimation should give accurate results.
I'm pretty sure dataset size means the size of the raw text.
Seems like the problem is loading data/saved_embeddings/train/c6-whitened-256_4.parquet.gzip to a dataframe
Two tests fail due to the issue: `TestArrayObjectComparison::test_eq_object` `TestArrayObjectComparison::test_ne_object`
@kmaehashi @leofang I get why you don't want to implement it that way. But cupy is supposed to be Numpy-compatible. In addition, some tests fail due to this issue: In...
In numpy 2.1, I get: ``` >>> x_np = np.array([4]).astype(np.float32) >>> y1_np = np.array([2]) # int64 >>> y2_np = np.array(2) # int64 >>> y3_np = 2 >>> x_np / y1_np...
@takagi Can you close the issue?
I started thinking about it, in most cases, top k is very small comared to the vocab size (100 vs 100k), maybe storing the results as a sparse tensor would...
I think that later on computing sotmax and sampling from a sparse tesnor should be much much faster
How can I check the numeric stability of the kernel?