zhuangbility

Results 5 issues of zhuangbility

Hi! Do you utilize the sector cache in the dgemm kernel? And could you please tell me how to control the sector cache with assembly? According to the documentation, it...

Hi! Thank you for your nice paper. I'm interested in reproducing the accuracy of the full-batch graphsage on Ogbn-papers100M in your paper (Table 1 in the paper: 65.8% and 66.3%)....

Hi! Your job is pretty good! I have a question about irregular-shaped GEMM: How to implement the irregular-shaped GEMM more efficiently? What's the main idea?

Hi! I'm trying to download the `IGB-large` dataset. but I found that it's very slow to download the raw feature file (say .npy file). Do you have any plan about...

enhancement
question

Hi! I use the `dist.all_to_all_single` with torch.distributed and torch.ccl. I found that when the size of send buffer and recv buffer is large (several Gigabytes), the problem of `segment fault`...