FBGEMM
FBGEMM copied to clipboard
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Differential Revision: D38716708
Add Mx2, Mx4, 2xN, and 4xN specific transposes on avx512 to improve the transpose performance of shapes of Mx2, Mx4, 2xN, and 4xN. * When the shape is Mx2 or...
Building FBGEMM with clang 14 fails. The errors look like this: `In file included from /home/juanpaez/FBGEMM/src/EmbeddingSpMDM.cc:11: In file included from /home/juanpaez/FBGEMM/third_party/asmjit/src/asmjit/asmjit.h:27: In file included from /home/juanpaez/FBGEMM/third_party/asmjit/src/asmjit/./core.h:2009: In file included from...
Make @weihanmines's PR https://github.com/ROCmSoftwarePlatform/FBGEMM/pull/13 upstreamable. @sryap, would you please review the PR and consider converting it to a draft? Thank you.
Summary: In the current code, we generate a different input batch for each iteration. For training benchmarks, we use global batch size of >= 64K. So, it becomes expensive (in...
We are trying to compile FBGEMM_gpu from source and are running into some errors during the build process that we suspect are related to an unsupported CUDA version. Please let...
In the QuantUtilsTest.cc on line 641 I am getting the error "error: comparison of integer expressions of different signedness: ‘std::vector::size_type’ {aka ‘long unsigned int’} and ‘int’ [-Werror=sign-compare] [build] 641 |...
Does anyone have a conanfile.py created for this dependency?
Sharing optimized prototype kernel for 2D dense_to_jagged operations using vectorized operations. CC: @mjanderson09
Summary: Add the APIs for using UVM where the preferred location is on GPU device instead of on CPU device. Differential Revision: D36657705