QPyTorch
QPyTorch copied to clipboard
Enablement on ROCm
This PR contains the below two changes
- forceinline needs inline and always_inline on ROCm
- extra_include_paths is required for the hipification of quant_cuda header files.
Hi @Tiiiger, could you please review this PR?
hi @rraminen , +1 for this PR, I have one proposal, instead of adding definition of __forceinline__ we can add
diff --git a/qtorch/quant/quant_cuda/bit_helper.cu b/qtorch/quant/quant_cuda/bit_helper.cu
index 794255f..c741d58 100644
--- a/qtorch/quant/quant_cuda/bit_helper.cu
+++ b/qtorch/quant/quant_cuda/bit_helper.cu
@@ -1,3 +1,5 @@
+#include <cuda.h>
+
#define FLOAT_TO_BITS(x) (*reinterpret_cast<unsigned int*>(x))
#define BITS_TO_FLOAT(x) (*reinterpret_cast<float*>(x))
diff --git a/qtorch/quant/quant_cuda/sim_helper.cu b/qtorch/quant/quant_cuda/sim_helper.cu
index d165793..5a81493 100644
--- a/qtorch/quant/quant_cuda/sim_helper.cu
+++ b/qtorch/quant/quant_cuda/sim_helper.cu
@@ -1,3 +1,4 @@
+#include <cuda.h>
#include "quant_kernel.h"
#include <cmath>
in order to use this definition from hip library.
hi @stevenygd @Tiiiger could you please look at this PR? Thanks in advance.