KataGo icon indicating copy to clipboard operation
KataGo copied to clipboard

Latest libnvinfer-headers-dev causes compiling cuda broken

Open gslin opened this issue 8 months ago • 2 comments

I tried to rebuild KataGo after nvidia's driver routine upgrade, but failed:

-- Building 'katago' executable for GTP engine and other tools.
-- -DUSE_BACKEND=TENSORRT, using TensorRT backend.
-- Including Git revision in the compiled executable, specify -DNO_GIT_REVISION=1 to disable
-- Found Git: /usr/bin/git (found version "2.49.0")
-- Found CUDAToolkit: /usr/local/cuda/targets/x86_64-linux/include (found version "12.3.107")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
CMake Error at CMakeLists.txt:354 (message):
  TensorRT 8.5 or greater is required but ..  was found.


-- Configuring incomplete, errors occurred!

After some digging, I found the change of /usr/include/x86_64-linux-gnu/NvInferVersion.h (installed by nvidia's official package libnvinfer-headers-dev on Ubuntu 22.04) is the cause, now its format is two tier:

#define TRT_MAJOR_ENTERPRISE 10
#define TRT_MINOR_ENTERPRISE 11
#define TRT_PATCH_ENTERPRISE 0
#define TRT_BUILD_ENTERPRISE 33
#define NV_TENSORRT_MAJOR TRT_MAJOR_ENTERPRISE //!< TensorRT major version.
#define NV_TENSORRT_MINOR TRT_MINOR_ENTERPRISE //!< TensorRT minor version.
#define NV_TENSORRT_PATCH TRT_PATCH_ENTERPRISE //!< TensorRT patch version.
#define NV_TENSORRT_BUILD TRT_BUILD_ENTERPRISE //!< TensorRT build number.

But KataGo's CMakeLists.txt tries to search numbers directly:

  string(REGEX MATCH "#define NV_TENSORRT_MAJOR ([0-9]+)" tensorrt_version_macro ${tensorrt_version_header})
  set(TENSORRT_VERSION_MAJOR ${CMAKE_MATCH_1})
  string(REGEX MATCH "#define NV_TENSORRT_MINOR ([0-9]+)" tensorrt_version_macro ${tensorrt_version_header})
  set(TENSORRT_VERSION_MINOR ${CMAKE_MATCH_1})
  string(REGEX MATCH "#define NV_TENSORRT_PATCH ([0-9]+)" tensorrt_version_macro ${tensorrt_version_header})
  set(TENSORRT_VERSION_PATCH ${CMAKE_MATCH_1})
  set(TENSORRT_VERSION "${TENSORRT_VERSION_MAJOR}.${TENSORRT_VERSION_MINOR}.${TENSORRT_VERSION_PATCH}")

I found CUDAToolkit_VERSION is available by CMake, but not sure if this is okay to use.

gslin avatar May 15 '25 10:05 gslin

You can try to edit the "CMakeLists.txt". NV_TENSORRT_MAJOR to TRT_MAJOR_ENTERPRISE NV_TENSORRT_MINOR to TRT_MINOR_ENTERPRISE NV_TENSORRT_PATCH to TRT_PATCH_ENTERPRISE

foxrainowo avatar May 23 '25 19:05 foxrainowo

That will break old versions, as TRT_MAJOR_ENTERPRISE is introduced in 10.11 and are not existing in earlier versions:

  • https://docs.nvidia.com/deeplearning/tensorrt/10.11.0/_static/c-api/_nv_infer_version_8h.html
  • https://docs.nvidia.com/deeplearning/tensorrt/10.10.0/_static/c-api/_nv_infer_version_8h.html

gslin avatar May 24 '25 01:05 gslin