KataGo icon indicating copy to clipboard operation
KataGo copied to clipboard

The latest version of tensorrt can't be loaded successfully, can 28b be sent in advance, I want to test

Open awsjgy opened this issue 2 years ago • 7 comments

awsjgy avatar Dec 30 '23 15:12 awsjgy

It's not trained enough, but you can get early versions of that at the discord https://discord.gg/bqkZAz3

lightvector avatar Dec 30 '23 18:12 lightvector

I can only use the latest opencl version, tensorrt I still can't use it after downloading it

它的训练还不够,但你可以在 discord 上获得它的早期版本 https://discord.gg/bqkZAz3

awsjgy avatar Dec 30 '23 18:12 awsjgy

The problem could be in your GPU drivers, or the versions of CUDA or TensorRT you have or something else. Are you using all the recommended versions of CUDA and TensorRT? Do you have the DLLs for those installations in your search path (easiest way to work around if not would be to copy them to the katago exe directory)? If you're in the discord, you can try asking for help. There are people there with technical experience, and there are also a few people there that speak Chinese if it would make things easier to debug.

lightvector avatar Dec 30 '23 18:12 lightvector

问题可能出在您的 GPU 驱动程序中,或者您拥有的 CUDA 或 TensorRT 版本或其他方面。您是否使用了所有推荐的 CUDA 和 TensorRT 版本?您的搜索路径中是否有这些安装的 DLL(如果没有,最简单的解决方法是将它们复制到 katago exe 目录)?如果您处于不和谐状态,可以尝试寻求帮助。那里有技术经验的人,也有一些人会说中文,如果这样可以更容易调试的话。

I downloaded the recommended CUDA and Tensorrt versions you wrote above, but the katago of loading Tensorrt failed, maybe it was the reason for the dll, but the previous download of the Tensorrt version was normal, where is the dll downloaded. If you only have the latest Tensorrt engine, but you don't have B28, is the chess strength stronger than the original engine?

awsjgy avatar Dec 30 '23 18:12 awsjgy

After downloading TensorRT, there is a "lib folder" inside it. Place the following two files in the lib folder in the same location as katago.exe. 無題

hope366 avatar Dec 30 '23 22:12 hope366

Can anyone send me the latest 28b, and the compiled 1.4 engine?

awsjgy avatar Dec 31 '23 16:12 awsjgy

The error you posted https://github.com/lightvector/KataGo/issues/877#issue-2061113738 looks tricky to diagnose - the backend is starting and then failing in an unknown way presumably when trying to initialize tensorrt. Have you tried @hope366's suggestion? Additionally, does the problem happen only for the b28, or does it also happen for smaller networks? And does the problem happen if you only use 1 GPU, or 2 GPUs, and choose much smaller settings for the number of threads and other parameters, or does it still happen then?

lightvector avatar Dec 31 '23 19:12 lightvector