[Nvidia Jetson TX2] LogicError: cuMemHostRegister failed: operation not supported
I have a Nvidia Jetson TX2. ARM64 aarch64 with 256 core GPU. I tried to install pycuda on the board by @santhosh.dc in (https://forums.developer.nvidia.com/t/is-the-memory-management-method-of-tx1-and-tx2-different/50650/14).
But in test_driver.py, I encountered error which is same as issue title "LogicError: cuMemHostRegister failed: operation not supported".
` uniskytx2@uniskytx2:~/pycuda_install/pycuda-2019.1.2/test$ python3 test_driver.py 'TestDriver().test_register_host_memory()'
Traceback (most recent call last):
File "test_driver.py", line 965, in
and this is result of test_driver.py failures
` ===================================================== FAILURES ====================================================== _______________________________________ TestDriver.test_register_host_memory ________________________________________
args = (<test_driver.TestDriver object at 0x7fa393c748>,), kwargs = {}
pycuda = <module 'pycuda' from '/usr/local/lib/python3.6/dist-packages/pycuda-2019.1.2-py3.6-linux-aarch64.egg/pycuda/init.py'>
ctx = <pycuda._driver.Context object at 0x7fa457dd40>
clear_context_caches = <function clear_context_caches at 0x7fab66fe18>, collect =
def f(*args, **kwargs):
import pycuda.driver
# appears to be idempotent, i.e. no harm in calling it more than once
pycuda.driver.init()
ctx = make_default_context()
try:
assert isinstance(ctx.get_device().name(), str)
assert isinstance(ctx.get_device().compute_capability(), tuple)
assert isinstance(ctx.get_device().get_attributes(), dict)
inner_f(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/pycuda-2019.1.2-py3.6-linux-aarch64.egg/pycuda/tools.py:462:
self = <test_driver.TestDriver object at 0x7fa393c748>
@mark_cuda_test
def test_register_host_memory(self):
if drv.get_version() < (4,):
from py.test import skip
skip("register_host_memory only exists on CUDA 4.0 and later")
import sys
if sys.platform == "darwin":
from py.test import skip
skip("register_host_memory is not supported on OS X")
a = drv.aligned_empty((2**20,), np.float64)
a_pin = drv.register_host_memory(a)
E pycuda._driver.LogicError: cuMemHostRegister failed: operation not supported
test_driver.py:832: LogicError ================================================= warnings summary ================================================== /usr/local/lib/python3.6/dist-packages/pycuda-2019.1.2-py3.6-linux-aarch64.egg/pycuda/tools.py:477 /usr/local/lib/python3.6/dist-packages/pycuda-2019.1.2-py3.6-linux-aarch64.egg/pycuda/tools.py:477: PytestUnknownMarkWarning: Unknown pytest.mark.cuda - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html return mark_test.cuda(f)
-- Docs: https://docs.pytest.org/en/latest/warnings.html ============================================== short test summary info ============================================== FAILED test_driver.py::TestDriver::test_register_host_memory - pycuda._driver.LogicError: cuMemHostRegister failed... ===================================== 1 failed, 28 passed, 1 warning in 15.12s ====================================== `
This is a CUDA limitation. register_host_memory won't work on ARM; the rest of PyCUDA should work fine. I'd welcome a patch detecting this and xfailing the test.
https://forums.developer.nvidia.com/t/cudaerrornotsupported-when-calling-cv-cudahostregister-on-nvidia-tx2/60236
As suggested, simply specify the map on alloc.