pycuda icon indicating copy to clipboard operation
pycuda copied to clipboard

[Nvidia Jetson TX2] LogicError: cuMemHostRegister failed: operation not supported

Open jangminhyeok opened this issue 5 years ago • 1 comments

I have a Nvidia Jetson TX2. ARM64 aarch64 with 256 core GPU. I tried to install pycuda on the board by @santhosh.dc in (https://forums.developer.nvidia.com/t/is-the-memory-management-method-of-tx1-and-tx2-different/50650/14).

But in test_driver.py, I encountered error which is same as issue title "LogicError: cuMemHostRegister failed: operation not supported".

` uniskytx2@uniskytx2:~/pycuda_install/pycuda-2019.1.2/test$ python3 test_driver.py 'TestDriver().test_register_host_memory()'

Traceback (most recent call last): File "test_driver.py", line 965, in exec(sys.argv[1]) File "", line 1, in File "/usr/local/lib/python3.6/dist-packages/pycuda-2019.1.2-py3.6-linux-aarch64.egg/pycuda/tools.py", line 462, in f inner_f(*args, **kwargs) File "test_driver.py", line 832, in test_register_host_memory a_pin = drv.register_host_memory(a) pycuda._driver.LogicError: cuMemHostRegister failed: operation not supported `

and this is result of test_driver.py failures

` ===================================================== FAILURES ====================================================== _______________________________________ TestDriver.test_register_host_memory ________________________________________

args = (<test_driver.TestDriver object at 0x7fa393c748>,), kwargs = {} pycuda = <module 'pycuda' from '/usr/local/lib/python3.6/dist-packages/pycuda-2019.1.2-py3.6-linux-aarch64.egg/pycuda/init.py'> ctx = <pycuda._driver.Context object at 0x7fa457dd40> clear_context_caches = <function clear_context_caches at 0x7fab66fe18>, collect =

def f(*args, **kwargs):
    import pycuda.driver
    # appears to be idempotent, i.e. no harm in calling it more than once
    pycuda.driver.init()

    ctx = make_default_context()
    try:
        assert isinstance(ctx.get_device().name(), str)
        assert isinstance(ctx.get_device().compute_capability(), tuple)
        assert isinstance(ctx.get_device().get_attributes(), dict)
      inner_f(*args, **kwargs)

/usr/local/lib/python3.6/dist-packages/pycuda-2019.1.2-py3.6-linux-aarch64.egg/pycuda/tools.py:462:


self = <test_driver.TestDriver object at 0x7fa393c748>

@mark_cuda_test
def test_register_host_memory(self):
    if drv.get_version() < (4,):
        from py.test import skip
        skip("register_host_memory only exists on CUDA 4.0 and later")

    import sys
    if sys.platform == "darwin":
        from py.test import skip
        skip("register_host_memory is not supported on OS X")

    a = drv.aligned_empty((2**20,), np.float64)
  a_pin = drv.register_host_memory(a)

E pycuda._driver.LogicError: cuMemHostRegister failed: operation not supported

test_driver.py:832: LogicError ================================================= warnings summary ================================================== /usr/local/lib/python3.6/dist-packages/pycuda-2019.1.2-py3.6-linux-aarch64.egg/pycuda/tools.py:477 /usr/local/lib/python3.6/dist-packages/pycuda-2019.1.2-py3.6-linux-aarch64.egg/pycuda/tools.py:477: PytestUnknownMarkWarning: Unknown pytest.mark.cuda - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html return mark_test.cuda(f)

-- Docs: https://docs.pytest.org/en/latest/warnings.html ============================================== short test summary info ============================================== FAILED test_driver.py::TestDriver::test_register_host_memory - pycuda._driver.LogicError: cuMemHostRegister failed... ===================================== 1 failed, 28 passed, 1 warning in 15.12s ====================================== `

jangminhyeok avatar Mar 20 '20 17:03 jangminhyeok

This is a CUDA limitation. register_host_memory won't work on ARM; the rest of PyCUDA should work fine. I'd welcome a patch detecting this and xfailing the test.

https://forums.developer.nvidia.com/t/cudaerrornotsupported-when-calling-cv-cudahostregister-on-nvidia-tx2/60236

As suggested, simply specify the map on alloc.

inducer avatar Mar 23 '20 17:03 inducer