[BUGFIX] Fix nms kernel's out of range access issue
Description
This fix the error found in a object detection case with following error
Traceback (most recent call last):
File "test_with_network.py", line 27, in <module>
print(arr)
File "/opt/mxnet/python/mxnet/gluon/block.py", line 825, in __call__
out = self.forward(*args)
File "/opt/mxnet/python/mxnet/gluon/block.py", line 1684, in forward
return self._call_cached_op(x, *args)
File "/opt/mxnet/python/mxnet/gluon/block.py", line 1233, in _call_cached_op
out = self._cached_op(*cargs)
File "/opt/mxnet/python/mxnet/_ctypes/ndarray.py", line 148, in __call__
check_call(_LIB.MXInvokeCachedOpEx(
File "/opt/mxnet/python/mxnet/base.py", line 246, in check_call
raise get_last_ffi_error()
mxnet.base.MXNetError: Traceback (most recent call last):
File "../include/mshadow/././././cuda/tensor_gpu-inl.cuh", line 147
Name: Check failed: err == cudaSuccess (700 vs. 0) : MapPlanKernel ErrStr:an illegal memory access was encountered
[16:13:24] ../src/resource.cc:306: Ignore CUDA Error [16:13:24] ../src/storage/././storage_manager_helpers.h:135: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: CUDA: an illegal memory access was encountered
Checklist
Essentials
- [x] PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
- [x] Changes are complete (i.e. I finished coding on this PR)
- [ ] All changes have test coverage
- [x] Code is well-documented
Changes
- [x] Remove the element_width limitation of 20 in CalculateGreedyNMSResultsKernel
Comments
- This fix credit to Przemyslaw Tredak
Hey @TristonC , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:
- To trigger all jobs: @mxnet-bot run ci [all]
- To trigger specific jobs: @mxnet-bot run ci [job1, job2]
CI supported jobs: [windows-cpu, unix-gpu, centos-cpu, windows-gpu, centos-gpu, clang, unix-cpu, sanity, edge, website, miscellaneous]
Note: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged.
@ptrendx @josephevans
@mxnet-bot run ci [centos-gpu, unix-cpu, unix-gpu, windows-gpu]
Jenkins CI successfully triggered : [unix-cpu, windows-gpu, centos-gpu, unix-gpu]
This is a legitimate failure - we are using C++17 which does not need message in static_assert, but the previous versions do (and 1.x uses older C++ standard) - you need to add the message to static_assert. This is the error message:
error: expected a comma (the one-argument version of static_assert is not enabled in this mode)
@mxnet-bot run ci [unix-gpu]
Jenkins CI successfully triggered : [unix-gpu]