ComputeLibrary icon indicating copy to clipboard operation
ComputeLibrary copied to clipboard

Unsigned int overflow in PoolingDepthfirstGeneric

Open alvoron opened this issue 2 years ago • 6 comments

Output of 'strings libarm_compute.so | grep arm_compute_version': arm_compute_version=v23.02 Build options: {'neon': '1', 'opencl': '0', 'openmp': '0', 'cppthreads': '1', 'examples': '0', 'Werror': '0', 'gemm_tuner': '0', 'reference_openmp': '0', 'validation_tests': '0', 'benchmark_tests': '0', 'data_layout_support': 'all', 'build_dir': '<project_dir>/thirdparty/ComputeLibrary', 'install_dir': '<project_dir>/thirdparty/ComputeLibrary/install', 'arch': 'armv8.2-a', 'debug': '1', 'asserts': '1', 'logging': '1', 'os': 'macos', 'build': 'native', 'compiler_prefix': '/usr/bin/', 'extra_cxx_flags': '-fPIC -fsigned-char -ffunction-sections -fdata-sections -fdiagnostics-show-option -Wundef -Wreturn-type -Wunused-variable -Wswitch -Wno-macro-redefined -Wno-undef -Wno-missing-declarations -fvisibility-inlines-hidden -Wall -Wno-unknown-pragmas -fvisibility=internal -mcpu=native -Wno-undef -Wno-error=return-stack-address'} Git hash=b'f8f7ede7a01eb5cd9d06060b4d2f2d1404d93f29'

Platform: Apple M1

Operating System: macOS 12.6

Problem description: I faced EXC_BAD_ACCESS crash in PoolingDepthfirstGeneric::compute_tile_padded() while using NEPoolingLayer with NHWC layout. Overflow of valid_rows or valid_cols variables in PoolingDepthfirstGeneric::compute_tile_padded() could happen if padding sum is greater than pool_window rows and cols:

const auto valid_rows = this->m_args.pool_window.rows - (pad_top + pad_bottom);
const auto valid_cols = this->m_args.pool_window.cols - (pad_left + pad_right); // 2 - (0 + 4) = 4294967294

Before running NEPoolingLayer kernel, the validate() method has been run to check configuration.

alvoron avatar Mar 17 '23 16:03 alvoron

The issue is not reproducible on Raspberry Pi.

alvoron avatar Jan 16 '24 12:01 alvoron

Hi @alvoron

In order to help I'll need more details.

Could you please share more information about the workload configuration that caused the problem on macOS? If you build ACL with logging=1 the library will print the arguments passed to ::configure()

morgolock avatar Jan 29 '24 15:01 morgolock

[ComputeLibrary][31-01-2024 11:17:58][INFO]  arm_compute::cpu::CpuPool2d::configure() : 
 src: Shape=112,112,64,1,DataLayout=NHWC,DataType=F32
 dst: Shape=56,56,64,1,DataLayout=NHWC,DataType=F32
 pool_info: {Type=MAX,DataLayout=NHWC,IsGlobalPooling=0,PoolSize=3,3,PadStride=2,2;1,1,1,1}
 indices: nullptr

alvoron avatar Jan 31 '24 10:01 alvoron

Hi @alvoron

Thanks, we managed to reproduce and we are working to fix the problem.

morgolock avatar Mar 13 '24 08:03 morgolock

Hi @alvoron

This is the patch fixing the problem https://review.mlplatform.org/c/ml/ComputeLibrary/+/11290

The fix will be included in 24.05

Hope this helps

morgolock avatar Mar 14 '24 11:03 morgolock

Thank you for the patch. I'll test it as soon as we upgrade ACL to 24.05

alvoron avatar Mar 15 '24 17:03 alvoron

Closing as this was already delivered in the last release. Please reopen if you still require support.

morgolock avatar Jun 03 '24 10:06 morgolock