OpenBLAS icon indicating copy to clipboard operation
OpenBLAS copied to clipboard

Hang issue while loading openblas.so on low memory systems

Open vinithakv opened this issue 7 months ago • 4 comments

Hi, When loading libopenblas.so with OpenMP support on a low memory host the program hangs in a loop trying to mmap buffers (In blas_memory_alloc) This can be reproduced by following testcase on a Linux system.

import os
import resource
import multiprocessing

def limitMemory(maxRAM):
    bytesLimit = maxRAM * 1024 ** 3
    resource.setrlimit(resource.RLIMIT_AS, (bytesLimit, bytesLimit))

def limitCPU(numOfCPUs):
    try:
        CPUs = list(range(numOfCPUs))
        os.sched_setaffinity(0, CPUs)
    except AttributeError:
        print("Error restricting CPU resources")

limitMemory(2)
print("Memory limit set")
limitCPU(1)
print("CPU limit set")
pid = os.getpid()
print("The PID of the curirent process is: ",pid)
openblas_so = cdll.LoadLibrary("/usr/lib64/libopenblaso.so.0")
#import numpy as np
print(openblas_so)
#print(np)

vinithakv avatar Jun 02 '25 04:06 vinithakv

gdb backtrace

#1  0x00007ffff7688110 in sysmalloc () from /lib64/glibc-hwcaps/power10/libc.so.6
#2  0x00007ffff768930c in _int_malloc () from /lib64/glibc-hwcaps/power10/libc.so.6
#3  0x00007ffff768a194 in malloc () from /lib64/glibc-hwcaps/power10/libc.so.6
#4  0x00007ffff5b3b240 in alloc_malloc () from /home/builder/test-env/lib/python3.12/site-packages/openblas/lib/libopenblas.so.0
#5  0x00007ffff5b3c060 in blas_memory_alloc () from /home/builder/test-env/lib/python3.12/site-packages/openblas/lib/libopenblas.so.0
#6  0x00007ffff5b3dfa0 in blas_thread_init () from /home/builder/test-env/lib/python3.12/site-packages/openblas/lib/libopenblas.so.0
#7  0x00007ffff586e794 in gotoblas_init () from /home/builder/test-env/lib/python3.12/site-packages/openblas/lib/libopenblas.so.0
#8  0x00007ffff7f97ba0 in call_init (env=0x7ffffffff510, argv=0x7ffffffff4f8, argc=2, l=<optimized out>) at dl-init.c:70
#9  _dl_init (main_map=0x1001b1380, argc=<optimized out>, argv=0x7ffffffff4f8, env=0x7ffffffff510) at dl-init.c:117
#10 0x00007ffff7faa368 in call_dl_init (closure=<optimized out>) at dl-open.c:528
#11 0x00007ffff7781cc8 in _dl_catch_exception () from /lib64/glibc-hwcaps/power10/libc.so.6
#12 0x00007ffff7faa52c in dl_open_worker (a=<optimized out>) at dl-open.c:822
#13 dl_open_worker (a=<optimized out>) at dl-open.c:785
#14 0x00007ffff7781c4c in _dl_catch_exception () from /lib64/glibc-hwcaps/power10/libc.so.6
#15 0x00007ffff7fabe9c in _dl_open (file=0x7ffff7069ed0 "/home/builder/test-env/lib/python3.12/site-packages/openblas/lib/libopenblas.so.0",
    mode=<optimized out>, caller_dlopen=0x7ffff71db758 <py_dl_open+168>, nsid=-2, argc=<optimized out>, argv=0x7ffffffff4f8, env=0x7ffffffff510)
    at dl-open.c:898
#16 0x00007ffff766c750 in dlopen_doit () from /lib64/glibc-hwcaps/power10/libc.so.6
#17 0x00007ffff7781c4c in _dl_catch_exception () from /lib64/glibc-hwcaps/power10/libc.so.6
#18 0x00007ffff7781d40 in _dl_catch_error () from /lib64/glibc-hwcaps/power10/libc.so.6
#19 0x00007ffff7fc0ae8 in _rtld_catch_error (objname=<optimized out>, errstring=<optimized out>, mallocedp=<optimized out>, operate=<optimized out>,
    args=<optimized out>) at dl-error-skeleton.c:260
#20 0x00007ffff766bfb4 in _dlerror_run () from /lib64/glibc-hwcaps/power10/libc.so.6
#21 0x00007ffff766c834 in dlopen@GLIBC_2.17 () from /lib64/glibc-hwcaps/power10/libc.so.6
#22 0x00007ffff71db758 in py_dl_open () from /usr/lib64/python3.12/lib-dynload/_ctypes.cpython-312-powerpc64le-linux-gnu.so
#23 0x00007ffff79fc7b8 in cfunction_call () from /lib64/libpython3.12.so.1.0
#24 0x00007ffff79b5c58 in _PyObject_MakeTpCall () from /lib64/libpython3.12.so.1.0
#25 0x00007ffff79c40ac in _PyEval_EvalFrameDefault () from /lib64/libpython3.12.so.1.0
#26 0x00007ffff79bb978 in _PyObject_FastCallDictTstate () from /lib64/libpython3.12.so.1.0
#27 0x00007ffff7a14738 in slot_tp_init () from /lib64/libpython3.12.so.1.0
#28 0x00007ffff79b5c04 in _PyObject_MakeTpCall () from /lib64/libpython3.12.so.1.0
#29 0x00007ffff79c40ac in _PyEval_EvalFrameDefault () from /lib64/libpython3.12.so.1.0
#30 0x00007ffff7ae34d4 in PyEval_EvalCode () from /lib64/libpython3.12.so.1.0
#31 0x00007ffff7b26c60 in run_eval_code_obj () from /lib64/libpython3.12.so.1.0
#32 0x00007ffff7b1da24 in run_mod () from /lib64/libpython3.12.so.1.0
#33 0x00007ffff7b4b7c0 in pyrun_file () from /lib64/libpython3.12.so.1.0
#34 0x00007ffff7b4a808 in _PyRun_SimpleFileObject () from /lib64/libpython3.12.so.1.0
#35 0x00007ffff7b49808 in _PyRun_AnyFileObject () from /lib64/libpython3.12.so.1.0
#36 0x00007ffff7b3bb84 in Py_RunMain () from /lib64/libpython3.12.so.1.0
#37 0x00007ffff7ac18d4 in Py_BytesMain () from /lib64/libpython3.12.so.1.0
#38 0x0000000100000938 in main ()

vinithakv avatar Jun 02 '25 04:06 vinithakv

On low memory systems, you will need to use the BUFFERSIZE option at build to limit memory use to something the system can handle. This will affect the maximum matrix size that can be processed, but a low memory system will usually have limited computing capacity anyway

martin-frbg avatar Jun 02 '25 05:06 martin-frbg

Hi @martin-frbg , Thank you for your response. I understand the current approach, but would it be possible to fail gracefully after a few failed mmap() or malloc() attempts, instead of retrying indefinitely? This might help avoid hangs on low memory systems. Thanks, Vinitha

vinithakv avatar Jun 02 '25 10:06 vinithakv

Hi, When loading libopenblas.so with OpenMP support on a low memory host the program hangs in a loop trying to mmap buffers (In blas_memory_alloc) This can be reproduced by following testcase on a Linux system.

import os
import resource
import multiprocessing

def limitMemory(maxRAM):
    bytesLimit = maxRAM * 1024 ** 3
    resource.setrlimit(resource.RLIMIT_AS, (bytesLimit, bytesLimit))

def limitCPU(numOfCPUs):
    try:
        CPUs = list(range(numOfCPUs))
        os.sched_setaffinity(0, CPUs)
    except AttributeError:
        print("Error restricting CPU resources")

limitMemory(2)
print("Memory limit set")
limitCPU(1)
print("CPU limit set")
pid = os.getpid()
print("The PID of the curirent process is: ",pid)
openblas_so = cdll.LoadLibrary("/usr/lib64/libopenblaso.so.0")
#import numpy as np
print(openblas_so)
#print(np)

20250418

johnaAr555 avatar Jun 05 '25 02:06 johnaAr555