Crash issue due to refcount error with clang compiler
bool AbstractBuffersPool<PoolT, BufferType, BufferParentType>::isPoolBuffer(const BufferParentType *buffer) const { static_assert(std::is_base_of_v<BufferParentType, BufferType>);
return (buffer && this->mainStorage.get() == buffer); // for clang compiler, unique_ptr is assigned nullptr firstly
}
ref: https://stackoverflow.com/questions/54237128/does-stdunique-ptr-set-its-underlying-pointer-to-nullptr-inside-its-destructor
Hi @Yanfeng-Mi could you share more details of the issue? Could you share callstack?
the callstack as following: #00 pc 0000000000067c0e /apex/com.android.runtime/lib64/bionic/libc.so (abort+206) (BuildId: 3f70d7b54a58b7ab204a797b00a4a7cb) #01 pc 0000000000432da8 /vendor/lib64/libigdrcl.so (NEO::abortExecution()+8) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #02 pc 0000000000432e87 /vendor/lib64/libigdrcl.so (NEO::abortUnrecoverable(int, char const*)+55) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #03 pc 000000000050b1e5 /vendor/lib64/libigdrcl.so (NEO::ReferenceTrackedObjectNEO::Context::decRefInternal()+117) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #04 pc 0000000000598d52 /vendor/lib64/libigdrcl.so (NEO::MemObj::~MemObj()+1458) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #05 pc 00000000005817d4 /vendor/lib64/libigdrcl.so (NEO::Buffer::~Buffer()+20) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #06 pc 0000000000603844 /vendor/lib64/libigdrcl.so (NEO::BufferHwNEO::XeHpcCoreFamily::~BufferHw()+20) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #07 pc 0000000000603868 /vendor/lib64/libigdrcl.so (NEO::BufferHwNEO::XeHpcCoreFamily::~BufferHw()+24) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #08 pc 000000000052c551 /vendor/lib64/libigdrcl.so (std::__1::default_deleteNEO::Buffer::operator()(NEO::Buffer*) const+49) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #09 pc 0000000000520683 /vendor/lib64/libigdrcl.so (std::__1::unique_ptr<NEO::Buffer, std::__1::default_deleteNEO::Buffer >::reset(NEO::Buffer*)+99) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #10 pc 0000000000526ac8 /vendor/lib64/libigdrcl.so (std::__1::unique_ptr<NEO::Buffer, std::__1::default_deleteNEO::Buffer >::~unique_ptr()+24) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #11 pc 00000000005207a6 /vendor/lib64/libigdrcl.so (NEO::AbstractBuffersPool<NEO::Context::BufferPool, NEO::Buffer, NEO::MemObj>::~AbstractBuffersPool()+54) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #12 pc 0000000000520c14 /vendor/lib64/libigdrcl.so (NEO::Context::BufferPool::~BufferPool()+20) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #13 pc 0000000000522378 /vendor/lib64/libigdrcl.so (std::__1::allocatorNEO::Context::BufferPool::destroy(NEO::Context::BufferPool*)+24) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #14 pc 000000000052234c /vendor/lib64/libigdrcl.so (void std::__1::allocator_traits<std::__1::allocatorNEO::Context::BufferPool >::__destroyNEO::Context::BufferPool(std::__1::integral_constant<bool, true>, std::__1::allocatorNEO::Context::BufferPool&, NEO::Context::BufferPool*)+28) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #15 pc 000000000052231c /vendor/lib64/libigdrcl.so (void std::__1::allocator_traits<std::__1::allocatorNEO::Context::BufferPool >::destroyNEO::Context::BufferPool(std::__1::allocatorNEO::Context::BufferPool&, NEO::Context::BufferPool*)+28) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #16 pc 00000000005222cf /vendor/lib64/libigdrcl.so (std::__1::__vector_base<NEO::Context::BufferPool, std::__1::allocatorNEO::Context::BufferPool >::__destruct_at_end(NEO::Context::BufferPool*)+95) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #17 pc 0000000000522217 /vendor/lib64/libigdrcl.so (std::__1::__vector_base<NEO::Context::BufferPool, std::__1::allocatorNEO::Context::BufferPool >::clear()+23) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #18 pc 0000000000528aad /vendor/lib64/libigdrcl.so (std::__1::vector<NEO::Context::BufferPool, std::__1::allocatorNEO::Context::BufferPool >::clear()+45) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #19 pc 000000000051cff8 /vendor/lib64/libigdrcl.so (NEO::AbstractBuffersAllocator<NEO::Context::BufferPool, NEO::Buffer, NEO::MemObj>::releaseSmallBufferPool()+24) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #20 pc 000000000051ca58 /vendor/lib64/libigdrcl.so (NEO::Context::~Context()+168) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #21 pc 000000000051d228 /vendor/lib64/libigdrcl.so (NEO::Context::~Context()+24) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #22 pc 0000000000514ebd /vendor/lib64/libigdrcl.so (NEO::unique_ptr_if_unusedNEO::Context::doDelete(NEO::Context*)+45) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #23 pc 000000000047bdd3 /vendor/lib64/libigdrcl.so (std::__1::unique_ptr<NEO::Context, void ()(NEO::Context)>::reset(NEO::Context*)+99) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #24 pc 000000000047bd68 /vendor/lib64/libigdrcl.so (std::__1::unique_ptr<NEO::Context, void ()(NEO::Context)>::~unique_ptr()+24) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #25 pc 0000000000439754 /vendor/lib64/libigdrcl.so (NEO::unique_ptr_if_unusedNEO::Context::~unique_ptr_if_unused()+20) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c) #26 pc 000000000043944b /vendor/lib64/libigdrcl.so (clReleaseContext+571) (BuildId: 5f66ed8b63a2c1f5fbd41c24c6ddde9c)
@Yanfeng-Mi Thanks for reporting the issue. Could you share repro steps?
@JablonskiMateusz To reproduce this issue, you need to recompile ocl rt driver with clang compiler toolsets. I found this issue on android platform on which clang compiler is used. it's not easy to rebuild the OCL RT with clang compiler tools on Ubuntu and many compiling issues needs to be resolve based on libc++. The root-cause of this issue is different behavior of unique_ptr destruction between gcc(libstdc++) and clang (libc++) . You can refer to my WA patch on android celadon projects: https://github.com/projectceladon/compute-runtime/commit/dcc1b6fc60518b25bb81cfd6450c129e089016b9
Hi @Yanfeng-Mi ,
We’d like to know if this issue is still affecting you. If so, please provide an update or any additional information. If you have identified a solution, we kindly ask that you create a proper pull request (PR) with the necessary changes for review (https://github.com/intel/compute-runtime/blob/master/CONTRIBUTING.md). Otherwise, we’ll close this issue after 30 days of inactivity. Your feedback is appreciated!
@kgibala the issue is still reproducible. I will create PR for this issue.
@kgibala Could you help review the PR?
@Yanfeng-Mi Thank you for your contribution! We appreciate your effort in submitting this pull request. Your changes will be reviewed and evaluated through our standard process. We’ll keep you updated on any progress or feedback.