Finalizers cause exception_access_violation while using HDF5 from a thread.
When running long jobs, I'm occasionally running into this traces like this. What I think is happening is that the finalizer is running to close out a property while another thread is trying to access the HDF5 library. This can be be mitigated by running GC.gc() when between jobs where threads are used.
Note I'm doing this while using HDF5.jl on a single spawned thread.
Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x7ffb692e3cac -- H5FL_reg_calloc at C:\Users\simview\.julia\artifacts\a285947f59f2fa90c9ac36239548c5af147d0da5\bin\libhdf5-0.dll (unknown line)
in expression starting at REPL[1]:1
H5FL_reg_calloc at C:\Users\simview\.julia\artifacts\a285947f59f2fa90c9ac36239548c5af147d0da5\bin\libhdf5-0.dll (unknown line)
H5CX_push at C:\Users\simview\.julia\artifacts\a285947f59f2fa90c9ac36239548c5af147d0da5\bin\libhdf5-0.dll (unknown line)
H5Iis_valid at C:\Users\simview\.julia\artifacts\a285947f59f2fa90c9ac36239548c5af147d0da5\bin\libhdf5-0.dll (unknown line)
h5i_is_valid at C:\Users\simview\.julia\packages\HDF5\7zvRl\src\api\functions.jl:1354
isvalid at C:\Users\simview\.julia\packages\HDF5\7zvRl\src\properties.jl:19 [inlined]
close at C:\Users\simview\.julia\packages\HDF5\7zvRl\src\properties.jl:11
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
run_finalizer at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:278
jl_gc_run_finalizers_in_list at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:365
run_finalizers at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:394 [inlined]
run_finalizers at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:372
jl_gc_collect at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:3267
maybe_collect at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:882 [inlined]
jl_gc_pool_alloc at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:1209
jl_gc_alloc_ at /cygdrive/c/buildbot/worker/package_win64/build/src\julia_internal.h:339 [inlined]
_new_array_ at /cygdrive/c/buildbot/worker/package_win64/build/src\array.c:137 [inlined]
_new_array at /cygdrive/c/buildbot/worker/package_win64/build/src\array.c:193 [inlined]
jl_alloc_array_3d at /cygdrive/c/buildbot/worker/package_win64/build/src\array.c:455
Array at .\boot.jl:461 [inlined]
Array at .\boot.jl:468 [inlined]
Array at .\boot.jl:474
unknown function (ip: 0000000063a9c088)
macro expansion at E:\Mirror\FND red stitching test\chunked\downsampling.jl:178 [inlined]
macro expansion at .\timing.jl:220 [inlined]
macro expansion at E:\Mirror\FND red stitching test\chunked\downsampling.jl:177 [inlined]
macro expansion at .\task.jl:399 [inlined]
downsample_to_memory_multithreaded_channel at E:\Mirror\FND red stitching test\chunked\downsampling.jl:136
I guess the solution here is to close all properties objects when they are done?
Following up on https://github.com/JuliaIO/HDF5.jl/pull/1013#discussion_r1003866863, how about we define a function
function with(fn, args...)
try
fn(args...)
finally
for arg in args
finalize(arg)
end
end
end
so e.g. https://github.com/JuliaIO/HDF5.jl/blob/6022be0e1e44073152a84b5063baf5bed6eaf19b/src/groups.jl#L24-L35 could be written as
function create_group(parent::Union{File,Group}, path::AbstractString; pv...)
with(_link_properties(path), GroupCreateProperties()) do lcpl, gcpl
pv = setproperties!(lcpl, gcpl; pv...)
isempty(pv) || error("invalid keyword options $pv")
return create_group(parent, path, lcpl, gcpl)
end
end
or perhaps a macro to do the same, e.g.:
function create_group(parent::Union{File,Group}, path::AbstractString; pv...)
@with lcpl=_link_properties(path) gcpl=GroupCreateProperties() begin
pv = setproperties!(lcpl, gcpl; pv...)
isempty(pv) || error("invalid keyword options $pv")
return create_group(parent, path, lcpl, gcpl)
end
end
I was thinking something similar, but calling close rather than finalize.
@simonbyrne that sounds a lot better than the current setup and remind me of something similar in the D language for finalizers. The function form seems simpler than going with a macro, unless there is an intrinsic benefit of the macro.
Does calling finalize run the finalizers on the current task/thread or does it schedule it on another task/thread?
I'm pretty sure it runs on the current task (i.e. it calls it immediately). finalize also has the advantages of working on any object (even immutable ones), and being cleared once called (i.e. the finalizers from the object aren't later called again by the GC), e.g.:
julia> mutable struct Foo
end
julia> close(f::Foo) = println("done")
close (generic function with 1 method)
julia> f = Foo()
Foo()
julia> finalizer(close, f)
Foo()
julia> close(f)
done
julia> f = nothing; GC.gc() # close called again
done
julia> f = Foo()
Foo()
julia> finalizer(close, f)
Foo()
julia> finalize(f)
done
julia> f = nothing; GC.gc() # close not called again
I'm pretty sure it runs on the current task (i.e. it calls it immediately)
https://github.com/JuliaLang/julia/blob/8512dd2a570c1f181df96529d73512e8c766e33c/base/gcutils.jl#L72-L73
This seems more general than HDF5.jl. While I'm happy to experiment with it here, perhaps we should consider putting it in a small package at some point? Maybe we should incubate it as a subdirectory package or at least put it within it's own module.
The @with macro looks quite like a let statement. Perhaps the macro should act on a let statement rather than a begin block?
Closed by #1021