Distributed.jl icon indicating copy to clipboard operation
Distributed.jl copied to clipboard

[Distributed.jl] inconsistent serialization of closures over global vars

Open kleinschmidt opened this issue 2 years ago • 2 comments

here's my MWE (tested on 1.9.2):

using Distributed, Serialization
y = 3
f = x -> x + y

worker = only(addprocs(1))
@everywhere worker using Serialization

fs = let
    io = IOBuffer()
    serialize(io, f)
    take!(io)
end;

# error: UndefVarError: `y` not defined
remotecall_fetch(worker, fs, 2) do fs, x
    f = deserialize(IOBuffer(fs))
    invokelatest(f, x)
end

# succeeds
remotecall_fetch(f, worker, 2)

# now succeeds
remotecall_fetch(worker, fs, 2) do fs, x
    f = deserialize(IOBuffer(fs))
    invokelatest(f, x)
end

I understand why the first invocation of my manually-deserialized function doesn't work: y is non-const in global scope and is not captured by f; it works as I'd hoped if I do

f = let
    y = 3
    x -> x + y
end

what's troubling me is that somehow when you remotecall f itself, y gets defined as a global on the worker, so that the second time I deserialize and invoke f on teh remote worker, it succeeds.

kleinschmidt avatar Aug 11 '23 21:08 kleinschmidt

I am not quite sure the bug being reported here. The remotecall_fetch code extends the serialization code to support moving global variables between compute nodes. That is not part of the standard serialization definition.

vtjnash avatar Aug 14 '23 18:08 vtjnash

The remotecall_fetch code extends the serialization code to support moving global variables between compute nodes. That is not part of the standard serialization definition

Yeah, once I dug more into the ClusterSerializer I saw that pretty quickly. At this point I think this is more of a documentation issue than anything else.

kleinschmidt avatar Aug 14 '23 18:08 kleinschmidt