Setfield.jl icon indicating copy to clipboard operation
Setfield.jl copied to clipboard

WIP: Implement mutating versions of @set and @lens (#71)

Open colinxs opened this issue 6 years ago • 31 comments

colinxs avatar Oct 08 '19 02:10 colinxs

Here is the diff wrt #91 (for easy review): https://github.com/jw3126/Setfield.jl/compare/constbase...colinxs:bangbang

tkf avatar Oct 08 '19 02:10 tkf

I think the part where it gets really challenging is to implement set on a composition. A composition that involves !! or ! lens applied to a nested struct should only mutate the innermost possible mutable in a nested struct update and not reconstruct any other objects.

jw3126 avatar Oct 08 '19 04:10 jw3126

I was assuming that the answer was that "users should be careful." Why not just say that you'd loose the pureness property if there is at least one mutating-lens? That is to say, isn't it OK to have

x = (a=[1],)
y = set(x, (@lens _.a) ∘ (@lens! _[1]), 2)
@assert x.a === y.a
@assert x.a[1] == 2

tkf avatar Oct 08 '19 04:10 tkf

Yes you example is the only reasonable behavior I can imagine. What I was talking about is

struct A; a end
x = A([1])
y = set(x, @lens!! _.a[1], 2)
@assert x.a === y.a
@assert x.a[1] == 2

Does this do a constructor call to A? I would prefer it if that was optimized away. But if this makes compose implementation too involved, I can live with it not being optimized out.

Edit: About composition of @lens! I am fine with if that just mutates the innermost (and errors if the innermost is immutable).

jw3126 avatar Oct 08 '19 04:10 jw3126

We had some half baked per set call mutation support in the beginning. The implementation is probably not interesting, but maybe you can salvage some of the tests. https://github.com/jw3126/Setfield.jl/commit/1ac915347ba6086d548ae1d11df9eef357ff7802

jw3126 avatar Oct 08 '19 04:10 jw3126

Does this do a constructor call to A?

Ah, that's a good point. Maybe something like this would work?

function set(obj, l::ComposedLens, val)
    inner_obj = get(obj, l.outer)
    if MutationPolicy(l.inner) isa AlwaysMutate
        set(inner_obj, l.inner, val)
        obj
    else
        # Do what we are doing:
        inner_val = set(inner_obj, l.inner, val)
        set(obj, l.outer, inner_val)
    end
end

But, if it made the code better, it would mean that there might be some non-trivial invariant checking in the constructor. Maybe we shouldn't be skipping it? If you know that it is safe to skip those checks then you probably know the type of the outer object reasonably well so that you can directly modify mutable inner object?

tkf avatar Oct 08 '19 05:10 tkf

But, if it made the code better, it would mean that there might be some non-trivial invariant checking in the constructor.

A plain a.b.c = d does no invariant checking, so I think it would be a sane choice if we allow ourselfs to optimize that away as well.

If you know that it is safe to skip those checks then you probably know the type of the outer object reasonably well so that you can directly modify mutable inner object?

That is an interesting point. Maybe we should examine some performance/IR and check if in the case of plain structs without inner constructor julia manages to elide the reconstruction and just mutates the innermost. If julia manages to optimize that well, that would be great. If this is the case, then I agree the invariant checking is a nice feature.

jw3126 avatar Oct 08 '19 11:10 jw3126

Actually, I know from experience that re-constructing structs sometimes helps improving code-gen. There is a function (adequately) called darkritual in Transducers.jl which just re-constructs all nested immutable structs. It is useful when one of the struct has an Array as a member and IIRC it was impossible to generate SIMD'ed loop without it.

tkf avatar Oct 09 '19 00:10 tkf

So I'm almost done incorporating the above feedback, but ran across the following which I thought was a bug. Using your example:

struct A; a end
x = A([1])
y = set(x, @lens!! _.a[1], 2)
@assert x.a === y.a
@assert x.a[1] == 2

You also have (without making the above modification to set(obj, l::ComposedLens, val)): @assert x === y

Edit: This behavior stems from setproperty!!:

x = A([1])
a = x.a
y = setproperty!!(x, :a, a)
@assert x === y

colinxs avatar Oct 09 '19 01:10 colinxs

Isn't it expected? That's equivalent to

julia> a = [1];

julia> x = (a=a,);

julia> y = (a=a,);

julia> x === y
true

Right?

tkf avatar Oct 09 '19 02:10 tkf

Ahh I guess so! Nevermind :)

colinxs avatar Oct 09 '19 02:10 colinxs

I think we should also deprecate the current @set! macro into something else. Maybe @reset? The new @set! macro can live in Setfield.Future for a while.

jw3126 avatar Oct 09 '19 07:10 jw3126

A bit crazy plan is to rename the whole package :)

tkf avatar Oct 09 '19 08:10 tkf

Since you pointed out that my favorite name is already taken in I am somewhat depressed about it. But yeah the package needs to be renamed, we can discuss that further in #54.

jw3126 avatar Oct 09 '19 08:10 jw3126

As of 503599149031ac613c4fbf34816da79a2044ac5f, we have

https://github.com/jw3126/Setfield.jl/blob/503599149031ac613c4fbf34816da79a2044ac5f/src/lens.jl#L142-L143

and

https://github.com/jw3126/Setfield.jl/blob/503599149031ac613c4fbf34816da79a2044ac5f/src/lens.jl#L212-L213

I should have noticed this earlier, but it means that @set! x.a[1] = 2 is equivalent to x.a[1] = 2, right? It seems like a redundant syntax? I think @lens! x.a[1] may be useful though.

FYI what I call mutate-if-mutable in https://github.com/jw3126/Setfield.jl/pull/71#issuecomment-513859965 would be something like this

if ismutablestruct(obj)
    setproperty!(obj, name, val)
    obj
else
    set(obj, PropertyLens(name, NeverMutate()), val)
end

We could use this for @set! but it is only slightly different from @set!! so it may be confusing.

tkf avatar Oct 10 '19 03:10 tkf

I was thinking how to make @set!! implementable outside Setfield.jl. I think it is somewhat possible already. @colinxs Maybe you can use it if you don't like waiting for this PR to be merged?

Here is an implementation that "replaces" PropertyLens with !!-variant

using Setfield
using Setfield: ComposedLens, PropertyLens, parse_obj_lens
using BangBang

struct Lens!!{T <: Lens} <: Lens
    lens::T
end

Setfield.get(obj, lens::Lens!!) = get(obj, lens.lens)

# Default to immutable:
Setfield.set(obj, lens::Lens!!, value) =
    set(obj, lens.lens, value)

Setfield.set(obj, lens::Lens!!{<:ComposedLens}, value) =
    set(obj, Lens!!(lens.lens.outer) ∘ Lens!!(lens.lens.inner), value)

Setfield.set(obj, ::Lens!!{<:PropertyLens{fieldname}}, value) where fieldname =
    setproperty!!(obj, fieldname, value)

macro set!!(ex)
    ref, val = ex.args
    obj, lens = parse_obj_lens(ref)
    val = esc(val)
    quote
        $obj = let compose = $Setfield.compose,
            IndexLens = $Setfield.IndexLens
            $Setfield.set($obj, $Lens!!($lens), $val)
        end
    end
end

mutable struct Mutable
    a
    b
end

x = orig = Mutable(1, 2)
@set!! x.a = 10
@assert orig.a == 10

x = orig = Mutable((1, 2), 3)
@set!! x.a[1] = 10
@assert orig.a == (10, 2)

There is a bit of hack needed (let in @set!!) but this can actually be easily solved by embedding function/type objects in the expression returned by parse_obj_lens. Also, it indicates that making @set "overloadable" is as easy as

diff --git a/src/sugar.jl b/src/sugar.jl
index d735047..8a5b4f2 100644
--- a/src/sugar.jl
+++ b/src/sugar.jl
@@ -133,7 +133,7 @@ struct _UpdateOp{OP,V}
 end
 (u::_UpdateOp)(x) = u.op(x, u.val)

-function atset_impl(ex::Expr; overwrite::Bool)
+function atset_impl(ex::Expr; overwrite::Bool, preprocess=identity)
     @assert ex.head isa Symbol
     @assert length(ex.args) == 2
     ref, val = ex.args
@@ -142,14 +142,14 @@ function atset_impl(ex::Expr; overwrite::Bool)
     val = esc(val)
     ret = if ex.head == :(=)
         quote
-            lens = $lens
+            lens = $preprocess($lens)
             $dst = set($obj, lens, $val)
         end
     else
         op = get_update_op(ex.head)
         f = :(_UpdateOp($op,$val))
         quote
-            $dst = modify($f, $obj, $lens)
+            $dst = modify($f, $obj, $preprocess($lens))
         end
     end
     ret

We can then do this:

macro set!!(ex)
    atset_impl(ex; overwrite=true, preprocess=Lens!!)
end

tkf avatar Oct 10 '19 07:10 tkf

Nice @tkf I actually had very much the same idea, coming from a different angle. I thought about splitting @set in a "parse" step and an "interpretation" step. Now the output of the parse step would be some kind of "Setfield.AST", with objects like:

AST.ComposedLens(AST.IndexLens(:myindex), AST.PropertyLens(:myproperty))

But this structure is (almost) isomorphic to its standard interpretation (AST.ComposedLens -> Setfield.ComposedLens). So it feels somewhat redundant to have it in the first place.

So the outcome is very much the same as yours. The only thing I would have done differently is that instead of $preprocess($lens) there would be a call to interpret($interpreter, $lens)

jw3126 avatar Oct 10 '19 16:10 jw3126

Is interpret something like a map over "lens/AST tree?" And the reasoning was that users don't have to write their recursive transformation? But maybe it makes sense to allow users to transform ComposedLens to a completely different representation? Like fusing some combinations of lenses? Not that I have a good example ATM, though.

But this structure is (almost) isomorphic to its standard interpretation (AST.ComposedLens -> Setfield.ComposedLens).

I'm just curious. What will be lost when going from AST to the standard interpretation?

tkf avatar Oct 10 '19 22:10 tkf

It's not for @set macro but I have one example of lens tree transformation I'm using for some private codebase. What I'm doing is, given a lens like

(@lens _.a.b.c) ∘ settingas𝕀

insert "non-constant annotation" for Zygote (see this uncut function) like this:

(@lens _.a.b.c) ∘ converting(fromfield=unwrap, tofield=uncut) ∘ settingas𝕀

That is to say, insert converting lens before the left-most "non-field" lens. This is an example lens transformation that has to look at entire tree.

The question is if it makes sense to do this kind of transformation for @set macro. I don't know if this is the case, but it seems harmless to allow arbitrary transformation as it does not complicate the API.

tkf avatar Oct 11 '19 05:10 tkf

Is interpret something like a map over "lens/AST tree?" And the reasoning was that users don't have to write their recursive transformation?

We would have preprocess == lens -> interpret(interpreter, lens) so there is no technical advantage of either. Just a spelling to be bike shedded. I think I like passing around a single function more then an interpreter object. Though I don't like the name preprocess much.

But maybe it makes sense to allow users to transform ComposedLens to a completely different representation? Like fusing some combinations of lenses? Not that I have a good example ATM, though.

Yes I think giving a lot of freedom makes sense, even if I also have no application in mind.

But this structure is (almost) isomorphic to its standard interpretation (AST.ComposedLens -> Setfield.ComposedLens).

I'm just curious. What will be lost when going from AST to the standard interpretation?

I am also curious. This is a bit fuzzy in my head. I cannot define in full rigor, what I mean by "this structure", "isomorphic" or "standard interpretation". But one difference is that AST would be a compile time object that would be interpreted at compile time. E.g. in the code we would have $(preprocess(lens_ast)) instead of $preprocess($lens).

jw3126 avatar Oct 11 '19 05:10 jw3126

E.g. in the code we would have $(preprocess(lens_ast)) instead of $preprocess($lens).

Oh, I see. I was thinking that what you meant was $preprocess($lens_ast); i.e., the evaluation would happen at run-time, rather than at macro expansion time. So I suppose the difference is that $preprocess($lens) approach would have access to the values of lenses (e.g., the index value of IndexLens) while $(preprocess(lens_ast)) approach would have access to lens expressions (e.g., function expression of the anonymous function wrapped by DynamicIndexLens). So, both seem to gain and lose information compared to the alternative.

Though I don't like the name preprocess much.

Me too :) It can be better. transform? convert? It also can have lens prefix/suffix, like lenstransform. Or maybe something related to physical lenses... like polish? (OK that's maybe too much of a pun.)

tkf avatar Oct 11 '19 06:10 tkf

Oh, I see. I was thinking that what you meant was $preprocess($lens_ast); i.e., the evaluation would happen at run-time, rather than at macro expansion time. ... So, both seem to gain and lose information compared to the alternative.

Yeah. Though of course a compile time transform can just insert a runtime transform, but not the other way round. But I think we should stick with the runtime version, it is more convenient. And only switch to the compile time version if we have a good reason why we need that.

Me too :) It can be better. transform? convert? It also can have lens prefix/suffix, like lenstransform. Or maybe something related to physical lenses... like polish? (OK that's maybe too much of a pun.)

I like lenstransform. Since this is a rare operation, a long descriptive name seems the way to go. Pushing the lens analogy e.g. polish is a fun thing to do, but I think it hurts understanding. If none of you guys is faster, I will make a PR next week.

jw3126 avatar Oct 11 '19 06:10 jw3126

Hey sorry all! I was pretty swamped these last two days. Catching up with your comments now.

colinxs avatar Oct 11 '19 17:10 colinxs

Though of course a compile time transform can just insert a runtime transform, but not the other way round.

@jw3126 Oh, that's right. I missed that.

These C structs are nested 3-4 structs in, and the inner most field may be an StaticArray/NTuple.

@colinxs Re https://github.com/JuliaObjects/ConstructionBase.jl/pull/25#issuecomment-541163575, I think my approach above https://github.com/jw3126/Setfield.jl/pull/92#issuecomment-540434449 can already be used for this.

tkf avatar Oct 11 '19 21:10 tkf

@tkf I'll test that out later today! Thanks for taking the time to put that together. If it works then all my needs are satisfied for now. I'll keep an eye on this to see what cool implementation you come up with.

colinxs avatar Oct 11 '19 22:10 colinxs

Thanks for all the hard work! With #95 I think my use-case is satisfied.

colinxs avatar Oct 14 '19 02:10 colinxs

@tkf @jw3126 in reference to https://github.com/JuliaLang/julia/pull/21912, Setfield.jl implements the @setfield macro described in that PR which creates a copy of an immutable with the respective field changed. My concern is the same as yuyichao, however, in that I may have concurrent threads modifying the same immutable and thus would prefer the RefField implementation that does the unsafe_store! (rather than relying on the compiler optimizing out the copy).

Using #95, this would be pretty straightforward to implement using Lens. I was going to take a stab at that in the next day or so, but wanted to check with you to see if that's something you may have already considered/started implemented to avoid duplicated work.

colinxs avatar Oct 14 '19 22:10 colinxs

I may have concurrent threads modifying the same immutable and thus would prefer the RefField implementation that does the unsafe_store!

I'm a bit confused here. If you have multiple threads, don't you want to keep objects immutable so that you don't need to worry about data races?

BTW, here is a PR that implements @set!!: https://github.com/tkf/BangBang.jl/pull/25

tkf avatar Oct 15 '19 00:10 tkf

@tkf If I have a struct with a StaticArray field (which has to be immutable to get the C equivalent inline memory layout) and want threads to be operating on different parts of the data, then @set as currently implemented would create a data race (dependent on whether the compiler optimizes out the creation of the new struct or does an unsafe_store!.

I'll checkout the PR! Thanks again for all your help.

colinxs avatar Oct 15 '19 00:10 colinxs

Hmm... Is it OK to call it a data race though? I thought data race needs one or more thread(s) writing to some memory location. There is no "write" in set. Isn't it just using invalid update mechanism?

By the way, did you check https://github.com/RelationalAI-oss/Blobs.jl? It seems to be more appropriate for mutating heap-allocated structs, if your main goal is interop with C?

tkf avatar Oct 15 '19 00:10 tkf