WIP: Implement mutating versions of @set and @lens (#71)
Here is the diff wrt #91 (for easy review): https://github.com/jw3126/Setfield.jl/compare/constbase...colinxs:bangbang
I think the part where it gets really challenging is to implement set on a composition. A composition that involves !! or ! lens applied to a nested struct should only mutate the innermost possible mutable in a nested struct update and not reconstruct any other objects.
I was assuming that the answer was that "users should be careful." Why not just say that you'd loose the pureness property if there is at least one mutating-lens? That is to say, isn't it OK to have
x = (a=[1],)
y = set(x, (@lens _.a) ∘ (@lens! _[1]), 2)
@assert x.a === y.a
@assert x.a[1] == 2
Yes you example is the only reasonable behavior I can imagine. What I was talking about is
struct A; a end
x = A([1])
y = set(x, @lens!! _.a[1], 2)
@assert x.a === y.a
@assert x.a[1] == 2
Does this do a constructor call to A? I would prefer it if that was optimized away. But if this makes compose implementation too involved, I can live with it not being optimized out.
Edit: About composition of @lens! I am fine with if that just mutates the innermost (and errors if the innermost is immutable).
We had some half baked per set call mutation support in the beginning. The implementation is probably not interesting, but maybe you can salvage some of the tests.
https://github.com/jw3126/Setfield.jl/commit/1ac915347ba6086d548ae1d11df9eef357ff7802
Does this do a constructor call to
A?
Ah, that's a good point. Maybe something like this would work?
function set(obj, l::ComposedLens, val)
inner_obj = get(obj, l.outer)
if MutationPolicy(l.inner) isa AlwaysMutate
set(inner_obj, l.inner, val)
obj
else
# Do what we are doing:
inner_val = set(inner_obj, l.inner, val)
set(obj, l.outer, inner_val)
end
end
But, if it made the code better, it would mean that there might be some non-trivial invariant checking in the constructor. Maybe we shouldn't be skipping it? If you know that it is safe to skip those checks then you probably know the type of the outer object reasonably well so that you can directly modify mutable inner object?
But, if it made the code better, it would mean that there might be some non-trivial invariant checking in the constructor.
A plain a.b.c = d does no invariant checking, so I think it would be a sane choice if we allow ourselfs to optimize that away as well.
If you know that it is safe to skip those checks then you probably know the type of the outer object reasonably well so that you can directly modify mutable inner object?
That is an interesting point. Maybe we should examine some performance/IR and check if in the case of plain structs without inner constructor julia manages to elide the reconstruction and just mutates the innermost. If julia manages to optimize that well, that would be great. If this is the case, then I agree the invariant checking is a nice feature.
Actually, I know from experience that re-constructing structs sometimes helps improving code-gen. There is a function (adequately) called darkritual in Transducers.jl which just re-constructs all nested immutable structs. It is useful when one of the struct has an Array as a member and IIRC it was impossible to generate SIMD'ed loop without it.
So I'm almost done incorporating the above feedback, but ran across the following which I thought was a bug. Using your example:
struct A; a end
x = A([1])
y = set(x, @lens!! _.a[1], 2)
@assert x.a === y.a
@assert x.a[1] == 2
You also have (without making the above modification to set(obj, l::ComposedLens, val)):
@assert x === y
Edit: This behavior stems from setproperty!!:
x = A([1])
a = x.a
y = setproperty!!(x, :a, a)
@assert x === y
Isn't it expected? That's equivalent to
julia> a = [1];
julia> x = (a=a,);
julia> y = (a=a,);
julia> x === y
true
Right?
Ahh I guess so! Nevermind :)
I think we should also deprecate the current @set! macro into something else. Maybe @reset?
The new @set! macro can live in Setfield.Future for a while.
A bit crazy plan is to rename the whole package :)
Since you pointed out that my favorite name is already taken in I am somewhat depressed about it. But yeah the package needs to be renamed, we can discuss that further in #54.
As of 503599149031ac613c4fbf34816da79a2044ac5f, we have
https://github.com/jw3126/Setfield.jl/blob/503599149031ac613c4fbf34816da79a2044ac5f/src/lens.jl#L142-L143
and
https://github.com/jw3126/Setfield.jl/blob/503599149031ac613c4fbf34816da79a2044ac5f/src/lens.jl#L212-L213
I should have noticed this earlier, but it means that @set! x.a[1] = 2 is equivalent to x.a[1] = 2, right? It seems like a redundant syntax? I think @lens! x.a[1] may be useful though.
FYI what I call mutate-if-mutable in https://github.com/jw3126/Setfield.jl/pull/71#issuecomment-513859965 would be something like this
if ismutablestruct(obj)
setproperty!(obj, name, val)
obj
else
set(obj, PropertyLens(name, NeverMutate()), val)
end
We could use this for @set! but it is only slightly different from @set!! so it may be confusing.
I was thinking how to make @set!! implementable outside Setfield.jl. I think it is somewhat possible already. @colinxs Maybe you can use it if you don't like waiting for this PR to be merged?
Here is an implementation that "replaces" PropertyLens with !!-variant
using Setfield
using Setfield: ComposedLens, PropertyLens, parse_obj_lens
using BangBang
struct Lens!!{T <: Lens} <: Lens
lens::T
end
Setfield.get(obj, lens::Lens!!) = get(obj, lens.lens)
# Default to immutable:
Setfield.set(obj, lens::Lens!!, value) =
set(obj, lens.lens, value)
Setfield.set(obj, lens::Lens!!{<:ComposedLens}, value) =
set(obj, Lens!!(lens.lens.outer) ∘ Lens!!(lens.lens.inner), value)
Setfield.set(obj, ::Lens!!{<:PropertyLens{fieldname}}, value) where fieldname =
setproperty!!(obj, fieldname, value)
macro set!!(ex)
ref, val = ex.args
obj, lens = parse_obj_lens(ref)
val = esc(val)
quote
$obj = let compose = $Setfield.compose,
IndexLens = $Setfield.IndexLens
$Setfield.set($obj, $Lens!!($lens), $val)
end
end
end
mutable struct Mutable
a
b
end
x = orig = Mutable(1, 2)
@set!! x.a = 10
@assert orig.a == 10
x = orig = Mutable((1, 2), 3)
@set!! x.a[1] = 10
@assert orig.a == (10, 2)
There is a bit of hack needed (let in @set!!) but this can actually be easily solved by embedding function/type objects in the expression returned by parse_obj_lens. Also, it indicates that making @set "overloadable" is as easy as
diff --git a/src/sugar.jl b/src/sugar.jl
index d735047..8a5b4f2 100644
--- a/src/sugar.jl
+++ b/src/sugar.jl
@@ -133,7 +133,7 @@ struct _UpdateOp{OP,V}
end
(u::_UpdateOp)(x) = u.op(x, u.val)
-function atset_impl(ex::Expr; overwrite::Bool)
+function atset_impl(ex::Expr; overwrite::Bool, preprocess=identity)
@assert ex.head isa Symbol
@assert length(ex.args) == 2
ref, val = ex.args
@@ -142,14 +142,14 @@ function atset_impl(ex::Expr; overwrite::Bool)
val = esc(val)
ret = if ex.head == :(=)
quote
- lens = $lens
+ lens = $preprocess($lens)
$dst = set($obj, lens, $val)
end
else
op = get_update_op(ex.head)
f = :(_UpdateOp($op,$val))
quote
- $dst = modify($f, $obj, $lens)
+ $dst = modify($f, $obj, $preprocess($lens))
end
end
ret
We can then do this:
macro set!!(ex)
atset_impl(ex; overwrite=true, preprocess=Lens!!)
end
Nice @tkf I actually had very much the same idea, coming from a different angle. I thought about splitting @set in a "parse" step and an "interpretation" step. Now the output of the parse step would be some kind of "Setfield.AST", with objects like:
AST.ComposedLens(AST.IndexLens(:myindex), AST.PropertyLens(:myproperty))
But this structure is (almost) isomorphic to its standard interpretation (AST.ComposedLens -> Setfield.ComposedLens). So it feels somewhat redundant to have it in the first place.
So the outcome is very much the same as yours. The only thing I would have done differently is that instead of $preprocess($lens) there would be a call to interpret($interpreter, $lens)
Is interpret something like a map over "lens/AST tree?" And the reasoning was that users don't have to write their recursive transformation? But maybe it makes sense to allow users to transform ComposedLens to a completely different representation? Like fusing some combinations of lenses? Not that I have a good example ATM, though.
But this structure is (almost) isomorphic to its standard interpretation (
AST.ComposedLens->Setfield.ComposedLens).
I'm just curious. What will be lost when going from AST to the standard interpretation?
It's not for @set macro but I have one example of lens tree transformation I'm using for some private codebase. What I'm doing is, given a lens like
(@lens _.a.b.c) ∘ settingas𝕀
insert "non-constant annotation" for Zygote (see this uncut function) like this:
(@lens _.a.b.c) ∘ converting(fromfield=unwrap, tofield=uncut) ∘ settingas𝕀
That is to say, insert converting lens before the left-most "non-field" lens. This is an example lens transformation that has to look at entire tree.
The question is if it makes sense to do this kind of transformation for @set macro. I don't know if this is the case, but it seems harmless to allow arbitrary transformation as it does not complicate the API.
Is interpret something like a map over "lens/AST tree?" And the reasoning was that users don't have to write their recursive transformation?
We would have preprocess == lens -> interpret(interpreter, lens) so there is no technical advantage of either. Just a spelling to be bike shedded. I think I like passing around a single function more then an interpreter object. Though I don't like the name preprocess much.
But maybe it makes sense to allow users to transform ComposedLens to a completely different representation? Like fusing some combinations of lenses? Not that I have a good example ATM, though.
Yes I think giving a lot of freedom makes sense, even if I also have no application in mind.
But this structure is (almost) isomorphic to its standard interpretation (AST.ComposedLens -> Setfield.ComposedLens).
I'm just curious. What will be lost when going from AST to the standard interpretation?
I am also curious. This is a bit fuzzy in my head. I cannot define in full rigor, what I mean by "this structure", "isomorphic" or "standard interpretation". But one difference is that AST would be a compile time object that would be interpreted at compile time. E.g. in the code we would have $(preprocess(lens_ast)) instead of $preprocess($lens).
E.g. in the code we would have
$(preprocess(lens_ast))instead of$preprocess($lens).
Oh, I see. I was thinking that what you meant was $preprocess($lens_ast); i.e., the evaluation would happen at run-time, rather than at macro expansion time. So I suppose the difference is that $preprocess($lens) approach would have access to the values of lenses (e.g., the index value of IndexLens) while $(preprocess(lens_ast)) approach would have access to lens expressions (e.g., function expression of the anonymous function wrapped by DynamicIndexLens). So, both seem to gain and lose information compared to the alternative.
Though I don't like the name preprocess much.
Me too :) It can be better. transform? convert? It also can have lens prefix/suffix, like lenstransform. Or maybe something related to physical lenses... like polish? (OK that's maybe too much of a pun.)
Oh, I see. I was thinking that what you meant was
$preprocess($lens_ast); i.e., the evaluation would happen at run-time, rather than at macro expansion time. ... So, both seem to gain and lose information compared to the alternative.
Yeah. Though of course a compile time transform can just insert a runtime transform, but not the other way round. But I think we should stick with the runtime version, it is more convenient. And only switch to the compile time version if we have a good reason why we need that.
Me too :) It can be better. transform? convert? It also can have lens prefix/suffix, like lenstransform. Or maybe something related to physical lenses... like polish? (OK that's maybe too much of a pun.)
I like lenstransform. Since this is a rare operation, a long descriptive name seems the way to go. Pushing the lens analogy e.g. polish is a fun thing to do, but I think it hurts understanding.
If none of you guys is faster, I will make a PR next week.
Hey sorry all! I was pretty swamped these last two days. Catching up with your comments now.
Though of course a compile time transform can just insert a runtime transform, but not the other way round.
@jw3126 Oh, that's right. I missed that.
These C structs are nested 3-4 structs in, and the inner most field may be an StaticArray/NTuple.
@colinxs Re https://github.com/JuliaObjects/ConstructionBase.jl/pull/25#issuecomment-541163575, I think my approach above https://github.com/jw3126/Setfield.jl/pull/92#issuecomment-540434449 can already be used for this.
@tkf I'll test that out later today! Thanks for taking the time to put that together. If it works then all my needs are satisfied for now. I'll keep an eye on this to see what cool implementation you come up with.
Thanks for all the hard work! With #95 I think my use-case is satisfied.
@tkf @jw3126 in reference to https://github.com/JuliaLang/julia/pull/21912, Setfield.jl implements the @setfield macro described in that PR which creates a copy of an immutable with the respective field changed. My concern is the same as yuyichao, however, in that I may have concurrent threads modifying the same immutable and thus would prefer the RefField implementation that does the unsafe_store! (rather than relying on the compiler optimizing out the copy).
Using #95, this would be pretty straightforward to implement using Lens. I was going to take a stab at that in the next day or so, but wanted to check with you to see if that's something you may have already considered/started implemented to avoid duplicated work.
I may have concurrent threads modifying the same immutable and thus would prefer the
RefFieldimplementation that does theunsafe_store!
I'm a bit confused here. If you have multiple threads, don't you want to keep objects immutable so that you don't need to worry about data races?
BTW, here is a PR that implements @set!!: https://github.com/tkf/BangBang.jl/pull/25
@tkf If I have a struct with a StaticArray field (which has to be immutable to get the C equivalent inline memory layout) and want threads to be operating on different parts of the data, then @set as currently implemented would create a data race (dependent on whether the compiler optimizes out the creation of the new struct or does an unsafe_store!.
I'll checkout the PR! Thanks again for all your help.
Hmm... Is it OK to call it a data race though? I thought data race needs one or more thread(s) writing to some memory location. There is no "write" in set. Isn't it just using invalid update mechanism?
By the way, did you check https://github.com/RelationalAI-oss/Blobs.jl? It seems to be more appropriate for mutating heap-allocated structs, if your main goal is interop with C?