DataFramesMeta.jl icon indicating copy to clipboard operation
DataFramesMeta.jl copied to clipboard

Request @rsubset_rtransform

Open Lincoln-Hannah opened this issue 2 years ago • 7 comments

Apply an @rtransform block to a restricted set of rows. Fill other rows with existing value or missing for new columns.

@chain begin

    DataFrame( A = 1:4, B=[1,1,2,2] )

    @rsubset_rtransform (:B == 1) begin

        :A = 10
        :C = 20

    end

end


#result
DataFrame( A=[10,10,3,4], B=[1,1,2,2], C=[20,20,missing,missing])

Lincoln-Hannah avatar Nov 29 '23 05:11 Lincoln-Hannah

Current:

julia> @chain DataFrame(A=1:4, B=[1,1,2,2]) begin
           @rsubset begin
               :B == 1
               @kwarg view=true
           end
           @rtransform! begin
               :A = 10
               :C = 20
           end
           parent
       end
4×3 DataFrame
 Row │ A      B      C
     │ Int64  Int64  Int64?
─────┼───────────────────────
   1 │    10      1       20
   2 │    10      1       20
   3 │     3      2  missing
   4 │     4      2  missing

bkamins avatar Nov 29 '23 09:11 bkamins

Thanks Bogumil. I'll probably lay it out as:

@chain DataFrame(A=1:4, B=[1,1,2,2]) begin

    @rsubset(:B == 1; view=true ); @rtransform! begin

        :A = 10
        :C = 20

    end; parent
    
end

I have 5 or so of these subset blocks within a larger @rtransform block. So don't want to have too many lines opening and closing each block.

Lincoln-Hannah avatar Nov 29 '23 11:11 Lincoln-Hannah

Can you do the same with a group-by? Have an @rtransform block for each value of :B That might reduce the amount of code needed to begin. each block

lincolnhannah avatar Nov 29 '23 11:11 lincolnhannah

@rsubset(:B == 1; view=true )

it does not work AFAICT. You need @kwarg but maybe please double check.

group-by should work, but could you please show the code and output you have in mind to make sure we are on the same page.?

bkamins avatar Nov 29 '23 13:11 bkamins

I absolutely want to implement this, ideally with the syntax

@rtransform df @when(:x == 1) :y = :z * 100

or

@rtransform df @when(:x == 1) begin
    :y = :z * 100
    :a = :b * 5
end

In the background we would have something along the lines of

@chain df begin 
    copy
    @rsubset(:x == 1; view = true)
    @rtransform! begin 
      :y = :z * 100
      :a = :b * 5
    end
    parent
end

This is very feasible, I just haven't written it.

pdeffebach avatar Nov 29 '23 15:11 pdeffebach

@when is a good name I think.

bkamins avatar Nov 29 '23 19:11 bkamins

That's awesome Peter. Shortens the code a lot.

Lincoln-Hannah avatar Nov 29 '23 22:11 Lincoln-Hannah