DataFramesMeta.jl icon indicating copy to clipboard operation
DataFramesMeta.jl copied to clipboard

suggestion - @chainWithRTransform

Open Lincoln-Hannah opened this issue 3 years ago • 9 comments

A macro similar to @chain but treats any line that isn't another macro as being within an @rtransform @astable block. So what would currently be written as:

@chain DataFrame( A = 1:10 ) begin

    @rtransform @astable begin
        :B =  mod(:A,3)
        :E =  :B * 2
    end

    @rsubset :B == 1
    
    @rtransform @astable begin
        :F =  :B * 2
        :G =  :F + 1
    end

    @orderby :A

    @rtransform   :H =  :G * 2

end

Could be written as:

  @chainWithrTransform DataFrame( A = 1:10 ) begin
        
        :B =  mod( :A, 3 )
        :E =  :B * 2
        
        @rsubset :B == 1
    
        :F =  :B * 2
        :G =  :F + 1
    
        @orderby :A
    
        :H =  :G * 2
  
  end

Lincoln-Hannah avatar Feb 17 '22 23:02 Lincoln-Hannah

I'm coming around to this being a good idea. It would certainly cut down on lots of typing and it definitely seems to be true that 90% of commands are @rtransform.

Maybe @nalimilan can chime in and give their thoughts. Because this would be a pretty non-standard syntax transformation.

pdeffebach avatar Dec 22 '23 21:12 pdeffebach

How would it combine with grouping/ungrouping data frames in the process (I think it would be OK, but I want to make sure)

bkamins avatar Dec 23 '23 09:12 bkamins

Technically, would it be possible to support passing macro calls like @rsubset or @orderby inside @rtransform df begin... end, so that we don't need a new macro like @chainWithrTransform?

That would make https://github.com/JuliaData/DataFramesMeta.jl/pull/376/ a bit less ad-hoc.

nalimilan avatar Dec 23 '23 21:12 nalimilan

@bkamins I think it would only apply to row-wise operations, so grouping would in general be ignored.

@nalimilan I'm not sure how that would work, are you saying something like

@chain_r_transform df begin
    :y = :x * 2
    @rsubset ...
    @orderby ...
end

or something else?

I also disagree what #376 (@when) is that ad-hoc. It honestly seems like one of the simpler ways to give the complication functionality that imitates Stata's if.

pdeffebach avatar Jan 02 '24 20:01 pdeffebach

@bkamins Maybe a better way to think of it:

  1. First change any line of the form :ColumnName = ... to @rtransform ColumnName = ....
  2. Then proceed as per a usual @chain block.

So this

  @chainWithrTransform DataFrame( A = 1:10 ) begin
        
        :B =  mod( :A, 3 )
        :E =  :B * 2
        
        @rsubset :B == 1
    
        :F =  :B * 2
        :G =  :F + 1
    
        @orderby :A
    
        :H =  :G * 2
  
  end

becomes

  @chain DataFrame( A = 1:10 ) begin
        
        @rtransform :B =  mod( :A, 3 )
        @rtransform :E =  :B * 2
        
        @rsubset :B == 1
    
        @rtransform :F =  :B * 2
        @rtransform :G =  :F + 1
    
        @orderby :A
    
        @rtransform :H =  :G * 2
  
  end

Its just a bit of Syntax Sugar As @pdeffebach says, 90% of lines are @rtransform.
It allows you to not have to keep writing it.

Lincoln-Hannah avatar Jan 09 '24 03:01 Lincoln-Hannah

So the first assignment operation present in the block drops grouping.

bkamins avatar Jan 09 '24 08:01 bkamins

Lines of the form :columnName = ... are changed to @rtransform :columnName= ... But not if they are within a sub-block (.e.g. @by or @transform )

@chainWithrTransform begin
    
    DataFrame(A=1:4)

    :B = mod(:A,2)

    @by :B   begin
           :A = mean(:A)          
    end

    :C = :B / :A

    @transform :sumA = sum(:A) 

end

becomes

@chain begin

    DataFrame(A=1:4)

    @rtransform :B = mod(:A,2)

    @by :B  begin 
            :A = mean(:A)                #unchanged since it is within the @by block
    end

    @rtransform :C = :B / :A

    @transform :sumA = sum(:A)          #unchanged since it is within the @transform block
end

Lincoln-Hannah avatar Jan 09 '24 10:01 Lincoln-Hannah

So the first assignment operation present in the block drops grouping.

Yeah. It would drop grouping.

pdeffebach avatar Jan 09 '24 14:01 pdeffebach