suggestion - @chainWithRTransform
A macro similar to @chain but treats any line that isn't another macro as being within an @rtransform @astable block. So what would currently be written as:
@chain DataFrame( A = 1:10 ) begin
@rtransform @astable begin
:B = mod(:A,3)
:E = :B * 2
end
@rsubset :B == 1
@rtransform @astable begin
:F = :B * 2
:G = :F + 1
end
@orderby :A
@rtransform :H = :G * 2
end
Could be written as:
@chainWithrTransform DataFrame( A = 1:10 ) begin
:B = mod( :A, 3 )
:E = :B * 2
@rsubset :B == 1
:F = :B * 2
:G = :F + 1
@orderby :A
:H = :G * 2
end
I'm coming around to this being a good idea. It would certainly cut down on lots of typing and it definitely seems to be true that 90% of commands are @rtransform.
Maybe @nalimilan can chime in and give their thoughts. Because this would be a pretty non-standard syntax transformation.
How would it combine with grouping/ungrouping data frames in the process (I think it would be OK, but I want to make sure)
Technically, would it be possible to support passing macro calls like @rsubset or @orderby inside @rtransform df begin... end, so that we don't need a new macro like @chainWithrTransform?
That would make https://github.com/JuliaData/DataFramesMeta.jl/pull/376/ a bit less ad-hoc.
@bkamins I think it would only apply to row-wise operations, so grouping would in general be ignored.
@nalimilan I'm not sure how that would work, are you saying something like
@chain_r_transform df begin
:y = :x * 2
@rsubset ...
@orderby ...
end
or something else?
I also disagree what #376 (@when) is that ad-hoc. It honestly seems like one of the simpler ways to give the complication functionality that imitates Stata's if.
@bkamins Maybe a better way to think of it:
- First change any line of the form
:ColumnName = ...to@rtransform ColumnName = .... - Then proceed as per a usual
@chainblock.
So this
@chainWithrTransform DataFrame( A = 1:10 ) begin
:B = mod( :A, 3 )
:E = :B * 2
@rsubset :B == 1
:F = :B * 2
:G = :F + 1
@orderby :A
:H = :G * 2
end
becomes
@chain DataFrame( A = 1:10 ) begin
@rtransform :B = mod( :A, 3 )
@rtransform :E = :B * 2
@rsubset :B == 1
@rtransform :F = :B * 2
@rtransform :G = :F + 1
@orderby :A
@rtransform :H = :G * 2
end
Its just a bit of Syntax Sugar
As @pdeffebach says, 90% of lines are @rtransform.
It allows you to not have to keep writing it.
So the first assignment operation present in the block drops grouping.
Lines of the form :columnName = ... are changed to @rtransform :columnName= ...
But not if they are within a sub-block (.e.g. @by or @transform )
@chainWithrTransform begin
DataFrame(A=1:4)
:B = mod(:A,2)
@by :B begin
:A = mean(:A)
end
:C = :B / :A
@transform :sumA = sum(:A)
end
becomes
@chain begin
DataFrame(A=1:4)
@rtransform :B = mod(:A,2)
@by :B begin
:A = mean(:A) #unchanged since it is within the @by block
end
@rtransform :C = :B / :A
@transform :sumA = sum(:A) #unchanged since it is within the @transform block
end
So the first assignment operation present in the block drops grouping.
Yeah. It would drop grouping.