Request - grouped by columns available as single values rather than vectors
Would it be possible, within a @by block, to make the grouped by columns available as single values rather then vectors?
In the below, I'd like to create a column of myCurve structs, but because the :name column comes through as a vector, it only works for the myCurve_name_vec structs. I could convert it, it just wouldn't be so clean.
More generally, if you are grouping by a column, any related calculations would likely use that column as a single value.
@with_kw struct myCurve
name::Symbol
curve::Vector{Int64}
end
@with_kw struct myCurve_name_vec
name::Vector{Symbol}
curve::Vector{Int64}
end
d = DataFrame( name=[:a,:a,:a,:b,:b,:b], curve =[1,2,3,11,12,13] )
@by d :name :x = myCurve_with_vec( AsTable(:)... ) #works
@by d :name :x = myCurve( AsTable(:)... ) #doesn't work
@Lincoln-Hannah - indeed I also often need it. I understand that this is request for DataFramesMeta.jl.
The only issue is mixing grouping and non-grouping columns. Maybe something like @val(:name) inside @by could be better instead (to distinguish taking :name as a column and @val(:name) as a value).
@val name is tentative.
What you currently can do is use first(:name) to get it, so maybe you would find it enough? (and just requiring documentation?)
@Lincoln-Hannah Can I have more information on your use-case?
I also do this all the time, but first(:name) is enough for me.
See related request: https://github.com/mauro3/Parameters.jl/issues/153
I'd like to move between DataFrames and arrays of structs as effortlessly as possible.
If I create a struct with fieldnames matching a database query. I'd like to convert the query into an array of structs in one line. Something like:
@rtransform df :mystruct = mystruct(; AsTable(:)... )
A struct derived from a grouped DataFrame, will have single value fields for the group by columns and vector fields for the non-group-by columns.
Okay so you would like
@rtransform df :mystruct = mystruct(; AsTable(:)... )
to not return a DataFrame? Rather, you want it to return a Vector?
I still need more information on what you want. What is the output you desire? Give it as a Julia object, not a description.
Sorry Peter. My bad. I was trying to isolate the key line. To get to a vector there would be an additional line.
@chain begin
@rtransform df :mystruct = mystruct(; AsTable(:)... )
_.mystruct
end
Actually, more often I'd put the result in a Dictionary. Example.
using Dictionaries
@with_kw struct myStruct
a::Int64
b::Int64
c::Vector{Int64}
d::Vector{Int64}
end
dict_of_structs = @chain begin
DataFrame( a=[1,1,2,2], b=[11,11,12,12], c=1:4, d=11:14 )
@by [:a,:b] :x = myStruct(; AsTable(:)... )
Dictionary( _.a, _.x )
end
AsTable(:) produces a named tuple per row, except that group by columns are single numbers and other columns are vectors or sub arrays (as per usual).
[ (a=1,b=11,c=[1,2],d=[11,12]),
(a=2,b=12,c=[3,4],d=[13,14]) ]
each row becomes a myStruct. The last line creates a dictionary.
Dictionary
1 | myStruct(a=1,b=11,c=[1,2],d=[11,12])
2 | myStruct(a=2,b=12,c=[3,4],d=[13,14])
We can then apply a function to any element
myFunc( dict_of_strucst[1] )
or broadcast over all
myFunc.( dict_of_structs )