RData.jl icon indicating copy to clipboard operation
RData.jl copied to clipboard

Import R data frame attributes as metadata

Open nalimilan opened this issue 3 years ago • 2 comments

Use the metadata function that are going to be added to DataAPI and DataFrames (https://github.com/JuliaData/DataFrames.jl/pull/3055) to import R data.frame attributes and set them as DataFrame metadata. R stores per-column attributes in vector objects, while DataFrames.jl stores them in the DataFrame object, as there is no generic mechanism to attach metadata to an AbstractVector object.

The row.names attribute is skipped as it is not appropriate to store it as global metadata given that it will get out of sync after subsetting rows. We could provide a way to turn row names into a column instead.

Also add methods to check equality between two DictoVec object as these are useful for tests (haven commonly sets named numeric vectors to store value labels so this case deserves testing).

(Test pass locally when using DataAPI and DataFrame branches.)

nalimilan avatar May 26 '22 11:05 nalimilan

Thanks! I'll try to review it in the next few days.

alyst avatar May 26 '22 18:05 alyst

@alyst - we still need to decide on the best API in https://github.com/JuliaData/DataAPI.jl/pull/48 before finalizing this PR, so if you have any thoughts on this please comment there. Thank you!

bkamins avatar May 26 '22 18:05 bkamins

Codecov Report

Base: 85.75% // Head: 86.64% // Increases project coverage by +0.88% :tada:

Coverage data is based on head (2724f62) compared to base (5df74cf). Patch coverage: 97.14% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #93      +/-   ##
==========================================
+ Coverage   85.75%   86.64%   +0.88%     
==========================================
  Files          14       14              
  Lines         646      689      +43     
==========================================
+ Hits          554      597      +43     
  Misses         92       92              
Impacted Files Coverage Δ
src/convert.jl 95.73% <96.42%> (+0.76%) :arrow_up:
src/DictoVec.jl 96.20% <100.00%> (-1.02%) :arrow_down:
src/RData.jl 100.00% <100.00%> (ø)
src/context.jl 100.00% <0.00%> (ø)
src/sxtypes.jl 76.13% <0.00%> (+0.27%) :arrow_up:
src/identifier.jl 92.59% <0.00%> (+0.28%) :arrow_up:
src/readers.jl 74.28% <0.00%> (+0.29%) :arrow_up:
... and 3 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Oct 09 '22 20:10 codecov[bot]

Ah, DataFrames 1.4 requires Julia 1.6, so we have to drop support for older Julia versions here too. But the convention is to bump the minor release when doing that, so that we would still be able to tag a bugfix release on 1.0 if necessary. Maybe take the occasion to tag RData 1.0?

nalimilan avatar Oct 09 '22 20:10 nalimilan

I think tagging 1.0.0 is a good idea.

alyst avatar Oct 09 '22 20:10 alyst

Thanks! See https://github.com/JuliaRegistries/General/pull/69824

nalimilan avatar Oct 10 '22 06:10 nalimilan