readr
readr copied to clipboard
Suggestion: parse_factor() with labels
base::factor has labels parameter with default argument as levels.
This is quite convenient when data contains code values like 1 for male and 2 for female
x <- c("1", "2", "2", "1", "1")
factor(x, levels = 1:2, labels = c("male", "female"))
#> [1] male female female male male
#> Levels: male female
parse_factor or col_parse doesn't have such options so that we need to do such process after loading data.
My suggestion is to add a new labels parameter for parse_factor and col_parse.
Example usage:
id,sex,...
1,1,...
2,1,...
3,2,...
sex_spec <- col_factor(levels = 1:2, labels = c("male", "female"))
read_csv('mydata.csv', col_types = cols(sex = sex_spec))
#> A tibble: m x n
#> id sex ...
#> <int> <fct> ...
#> 1 1 male ...
#> 2 2 male ...
#> 3 3 female ...
I stumbled upon this issue and realized I had already upvoted it over a year ago. @jimhester Can I ask what is blocking https://github.com/tidyverse/readr/pull/882 from being merged and whether there is anything than can be done to move it forward?