what is the type of ds.A?
ds = Dataset(A = ["a", "b","a", "b"],B=[1,2,3,4])
julia> ds.A==ds[:,:A]
true
julia> typeof(ds.A)
DatasetColumn{Dataset, Vector{Union{Missing, String}}}
julia> typeof(ds[:,:A])
Vector{Union{Missing, String}} (alias for Array{Union{Missing, String}, 1})
julia> ds.A
4-element Vector{Union{Missing, String}}:
"a"
"b"
"a"
"b"
I had tried to make the concatenation between what I thought were two vectors [ds.A; ds.A]
ulia> [ds.A ; ds.A]
2-element Vector{DatasetColumn{Dataset, Vector{Union{Missing, String}}}}:
DatasetColumn{Dataset, Vector{Union{Missing, String}}}(1, 4×3 Dataset
Row │ A B C
│ identity identity identity
│ String? String? String?
─────┼──────────────────────────────
1 │ a no low
2 │ b yes low
3 │ a no hi
4 │ b no hi, Union{Missing, String}["a", "b", "a", "b"])
DatasetColumn{Dataset, Vector{Union{Missing, String}}}(1, 4×3 Dataset
Row │ A B C
│ identity identity identity
│ String? String? String?
─────┼──────────────────────────────
1 │ a no low
2 │ b yes low
3 │ a no hi
4 │ b no hi, Union{Missing, String}["a", "b", "a", "b"])
It is DatasetColumn, a customised structure which wrap a column of a data set. It is there because we want to track any changes to a data set column. Any change of a value of a column can change the following attributes of a data set:
- last modified time
- sorting - grouping
- format
- ...
Thus, an abstract vector cannot be used for this purpose, and a customised type is used instead. Generally, we recommend ds[:, :A] for extracting columns and/or provided APIs to manipulate columns.
However, if you think a method must be defined for DatasetColumn, you are welcome to open a PR for it. The right location to add such methods is src/abstractdataset/dscol.jl.
Just a side note: for repeating rows you can use repeat! or repeat, and use append! to append data sets.