setindex! incorrect for non-UTF-8 strings?
These two lines don't seem correct to me for non-UTF-8 AbstractString types:
https://github.com/JuliaData/WeakRefStrings.jl/blob/caf4ed477e493309d12502ab0984eec157120925/src/WeakRefStrings.jl#L369-L370
Indeed this will copy the contents of the string even if it uses a different encoding from existing data.
what would you suggest though? is there a standard api for getting the encoding of a string? converting it to utf8? or maybe if it's not in the encoding of the rest of the array, we reject it?
AFAIK there's no API to get the encoding of a string, but that would be a logical complement to codeunit/codeunits. BTW, there's no guaranty that you can call pointer on an AbstractString and get a pointer to the data: one would need to use codeunits anyway even if the encoding matched.
Waiting for a better API, I guess the only solution is to have a fast method for String with StringArray{<:Union{Missing, String}}, and a slower method iterating over characters for other cases.