ShortStrings.jl icon indicating copy to clipboard operation
ShortStrings.jl copied to clipboard

Deprecate for WeakRefString's InlineString

Open oxinabox opened this issue 4 years ago • 4 comments

I don't care to duplicate efforts. InlineStrings is just a more complete implementation of this idea. With a few extra clever tricks.

cc @quinnj

oxinabox avatar Sep 21 '21 16:09 oxinabox

I think that deprecating ShortStrings should wait, InlineString doesn't completely replace SSs (it doesn't have a string macro to make using them more convenient, for example), and it really needs some performance testing done, to see how it compares to other string types, such as String and ShortString. (I've already seen a few cases where it is substantially slower - in one simple case, 273x slower!)

ScottPJones avatar Sep 30 '21 00:09 ScottPJones

Please share the performance checking you've done; I tested every function defined in the InlineStrings.jl package and it was faster or on par with everything in ShortStrings. I'll find a link to the perf testing I did, but it was pretty substantial.

quinnj avatar Sep 30 '21 02:09 quinnj

I just did some simple tests of ==, comparing different sized InlineStrings to each other, to Strings, and SubStrings. In many places, it looks like operations fall back to ones in Base - in others, what may be an issue is that I believe that when a Ref{T} is created (so that you can get a pointer and then call memcmp), that has to allocate (where?) some memory and then store the InlineString there.

ScottPJones avatar Oct 01 '21 15:10 ScottPJones

julia> x = InlineString("hey")
"hey"

julia> y = "hey"
"hey"

julia> typeof(x)
String3

julia> typeof(y)
String

julia> @btime x == y
  15.004 ns (0 allocations: 0 bytes)
true

julia> @btime y == y
  12.275 ns (0 allocations: 0 bytes)
true

julia> @btime x == x
  11.479 ns (0 allocations: 0 bytes)
true

julia> z = ShortString("hey")
"hey"

julia> typeof(z)
ShortString3 (alias for ShortString{UInt32})

julia> @btime y == z
  15.442 ns (0 allocations: 0 bytes)
true

julia> @btime y == y
  11.114 ns (0 allocations: 0 bytes)
true

julia> x = InlineString("a"^255)
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

julia> y = "a"^255
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

julia> z = ShortString("a"^255)
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

julia> @btime x == y
  34.052 ns (0 allocations: 0 bytes)
true

julia> @btime z == y
  925.439 ns (0 allocations: 0 bytes)
true

sampling other sizes of InlineStrings/ShortStrings seems to show similar results; it seems like ShortStrings gets significantly slower for some reason for the 128/255 byte cases; otherwise, the performance seems about equal for all other sizes.

quinnj avatar Oct 02 '21 05:10 quinnj