DocOpt.jl
DocOpt.jl copied to clipboard
Performance/safety considerations for `partition`
Changes
- Performance: made
partitionaround 50 times faster, and considerably better memory usage; - String safety: use
previndandnextindinstead of- 1and+ 1respectively; - A Julian touch: return an
NTuple{3, SubString{String}}, rather thanNTuple{3, String}.
P.S., I know this package isn't worked on an awful lot, but any performance improvements are good, right?...
Benchmarking results:
For benchmarking, the original partitions function has kept its name, and my updated partitions function is called my_partition. For benchmarking, I have generated a random string 3 times, and for each string I choose a random character, and benchmark both functions with the character and string version of that character.
[ Info: Benchmarking methods for random string of length 100000
[ Info: 1/3: Checking partition for character C
6.913 μs (4 allocations: 97.86 KiB)
[ Info: 1/3: Checking partition for string C
7.285 μs (4 allocations: 97.86 KiB)
[ Info: 1/3: Checking my_partition for character C
146.461 ns (2 allocations: 112 bytes)
[ Info: 1/3: Checking my_partition for string C
146.576 ns (2 allocations: 112 bytes)
[ Info: 2/3: Checking partition for character t
7.097 μs (4 allocations: 97.86 KiB)
[ Info: 2/3: Checking partition for string t
6.822 μs (4 allocations: 97.86 KiB)
[ Info: 2/3: Checking my_partition for character t
146.713 ns (2 allocations: 112 bytes)
[ Info: 2/3: Checking my_partition for string t
145.491 ns (2 allocations: 112 bytes)
[ Info: 3/3: Checking partition for character t
7.213 μs (4 allocations: 97.84 KiB)
[ Info: 3/3: Checking partition for string t
7.149 μs (4 allocations: 97.84 KiB)
[ Info: 3/3: Checking my_partition for character t
148.481 ns (2 allocations: 112 bytes)
[ Info: 3/3: Checking my_partition for string t
147.291 ns (2 allocations: 112 bytes)
Benchmarking script:
function partition(s::AbstractString, delim::AbstractString)
range = findfirst(delim, s)
if range == nothing
# no match
return s, "", ""
elseif length(range) == 1
# delim is a single character
return s[1:range[1]-1], delim, s[range[1]+1:end]
else
start, stop = range
return s[1:start-1], delim, s[stop+1:end]
end
end
partition(s::AbstractString, delim::Char) = partition(s::AbstractString, string(delim))
function my_partition(str::AbstractString, delim::Union{AbstractString, AbstractChar})
indices = findfirst(delim, str)
if isnothing(indices)
empty_substring = SubString("")
return SubString(str), empty_substring, empty_substring
else
return SubString(str, 1, prevind(str, first(indices))),
SubString(delim isa AbstractChar ? string(delim) : delim),
SubString(str, nextind(str, last(indices)))
end
end
using BenchmarkTools, Random
str_len = 100_000
ntries = 3
@info "Benchmarking methods for random string of length $str_len\n"
for i in 1:ntries
global str = randstring(str_len)
global rand_c = rand(str)
global rand_str = string(rand_c)
for f in [:partition, :my_partition]
@info "$i/$ntries: Checking $f for character $rand_c"
@eval @btime $f(str, rand_c)
@info "$i/$ntries: Checking $f for string $rand_c"
@eval @btime $f(str, rand_c)
end
end