Improve `String#casecmp?` and `Symbol#casecmp?` performance
Background
String#casecmp? method was suggested to RuboCop Performance to use as a fast method.
https://github.com/rubocop-hq/rubocop-performance/issues/100
However String#casecmp? is always slower than String#casecmp("arg").zero?.
Summary
This PR improves String#casecmp? and Symbol#casecmp? performance when comparing ASCII characters.
OTOH, in the case of Non-ASCII characters, performance is slightly degraded by added condition.
And I found #1668 which is essentially the same.
Benchmark
The following is a benchmark case for String#casecmp? with similar cases.
require 'benchmark/ips'
puts '* ASCII'
Benchmark.ips do |x|
x.report('upcase') { 'String'.upcase == 'string' }
x.report('downcase') { 'String'.downcase == 'string' }
x.report('casecmp.zero?') { 'String'.casecmp('string').zero? }
x.report('casecmp?') { 'String'.casecmp?('string') }
x.compare!
end
puts '* Non-ASCII'
Benchmark.ips do |x|
x.report('upcase') { 'äö '.upcase == ('ÄÖÜ') }
x.report('downcase') { 'äö '.downcase == ('ÄÖÜ') }
x.report('casecmp.zero?') { 'äö '.casecmp('ÄÖÜ').zero? }
x.report('casecmp?') { 'äö '.casecmp?('ÄÖÜ') }
x.compare!
end
Before
* ASCII
Warming up --------------------------------------
upcase 200.428k i/100ms
downcase 197.503k i/100ms
casecmp.zero? 216.244k i/100ms
casecmp? 171.257k i/100ms
Calculating -------------------------------------
upcase 4.326M (± 2.9%) i/s - 21.646M in 5.008511s
downcase 4.350M (± 2.3%) i/s - 21.923M in 5.042694s
casecmp.zero? 4.998M (± 3.7%) i/s - 25.084M in 5.025253s
casecmp? 3.090M (± 2.9%) i/s - 15.584M in 5.047497s
Comparison:
casecmp.zero?: 4998357.4 i/s
downcase: 4349766.0 i/s - 1.15x slower
upcase: 4325765.3 i/s - 1.16x slower
casecmp?: 3090194.2 i/s - 1.62x slower
* Non-ASCII
Warming up --------------------------------------
upcase 137.352k i/100ms
downcase 136.735k i/100ms
casecmp.zero? 206.341k i/100ms
casecmp? 96.326k i/100ms
Calculating -------------------------------------
upcase 2.215M (± 2.7%) i/s - 11.126M in 5.027582s
downcase 2.199M (± 2.5%) i/s - 11.076M in 5.038781s
casecmp.zero? 4.709M (± 1.8%) i/s - 23.729M in 5.041362s
casecmp? 1.315M (± 1.6%) i/s - 6.646M in 5.054479s
Comparison:
casecmp.zero?: 4708502.7 i/s
upcase: 2214590.5 i/s - 2.13x slower
downcase: 2199478.9 i/s - 2.14x slower
casecmp?: 1315326.8 i/s - 3.58x slower
After
casecmp? performance for ASCII characters will be improved.
* ASCII
Warming up --------------------------------------
upcase 206.368k i/100ms
downcase 205.971k i/100ms
casecmp.zero? 225.636k i/100ms
casecmp? 232.030k i/100ms
Calculating -------------------------------------
upcase 4.337M (± 1.3%) i/s - 21.875M in 5.045187s
downcase 4.297M (± 1.5%) i/s - 21.627M in 5.033750s
casecmp.zero? 5.240M (± 1.4%) i/s - 26.399M in 5.038613s
casecmp? 5.548M (± 1.5%) i/s - 27.844M in 5.019895s
Comparison:
casecmp?: 5547954.3 i/s
casecmp.zero?: 5240466.7 i/s - 1.06x slower
upcase: 4336522.3 i/s - 1.28x slower
downcase: 4297370.8 i/s - 1.29x slower
* Non-ASCII
Warming up --------------------------------------
upcase 141.841k i/100ms
downcase 141.233k i/100ms
casecmp.zero? 215.401k i/100ms
casecmp? 94.625k i/100ms
Calculating -------------------------------------
upcase 2.238M (± 2.1%) i/s - 11.205M in 5.010247s
downcase 2.243M (± 1.3%) i/s - 11.299M in 5.038967s
casecmp.zero? 4.611M (± 2.6%) i/s - 23.048M in 5.001970s
casecmp? 1.255M (± 2.4%) i/s - 6.340M in 5.055611s
Comparison:
casecmp.zero?: 4610838.3 i/s
downcase: 2242621.9 i/s - 2.06x slower
upcase: 2237553.1 i/s - 2.06x slower
casecmp?: 1254739.1 i/s - 3.67x slower
Symbol#casecmp? have similar results.
Other Information
This is an off-topic for this PR.
I inform RuboCop Performance that String#casecmp and String#casecmp? behave differently when using Non-ASCII characters.
'äöü'.casecmp('ÄÖÜ').zero? #=> false
'äöü'.casecmp?('ÄÖÜ') #=> true
You can add a benchmark script under benchmark/, like 5e897227ff3d37a36be96bb2c082370d437058ea, instead of writing in a commit log, and -omarkdown options makes a markdown table.
Iteration per second (i/s)
| compare-ruby | built-ruby | |
|---|---|---|
| casecmp_p-1 | 2.245M | 11.624M |
| casecmp_p-10 | 1.436M | 1.955M |
| casecmp_p-100 | 283.525k | 214.075k |
| casecmp_p-1000 | 30.240k | 21.550k |
| casecmp_p-nonascii1 | 669.295k | 656.785k |
| casecmp_p-nonascii10 | 112.033k | 111.158k |
| casecmp_p-nonascii100 | 12.075k | 11.985k |
| casecmp_p-nonascii1000 | 1.201k | 1.192k |
Related bug tracker issue https://bugs.ruby-lang.org/issues/13750
Rebased this PR, and run with adding a benchmark as the following,
diff --git i/benchmark/string_casecmp_p.yml w/benchmark/string_casecmp_p.yml
index a790ce7d55..8bb581aed7 100644
--- i/benchmark/string_casecmp_p.yml
+++ w/benchmark/string_casecmp_p.yml
@@ -1,2 +1,4 @@
prelude: |
+ cstr = 'String'
+ lstr = 'string'
lstr1 = [*"a".."z",*"0".."9"].join("")
@@ -18,2 +20,3 @@ prelude: |
benchmark:
+ casecmp_p-ascii: cstr.casecmp?(lstr)
casecmp_p-1: lstr1.casecmp?(ustr1)
but can't see significant differences now...
Iteration per second (i/s)
| compare-ruby | built-ruby | |
|---|---|---|
| casecmp_p-ascii | 7.607M | 7.601M |
| 1.00x | - | |
| casecmp_p-1 | 2.238M | 2.281M |
| - | 1.02x | |
| casecmp_p-10 | 1.399M | 1.414M |
| - | 1.01x | |
| casecmp_p-100 | 271.704k | 274.164k |
| - | 1.01x | |
| casecmp_p-1000 | 29.003k | 30.158k |
| - | 1.04x | |
| casecmp_p-nonascii1 | 655.054k | 672.244k |
| - | 1.03x | |
| casecmp_p-nonascii10 | 108.929k | 111.451k |
| - | 1.02x | |
| casecmp_p-nonascii100 | 11.656k | 11.808k |
| - | 1.01x | |
| casecmp_p-nonascii1000 | 1.163k | 1.180k |
| - | 1.01x |
It's still would be useful for me. Pretty common case, especially when File::FNM_CASEFOLD is broken for Dir.glob in Ruby 3.1 on Linux.