quick-xml icon indicating copy to clipboard operation
quick-xml copied to clipboard

`memchr` vs `stringzilla` performance comparison

Open RoloEdits opened this issue 1 year ago • 2 comments

Came across a benchmarking comparison between the two.

Notably the results:

ASCII ⏩ ASCII ⏪ UTF8 ⏩ UTF8 ⏪
Intel:
memchr 5.89 GB/s 1.08 GB/s 8.73 GB/s 3.35 GB/s
stringzilla 8.37 GB/s 8.21 GB/s 11.21 GB/s 11.20 GB/s
Arm:
memchr 6.38 GB/s 1.12 GB/s 13.20 GB/s 3.56 GB/s
stringzilla 6.56 GB/s 5.56 GB/s 9.41 GB/s 8.17 GB/s
Average 1.2x faster 6.2x faster - 2.8x faster

Its noted that that rust crate doesn't cover the full c++ api, but that it is planned to do so eventually. In the interest of performance, I thought I would share the benchmark results so informed exploring can be done if desired, if the potential gains match up with any wins for one crate or the other.

RoloEdits avatar Feb 24 '24 06:02 RoloEdits

Very interesting! If I understand correctly, this is results of benchmarks of crates themselves, you didn't integrate stringzilla to quick-xml, right? I'm always open in performance improvements so if you will you can create a PR with a replacement so everyone can experiment with such change. Note, however, that these results probably from searching small patterns in long strings. XML usually has a different access pattern -- many searches of small patterns in small strings. As you can see from quick-xml self benchmarks, the maybe_xml is even faster than quick-xml in most cases, although it does not use any SIMD libs. quick-xml wins only on very long XMLs (several megabytes) which, I think, usually a rare case.

Mingun avatar Feb 24 '24 13:02 Mingun

BurntSushi provided a response on Reddit, it seems like the benchmarks are a bit misleading, there are some circumstances in which StringZilla is faster but on average it seems to be slower.

https://www.reddit.com/r/rust/comments/1ayngf6/memchr_vs_stringzilla_benchmarks_up_to_7x/

dralley avatar Feb 25 '24 00:02 dralley