spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-49490][SQL] Add benchmarks for initCap

Open mrk-andreev opened this issue 1 year ago • 3 comments

What changes were proposed in this pull request?

Add benchmarks for all codepaths of initCap, namely, paths that call:

  • execBinaryICU
  • execBinary
  • execLowercase
  • execICU

Why are the changes needed?

Requested by jira ticket SPARK-49490.

Does this PR introduce any user-facing change?

No

How was this patch tested?

The benchmark was tested locally by performing a manual run.

Was this patch authored or co-authored using generative AI tooling?

No

mrk-andreev avatar Oct 16 '24 16:10 mrk-andreev

Results of local run InitCapBenchmark-local.txt

Sample

Running benchmark: InitCap evaluation [wc=1000, wl=16, capitalized=false]
  Running case: execICU
  Stopped after 8978 iterations, 2000 ms
  Running case: execBinaryICU
  Stopped after 6235 iterations, 2000 ms
  Running case: execBinary
  Stopped after 28374 iterations, 2000 ms
  Running case: execLowercase
  Stopped after 8839 iterations, 2000 ms

OpenJDK 64-Bit Server VM 17.0.2+8-86 on Linux 5.15.0-122-generic
Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
InitCap evaluation [wc=1000, wl=16, capitalized=false]:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
--------------------------------------------------------------------------------------------------------------------------------------
execICU                                                             0              0           0     432768.3           0.0       1.0X
execBinaryICU                                                       0              0           0     285450.1           0.0       0.7X
execBinary                                                          0              0           0    1494256.8           0.0       3.5X
execLowercase                                                       0              0           0     415082.4           0.0       1.0X

Open questions

  1. Should we place the benchmark code in the same package, 'unsafe,' or at the 'SQL level'? If it's in 'unsafe,' should we extract the shared code for benchmarks into a shared library?
  2. The benchmark output expects each measurement to be at least 1 ms, but this isn't the case here. Should we align the rounding to the first non-zero digit after the decimal point?
  3. How detailed do we expect the benchmarks to be? Do we want different axes of variation, or should we stick to defaults like parameters?

mrk-andreev avatar Oct 16 '24 17:10 mrk-andreev

Can we include the benchmark result files too? See also "Testing with GitHub Actions workflow" at https://spark.apache.org/developer-tools.html

HyukjinKwon avatar Oct 17 '24 00:10 HyukjinKwon

Let's place the backmark at the SQL level so far.

Done

Can we include the benchmark result files too?

Done

mrk-andreev avatar Oct 19 '24 10:10 mrk-andreev

@uros-db @mihailom-db @viktorluc-db Could you review this PR, please.

MaxGekk avatar Oct 22 '24 12:10 MaxGekk

@mrk-andreev Could you intergrate your benchmark into CollationBenchmark, please, as @uros-db pointed out https://github.com/apache/spark/pull/48501#pullrequestreview-2385543767. Otherwise we might forget to re-run your benchmark while benchmarking collation related code.

MaxGekk avatar Nov 07 '24 19:11 MaxGekk

@mrk-andreev Could you intergrate your benchmark into CollationBenchmark, please, as @uros-db pointed out https://github.com/apache/spark/pull/48501#pullrequestreview-2385543767. Otherwise we might forget to re-run your benchmark while benchmarking collation related code.

@MaxGekk , done.

mrk-andreev avatar Nov 12 '24 21:11 mrk-andreev

cc: @MaxGekk

Related work

This is not related to my code changes but rather to the benchmarks we are modifying. It might be worth starting a separate thread in the dev mailing list or creating an additional ticket in Jira, which I would be happy to handle.

Blackhole

I would like to point out that the current implementation of org.apache.spark.benchmark.Benchmark::addCase does not use any form of Blackhole (Blackhole in JMH), which could lead to dead-code elimination. However, I have not observed this issue in the existing tests. This is likely due to the complexity and side effects of the code being benchmarked, which prevents such elimination.

Would it be a good idea to consider adding this as a feature in the future?

Context

org.apache.spark.benchmark.Benchmark::addCase

  def addCase(name: String, numIters: Int = 0)(f: Int => Unit): Unit = {
    addTimerCase(name, numIters) { timer =>
      timer.startTiming()
      f(timer.iteration)
      timer.stopTiming()
    }
  }

Async-profiler

I suggest adding Async Profiler, a low-overhead sampling profiler, to all benchmark runs. This will help us identify the causes of performance degradation.

Would it also be worth considering adding this as a feature in the future?

mrk-andreev avatar Nov 16 '24 16:11 mrk-andreev

Hi @MaxGekk, @stevomitric,

Does this PR need any additional changes? Are there any blockers we should address? Let me know how I can help to move it forward!

mrk-andreev avatar Nov 19 '24 21:11 mrk-andreev

+1, LGTM. Merging to master. Thank you, @mrk-andreev and @stevomitric @uros-db for review.

MaxGekk avatar Nov 21 '24 08:11 MaxGekk