adjust Silver Job Runs module configuration
enable auto-optimized shuffle for module 2011
originally implemented for Spark 3.1.2 in commit https://github.com/databrickslabs/overwatch/commit/d751d5fc75c939892b73f877cb0e5542eb2cc030 on branch 1228-silver-job-runs-spark312-r0812 as part of #1253.
This PR removes all of the new utilities and transformation refactoring that were only aids to development and testing. They did not impact performance in any significant way.
The essential change brought to this branch (1228-optimization-only) is entirely expressed in commit https://github.com/databrickslabs/overwatch/commit/8c9ee79d20a4904ecd5aa2908715179c58e615e1. The new code introduced is here:
https://github.com/databrickslabs/overwatch/blob/8c9ee79d20a4904ecd5aa2908715179c58e615e1/src/main/scala/com/databricks/labs/overwatch/pipeline/Silver.scala#L271-L274
The background and analysis of the optimization presented in the description of #1253 is still representative of the performance improvements realized by this change.
proof notebook (IN PROGRESS)
Corresponding job runs for before/after comparison of this change:
0.8.1.2 |
0.8.2.0-SNAPSHOT |
|---|---|
| Run 647559332994892 (210402 rows in 19.2 mins) | Run 455980391763738 (210402 rows in 8.13 mins) |
| Run 265434635290698 (483230 rows in 20.53 mins) | Run 635176874378143 (483230 rows in 8.9 mins) |
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
Added row counts and timings to second set of comparison runs to table in description. ☝️
[ The following content is copy-pasted from #1253. ]
Introduction
A typical run of Overwatch module 2011 (JobsRuns_Silver, "JR") using release 0.8.1.2 to process 10 days of data in workspace e2-demo-west-ws produces this console output:
. . .
COMPLETED: 2011-Silver_JobsRuns --> Workspace ID: 2556758628403379
TIME RANGE: 2024-05-09 00:00:00 -> 2024-05-19 00:00:00
SUCCESS! Silver_JobsRuns
OUTPUT ROWS: 210402
RUNTIME MINS: 16.25 --> Workspace ID: 2556758628403379
. . .
The corresponding run of the code from this feature branch (1228-silver-job-runs-spark312-r0812) in an otherwise identical environment against the same upstream data goes like this:
. . .
COMPLETED: 2011-Silver_JobsRuns --> Workspace ID: 2556758628403379
TIME RANGE: 2024-05-09 00:00:00 -> 2024-05-19 00:00:00
SUCCESS! Silver_JobsRuns
OUTPUT ROWS: 210402
RUNTIME MINS: 5.25 --> Workspace ID: 2556758628403379
. . .
This ~67% reduction in runtime minutes is characteristic of the changes introduced here. Below are some details on how this was achieved, but first a little background.
Background
Databricks AQE shuffle auto-optimization
As part of Overwatch release 0.7.1.1, #675 introduced the following Spark configuration globally for all modules:
https://github.com/databrickslabs/overwatch/blob/ae56406ea5aef1e1e8a40b3653a9e6df70922458/src/main/scala/com/databricks/labs/overwatch/utils/SparkSessionWrapper.scala#L20
This is a component of a larger set of features called Adaptive Query Execution (AQE) that was released with Databricks Runtime (DBR) 7.0 and Spark 3.0.0. See this Databricks Engineering Blog article for more details on AQE in general.
Soon thereafter performance degradation in certain Overwatch modules was reported in https://github.com/databrickslabs/overwatch/issues/794#issue-1607557972:
In an attempt to increase performance -- a regression was made that results in the larger modules (i.e. spark table merges) to have WAY too few tasks resulting in extremely long -- never finishing runtimes.
Solution -- disable this flag.
. . . and the autoOptimizeShuffle feature was disabled in #790 for release 0.7.1.2:
https://github.com/databrickslabs/overwatch/blob/f9c8dd088ea0e81d4a311368a5c47b1a4f9ad375/src/main/scala/com/databricks/labs/overwatch/utils/SparkSessionWrapper.scala#L20
JR shuffle factor
Prior to the AQE shuffle auto-optimization described in the previous section, Overwatch release 0.7.1.0 (#563, #661) added the shuffleFactor parameter to the definition of jobRunsModule:
https://github.com/databrickslabs/overwatch/blob/bc136594b1e9d08c77339d6a31f7c87bfe2e7202/src/main/scala/com/databricks/labs/overwatch/pipeline/Silver.scala#L246
Based on this shuffleFactor value of 12.0 and the job cluster configuration used for these tests, Module.optimizeShufflePartitions() currently sets spark.sql.shuffle.partitions to 9600:
https://github.com/databrickslabs/overwatch/blob/3f4cdca4d8438550b13c38fec43eea60bdbee877/src/main/scala/com/databricks/labs/overwatch/pipeline/Module.scala#L79-L102
Results
With AQE shuffle auto-optimization disabled in 0.8.1.2, many Spark stages for JR are forced to run 9600 tasks or more. This results in extended stage durations (grey verticals are minutes) and additional executors, (i.e. cluster auto-scaling, blue verticals) in the following chart from the Spark UI:
This PR introduced the following change in the Silver JR module's configuration in commit https://github.com/databrickslabs/overwatch/commit/d751d5fc75c939892b73f877cb0e5542eb2cc030: https://github.com/databrickslabs/overwatch/blob/d751d5fc75c939892b73f877cb0e5542eb2cc030/src/main/scala/com/databricks/labs/overwatch/pipeline/Silver.scala#L271-L274
Running 0.8.1.2 with the enhancements in this PR, not only are the corresponding Spark job durations decreased but recruitment of additional executors is deferred to later phases of the module:
Note that differences in Spark job labels in these screenshots is due to other provisional features unrelated to Spark configuration or performance that were included in the Overwatch snapshot JAR used for these test runs. Those special labels are aids to development, testing, and troubleshooting to be introduced in #1223. Despite other minor refactoring in the JR module logic the overall contours of the progression of Spark jobs is recognizable because these runs were restricted to only this module for the same date range. Spark job 64 from 20:28 until 20:36 in the upper timeline corresponds to jobRunsAppendMeta (label truncated) at 20:23, for example. This phase in particular seems to be where the overall duration of the module has been decreased most significantly.
Closer examination of the task statistics for corresponding jobs shows that many tasks created in the absence of shuffle auto-optimization are effectively doing nothing!
With shuffle auto-optimization relaxed we see a much more favorable distribution of task durations and bytes processed:
This trend is similarly apparent in other phases of the JR module.
Conclusion
This PR has produced demonstrable performance improvements for the following dataframe transformations:
-
SilverTransforms.filterAuditLog -
WorkflowsTransforms.jobRunsDeriveRunsBase -
SilverTransforms. jobRunsAppendJobMeta
Next steps
Further gains in resource utilization and time efficiency may be possible in the subsequent phases of the JR module (2011):
closes #1228