integrations-core icon indicating copy to clipboard operation
integrations-core copied to clipboard

Add `selected_message_queues` flag to filter message queues and improve performances

Open amenasria opened this issue 3 years ago • 4 comments

What does this PR do?

This PR adds a selected_message_queues flag to allow watching only a specific list of message queues and improve performances.

Motivation

(Customer request) Sometimes you just want to monitor messages on specific IBM i system message queues. The thing is that the way it's implemented now, the request is being made on all MESSAGE_QUEUE_NAME instead of first filtering by interesting MESSAGE_QUEUE_NAME and after that get the message queue info we want. We suspect the absence of filter to be responsible for the high CPU usage on IBM i hosts so we need to fix that.

Benchmark

To check that the feature really improved the performance I ran some performance tests:

We will be monitoring the QSYSOPR message queue and CECUSER (the user) message queue (those names are not important, we just want to have 2 different message queues). We will create messages in the CECUSER message queue and measure the query execution time on both QSYSOPR, CECUSER, without filter. We will be satisfied if the time for QSYSOPR remains nearly constant, QSYSOPR increases and the without filter increases too

The query is

SELECT MESSAGE_QUEUE_NAME, MESSAGE_QUEUE_LIBRARY, COUNT(*), SUM(CASE WHEN SEVERITY >= 50 THEN 1 ELSE 0 END) FROM QSYS2.MESSAGE_QUEUE_INFO {message_queues_filter} GROUP BY MESSAGE_QUEUE_NAME, MESSAGE_QUEUE_LIBRARY

And the message_queues_filter will take the value:

  • WHERE MESSAGE_QUEUE_NAME IN ('QSYSOPR') for the QSYSOPR query.
  • WHERE MESSAGE_QUEUE_NAME IN ('CECUSER') for the CECUSER query.
  • for the unfiltered query.

Now that we're all on the same page, here are the time results about the query:

(Tests ran on IBM i 7.4 PowerVM POWER9 LPAR 1 vCPU 2048 Mo RAM)

# CECUSER Jobs QSYSOPR query time CECUSER query time No filter query time
0 0.67s 0.36s 0.45s
220 0.40s 0.53s 0.48s
960 0.60s 0.62s 0.62s
2150 0.43s 0.72s 0.72s
3600 0.38s 0.97s 0.91s
6700 0.43s 1.24s 1.20s
9300 0.48s 1.70s 1.57s
10830 0.35s 1.66s 1.64s
12000 0.46s 1.86s 1.85s
66400 0.86s 8.46s 8.64s

So our expectations finally realise as we see the QSYSOPR query time being constant compared to the other two queries. This means less load on the CPU for the filtered query, mission completed !

Additional Notes - Scripts for reproducibility

IBM i is not a Unix-like OS, so I think it's important to detail the scripts used, both for rigor and for reproducibility purposes.

To create the jobs on the VM the following command was used:

max=1000
index=0
while [ $index -lt $max ] ; do
 let index+=1
 echo Job $index
 system "SBMJOB JOBD(QBATCH) JOB(WSYS) JOBQ(QBATCH) CMD(WRKSYSSTS)"
done

To measure the time taken during a query the following command was used:

qsh -c "db2 -t  \"select distinct current_timestamp from sysibm.sysdummy1;\";" | head -n 4 |tail -n 1; qsh -c "db2 \"$query\""; qsh -c "db2 -t  \"select distinct current_timestamp from sysibm.sysdummy1;\";" | head -n 4 |tail -n 1

Review checklist (to be filled by reviewers)

  • [ ] Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • [ ] PR title must be written as a CHANGELOG entry (see why)
  • [ ] Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • [ ] PR must have changelog/ and integration/ labels attached

amenasria avatar Aug 23 '22 15:08 amenasria

The validations job has failed; please review the Files changed tab for possible suggestions to resolve.

github-actions[bot] avatar Aug 23 '22 16:08 github-actions[bot]

The validations job has failed; please review the Files changed tab for possible suggestions to resolve.

github-actions[bot] avatar Sep 02 '22 14:09 github-actions[bot]

The validations job has failed; please review the Files changed tab for possible suggestions to resolve.

github-actions[bot] avatar Sep 05 '22 12:09 github-actions[bot]

Codecov Report

Merging #12808 (f8ddbea) into master (eecf6d8) will increase coverage by 0.00%. The diff coverage is 100.00%.

Flag Coverage Δ
ibm_i 82.28% <100.00%> (+0.70%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

codecov[bot] avatar Sep 05 '22 13:09 codecov[bot]

The validations job has failed; please review the Files changed tab for possible suggestions to resolve.

github-actions[bot] avatar Sep 26 '22 17:09 github-actions[bot]