DIRAC icon indicating copy to clipboard operation
DIRAC copied to clipboard

[v8r0] Improve getTransformationFiles performance

Open chrisburr opened this issue 1 year ago • 1 comments

This significantly improves the performance of getTransformationFiles by:

  • Using a JOIN instead of manually looking up the LFNs from file IDs
  • Remove the batching that was needed due to the use of __getFileIDsForLfns
    • This is doubly helpful as the use of OFFSET N LIMIT 10000 made the function O(N^2) due to the database having to rebuild and scan the results for every batch.
  • Giving the option to only get the columns you need

For example in LHCb: this optimises a ~1200 second call to be ~20 seconds.

BEGINRELEASENOTES

*TransformationSystem NEW: CHANGE: Improve getTransformationFiles performance

ENDRELEASENOTES

chrisburr avatar Oct 02 '24 11:10 chrisburr

I've rolled back incorrect commit and added the comment as suggested in https://github.com/DIRACGrid/DIRAC/pull/7812#discussion_r1786417196 so this should be good to go

chrisburr avatar Oct 08 '24 20:10 chrisburr

Sweep summary

Sweep ran in https://github.com/DIRACGrid/DIRAC/actions/runs/11324634755

Successful:

  • integration

DIRACGridBot avatar Oct 14 '24 09:10 DIRACGridBot