Restructure skipExisting checks to reuse previous values
Another bottleneck : D
For a large db.xml + a sizeable gamelist this got really expensive to keep doing the same checks. Restructured it to save information that can be reused.
This took my local skipExisting check from ~5mins to ~5seconds
Also, there is still room to improve it but I can't quite figure out the algorithm.
Right now, we iterate over the queue for every gamelist entry, and we call N removes on the queue for each found game. I think It would be better to do a full iteration of the queue, building a cache of "index N has this filepath", simliar to my previous PR. Then do a full cycle of the gamelist to find all indexes to remove. Then remove those in a tight loop (or in batch? is that possible? I havent done c++ in 20 years and never messed with QT).
That must be some gamelist!
Your suggested change of this PR saves one if. I tested it but noticed no measurable difference between the "now" and your changes. However, my gamelist for the platform only has just below 700 entries.
Can you provide evidence for your claim (5min --> 5secs)? Do you compile against Qt6 standard or against Qt5 standard (see manual installation in the readme)? What hardware do you use, incl. storage interface, is the storage network attached?
Are you generating for ES? I didn’t mention but thats what I was fixing, the other frontends I just updated for consistency.
Ive been doing a lot of updating so Ill have to see if I still have a copy of the cache that was showing the pathological case. Note that its not so much that we are saving an if as much as we are saving an if N times (the size of the queue) as well as handling worst case misses (ie the scenarios where we have to run through the whole queue). My gamelist was ~1500 entries with a lot of misses, resource cache probably 100k entries, and ~7k quickida. My only evidence is that it was doing it before I updated it and now its not : D
Im running on mac via docker (dockerfile taken from other PR). I believe thats QT5.
On Mon, Aug 12, 2024 at 5:33 AM Gemba @.***> wrote:
That must be some gamelist! Your suggested change of this PR saves one if. I tested it but noticed no measurable difference between the "now" and your changes. However, my gamelist for the platform only has just below 700 entries.
Can you provide evidence for your claim (5min --> 5secs)? Do you compile against Qt6 standard or against Qt5 standard (see manual installation in the readme)? What hardware do you use, incl. storage interface, is the storage network attached?
— Reply to this email directly, view it on GitHub https://github.com/Gemba/skyscraper/pull/83#issuecomment-2283503448, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTHTTG7YEQG4RMYLC2EODZRB6OBAVCNFSM6AAAAABMJ5OKOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBTGUYDGNBUHA . You are receiving this because you authored the thread.Message ID: @.***>
Yes I did try the ES frontend to reproduce your issue. I saw also your other minor changes.
And yes, the docker file / update_skyskraper.sh script uses Qt5 standard. This takes the qt6 changes on QList/QVector out of the mix.
Seems it is primary caused by a large quickids.xml, db.xml and resource cache. If you have this context it would help to find the real culprit.
The nested loop remove.at() is another issue -as you spotted- which can be optimized, from M*N/2 (in average) to M+N (worst case).
Sorry about the delay, got sidetracked on other projects. I dont have the backup to compare any more
Thanks for keeping me posted. I will close the PR now. Feel free to reopen, when you have evidence.