Push Evaluation Strategy performance issue
If I configure a repository with Push, the performance of the SPARQL query is extremely slow and might not return within timeout. If I create the same repository but with Push disabled the performance is reasonable.
Background:
- The SPARQL query is very complex (paths, sub-queries etc): would this impact the Push strategy?
- I have Halyard running in a single node configuration: would this also impact use of the Push strategy?
- HBase1.2.6/Hadoop2.7.4/Halyard2.5
Push strategy is beneficial when:
- there is a latency in communication with HBase region servers (network multi-node cluster installation)
- dataset regions are balanced and equally distributed across multiple region servers so parallel asynchronous communication does not create hot spots (usage of pre-split and control over HBase traffic)
- join optimizer in Push strategy requires stats computation for its optimal work (outdated stats are still much better than none)
- some of the complex patterns are better than others (for example to use filtering rather than minus or organize subqueries so they can be chained)
- query profiler can be used to analyze the reason of long-running queries if all the above failed
Adam
- 2020 v 11:18, Lawrence [email protected]:
If I configure a repository with Push, the performance of the SPARQL query is extremely slow and might not return within timeout. If I create the same repository but with Push disabled the performance is reasonable.
Background:
The SPARQL query is very complex (paths, sub-queries etc): would this impact the Push strategy? I have Halyard running in a single node configuration: would this also impact use of the Push strategy? HBase1.2.6/Hadoop2.7.4/Halyard2.5 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.