Poor performance
I have benchmarked jsonpath.value(obj, "$.a.b.c") with 100K objects. It took > 40 seconds to compute. This looked way too slow for such a simple lookup. Looking at the code it is obvious that it is computing the same values over and over again. E.g. functions: parse, stringify, normalize all could be cached (memoized) with the computed values reused.
A quick prototype showed > 10 times performance gains. Instead of 40+ seconds for 100K objects, the benchmark took < 4 seconds.
The only dependency I introduced was the package underscore for _.memoize() function.
I am going to submit a pull request with my changes and hope they would be quickly approved. One of my large customers is using this package and they could definitely use the speedup.
Isn't there a simpler solution here: allow jsonpath.value() (etc.) to accept an already-parsed query? This should give the same performance win as the proposed memoization PR, without this module having to worry about being the cause of memory problems. In this scheme, it is up to the caller to pass either a string or a pre-parsed string; and, for the latter, it is up to the caller to store that parsed form wherever makes sense.
I notice that jsonpath.value(obj, jsonpath.parse(myQueryString)) actually works, but jsonpath ignores the fact that I'm passing it a pre-parsed query string (turning my parsed query back into a query string! and then re-parsing it!)
The fix, as far as I can tell, only involves three changes:
- in
value(), avoid the call to stringify if the givenpathis not a string (or array) - in
query(), the assert needs to perhaps check thattypeof string !== 'undefined'rather than its current check_is_string(string) - in
nodes(), only parse strings:
var path = typeof string === 'string' ? this.parser.parse(string) : string;
(oh, and separately, @types/jsonpath will need to have value and query accept the string | ReturnType<typeof JSONpath.prototype.parse> instead of only string)
Sounds like a better solution to me.
Vassili Gorshkov
On Mon, Feb 15, 2021 at 8:24 AM Nick Mitchell [email protected] wrote:
Isn't there a simpler solution here? I notice that jsonpath.value(obj, jsonpath.parse(myQueryString)) actually works, but jsonpath ignores the fact that i'm passing it a pre-parsed query string (oddly jsonpath.stringify(jsonpath.parse(qs)) == qs)
With a few simple updates, we could allow jsonpath.value() (etc.) to accept a parsed query. THe fix, as far as I can tell, only involves three changes:
- in value(), avoid the call to stringify if the given path is not a string
- in query(), the assert needs to perhaps change that typeof string !== 'undefined' rather than its current check _is_string(string)
- in nodes(), only parse strings:
var path = typeof string === 'string' ? this.parser.parse(string) : string;
(oh, and @types/jsonpath will need to have value and query accept the string | ReturnType
instead of only `string) This should give the same performance win as the proposed memoization PR, without this module having to worry about being the cause of memory problems. It is up to the caller to pass either a string or a pre-parsed string; and, for the latter, it is up to the caller to store that parsed form wherever makes sense.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dchester/jsonpath/issues/74#issuecomment-779327748, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC3RENMVQSXWMQEPFXAYDYDS7FDCNANCNFSM4D2CI7UA .
-- Vassili Gorshkov Atlas Consultants Group, Inc.