hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-28978:Remove Json-Path from Hive

Open devaspatikrishnatri opened this issue 8 months ago • 14 comments

What changes were proposed in this pull request?

json-smart comes as a part of json-path , and the latest release of json-path was 2.9.0 in Jan 2024 , which brings in json-smart 2.5.0.This drops json-path explicitly from hive as it is not used.

Why are the changes needed?

fix CVEs

Does this PR introduce any user-facing change?

No

How was this patch tested?

Built locally, relying on pre-commits.

devaspatikrishnatri avatar May 29 '25 15:05 devaspatikrishnatri

Hmm. https://github.com/json-path/JsonPath/pull/1030

okumin avatar Jun 01 '25 08:06 okumin

Reviewing. I verified net.minidev:json-smart:jar:2.5.0 and net.minidev:accessors-smart:jar:2.5.0 come from the json-path library on the master branch. https://github.com/apache/hive/actions/runs/15371374316/job/43250990748

[INFO] |  +- com.jayway.jsonpath:json-path:jar:2.9.0:runtime
[INFO] |  |  \- net.minidev:json-smart:jar:2.5.0:runtime
[INFO] |  |     \- net.minidev:accessors-smart:jar:2.5.0:runtime

okumin avatar Jun 01 '25 08:06 okumin

I verified this patch upgrades all places using net.minidev:json-smart and net.minidev:accessors-smart. It also upgrade org.ow2.asm:asm:jar. https://github.com/apache/hive/actions/runs/15340007646/job/43164237620?pr=5831

[INFO] +- com.jayway.jsonpath:json-path:jar:2.9.0:runtime
[INFO] +- net.minidev:json-smart:jar:2.5.2:runtime
[INFO] |  \- net.minidev:accessors-smart:jar:2.5.2:runtime
[INFO] |     \- org.ow2.asm:asm:jar:9.7.1:runtime

okumin avatar Jun 01 '25 09:06 okumin

Nashorn uses asm and Nashorn was included by the JDK upgrade. https://github.com/apache/hive/commit/413069ea0faeb893fe78faa215b08a70ece80595#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8

[INFO] |  +- org.openjdk.nashorn:nashorn-core:jar:15.4:compile
[INFO] |  |  +- org.ow2.asm:asm:jar:7.3.1:compile
[INFO] |  |  +- org.ow2.asm:asm-commons:jar:7.3.1:compile
[INFO] |  |  |  \- org.ow2.asm:asm-analysis:jar:7.3.1:compile
[INFO] |  |  +- org.ow2.asm:asm-tree:jar:7.3.1:compile
[INFO] |  |  \- org.ow2.asm:asm-util:jar:7.3.1:compile

I wonder if we are allowed to include Nashorn or if we can remove it. Anyway, I'm asking it first. https://lists.apache.org/thread/yyy1fckg2bjvs309bcvszbcr9jtqxvn2

okumin avatar Jun 01 '25 09:06 okumin

@zabetak I think hive-exec directly references json-path. https://github.com/apache/hive/blob/81c02a72e88dee8bf28a88b23401e9c3f1fad4a7/ql/pom.xml#L853 But I can try this upgrade in calcite as well.

devaspatikrishnatri avatar Jun 03 '25 08:06 devaspatikrishnatri

Have tried removing json-path completely from hive. DepTree. Now this dependency comes only from Calcite(1.33) at a version of 2.7.0. If pre-commits pass and this is indeed not required then maybe we can drop this dependency from hive completely and just keep the upgrades on calcite side , as after Jan 2024 there does not seem to be any new version of json-path.

devaspatikrishnatri avatar Jun 04 '25 12:06 devaspatikrishnatri

@devaspatikrishnatri By removing the json-path, I was thinking actually excluding it from calcite and not having it at all in the Hive dependency tree.

zabetak avatar Jun 05 '25 11:06 zabetak

@zabetak Let me try that. I was under wrong impression.

devaspatikrishnatri avatar Jun 05 '25 15:06 devaspatikrishnatri

Dep Tree. This now completely removes it from Hive , waiting for pre-commits.

devaspatikrishnatri avatar Jun 10 '25 08:06 devaspatikrishnatri

@zabetak I do not think we can completely drop json-path , there are multiple test failures like this "missing com/jayway/jsonpath/spi/mapper/MappingProvider".

We can remove all direct references from hive and keep it coming from caclite , and maintain this dependency there itself. Or We can keep updating the json-path and json-smart versions explicitly in hive.

devaspatikrishnatri avatar Jun 10 '25 14:06 devaspatikrishnatri

@zabetak Any suggestions on which of the above two is the preferred approach ?

devaspatikrishnatri avatar Jun 12 '25 07:06 devaspatikrishnatri

@devaspatikrishnatri The way it seems is that we have a problem in Calcite and by pinning the dependency in Hive we are making it a Hive problem as well. Probably the best would be to let Calcite it deal with its transitive dependencies. Anyways pinning version is in general risky especially when we start touching transitively L+3 (calcite->json-path->json-smart) dependencies.

Summing up I am leaning towards removing all direct references of json-path from hive.

zabetak avatar Jun 17 '25 16:06 zabetak

Got it . Will do the needful. Dep-Tree after dropping json-path from hive.

devaspatikrishnatri avatar Jun 18 '25 07:06 devaspatikrishnatri

@devaspatikrishnatri is there a plan to fix failed tests for this change.

arorasimran0309 avatar Jun 23 '25 10:06 arorasimran0309

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.

github-actions[bot] avatar Sep 12 '25 00:09 github-actions[bot]