hermes icon indicating copy to clipboard operation
hermes copied to clipboard

HERMES produces incorrect license output when harvested metadata contains multiple licenses

Open Aidajafarbigloo opened this issue 4 months ago • 2 comments

When harvested metadata includes more than one license, the merging process (hermes process command) does not handle it correctly and produces malformed output.

Action: Harvest metadata from CITATION.cff and codemeta.json.

Two Scenarios: Scenario 1 (works as expected): Both cff and codemeta contain only one license each, e.g.: "license": [ [ "https://spdx.org/licenses/Apache-2.0", { "plugin": "cff", "local_path": "CITATION.cff", "timestamp": "2025-09-29T11:15:38.975971", "harvester": "cff" } ] ] , and "license": [ [ "https://spdx.org/licenses/Apache-2.0", { "plugin": "codemeta", "local_path": "codemeta.json", "timestamp": "2025-09-29T11:15:41.690517", "harvester": "codemeta" } ] ] .

The merging works correctly in this scenario.

Scenario 2 (problematic):

  • cff contains one license.
  • codemeta contains multiple licenses (array). Example: "license": [ [ "https://spdx.org/licenses/Apache-2.0", { "plugin": "cff", "local_path": "CITATION.cff", "timestamp": "2025-09-29T11:15:38.975971", "harvester": "cff" } ] ] , and "license[0]": [ [ "https://spdx.org/licenses/Apache-2.0", { "plugin": "codemeta", "local_path": "codemeta.json", "timestamp": "2025-09-29T11:15:41.690517", "harvester": "codemeta" } ] ], "license[1]": [ [ "https://spdx.org/licenses/CC-BY-4.0", { "plugin": "codemeta", "local_path": "codemeta.json", "timestamp": "2025-09-29T11:15:41.690517", "harvester": "codemeta" } ] ], "license[2]": [ [ "https://spdx.org/licenses/CC0-1.0", { "plugin": "codemeta", "local_path": "codemeta.json", "timestamp": "2025-09-29T11:15:41.690517", "harvester": "codemeta" } ] ] .

The resulting hermes.json output is broken: "license": [ "h","t","t","p","s",":","/","/","s","p","d","x",".","o","r","g","/","l","i", "c","e","n","s","e","s","/","A","p","a","c","h","e","-","2",".","0" ] .

Aidajafarbigloo avatar Sep 29 '25 09:09 Aidajafarbigloo

Hi @Aidajafarbigloo, thanks for the report! I can reproduce this problem 👍🏻

Since DLR is currently refactoring the data model, we should definitely add this as a test case. I don't know what the current state of the refactoring is, so I'm not sure whether it makes sense to fix this issue now or just wait for the new data model. 🤔

zyzzyxdonta avatar Oct 02 '25 06:10 zyzzyxdonta

Hi @zyzzyxdonta, thanks for the response.

Aidajafarbigloo avatar Oct 02 '25 13:10 Aidajafarbigloo