openeo-python-client icon indicating copy to clipboard operation
openeo-python-client copied to clipboard

STAC Job Manager: item_id must be string

Open Pratichhya opened this issue 5 months ago • 8 comments

I faced an issue recently where I got the following error when creating a STAC job database. The collection is created as expected, but there is an error when creating the item if a field named "item_id" is not available in my pandas dataframe.

Expecting HTTP status to be any of [200, 201, 202] but received 500 - Internal Server Error, request method=POST
response body:
{"code":"ValidationError","description":"1 validation error for Item\nid\n  Input should be a valid string [type=string_type, input_value=0, input_type=int]\n    For further information visit https://errors.pydantic.dev/2.11/v/string_type"}
Expecting HTTP status to be any of [200, 201, 202] but received 500 - Internal Server Error, request method=POST
response body:
{"code":"ValidationError","description":"1 validation error for Item\nid\n  Input should be a valid string [type=string_type, input_value=500, input_type=int]\n    For further information visit https://errors.pydantic.dev/2.11/v/string_type"}
Traceback (most recent call last):
  File "/data/users/Private/pratixa/nifi/cropsar_px_nifi/create_job_database.py", line 151, in <module>
    main()
  File "/data/users/Private/pratixa/nifi/cropsar_px_nifi/create_job_database.py", line 140, in main
    job_db.initialize_from_df(jobs_df, on_exists='error')
  File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 106, in initialize_from_df
    self.persist(df)
  File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 211, in persist
    self._upload_items_bulk(self.collection_id, all_items)
  File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 247, in _upload_items_bulk
    self._ingest_bulk(chunk)
  File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 230, in _ingest_bulk
    _check_response_status(response, _EXPECTED_STATUS_POST)
  File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 311, in _check_response_status
    response.raise_for_status()
  File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://stac-api-dev.vgt.vito.be/collections/cropsar2d_jobdb_test05082025/bulk_items

In case the item_id is not available, it should consider the index as item_id, but then it would be an integer. https://github.com/Open-EO/openeo-python-client/blob/66499f9da31fb967411efb09e0ecd9075caf2e6f/openeo/extra/job_management/stac_job_db.py#L63

So, in this case, shouldn't we mention it to mandatorily have an item_id column that is string or shouldn't they we adapted to take it as string?

Pratichhya avatar Aug 05 '25 13:08 Pratichhya

Not sure if I could fully express what I meant here. @HansVRP some help please.

fyi @VincentVerelst

Pratichhya avatar Aug 05 '25 13:08 Pratichhya

There seems to be a regression? issue in case that no item_id is provided to stac based job manager when initiating. without setting an item_id, the index is used as an int which gets refused by:

{"code":"ValidationError","description":"1 validation error for Item\nid\n Input should be a valid string [type=string_type, input_value=0, input_type=int]\n For further information visit https://errors.pydantic.dev/2.11/v/string_type"}

HansVRP avatar Aug 05 '25 15:08 HansVRP

while working on #736 I encountered a related problem

this indeed comes from here I think where there is a fallback to numerical item_ids:

https://github.com/Open-EO/openeo-python-client/blob/4dfdf77ca8bb1158b00562d67591752ef9845054/openeo/extra/job_management/stac_job_db.py#L63-L65

soxofaan avatar Aug 07 '25 19:08 soxofaan

@soxofaan I believe we can close this one?

HansVRP avatar Oct 02 '25 12:10 HansVRP

I'm not sure, needs verification.

could you try to reproduce the original issue @Pratichhya ?

soxofaan avatar Oct 03 '25 10:10 soxofaan

I still receive the same erro:

Waiting 5 seconds to make sure the database is deleted Job database cropsar2d_jobdb_03102025 deleted Expecting HTTP status to be any of [200, 201, 202] but received 500 - Internal Server Error, request method=POST response body: {"code":"ValidationError","description":"1 validation error for Item\nid\n Input should be a valid string [type=string_type, input_value=0, input_type=int]\n For further information visit https://errors.pydantic.dev/2.11/v/string_type"} Traceback (most recent call last): File "/data/users/Private/pratixa/nifi/cropsar_px_stac/create_job_database_kanxa.py", line 150, in <module> main() File "/data/users/Private/pratixa/nifi/cropsar_px_stac/create_job_database_kanxa.py", line 139, in main job_db.initialize_from_df(jobs_df, on_exists='error') File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 106, in initialize_from_df self.persist(df) File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 220, in persist self._upload_items_bulk(self.collection_id, all_items) File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 257, in _upload_items_bulk self._ingest_bulk(chunk) File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 240, in _ingest_bulk _check_response_status(response, _EXPECTED_STATUS_POST) File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 321, in _check_response_status response.raise_for_status() File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://stac-api-dev.vgt.vito.be/collections/cropsar2d_jobdb_03102025/bulk_items

Pratichhya avatar Oct 03 '25 12:10 Pratichhya

My openeo version is: openeo-0.45.0

Pratichhya avatar Oct 03 '25 12:10 Pratichhya

This is my test code: https://git.vito.be/users/sharmap/repos/cropsar_px_nifi/browse/create_job_database.py?at=refs%2Ftags%2F10.03.2025#55

Pratichhya avatar Oct 03 '25 12:10 Pratichhya