STAC Job Manager: item_id must be string
I faced an issue recently where I got the following error when creating a STAC job database. The collection is created as expected, but there is an error when creating the item if a field named "item_id" is not available in my pandas dataframe.
Expecting HTTP status to be any of [200, 201, 202] but received 500 - Internal Server Error, request method=POST
response body:
{"code":"ValidationError","description":"1 validation error for Item\nid\n Input should be a valid string [type=string_type, input_value=0, input_type=int]\n For further information visit https://errors.pydantic.dev/2.11/v/string_type"}
Expecting HTTP status to be any of [200, 201, 202] but received 500 - Internal Server Error, request method=POST
response body:
{"code":"ValidationError","description":"1 validation error for Item\nid\n Input should be a valid string [type=string_type, input_value=500, input_type=int]\n For further information visit https://errors.pydantic.dev/2.11/v/string_type"}
Traceback (most recent call last):
File "/data/users/Private/pratixa/nifi/cropsar_px_nifi/create_job_database.py", line 151, in <module>
main()
File "/data/users/Private/pratixa/nifi/cropsar_px_nifi/create_job_database.py", line 140, in main
job_db.initialize_from_df(jobs_df, on_exists='error')
File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 106, in initialize_from_df
self.persist(df)
File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 211, in persist
self._upload_items_bulk(self.collection_id, all_items)
File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 247, in _upload_items_bulk
self._ingest_bulk(chunk)
File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 230, in _ingest_bulk
_check_response_status(response, _EXPECTED_STATUS_POST)
File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 311, in _check_response_status
response.raise_for_status()
File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://stac-api-dev.vgt.vito.be/collections/cropsar2d_jobdb_test05082025/bulk_items
In case the item_id is not available, it should consider the index as item_id, but then it would be an integer. https://github.com/Open-EO/openeo-python-client/blob/66499f9da31fb967411efb09e0ecd9075caf2e6f/openeo/extra/job_management/stac_job_db.py#L63
So, in this case, shouldn't we mention it to mandatorily have an item_id column that is string or shouldn't they we adapted to take it as string?
Not sure if I could fully express what I meant here. @HansVRP some help please.
fyi @VincentVerelst
There seems to be a regression? issue in case that no item_id is provided to stac based job manager when initiating. without setting an item_id, the index is used as an int which gets refused by:
{"code":"ValidationError","description":"1 validation error for Item\nid\n Input should be a valid string [type=string_type, input_value=0, input_type=int]\n For further information visit https://errors.pydantic.dev/2.11/v/string_type"}
while working on #736 I encountered a related problem
this indeed comes from here I think where there is a fallback to numerical item_ids:
https://github.com/Open-EO/openeo-python-client/blob/4dfdf77ca8bb1158b00562d67591752ef9845054/openeo/extra/job_management/stac_job_db.py#L63-L65
@soxofaan I believe we can close this one?
I'm not sure, needs verification.
could you try to reproduce the original issue @Pratichhya ?
I still receive the same erro:
Waiting 5 seconds to make sure the database is deleted Job database cropsar2d_jobdb_03102025 deleted Expecting HTTP status to be any of [200, 201, 202] but received 500 - Internal Server Error, request method=POST response body: {"code":"ValidationError","description":"1 validation error for Item\nid\n Input should be a valid string [type=string_type, input_value=0, input_type=int]\n For further information visit https://errors.pydantic.dev/2.11/v/string_type"} Traceback (most recent call last): File "/data/users/Private/pratixa/nifi/cropsar_px_stac/create_job_database_kanxa.py", line 150, in <module> main() File "/data/users/Private/pratixa/nifi/cropsar_px_stac/create_job_database_kanxa.py", line 139, in main job_db.initialize_from_df(jobs_df, on_exists='error') File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 106, in initialize_from_df self.persist(df) File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 220, in persist self._upload_items_bulk(self.collection_id, all_items) File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 257, in _upload_items_bulk self._ingest_bulk(chunk) File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 240, in _ingest_bulk _check_response_status(response, _EXPECTED_STATUS_POST) File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/openeo/extra/job_management/stac_job_db.py", line 321, in _check_response_status response.raise_for_status() File "/home/pratixa/.conda/envs/nifi/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://stac-api-dev.vgt.vito.be/collections/cropsar2d_jobdb_03102025/bulk_items
My openeo version is: openeo-0.45.0
This is my test code: https://git.vito.be/users/sharmap/repos/cropsar_px_nifi/browse/create_job_database.py?at=refs%2Ftags%2F10.03.2025#55