Phase out read_vector usage
when passing a string as polygon argument to ImageCollectionClient.mask:
https://github.com/Open-EO/openeo-python-client/blob/b5f3b4725ed54a7ab1522992b330221c07f6f287/openeo/rest/imagecollectionclient.py#L777-L784
or in _get_geometry_argument: https://github.com/Open-EO/openeo-python-client/blob/2ce9d3be823ff2201e55d2e96e82811714164e9a/openeo/rest/datacube.py#L900
this implementation assumes process read_vector which is currently a VITO specific process and I'm not aware of anything alike in the official process collection.
Background: At VITO we currently use this to work with very large polygon files (> 100k polygons) stored at backend side, which we don't want to pass with the openEO request for obvious reasons.
In the client we should avoid hardcoding non-official processes of course
Solution:
- check which processes are supported by backend: read_vector, load_url, load_geojson
- check if the string is an http url, a path to geojson that exists locally, or a path that does not exist locally
- depending on the combination of available processes and the type of string, do something that's more sensible than current implementation
- Some care should be taken to avoid passing a huge geojson object with the request (e.g. throw an exception when above a threshold)
- Still allow the possibility to inject a
read_vectorbased argument to support the VITO use cases
cc @jdries
A fix for this is being proposed in this pull request: https://github.com/Open-EO/openeo-processes/pull/106
ah nice, I somehow missed that thread
however, load_uploaded_files is meant for user uploaded files, which is not exactly what we currently do in the VITO use cases that depend on read_vector.
A closer proposal is probably the import_nfs process from Open-EO/openeo-processes#105
way forward:
- https://github.com/Open-EO/openeo-python-client/issues/457
Another use case that has to be updated: this suggestion was made in a user support channel to do aggregate_spatial with a (large) geometry from a URL:
datacube = datacube.aggregate_spatial(
geometries="https://example.com/path/to/geometries.json",
reducer="mean",
)
will currently produce a process graph using read_vector, so that's not future proof at the moment
(cc @EmileSonneveld)
done:
- remove
read_vectorusage from default geometry handling (but documented how to reconstruct for workflows that still need it) - add support for passing a geometry URL