openeo-python-client icon indicating copy to clipboard operation
openeo-python-client copied to clipboard

Phase out read_vector usage

Open soxofaan opened this issue 6 years ago • 4 comments

when passing a string as polygon argument to ImageCollectionClient.mask: https://github.com/Open-EO/openeo-python-client/blob/b5f3b4725ed54a7ab1522992b330221c07f6f287/openeo/rest/imagecollectionclient.py#L777-L784

or in _get_geometry_argument: https://github.com/Open-EO/openeo-python-client/blob/2ce9d3be823ff2201e55d2e96e82811714164e9a/openeo/rest/datacube.py#L900

this implementation assumes process read_vector which is currently a VITO specific process and I'm not aware of anything alike in the official process collection.

Background: At VITO we currently use this to work with very large polygon files (> 100k polygons) stored at backend side, which we don't want to pass with the openEO request for obvious reasons.

In the client we should avoid hardcoding non-official processes of course

Solution:

  • check which processes are supported by backend: read_vector, load_url, load_geojson
  • check if the string is an http url, a path to geojson that exists locally, or a path that does not exist locally
  • depending on the combination of available processes and the type of string, do something that's more sensible than current implementation
  • Some care should be taken to avoid passing a huge geojson object with the request (e.g. throw an exception when above a threshold)
  • Still allow the possibility to inject a read_vector based argument to support the VITO use cases

cc @jdries

soxofaan avatar Dec 17 '19 10:12 soxofaan

A fix for this is being proposed in this pull request: https://github.com/Open-EO/openeo-processes/pull/106

jdries avatar Dec 17 '19 10:12 jdries

ah nice, I somehow missed that thread

soxofaan avatar Dec 17 '19 10:12 soxofaan

however, load_uploaded_files is meant for user uploaded files, which is not exactly what we currently do in the VITO use cases that depend on read_vector.

A closer proposal is probably the import_nfs process from Open-EO/openeo-processes#105

soxofaan avatar Dec 17 '19 11:12 soxofaan

way forward:

  • https://github.com/Open-EO/openeo-python-client/issues/457

soxofaan avatar Jul 18 '24 12:07 soxofaan

Another use case that has to be updated: this suggestion was made in a user support channel to do aggregate_spatial with a (large) geometry from a URL:

datacube = datacube.aggregate_spatial(
    geometries="https://example.com/path/to/geometries.json",
    reducer="mean",
)

will currently produce a process graph using read_vector, so that's not future proof at the moment

(cc @EmileSonneveld)

soxofaan avatar Oct 24 '24 09:10 soxofaan

done:

  • remove read_vector usage from default geometry handling (but documented how to reconstruct for workflows that still need it)
  • add support for passing a geometry URL

soxofaan avatar Nov 27 '24 15:11 soxofaan