BBOX for Postgres returning unwanted results
Description Using the bbox intersection with a postgres backend results in items being returned that do not match the filter
Steps to Reproduce Steps to reproduce the behavior:
Go to https://reference.geoconnex.us/collections/mainstems/items?bbox=-109.448547,36.611118,-107.668762,37.322120&properties=uri&limit=1000
Notice that https://geoconnex.us/ref/mainstems/29559 is included in the result although it does not intersect the bbox. It seems the bbox of the feature might intersect with the input bbox.
Expected behavior Only items that intersect with the input bbox are returned as in returned for the CQL request: https://reference.geoconnex.us/collections/mainstems/items?filter=INTERSECTS(geom,%20POLYGON((-109.448547%2036.611118,%20-109.448547%2037.322120,%20-107.668762%2037.322120,%20-107.668762%2036.611118,%20-109.448547%2036.611118)))&limit=1000
Screenshots/Tracebacks
Environment
- OS: All
- Python version: All
- pygeoapi version: 0.17.dev0
Additional context Related to https://github.com/DOI-USGS/nhdplusTools/issues/386
cc @KoalaGeo
Upon further inspection (thanks @webb-ben for the test data) I'm able to reproduce.
In the PostgreSQL provider, the intersects query differs in CQL mode vs bbox mode.
CQL query (via pygeofilter)
WHERE ST_Intersects(public.mainstems.geom, ST_GeomFromEWKT(%(ST_GeomFromEWKT_1)s))) AS anon_1
2024-07-25 07:50:15,984 INFO sqlalchemy.engine.Engine [no key 0.00515s] {'ST_GeomFromEWKT_1': 'SRID=4326;POLYGON ((-109.448547 36.611118, -109.448547 37.32212, -107.668762 37.32212, -107.668762 36.611118, -109.448547 36.611118))'}
bbox query (via SQLAlchemy direct)
WHERE public.mainstems.geom && ST_MakeEnvelope(%(ST_MakeEnvelope_1)s, %(ST_MakeEnvelope_2)s, %(ST_MakeEnvelope_3)s, %(ST_MakeEnvelope_4)s) ORDER BY public.mainstems.id ASC
LIMIT %(param_1)s OFFSET %(param_2)s
2024-07-25 07:52:03,748 INFO sqlalchemy.engine.Engine [no key 0.00020s] {'ST_MakeEnvelope_1': -109.448547, 'ST_MakeEnvelope_2': 36.611118, 'ST_MakeEnvelope_3': -107.668762, 'ST_MakeEnvelope_4': 37.32212, 'param_1': 1000, 'param_2': 0}
[2024-07-25T07:52:03Z] {/Users/tomkralidis/Dev/pygeoapi/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1577} INFO - [no key 0.00020s] {'ST_MakeEnvelope_1': -109.448547, 'ST_MakeEnvelope_2': 36.611118, 'ST_MakeEnvelope_3': -107.668762, 'ST_MakeEnvelope_4': 37.32212, 'param_1': 1000, 'param_2': 0}
The query is slightly different, and more importantly pygeofilter is setting SRID=4326 for the bbox/geometry.
Updating the PostgreSQL provider per below gives consistent results with the CQL code path for bbox/geometry:
def _get_bbox_filter(self, bbox):
if not bbox:
return True # Let everything through
# Convert bbox to SQLAlchemy clauses
geom_column = getattr(self.table_model, self.geom)
envelope = ST_MakeEnvelope(*bbox, 4326)
bbox_filter = ST_Intersects(geom_column, envelope)
return bbox_filter
Resulting SQL:
WHERE ST_Intersects(public.mainstems.geom, ST_MakeEnvelope(%(ST_MakeEnvelope_1)s, %(ST_MakeEnvelope_2)s, %(ST_MakeEnvelope_3)s, %(ST_MakeEnvelope_4)s, %(ST_MakeEnvelope_5)s)) ORDER BY public.mainstems.id ASC
LIMIT %(param_1)s OFFSET %(param_2)s
2024-07-25 08:01:02,503 INFO sqlalchemy.engine.Engine [no key 0.00111s] {'ST_MakeEnvelope_1': -109.448547, 'ST_MakeEnvelope_2': 36.611118, 'ST_MakeEnvelope_3': -107.668762, 'ST_MakeEnvelope_4': 37.32212, 'ST_MakeEnvelope_5': 4326, 'param_1': 1000, 'param_2': 0}
[2024-07-25T08:01:02Z] {/Users/tomkralidis/Dev/pygeoapi/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1577} INFO - [no key 0.00111s] {'ST_MakeEnvelope_1': -109.448547, 'ST_MakeEnvelope_2': 36.611118, 'ST_MakeEnvelope_3': -107.668762, 'ST_MakeEnvelope_4': 37.32212, 'ST_MakeEnvelope_5': 4326, 'param_1': 1000, 'param_2': 0}
We could hardcode 4326 for the incoming bbox parameter (which arrives via the provider's query() function), however the bbox may or may not have had CRS treatment depending on the storage CRS.
Perhaps the best way to determine and set the srid of the bbox is to set it to the srid of the geometry column as follows:
def _get_bbox_filter(self, bbox):
if not bbox:
return True # Let everything through
# Convert bbox to SQLAlchemy clauses
geom_column = getattr(self.table_model, self.geom)
geom_column_srid = TODO
envelope = ST_MakeEnvelope(*bbox, geom_column_srid)
bbox_filter = ST_Intersects(geom_column, envelope)
return bbox_filter
Any ideas on how to derive would be valuable (Find_SRID doesn't seem to work).
@volcan01010 & @ximenesuk ideas/suggestions?
This Issue has been inactive for 90 days. As per RFC4, in order to manage maintenance burden, it will be automatically closed in 7 days.
This Issue has been inactive for 90 days. As per RFC4, in order to manage maintenance burden, it will be automatically closed in 7 days.
@tomkralidis @webb-ben sorry, we've not been able to resource looking at this.
Can I add a warning in the docs so users are aware of the issue?
@KoalaGeo since fixed in #2051 -- thanks @webb-ben !