api icon indicating copy to clipboard operation
api copied to clipboard

libpostal: discard parse containing single "suburb" label

Open missinglink opened this issue 3 years ago • 1 comments

It seems that libpostal gets confused with airports, returning the label suburb:

curl --get http://localhost:4400/parse \
  --data-urlencode 'address=john f kennedy international airport'

[{"label":"suburb","value":"john f kennedy international airport"}]

However it returns the correct label for a neighbourhood:

curl --get http://localhost:4400/parse \
  --data-urlencode 'address=soho'

[{"label":"suburb","value":"soho"}]

And also in the case where the text refers to either an airport or a neighbourhood:

curl --get http://localhost:4400/parse \
  --data-urlencode 'address=tegel'

[{"label":"suburb","value":"tegel"}]

I think the best course of action here is to check for any parse containing a single label of type "suburb" and then discarding it. While this certainly will discard some valid neighbourhood parses, they should be adequately handled by the fallback parsing.

missinglink avatar Jun 21 '22 14:06 missinglink

falling back to the pelias/parser in these cases seems preferable:

Screenshot 2022-06-21 at 17 00 29 Screenshot 2022-06-21 at 17 00 50

missinglink avatar Jun 21 '22 15:06 missinglink