fix: Sanitize OSM height values with comma decimal separators
The problem is showcased here: https://github.com/protomaps/basemaps/discussions/339
Problem
The parseHeight method in Buildings.java did not handle common OSM tagging mistakes where commas are used as decimal separators instead of periods (eg. height=89,10 instead of height=89.10).
This caused incorrect parsing:
-
"89,10"was parsed as8910meters instead of89.10meters -
"89,1 m"was parsed as891meters instead of89.1meters
According to the OSM height documentation, commas are incorrect but appear frequently in real-world OSM data.
Problem example as in discussion 339:
Solution
Note: this is similar to the solution that JOSM used here: https://josm.openstreetmap.de/ticket/15719
Added a sanitizeHeightValue() method that:
- Detects commas followed by 1-2 digits (with optional unit suffix like
m) - Replaces the comma with a period to create valid decimal notation
- Preserves actual thousand separators (e.g.,
"1,234"remains unchanged)
Solution example:
Changes
Modified Files
-
src/main/java/com/protomaps/basemap/layers/Buildings.java- Added
sanitizeHeightValue() - Updated
parseHeight()to sanitize bothosmHeightandosmMinHeightparameters
- Added
Test Coverage
-
src/test/java/com/protomaps/basemap/layers/BuildingsTest.java- Added
sanitizeHeightValue()test with proper test cases - Added
parseHeightWithCommaDecimalSeparator()integration test
- Added
Examples
| Input | Before | After |
|---|---|---|
"89,10" |
8910.0 | 89.10 |
"89,1" |
891.0 | 89.1 |
"89,1 m" |
891.0 | 89.1 |
"89,1m" |
891.0 | 89.1 |
"89.10" |
89.10 | 89.10 (unchanged) |
"1,234" |
1234.0 | 1234.0 (unchanged) |
@bdon Hello, do you have any time to review this change? Thank you 👍