osm2pgsql icon indicating copy to clipboard operation
osm2pgsql copied to clipboard

In-band metadata in middle layer

Open mmd-osm opened this issue 5 years ago • 2 comments

Let's assume I've created some Overpass extract which doesn't include metadata (i.e. out; instead of out meta; ). With GDPR in mind, this could also mean that some metadata fields like user, uid and changeset may be missing in the future.

Now, when processing such a file via osm2pgsql with extra attributes enabled, a mapper (!) can control the metadata fields due to the logic in taglist_t::add_attributes. This way, random values can be written to the metadata fields. Even worse, setting some special purpose keys to invalid values can abort osm2pgsql altogether.

Example test data:

<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API 0.7.56.3 eb200aeb">
 <node id="25477618" lat="49.1837670" lon="-122.9572343">
  <tag k="highway" v="traffic_signals"/>
  <tag k="osm_uid" v="x"/>
 </node>
</osm>

Result: Osm2pgsql failed due to ERROR: illegal user id: 'x'

./osm2pgsql ---create --slim -O flex -S ../flex-config/simple.lua demo.osm -x

mmd-osm avatar May 17 '20 07:05 mmd-osm

Yes, the middle should not store the attributes as pseudo-tags. This is on my list of things to refactor. Unfortunately the old outputs depends on this behaviour and changing it breaks the data format the middle uses in the database which would break updates. So any solution here has to be carefully considered.

joto avatar May 17 '20 09:05 joto

Tag keys and values need to be valid utf-8 strings. What I’ve seen elsewhere is to use a 0xff prefix for special internal keys, which would turn them into invalid utf-8. A mapper wouldn’t be able to upload such values.

mmd-osm avatar May 17 '20 10:05 mmd-osm

This seems to be covered by #1966 now.

mmd-osm avatar May 25 '23 07:05 mmd-osm

This seems to be covered by #1966 now.

Not quite. There are still ways this could create some problems. The real solution is coming up. The new database middle implementation will store the attributes somewhere else, not in the tags.

joto avatar May 25 '23 13:05 joto