osm2pgsql icon indicating copy to clipboard operation
osm2pgsql copied to clipboard

Figure out a way to use fastupdate on indexes

Open pnorman opened this issue 12 years ago • 7 comments

fastupdate does not work with osm2pgsql because with a high work_mem it builds up a large temporary part of the index it has to sequentially scan, slowing down updates significantly

It would be nice if we could use fastupdate with a smaller temporary part of the index.

pnorman avatar Jun 10 '13 21:06 pnorman

What if you lower work_mem just in the session that is doing the updates?

jeffjanes avatar Nov 27 '13 21:11 jeffjanes

What if you lower work_mem just in the session that is doing the updates?

I'm not sure - if there's any sorts or bitmap scans being done, it'll have negative impacts there.

pnorman avatar Sep 23 '14 17:09 pnorman

Do you have a specific use case in mind? e.g. an example command line to use?

There is a proposed change in PostgreSQL to allow the fast update pending list to be set separately from work_mem (https://commitfest.postgresql.org/action/patch_view?id=1536). I tried testing it with various settings that I thought were reasonable, but I couldn't find meaningful improvement, because it seems like the pending list wasn't much of a bottleneck.

But I only tried initial loads, not incremental loads, because that is what I am familiar with.

jeffjanes avatar Sep 23 '14 17:09 jeffjanes

Do you have a specific use case in mind?

The current reason we can't use fastupdate is that with the high work_mem required on a rendering server the pending-entry list grows very large, and updates become extremely slow because scanning a 128MB pending-entry list is slow.

Ideally, we'd keep the pending list small, and after an osm2pgsql command has completed, force it to process the pending list, even if it's small.

This won't impact imports at all, because the GIN index isn't created until the very end and no updates occur to that table after the index is created. That column also isn't used in importing at all, just populated.

e.g. an example command line to use?

Just normal slim mode updates.

pnorman avatar Sep 23 '14 18:09 pnorman

Ideally, we'd keep the pending list small, and after an osm2pgsql command has completed, force it to process the pending list, even if it's small.

And with 9.6 we will have a way to do this!

My thoughts are

  • Detect PostgreSQL version
  • On 9.6 create GIN indexes with fastupdate on
  • On 9.6 when finishing processing during updates run gin_clean_pending_list

If running on a DB created by an earlier osm2pgsql version (with fastupdate=off indexes) this would still work because gin_clean_pending_list can still be safely called.

The migration is also easy.

pnorman avatar Sep 09 '16 22:09 pnorman