Description

There are many situations where records in a data set do not have natural sequential integer ids and we want to simply assign them at ingest time (rather than using column translation). In these cases, a single client importing into an empty Pilosa can fairly easily generate sequential ids and import the data performantly.

But what if there are multiple clients? What if there is already data in Pilosa? Clients need some way of coordinating which IDs each one will allocate, and understanding what data is already in Pilosa so it doesn't get overwritten. One could imagine any variety of where clients communicate with each other (or through another service) to synchronize, or are configured ahead of time not to produce overlapping ids, and can interrogate Pilosa about what data it already has, but these all seem messy and like kind of a lot of work.

What if the import endpoints which take a shard parameter simply allowed you to leave that parameter off? If you do not specify what shard the data should go into, Pilosa knows that it should use a new empty shard to ingest the data. This will allow any number of concurrent clients to throw shardfulls of data at Pilosa without having to do any synchronization or pre-communication!

Pilosa will have to take care such that concurrent requests don't end up in the same shard, but this is purely internal to Pilosa, and seems a lot easier than trying to coordinate with clients. The node receiving the request can look at its availableShards data and select from among the first few empty ones. It would then send out a request to all the owners of that shard to let them know it is reserving it. If the owners respond that they have no data for that shard, then the node can continue with the import. The owners will have to reject any other modifications to that shard until the import is finished.

Aug 28 '19 18:08 jaffee

I'm unclear on how the messy client communication differers from the "simple" internal coordination described at the end. I feel that coordination at the time of delivery of shard would probably more expensive than coordination in the planning stage of the shard creation, at least from a managed batch ingest like spark

How would your proposal handle say the same binary shard imported two times in a row?

Aug 28 '19 18:08 tgruben

My thought was that Pilosa's internal communication is simpler because the infrastructure is already in place to do it, but I see your point about the same shard coming more than once. Clients would need to add some kind of transaction ID to each set of records they are importing so that Pilosa can check whether it has already imported this set in the case of client retries. Then Pilosa would need to keep track of those transaction IDs as well which seems kind of ugly.

I'll icebox this for now...

Sep 06 '19 14:09 jaffee

import endpoints which take a shard should optionally allocate the shard for you

Description