Range scans in transactional context
Use Case:
Key range scans in transactional context, which takes into account keys added or removed in current transaction, similar to TransactionalTree.get()
Without that, it is impossible to emulate even simple SQL SELECT-like scenarios in transactional context: even if you do range scan outside transactional context and pass a copy inside, you can't merge that with keys modified by current transaction.
Proposed Change (tentative):
- change transaction caches from HashMap to BTreeMap, to allow for key ordering
- implement iterator which does ordered merge of Tree iterator and transactional cache
- implement TransactionalTree.range() which should produce such iterator, and possibly other shortcut methods like get_lt/get_gt/...
Benefits
- TransactionalTree and Tree APIs will be more consistent
- Eliminate need to access Trees directly from within transactional context
- API for read access of transactional write cache (=modified keys) may be useful by itself
See #382, it's also necessary to detect conflicts between range scans and inserts of keys that should have been included in the range scan
I understand that fully serializable isolation level is not achievable with that, but it is not necessary for all use-cases and should be documented as such. I see it as relatively simple-to-implement provisional measure until transactions overhaul (#382 created almost two years ago...)
I don't want to spend effort on a bug-prone version of this. I want to either support it with full serializability or not at all, as anything in-between will lead to people experiencing bugs without realizing it, because most people don't understand the different isolation levels. A big goal for this project is to make it harder if possible to misuse by accident. The current transaction API is generally not acceptable for this goal in my opinion, but I'd like to avoid making it more complex until sturdier foundations are put into place.
Thanks, this makes sense. It would be very helpful if you can outline somehow your vision of transactional API evolution (from user perspective), so people will have better idea what to expect, and will be able to make more meaningful contributions.
A few more thoughts:
- current behavior of accessing tree methods from tx context in the same thread is to silently deadlock, which is not pretty. At least, this should be mentioned in documentation somehow. Another option is to utilize
deadlock_detectionfeature ofparking_lot, as provisional measure. - in future, what about permitting access to non-isolated tree API from transactional context, but making it
unsafe?
I need this feature for my project. This is the only reason why I cannot use sled and must stay on LMDB :/
A big goal for this project is to make it harder if possible to misuse by accident.
I am of the opinion that until this feature can be provided in a robust way, it would be nice to be able to provide it in a less robust way but via an unsafe block. By explaining in the doc why it's unsafe, what are the risks, the user cannot misuse sled by accident :)