horaedb icon indicating copy to clipboard operation
horaedb copied to clipboard

Proposal: make usage of `ScanIter` same as `RocksLogIterator`

Open ygf11 opened this issue 3 years ago • 4 comments

Description The usage of RocksLogIterator:

  1. seek or next
  2. visit key and value

The usage of ScanIter:

  1. visit key and value
  2. next

It is ok in main branch, since it will decode value immediately。 As I introduce #119, the value will be decoded later, and the item will be &[u8], it works in RocksLogIterator. But TableLogIterator will violate the borrow checker.

https://github.com/CeresDB/ceresdb/blob/21bc3e51bd03890a2998919b2d405ab3adc1b8b4/wal/src/table_kv_impl/region.rs#L686-L708

What do you think? @waynexia @ShiKaiWi @chunshao90

ygf11 avatar Jul 21 '22 04:07 ygf11

@ygf11 I have fetched your PR changes and compiler don't make any complaint. So could you give more explanation about TableLogIterator will violate the borrow checker.

ShiKaiWi avatar Jul 21 '22 10:07 ShiKaiWi

I have fetched your PR changes and compiler don't make any complaint. So could you give more explanation about TableLogIterator will violate the borrow checker.

@ShiKaiWi Sorry, I change a litter to make compiler happy, but it is not correct.

I reset this in my pr, you can run cargo build to see the error.

ygf11 avatar Jul 21 '22 10:07 ygf11

ScanIter is constructed by TableLogIterator::scan_buckets(). Its semantic is "the first call to next() will drop the first entry and move iter to point to the second entry". This makes us consume the "current" entry before calling next(). And this leads to the lifetime problem we are facing.

I haven't inspected how wide this semantic of ScanIter is used. If it only presents in this one place we can change it (but at a glance, I can see many references to that trait...). So I'm afraid we need some workarounds:

  • Clone the bytes and keep it in TableLogIterator to decouple the lifetime.
  • Wrap the return type of ScanIter::value() with a ref count. This may change lots of places either but logic is simpler.
  • Wrap over the ScanIter to change its semantics only in this place.

I plan to dig deeper in the next few days. Maybe we can simply make a clone first to unblock the whole progress.

waynexia avatar Jul 21 '22 14:07 waynexia

@waynexia Thanks for your response.

Its semantic is "the first call to next() will drop the first entry and move iter to point to the second entry". This makes us consume the "current" entry before calling next(). And this leads to the lifetime problem we are facing.

That's what problem I mean.

Wrap the return type of ScanIter::value() with a ref count. This may change lots of places either but logic is simpler.

Do you mean Arc<Vec<u8>>? If so, the value type of RocksLogIterator should also be Arc<Vec<u8>>, then clone becomes inevitable, which we may care about.

I plan to dig deeper in the next few days. Maybe we can simply make a clone first to unblock the whole progress

Let's clone it first, and make it better later.

ygf11 avatar Jul 22 '22 01:07 ygf11