miniob icon indicating copy to clipboard operation
miniob copied to clipboard

[BUG]索引扫描删除记录不能完全删除

Open warr99 opened this issue 2 years ago • 6 comments

Describe the bug 走索引扫描删除记录时删除不能全部删除

Environment

  • OS Version: ubuntu22.04
  • CPU Arch(x86/arm): x86
  • Compiler: GCC 11.4.0 x86_64-linux-gnu
  • Others:

Fast Reproduce Steps(Required) Steps to reproduce the behavior: miniob> create table t (id int,age int); SUCCESS miniob> create index t_id_idx on t (id); SUCCESS miniob> insert into t values (1,10); SUCCESS miniob > insert into t values (1,11); SUCCESS miniob > insert into t values (1,12); SUCCESS miniob > select from t; id|age 1|10 1|11 1|12 miniob > delete from t where id 1; SUCCESS Expected behavior 记录应该全部被删除 miniob > select from t; id|age Actual Behavior What is the result? picture is allowed 中间的记录被跳过没有被删除 miniob > select from t; id|age 1|11 c4e7acf9a76a5e413ece81d0a113aa9 Additional context

  • 导致该问题的原因是在RC BplusTreeScanner::next_entry(RID& rid, bool idx_need_increase)这个函数中查找下一个entry时,会把iter_index_++,但删除一条记录会将其对应的索引向前覆盖memmove(__item_at(index), __item_at(index + 1), (size() - index - 1) * item_size())iter_index_++就直接跳过了下一条要获取的记录了。

warr99 avatar Sep 24 '23 05:09 warr99

Thanks. Do you have some suggestions? Or would you like to create a pull request?

hnwyllmm avatar Sep 25 '23 01:09 hnwyllmm

Thanks. Do you have some suggestions? Or would you like to create a pull request?

To minimize code structure modifications as much as possible, my current solution is to add a boolean property called idx_need_increase_ within the IndexScanPhysicalOperator class, with a default value of true. I would also provide methods set_idx_need_increase() and get_idx_need_increase(). In the DeletePhysicalOperator::next() function, if the record deletion is successful, I would attempt to cast the PhysicalOperator* child into an IndexScanPhysicalOperator*. If the casting is successful, I would then call set_idx_need_increase() to set it to false. In the next_entry() function, I would use the provided idx_need_increase parameter to decide whether to perform the iter_index_++ operation.

If you think this approach is feasible, I can submit a pull request (PR), or you can provide me with some suggestions.

warr99 avatar Sep 25 '23 05:09 warr99

Thanks for your suggestion. It looks not a perfect solution.

  1. Your solution only suit for deletion and cannot used in insertion scenario
  2. Cast the child to specific type is not a generic method.

Do you know the method that some real-database uses, such as mysql, postgresql? And it also looked like std::iterator with insert/delete.

hnwyllmm avatar Sep 25 '23 05:09 hnwyllmm

Do you mean that, similar to using erase() for deleting elements in a vector, it returns a valid iterator? I'm not very familiar with how MySQL/PostgreSQL implement this feature. Could you provide me with some keywords so that I can try to implement this solution?

warr99 avatar Sep 25 '23 06:09 warr99

std::iterator cannot deal with erase or insert correctly. I'm sorry that I am not familiar with mysql/postgresql too and I cannot provide any useful information too. I will trace this issue and if you have any other ideas, discussion is welcome.

hnwyllmm avatar Sep 25 '23 06:09 hnwyllmm

Is it possible to fix this bug? The test samples in the competition involve this issue, and it won't pass the test if it's not fixed.

0130w avatar Oct 21 '23 03:10 0130w