DataStructures.jl icon indicating copy to clipboard operation
DataStructures.jl copied to clipboard

In indexing data, RTtree search may be far inferior to AVLTree

Open Klaixiya opened this issue 5 years ago • 1 comments

I had done some tests to find out the performance of indexing data using RBTree or AVLTree, the results is following:


rbtree = RBTree{Int}()
avltree = AVLTree{Int}()
sets = rand(collect(1:10^6), 10000)

for k in sets
   insert!(rbtree, k)
end


for k in sets
   insert!(avltree, k)
end

testset = collect(1:10^6)
ind = collect(1:9000)

@benchmark rbtree[rand(ind)]

BenchmarkTools.Trial: 
  memory estimate:  2.44 MiB
  allocs estimate:  27788
  --------------
  minimum time:     3.007 ms (0.00% GC)
  median time:      3.379 ms (0.00% GC)
  mean time:        3.816 ms (7.58% GC)
  maximum time:     18.757 ms (61.41% GC)
  --------------
  samples:          1306
  evals/sample:     1



@benchmark avltree[rand(ind)]
BenchmarkTools.Trial: 
  memory estimate:  86 bytes
  allocs estimate:  5
  --------------
  minimum time:     924.000 ns (0.00% GC)
  median time:      1.114 μs (0.00% GC)
  mean time:        1.191 μs (0.00% GC)
  maximum time:     4.403 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     14

I want to know why RBTree is so ineffecient, that is so weird!

Klaixiya avatar Jan 24 '21 05:01 Klaixiya

Hi! Well indexing in RBTree is supposed to take more time as it has been implemented in linear time complexity by collecting the array done by a inorder traversal of the tree. Meanwhile, in AVLTree, the data structure has been augmented to provide operations related to order-statistics in logarithmic time.

eulerkochy avatar Jan 24 '21 13:01 eulerkochy