badger
badger copied to clipboard
[Hackathon2020 - Do not merge] Greedy-piecewise-linear-regression (PLR) to speed up int type key search for LSM-tree (bourbon idea)
Disclaimer
Original implementation made by @spongedu at here https://github.com/spongedu/badger/tree/hackathon_go_plr_with_pointget_uint64key I ported over and did some clean up work.
Idea from paper: From WiscKey to Bourbon: A Learned Index for Log-Structured Merge Trees, by Yifan Dai
Rust code source:
- binary
plr(which used in badger code) code here https://github.com/liufuyang/plr-test/blob/master/src/main.rs -
plrlibrary code here https://github.com/RyanMarcus/plr
Benchmark code here:
https://github.com/liufuyang/tbadger/blob/master/badger_test.go
With PLR, with pointGet
(fuyang-go-bench-badger *%) 🦀 👉 go test -bench=BenchmarkBadger -benchtime=30s -run=^s
2021/01/16 18:36:38 plr opening file: data/00000004.sst
2021/01/16 18:36:38 data/00000004.sst
2021/01/16 18:36:38 plrSegment loaded: [{0 43371 0.002763512135689483 1.301027126230357} {43371 173143 0.001957717574102289 32.139598333991884} {173143 436947 0.0016452183457147617 82.19281535952956} {436947 833712 0.0014415577306141915 167.1788822889456} {833712 1.393807e+06 0.0013017392682151607 279.7244833388081} {1.393807e+06 2.144502e+06 0.0012110451657496644 402.0680961828125} {2.144502e+06 3.243884e+06 0.001120694858783678 591.6862955068373} {3.243884e+06 4.252725e+06 0.001064653793312162 769.4048384938992} {4.252725e+06 5.489191e+06 0.0010158619299836292 972.7984525022184} {5.489191e+06 7.260306e+06 0.0009696070011254119 1222.5316082781937} {7.260306e+06 8.683148e+06 0.000932753999857703 1485.8753435962026} {8.683148e+06 1.0541136e+07 0.0009026047455994166 1743.6066808302094} {1.0541136e+07 1.2621543e+07 0.0008782587667118885 1996.1946893039676} {1.2621543e+07 1.4617779e+07 0.0008541220294640251 2296.6850507190775} {1.4617779e+07 1.7976931348623157e+308 0.0008369829862892688 2543.212360918902}]
-------> Hit rate: 0.000000
goos: darwin
goarch: amd64
pkg: tbadger
BenchmarkBadger-12 -------> Hit rate: 0.040000
-------> Hit rate: 0.042700
-------> Hit rate: 0.039912
-------> Hit rate: 0.039828
-------> Hit rate: 0.039901
40003507 985 ns/op
PASS
ok tbadger 69.299s
WithoutPLR, with pointGet
(fuyang-go-bench-badger *%) 🦀 👉 go test -bench=BenchmarkBadger -benchtime=30s -run=^s
2021/01/16 18:40:05 plr opening file: data/00000004.sst
2021/01/16 18:40:05 data/00000004.sst
2021/01/16 18:40:05 plrSegment loaded: [{0 43371 0.002763512135689483 1.301027126230357} {43371 173143 0.001957717574102289 32.139598333991884} {173143 436947 0.0016452183457147617 82.19281535952956} {436947 833712 0.0014415577306141915 167.1788822889456} {833712 1.393807e+06 0.0013017392682151607 279.7244833388081} {1.393807e+06 2.144502e+06 0.0012110451657496644 402.0680961828125} {2.144502e+06 3.243884e+06 0.001120694858783678 591.6862955068373} {3.243884e+06 4.252725e+06 0.001064653793312162 769.4048384938992} {4.252725e+06 5.489191e+06 0.0010158619299836292 972.7984525022184} {5.489191e+06 7.260306e+06 0.0009696070011254119 1222.5316082781937} {7.260306e+06 8.683148e+06 0.000932753999857703 1485.8753435962026} {8.683148e+06 1.0541136e+07 0.0009026047455994166 1743.6066808302094} {1.0541136e+07 1.2621543e+07 0.0008782587667118885 1996.1946893039676} {1.2621543e+07 1.4617779e+07 0.0008541220294640251 2296.6850507190775} {1.4617779e+07 1.7976931348623157e+308 0.0008369829862892688 2543.212360918902}]
-------> Hit rate: 0.000000
goos: darwin
goarch: amd64
pkg: tbadger
BenchmarkBadger-12 -------> Hit rate: 0.060000
-------> Hit rate: 0.039500
-------> Hit rate: 0.039985
-------> Hit rate: 0.039894
-------> Hit rate: 0.039881
38633751 1025 ns/op
PASS
ok tbadger 60.518s
WithPLR, without pointGet
(master %) 🦀 👉 go test -bench=BenchmarkBadger -benchtime=30s -run=^s
2021/01/16 18:52:33 plr opening file: data/00000004.sst
2021/01/16 18:52:33 data/00000004.sst
2021/01/16 18:52:33 plrSegment loaded: [{0 43371 0.002763512135689483 1.301027126230357} {43371 173143 0.001957717574102289 32.139598333991884} {173143 436947 0.0016452183457147617 82.19281535952956} {436947 833712 0.0014415577306141915 167.1788822889456} {833712 1.393807e+06 0.0013017392682151607 279.7244833388081} {1.393807e+06 2.144502e+06 0.0012110451657496644 402.0680961828125} {2.144502e+06 3.243884e+06 0.001120694858783678 591.6862955068373} {3.243884e+06 4.252725e+06 0.001064653793312162 769.4048384938992} {4.252725e+06 5.489191e+06 0.0010158619299836292 972.7984525022184} {5.489191e+06 7.260306e+06 0.0009696070011254119 1222.5316082781937} {7.260306e+06 8.683148e+06 0.000932753999857703 1485.8753435962026} {8.683148e+06 1.0541136e+07 0.0009026047455994166 1743.6066808302094} {1.0541136e+07 1.2621543e+07 0.0008782587667118885 1996.1946893039676} {1.2621543e+07 1.4617779e+07 0.0008541220294640251 2296.6850507190775} {1.4617779e+07 1.7976931348623157e+308 0.0008369829862892688 2543.212360918902}]
-------> Hit rate: 0.000000
goos: darwin
goarch: amd64
pkg: tbadger
BenchmarkBadger-12 -------> Hit rate: 0.030000
-------> Hit rate: 0.039600
-------> Hit rate: 0.039574
-------> Hit rate: 0.039863
-------> Hit rate: 0.039875
18224655 1865 ns/op
PASS
ok tbadger 60.895s
WithoutPLR, without pointGet
(master %) 🦀 👉 go test -bench=BenchmarkBadger -benchtime=30s -run=^s
2021/01/16 18:50:51 plr opening file: data/00000004.sst
2021/01/16 18:50:51 data/00000004.sst
2021/01/16 18:50:51 plrSegment loaded: [{0 43371 0.002763512135689483 1.301027126230357} {43371 173143 0.001957717574102289 32.139598333991884} {173143 436947 0.0016452183457147617 82.19281535952956} {436947 833712 0.0014415577306141915 167.1788822889456} {833712 1.393807e+06 0.0013017392682151607 279.7244833388081} {1.393807e+06 2.144502e+06 0.0012110451657496644 402.0680961828125} {2.144502e+06 3.243884e+06 0.001120694858783678 591.6862955068373} {3.243884e+06 4.252725e+06 0.001064653793312162 769.4048384938992} {4.252725e+06 5.489191e+06 0.0010158619299836292 972.7984525022184} {5.489191e+06 7.260306e+06 0.0009696070011254119 1222.5316082781937} {7.260306e+06 8.683148e+06 0.000932753999857703 1485.8753435962026} {8.683148e+06 1.0541136e+07 0.0009026047455994166 1743.6066808302094} {1.0541136e+07 1.2621543e+07 0.0008782587667118885 1996.1946893039676} {1.2621543e+07 1.4617779e+07 0.0008541220294640251 2296.6850507190775} {1.4617779e+07 1.7976931348623157e+308 0.0008369829862892688 2543.212360918902}]
-------> Hit rate: 0.000000
goos: darwin
goarch: amd64
pkg: tbadger
BenchmarkBadger-12 -------> Hit rate: 0.030000
-------> Hit rate: 0.037500
-------> Hit rate: 0.039512
-------> Hit rate: 0.039923
14039182 2240 ns/op
PASS
ok tbadger 34.574s
Summary:
- With PLR, with pointGet 985 ns/op
- W/ot PLR, with pointGet 1025 ns/op (Baseline)
Turn off badger index/pointGet to see the effect of PLR vs bare binary search
- With PLR, w/ot pointGet 1865 ns/op
- W/ot PLR, w/ot pointGet 2240 ns/op