consider alternative KV store
I noticed in a Safenet discussion a mention that sled "is buggy and doesn’t seem to be actively maintained", with Persy and Cacache being their current candidates for replacement.
I haven't yet encountered any bugs in sled in the past two years, so I don't think it's worryingly buggy for my usecase. But I do have some worries about active maintenance. I've reached out to the maintainer some time ago, who told me he's working on a large low-level library that he'll integrate in Sled too.
Some requirements for alternatives:
- Embeddable KV database
- Fast
-
rangeandprefixqueries possible
Some options:
- ReDB. Impressive benchmarks, faster than pretty much anything else! But it's quite new and doesn't have a stable format yet. Definitely keep an eye on this one!
- Reddit thread with more contenders
- TiKV (multi-node!)
- OpenDAL (supports sled, tikv, redb, so users can choose for themselves)
I walked through a similar impressive list of contenders for embedded DB, before finding the atomic data server and it's not as clean-cut: for example, indradb has sled or Postgres as a dependency. While doing research I started working on a testing bench for common data structures in Rust used for those, but I run out of steam. If we want to consider moving to different databases I would start with benchmarks - expand our criterion benchmarks to cover as many cases as possible and then select a handful to try with the new backend. I believe we can get a lot of improvements using in-memory data structures for the cache - like dashmap, before we need to move from sled. One more thought: team at https://github.com/Synerise/cleora found it's faster to re-build graph structure from data -they use sparse matrix of nodes and edges stored in FxHash than to deserialize it from serde.
Update on sled: maintainer of sled is working on it in the background, mostly on a new storage engine. So sled ain't dead, baby.
Some other thoughts:
- Switching to Redis might help to achieve multi-node setup #213, although it is not embeddable. Maybe we need some sort of abstraction that allows users to switch KV store? Wouldn't be too complex, I think.
- Cloudflare's KV store might be interesting, too, as it allows for an edge deploy. Would probably involve rewriting far more, though.
Update on sled: maintainer of sled is working on it in the background, mostly on a new storage engine. So sled ain't dead, baby.
Do note that the new engine is licensed under GPL3. I'm not familiar with how sled is being used in your project, but it may be incompatible with your MIT license.
https://github.com/komora-io/marble/blob/main/Cargo.toml#L7
@netthier That could very well be a problem, thanks! I've sent a mail to sled's maintainer.
Relevant issue in marble: https://github.com/komora-io/marble/issues/7
I propose to hook into Apache OpenDAL (Data Access Library), I was going to use it to handle s3 uploads and writes, but it supports in memory, sled/dash map/redis in addition to all major cloud services + IPFS. Fully functional example:
use log::debug;
use log::info;
use opendal::layers::LoggingLayer;
use opendal::Scheme;
use std::collections::HashMap;
use std::env;
use opendal::services;
use opendal::Operator;
use opendal::Result;
#[tokio::main]
async fn main() -> Result<()> {
let _ = tracing_subscriber::fmt()
.with_env_filter("info")
.try_init();
let schemes = [Scheme::S3, Scheme::Memory, Scheme::Dashmap, Scheme::Sled, Scheme::Redis];
for scheme in schemes.iter() {
info!("scheme: {:?}", scheme);
read_and_write(*scheme).await?;
}
Ok(())
}
async fn read_and_write(scheme:Scheme) -> Result<()>{
// Write data into object test and read it back
let op = match scheme {
Scheme::S3 => {
let op = init_operator_via_map()?;
debug!("operator: {op:?}");
op
},
Scheme::Dashmap => {
let builder = services::Dashmap::default();
// Init an operator
let op = Operator::new(builder)?
// Init with logging layer enabled.
.layer(LoggingLayer::default())
.finish();
debug!("operator: {op:?}");
op
},
Scheme::Sled => {
let mut builder = services::Sled::default();
builder.datadir("/tmp/opendal/sled");
// Init an operator
let op = Operator::new(builder)?
// Init with logging layer enabled.
.layer(LoggingLayer::default())
.finish();
debug!("operator: {op:?}");
op
},
Scheme::Redis => {
let builder = services::Redis::default();
// Init an operator
let op = Operator::new(builder)?
// Init with logging layer enabled.
.layer(LoggingLayer::default())
.finish();
debug!("operator: {op:?}");
op
},
_=>{
let builder = services::Memory::default();
// Init an operator
let op = Operator::new(builder)?
// Init with logging layer enabled.
.layer(LoggingLayer::default())
.finish();
debug!("operator: {op:?}");
op
}
};
// Write data into object test.
let test_string = format!("Hello, World! {scheme}");
op.write("test", test_string).await?;
// Read data from object.
let bs = op.read("test").await?;
info!("content: {}", String::from_utf8_lossy(&bs));
// Get object metadata.
let meta = op.stat("test").await?;
info!("meta: {:?}", meta);
Ok(())
}
fn init_operator_via_map() -> Result<Operator> {
// setting up the credentials
let access_key_id = env::var("AWS_ACCESS_KEY_ID").expect("AWS_ACCESS_KEY_ID is set and a valid String");
let secret_access_key = env::var("AWS_SECRET_ACCESS_KEY").expect("AWS_ACCESS_KEY_ID is set and a valid String");
let mut map = HashMap::default();
map.insert("bucket".to_string(), "test".to_string());
map.insert("region".to_string(), "us-east-1".to_string());
map.insert("endpoint".to_string(), "http://rpi4node3:8333".to_string());
map.insert("access_key_id".to_string(), access_key_id.to_string());
map.insert(
"secret_access_key".to_string(),
secret_access_key.to_string(),
);
let op = Operator::via_map(Scheme::S3, map)?;
Ok(op)
}
This one: https://github.com/apache/incubator-opendal
Wow @AlexMikhalev that looks really promising! Seems like it supports [scan] so that's good, althought it's missing in the Sled connector. I'm also wondering if it has Tree support, see issues:
https://github.com/apache/incubator-opendal/issues/2498
https://github.com/apache/incubator-opendal/issues/2497
Hi, I'm the maintainer of OpenDAL. Thanks for @AlexMikhalev's sharing and @joepio's contact!
I'm here to bring some updates from OpenDAL side:
- https://github.com/apache/incubator-opendal/issues/2497 is fixed.
- https://github.com/apache/incubator-opendal/issues/2498 is almost done and waiting for some review, I expect it will be released soon.
Apart from existing issues, I'm interesed in adding support for more services so our users can have more choices:
- https://github.com/apache/incubator-opendal/issues/2518
- https://github.com/apache/incubator-opendal/issues/2522
- https://github.com/apache/incubator-opendal/issues/2523
- https://github.com/apache/incubator-opendal/issues/2524
Please feel free to let me know if there is anything I can help you with!
@Xuanwo awesome. A small example (like example 2 in your plans) of how to use OpenDal from tokio async functions will help me personally - I am building complementary to atomic product, https://terraphim.ai/ and I want to plug OpenDal operator instead of redis.rs KV.
@Xuanwo awesome. A small example (like example 2 in your plans) of how to use OpenDal from tokio async functions will help me personally - I am building complementary to atomic product, https://terraphim.ai/ and I want to plug OpenDal operator instead of redis.rs KV.
Thanks for the feedback! I will write one tomorrow 🤪
OpenDAL now also supports TiKV! https://github.com/apache/incubator-opendal/issues/2533
This opens up multi-node setups for Atomic-Server.
Hi, I'm the maintainer of OpenDAL. Thanks for @AlexMikhalev's sharing and @joepio's contact!
I'm here to bring some updates from OpenDAL side:
- Sled
scanKV query support apache/incubator-opendal#2497 is fixed.- Named KV stores / trees for sled apache/incubator-opendal#2498 is almost done and waiting for some review, I expect it will be released soon.
Apart from existing issues, I'm interesed in adding support for more services so our users can have more choices:
- Add new service support: mini_moka apache/incubator-opendal#2518
- Add new service support: persy apache/incubator-opendal#2522
- Add new service support: cacache apache/incubator-opendal#2523
- Add new service support: ReDB apache/incubator-opendal#2524
Please feel free to let me know if there is anything I can help you with!
One month later, OpenDAL community implemented all the issues :rocket:!