kiddo icon indicating copy to clipboard operation
kiddo copied to clipboard

Why must Content be Zero and One?

Open wentasah opened this issue 2 years ago • 8 comments

From a quick look, it seems that this is because KdTree::size is T, which might not be necessary. I'd like to store a simple data structure as Content, and having this structure represent size makes no sense. Or is there anything else I don't see?

wentasah avatar May 10 '23 07:05 wentasah

The intent was that this would nudge users towards storing their structs in a Vec rather than the tree, and storing indexes into that Vec in the KdTree itself. This keeps the size of the items down, helping to keep the leaf nodes smaller, which helps with fitting more into the CPU cache. Since the intent was to store index values, the max value returnable by size() would be the max value that this index could take, hence the return type being T.

On reflection, this can be a bit counter-intuitive. I'd consider changing the return type of size() to usize since this is really what you'd normally expect to get back from the size() method of containers in Rust generally.

sdd avatar May 11 '23 06:05 sdd

This makes sense.

Actually, what I really need is to get coordinates of the nearest point. It seems it should be possible to get them from the tree directly, but there is no API for that. Now, I need to store the points separately in a Vec and get the coordinates from there. Since the size of my point is 64 bits, storing the point in the tree would make no difference in size. If I can get coordinates from the tree, I could have content () and the tree would be even smaller.

wentasah avatar May 11 '23 06:05 wentasah

It depends on how many points you have. Indexing into a Vec is a usize by default (sounds like you're on a 64 bit arch so 64 bits in this case) but i'm guessing that you're not storing > 2 billion points and so a u32 would do, which can then be cast back to a usize when accessing the Vec. This is the approach that I use.

Clearly though this is not as useful as simply having the ability to return the point when making the query. We could add a method called nearest_one_point or similar that returns the point rather than the content. That would be fairly straightforward. This could be more efficient and performant if T could also be (), which would be more work and potentially a breaking change.

I'm open to adding the nearest_one_point method for now, and then addressing the ability to have T as () in the upcoming major release I have planned. How does that sound for you?

sdd avatar May 12 '23 10:05 sdd

That sounds perfect. My current application is not performance critical so I don't need T to be () right now.

wentasah avatar May 12 '23 12:05 wentasah

I'm open to adding the nearest_one_point method for now, and then addressing the ability to have T as () in the upcoming major release I have planned. How does that sound for you?

Hello Scott.

Just to inform you of my situation: In a project I'm a part of, we are currently storing RGB color information in the Kdtree as Content, while also using that same color information as coordinates. This means that we're actually storing the color information twice. Letting T be () would therefor be useful to us. This is also not a performance critical application in any way, but it would be nice. I'm looking forward to the next major release.

Cheers, Youser

NeuralModder avatar May 24 '24 03:05 NeuralModder

Just to let you know @NeuralModder that the long-awaited next version has been released as https://crates.io/crates/kiddo/5.0.0. Unfortunately there is still no means to have T as () but now that some long-standing issues with the ImmutableKdTree have been resolved I've got more time to re-look at older feature requests.

sdd avatar Dec 04 '24 21:12 sdd

Thanks for the update. We don't currently utilize ImmutableKdTrees and having T as () is what I'm waiting for, but I appreciate it nonetheless.

NeuralModder avatar Dec 11 '24 06:12 NeuralModder