Remove redundent information and optimize dynamic allocations in `Table`
Objective
- fix #12853
- Make
Table::allocatefaster
Solution
The PR consists of multiple steps:
- For the component data: create a new data-structure that's similar to
BlobVecbut doesn't storelen&capacityinside of it: "BlobArray" (name suggestions welcome) - For the
Tickdata: create a new data-structure that's similar toThinSlicePtrbut supports dynamic reallocation: "ThinArrayPtr" (name suggestions welcome) - Create a new data-structure that's very similar to
Columnthat doesn't storelen&capacityinside of it: "ThinColumn" - Adjust the
Tableimplementation to useThinColumninstead ofColumn
The result is that only one set of len & capacity is stored in Table, in Table::entities
Notes Regarding Performance
Apart from shaving off some excess memory in Table, the changes have also brought noteworthy performance improvements:
The previous implementation relied on Vec::reserve & BlobVec::reserve, but that redundantly repeated the same if statement (capacity == len). Now that check could be made at the Table level because the capacity and length of all the columns are synchronized; saving N branches per allocation. The result is a respectable performance improvement per every Table::reserve (and subsequently Table::allocate) call.
I'm hesitant to give exact numbers because I don't have a lot of experience in profiling and benchmarking, but these are the results I got so far:
add_remove_big/table benchmark after the implementation:
add_remove_big/table benchmark in main branch (measured in comparison to the implementation):
add_remove_very_big/table benchmark after the implementation:
add_remove_very_big/table benchmark in main branch (measured in comparison to the implementation):
cc @james7132 to verify
Changelog
- New data-structure that's similar to
BlobVecbut doesn't storelen&capacityinside of it:BlobArray - New data-structure that's similar to
ThinSlicePtrbut supports dynamic allocation:ThinArrayPtr - New data-structure that's very similar to
Columnthat doesn't storelen&capacityinside of it:ThinColumn - Adjust the
Tableimplementation to useThinColumninstead ofColumn - New benchmark:
add_remove_very_bigto benchmark the performance of spawning a lot of entities with a lot of components (15) each
Migration Guide
Table now uses ThinColumn instead of Column. That means that methods that previously returned Column, will now return ThinColumn instead.
ThinColumn has a much more limited and low-level API, but you can still achieve the same things in ThinColumn as you did in Column. For example, instead of calling Column::get_added_tick, you'd call ThinColumn::get_added_ticks_slice and index it to get the specific added tick.
looks like a segfault in run-examples-macos-metal ci? needs investigation
e: fixed
For the component data: create a new data-structure that's similar to
BlobVecbut doesn't storelen&capacityinside of it: "BlobArray" (name suggestions welcome) (This type makes a compile-time destinction between ZST types and non-ZST types)
I like UntrackedBlobVec for this concept -- "thin" in Rust contexts typically refers to pointer sizes, but the data structures aren't pointers. And IMO, "untracked" is clearer about the intention. By the same logic, I like UntrackedVec and UntrackedColumn over ThinArrayPtr and ThinColumn (assuming ThinArrayPtr is implemented like a Vec).
I'm not sure thee ZST separation is actually worth it. The cost of the branch is only needed up on
BlobVec::reserve_exact, which is only called when the entities Vec is reallocated, which occurs at typical power-of-two capacity threshold. The extra SparseSet likely negates any memory savings from the redundant length and capacity cuts. It might be worth benchmaking.
I did some benchmarking and the ZST separation indeed has negligible performance differences, it looks like the branch predictor is really pulling it's weight. So I merged the ZST and non-ZST columns. Thanks for calling this out!
Due to the amount of unsafe in this PR, merging now is a potential risk for 0.14, moving this to merge early on in the 0.15 cycle.
Unblocked 🎉
@Adamkob12 Do you have anytime to resolve conflicts to help this pr moving forword ? I don't want to miss out on such an excellent PR in 0.15 especially it has been approved by james7132 and james
@Adamkob12 sorry to leave you waiting. Once merge conflicts are resolved I'll get this in :)
No worries! I'll try to resolve the conflicts soon.
Fixed the conflicts, note for reviewers: pay extra attention to #[cfg(feature = "track_change_detection")] adjacent code, It's the first time I've seen this feature and integrating it into the PR was the majority of the work so it's very possible that I missed some small details in documentation or something like that.
Also we should probably bench this again to be safe
Your PR increases Bevy Minimum Supported Rust Version. Please update the rust-version field in the root Cargo.toml file.