Add z-index support with a predictable UI stack
Objective
Add consistent UI rendering and interaction where deep nodes inside two different hierarchies will never render on top of one-another by default and offer an escape hatch (z-index) for nodes to change their depth.
The problem with current implementation
The current implementation of UI rendering is broken in that regard, mainly because it sets the Z value of the Transform component based on a "global Z" space shared by all nodes in the UI. This doesn't account for the fact that each node's final GlobalTransform value will be relative to its parent. This effectively makes the depth unpredictable when two deep trees are rendered on top of one-another.
At the moment, it's also up to each part of the UI code to sort all of the UI nodes. The solution that's offered here does the full sorting of UI node entities once and offers the result through a resource so that all systems can use it.
Solution
New ZIndex component
This adds a new optional ZIndex enum component for nodes which offers two mechanism:
-
ZIndex::Local(i32): Overrides the depth of the node relative to its siblings. -
ZIndex::Global(i32): Overrides the depth of the node relative to the UI root. This basically allows any node in the tree to "escape" the parent and be ordered relative to the entire UI.
Note that in the current implementation, omitting ZIndex on a node has the same result as adding ZIndex::Local(0). Additionally, the "global" stacking context is essentially a way to add your node to the root stacking context, so using ZIndex::Local(n) on a root node (one without parent) will share that space with all nodes using Index::Global(n).
New UiStack resource
This adds a new UiStack resource which is calculated from both hierarchy and ZIndex during UI update and contains a vector of all node entities in the UI, ordered by depth (from farthest from camera to closest). This is exposed publicly by the bevy_ui crate with the hope that it can be used for consistent ordering and to reduce the amount of sorting that needs to be done by UI systems (i.e. instead of sorting everything by global_transform.z in every system, this array can be iterated over).
New z_index example
This also adds a new z_index example that showcases the new ZIndex component. It's also a good general demo of the new UI stack system, because making this kind of UI was very broken with the old system (e.g. nodes would render on top of each other, not respecting hierarchy or insert order at all).

Changelog
- Added the
ZIndexcomponent to bevy_ui. - Added the
UiStackresource to bevy_ui, and added implementation in a newstack.rsmodule. - Removed the previous Z updating system from bevy_ui, because it was replaced with the above.
- Changed bevy_ui rendering to use UiStack instead of z ordering.
- Changed bevy_ui focus/interaction system to use UiStack instead of z ordering.
- Added a new z_index example.
ZIndex demo
Here's a demo I wrote to test these features https://user-images.githubusercontent.com/1060971/188329295-d7beebd6-9aee-43ab-821e-d437df5dbe8a.mp4
The local and global z-index solution in this PR is an attempt to bring flexibility without adding too much complexity.
Local z-index allows a node to change its depth compared to its siblings regardless of the order they were added to their parent, while the Global z-index allows nodes to "escape" the parent and be ordered relative to the entire UI.
Those two use-cases are the most common in my opinion, but I can see some benefits in other more complex solutions. For example, having the ZIndex::Local depth be relative to the nearest parent node that has a StackingContext component attached to it (effectively creating contexts dynamically instead of basing it solely on siblings).
This basically allows any node in the tree to
The sentence cut off.
I think this system works great with how Bevy UI works at the moment, because it assumes one window and viewport. However, the UiStack will need to change slightly after or before #5892 gets merged. The UiStack struct will be even more useful if it contains one Vec<Entity> per UI root, and each of those roots comes with information on where they're getting rendered to.
For example, the UI interaction system will be able to filter on the root's target information to only update UI that's getting rendered on the current active window. This will also be useful during UI rendering when multiple render targets are supported.
I think that this PR could be improved by
- Addressing how this solution relates to some of the discussion / design from the previous attempt in #1211
- Adding a test. This changeset puts us at -1 tests for "ui z stuff."
- Some benchmarking to demonstrate that there are no major perf regressions.
The included example serves as a good demo, but putting myself in the shoes of a user who is trying to learn how ZIndex works, it seems pretty difficult to follow.
Addressing how this solution relates to some of the discussion / design from the previous attempt in https://github.com/bevyengine/bevy/pull/1211
This was what motivated me to add the Controversial label. So long as we can avoid the problems there, I'm happy to move forward on this. I think we have consensus that this needs to happen, and this seems like a good step forward.
Much of the related camera-driven rendering + hierarchy mess that plagued #1211 has since been dramatically improved.
Addressing how this solution relates to some of the discussion / design from the previous attempt in #1211
This was what motivated me to add the Controversial label. So long as we can avoid the problems there, I'm happy to move forward on this. I think we have consensus that this needs to happen, and this seems like a good step forward.
Much of the related camera-driven rendering + hierarchy mess that plagued #1211 has since been dramatically improved.
I made a summary of some important points from that PR and how they're improved upon with this one.
We need to ensure that layers never "intersect" with each other.
The UiStack concept helps with this since it's a sorted vector built from the hierarchies (and the z-index escape hatch). You can only iterate over it in one way or the other depending on what you need to do.
I think it probably makes sense to make it relative. Maybe we can do something similar to CSS.
The global and local options helps giving some flexibility here and the result is somewhat similar to CSS with less rules and exceptions.
I pretty strongly feel that it should be a part of Style (defaulting to a None or Auto value). I don't like requiring a separate component for roots because that is "redundant information".
There's nothing root-centric in this PR, so that helps. As for the ZIndex component, it's fully optional. However, we should probably add it to UI bundles as an Option<ZIndex> for clarity/discovery.
Z precision issues
The z precision issues discussed on there are gone with this solution because we don't move ui nodes on the z axis anymore.
WindowId, UI to texture concerns
This PR doesn't change anything in that regard.
My PR also doesn't address the fact that multiple root nodes share the same internal taffy root and will share screen space. However, I have a separate PR #5892 which does.
I just realized that adding Option<ZIndex> to the bundles is not yet supported. Do you think it's important enough that it should be added to the node bundles to justify adding it with a default value of ZIndex::Local(0)? It would remain an optional component, but would be present by default when using bundles to help with discovery.
I just realized that adding Option<ZIndex> to the bundles is not yet supported. Do you think it's important enough that it should be added to the node bundles to justify adding it with a default value of ZIndex::Local(0)? It would remain an optional component, but would be present by default when using bundles to help with discovery.
In this case I think yes. IMO we should prioritize UX over micro-performance pretty heavily for UI.
Remember to hit the other bundles that build off of NodeBundle too.
Alice would this close #1275?
@rparrett
- Some benchmarking to demonstrate that there are no major perf regressions.
Unless I write some new benchmarks, I'm not sure we have any I can run for the UI. I do think they would be beneficial in general, but it's hard for me to benchmark this and compare without doing some full app benchmark that includes a complex nested UI. This moves some z sorting that was done separately in the focus system and render extraction systems into a single, more complex system so that the complexity is only done once. I think we'd need to benchmark a complete app with UI loop to compare the two.
If you have some suggestions for me, please do let me know.
@colepoirier no, because it doesn't work for sprites. It's significant progress though, thanks for linking them.
@colepoirier no, because it doesn't work for sprites. It's significant progress though, thanks for linking them.
Cool. Is there a fundamental reason that this system can’t be the single system for both ui and sprites for z-order?
I have added a unit test for the UI stack system that's similar to the one we had before. I believe this was the last concern for this PR other than for benchmarking, which I've never done with bevy so I wouldn't be sure where to start. Also, I believe there were no benchmarks available for bevy_ui prior to this either.
For me, this looks like ready to go unless other concerns are raised.
Edit:
@colepoirier no, because it doesn't work for sprites. It's significant progress though, thanks for linking them.
Cool. Is there a fundamental reason that this system can’t be the single system for both ui and sprites for z-order?
I think this could work for sprite rendering as well, but it seems to me like it isn't as high of a priority there as it is for UI.
The problem with UI in its current state is that it's simply impossible to have multiple absolute-positioned nested trees existing on top of each other. They will intersect in unpredictable ways and there's no fix for it, so I wanted to make something simple that would at least solve that. With sprites, you need to keep track of your Z values, but at least they can be used consistently.
The solutions described in that thread sound like they could be interesting to bring even more flexibility to the system, because my solution is more UI-centric and simplistic.
Hovering seems broken right now. No hover-state is shown in the button or many_buttons examples.
I believe this was the last concern for this PR other than for benchmarking, which I've never done with bevy so I wouldn't be sure where to start.
I'd probably start by looking at the many_buttons example with tracing / tracy and compare execution times of the involved systems.
Here's ui_focus_system as an example (Yellow is this PR)
And extract_uinodes:
This is not mega-scientific. I attempted to run my mouse over the window in a similar pattern at a similar speed. Probably something going on there though. At a glance, I'd guess that iterating a list of entities and using .get is slower than just iterating a query, which makes sense. Not sure if iter_many would help there or not.
As a review note: performance measurements are good, but not blocking for this sort of feature work. Useful, slightly slower libraries are much better than blazing fast ones that are missing essential functionality (like z-order control).
Excellent advice from @rparrett on casual perf investigations though!
I've made a few changes:
- Improvements to the docs for clarity
- Added a
WorldQueryin the focus system to improve readability - Changed the focus and render extraction systems to use
iter_manyanditer_many_mutto improve readability (and possibly performance)
There's one place in the focus system where it doesn't use iter_many_mut because it doesn't have full Iterator support and that place relies on a filter_map. However, most of it now uses it which makes it look simpler with less indentation as well.
That looks a lot nicer at a glance.
I am still seeing broken "hovering" behavior in the button example (pre-iter-mut as well). I believe that there was something subtle going on in the previous (pre-this-PR) implementation of the focus system because now the
// reset lower nodes to None
code is immediately resetting the nodes that were marked as Hovered in the code above.
edit: reprofiled, and iter_many didn't really seem to make a huge difference perf-wise as far as I can tell, but I agree with Alice that chasing microseconds isn't vital here.
Also, it seems that 4e6a193e202ce9f98fe8fb02579a787cc655605f has broken something, because I am now seeing this in the demo:
I don't think that
for (stack_index, (uinode, transform, color, image, visibility, clip)) in
uinode_query.iter_many(&ui_stack.uinodes).enumerate()
is producing the same indices as enumerateing over the stack first.
I don't know if the iter_many idea is salvageable without bevy providing a variant of iter_many that iterates over Results.
Sorry for pointing you towards that goose-chase.
iter_many is actually perhaps salvageable, but I haven't benchmarked yet to determine if it's worth the extra complexity.
https://github.com/rparrett/bevy/commit/ffbe0dcae7c9c073b25dc363ae42c08625283151
iter_manyis actually perhaps salvageable, but I haven't benchmarked yet to determine if it's worth the extra complexity.
It seems to me like performances for this will probably be at best similar (but possibly worse) than just using .get_mut(entity) in a loop? You're now going through the iterator item by item to find the right one on every loop, whereas the get_mut(entity) option at least relies on bevy's lookup performance.
I implemented iter_many/iter_many_mut quickly, assuming that it would give me results in the order that they are in the provided entities vector, but if it doesn't do that by design then I guess there's not much we can do other than revert this part or implement an iter_many alternative that actually does that.
I implemented
iter_many/iter_many_mutquickly, assuming that it would give me results in the order that they are in the provided entities vector, but if it doesn't do that by design then I guess there's not much we can do other than revert this part or implement aniter_manyalternative that actually does that.
It's not that it doesn't iterate in the correct order (I'm reasonably sure it does but I wish the docs were clearer about that), it's that you're relaying in enumerate to provide the index, and this will be different if you're iterating over only matched entities vs. iterating the entire stack.
It seems to me like performances for this will probably be at best similar (but possibly worse) than just using
.get_mut(entity)in a loop? You're now going through the iterator item by item to find the right one on every loop, whereas theget_mut(entity)option at least relies on bevy's lookup performance.
The iterator's state is preserved and find short-circuits when it finds something, so I'm not iterating the entire iterator on every loop, just skipping ahead to the next matched entity. This working at all is relying on the order being the same.
I agree that any small potential perf increase may not be worth the cost in code readability though. This seems to make extract_uinodes about 10% faster than the ui-stack-pre-iter-mut.
Indeed, I just realized my mistake while looking at the code and trying an alternative. I looked at the internal code for .iter_many() and it essentially does a .get() on every entity in the loop, so my understanding is that there won't be much benefit to using iter_many.
What do you think of this which keeps the enumerate order? It's a bit verbose but I think it may be easier to understand what's going on. Note that the UI focus system can just keep using iter_many_mut since it doesn't need to keep the indices.
for (stack_index, (uinode, global_transform, text, text_layout_info, visibility, clip)) in
ui_stack
.uinodes
.iter()
.enumerate()
.filter_map(|(stack_index, entity)| {
uinode_query
.get(*entity)
.map(|result| (stack_index, result))
.ok()
})
{
// ...
}
I'd like to go back to @alice-i-cecile comment :
performance measurements are good, but not blocking for this sort of feature work.
Can we move ahead with this PR as is, and leave any potential optimization for a later pass? The discussion is very interesting, but there's already a lot of work here with clear value over the existing solution, I wouldn't want to discourage @oceantume nor risk the work getting stuck in PR limbo. I'd prefer we focus on the last items if any (I didn't follow, is there still a bug to fix?) and merge what we have, which fixes a major blocker for UI hierarchies. Thanks ❤️
"as-is" is currently completely broken, so that needs to be resolved.
But yeah if I haven't been clear, my position is that we are chasing microseconds of frame time (but we didn't know that until we measured). Please carry on with .get.
I reverted the render extraction system to the unbroken state it was in before using iter_many, fixed the hover state and resolved the remaining comments.
FYI this is ready for final review for me.
I think I don't understand GH's review system because I keep removing someone's review request when I request a review from someone else.
Updated documentation to resolve comments!
bors try
@alice-i-cecile I would still call this controversial as this is a pretty foundational piece of UI that will be hard to change once people start depending on its behaviors. I'd like to give this a look, but I also don't want to hold back progress. Lets "start the merge clock" on this, but with the agreement that it can be merged for Bevy 0.9 if I don't get to it in time (so slightly accelerated).