Objective

operate on naga IR directly to improve handling of shader modules.

give codespan reporting into imported modules
allow glsl to be used from wgsl and vice-versa

the ultimate objective is to make it possible to

provide user hooks for core shader functions (to modify light behaviour within the standard pbr pipeline, for example)
make automatic binding slot allocation possible

but ... since this is already big, adds some value and (i think) is at feature parity with the existing code, i wanted to push this now.

Solution

i made a crate called naga_oil (https://github.com/robtfm/naga_oil - unpublished for now, could be part of bevy) which manages modules by

building each module independantly to naga IR
creating "header" files for each supported language, which are used to build dependent modules/shaders
make final shaders by combining the shader IR with the IR for imported modules

then integrated this into bevy, replacing some of the existing shader processing stuff. also reworked examples to reflect this.

Migration Guide

shaders that don't use #import directives should work without changes.

the most notable user-facing difference is that imported functions/variables/etc need to be qualified at point of use, and there's no "leakage" of visible stuff into your shader scope from the imports of your imports, so if you used things imported by your imports, you now need to import them directly and qualify them.

the current strategy of including/'spreading' mesh_vertex_output directly into a struct doesn't work any more, so these need to be modified as per the examples (e.g. color_material.wgsl, or many others). mesh data is assumed to be in bindgroup 2 by default, if mesh data is bound into bindgroup 1 instead then the shader def MESH_BINDGROUP_1 needs to be added to the pipeline shader_defs.

Aug 15 '22 12:08 robtfm

I would be happy to help develop and maintain the naga_oil crate.

Aug 15 '22 15:08 MavethGH

I would be happy to help develop and maintain the naga_oil crate.

thanks, i appreciate it - worth noting though that even if it's part of bevy it can still be imported and used as a standalone crate if you find it useful for you outside of bevy. i am mostly concerned with bevy since that's where i want to use it, so if @cart wants to own it that's totally fine with me.

Aug 16 '22 14:08 robtfm

I actually have a use for it outside of Bevy, but I think that integrating it with Bevy would be most useful to the most people.

Aug 16 '22 16:08 MavethGH

added overrides, so functions can be replaced dynamically, and added an example shader_material_override to demonstrate.

todo still: provide a mechanism for general overrides (so you can override the default lighting functions - this only allows for overriding on a per-material basis currently).

Aug 19 '22 10:08 robtfm

and added global/default overrides, with pbr_global_override example

Aug 19 '22 15:08 robtfm

Gave this a very quick once over (this will take a lot of time to review properly), but I now have a decent understanding of the approach taken. Some quick thoughts:

This is awesome and does go generally in the direction I've wanted for Bevy's shaders. Very nice to know that we can rely on Naga here (although I'll admit the naga side ended up being more complex than I expected).
naga_oil covers cases that are "core" enough to the bevy shader experience that I would want that to be under the bevyengine org umbrella and go through our normal governance process (even though that is slower). Controversial changes there are definitely controversial changes within the context of bevy. Although I think theres a reasonable argument to be made to keep it separate from the rest of the bevy codebase (ex: exist in its own repo in the bevy org). We can discuss this.
Forcing shader import consumers to define their own PascalCase aliases for modules feels wrong. Module authors should be defining the "canonical names" used, not module consumers (at least by default ... if there are naming collisions the as syntax provides a nice way to resolve that). And PascalCase makes it seem like modules are struct types. Given that we use Rust-like snake_case::module::syntax elsewhere, this feels confusing. At the very least we should be consistent, and I personally think snake_case::module::syntax is what we should use.
We probably want to support something like this syntax as well: #import bevy_pbr::skinning::{skin_model, skin_normals} (rather than "whole module" imports). Is that compatible with this implementation? How hard would that be? Not saying we need to support it out of the gate, but it would be good to know how possible it is, if this design can accommodate it, and whether we should make that a part of our plan.
Allowing arbitrary function overloads feels a bit dangerous / allows people to take dependencies on internal implementation details. I do like that it lets people insert themselves into pre-built pipelines, but we might want to be a bit more principled about it. We should consider only allowing overriding of functions that have been labeled as "overridable". We should also look in to how other engines expose custom lighting logic in shaders. Its possible that a simple function overloading model isn't the best fit. We might need a bit more structure and/or magic. I don't have strong opinions here yet, but this deserves careful design thought.
Given that this removes the ability to import "partial struct fields" / spread them into functions or structs, I think we should discuss our plans for "hiding" the complexity / boilerplate of specifying the default VertexOutputs from shader authors. The current approach is a bit weird / arcane, but defining this manually (like we do in this pr) is worse. If we choose to take the path in this pr, we should make sure it can accommodate this scenario in a reasonable way.

Aug 22 '22 20:08 cart

2. naga_oil covers cases that are "core" enough to the bevy shader experience that I would want that to be under the bevyengine org umbrella and go through our normal governance process (even though that is slower). Controversial changes there are _definitely_ controversial changes within the context of bevy. Although I think theres a reasonable argument to be made to keep it separate from the rest of the bevy codebase (ex: exist in its own repo in the bevy org). We can discuss this.

thanks for looking it over. i'm very happy for it to be under bevy org or part of bevy core as you prefer, and happy it would benefit from bevy's review process as well.

3. Forcing shader import consumers to define their own `PascalCase` aliases for modules feels wrong. Module authors should be defining the "canonical names" used, not module consumers (at least by default ... if there are naming collisions the `as` syntax provides a nice way to resolve that). And  `PascalCase` makes it seem like modules are struct types. Given that we use Rust-like `snake_case::module::syntax` elsewhere, this feels confusing. At the very least we should be consistent, and I personally think `snake_case::module::syntax` is what we should use.

the as syntax is optional, if you don't specify it then the original/canonical name stands. i chose to capitalize the modules just to get different colours in the editor but i agree it would be better to be consistent with rust and i'll change that in the bevy wgsl files. (anyway if this goes through i expect we could tailor the syntax hilighting for it in the future.)

4. We probably want to support something like this syntax as well: `#import bevy_pbr::skinning::{skin_model, skin_normals}` (rather than "whole module" imports). Is that compatible with this implementation? How hard would that be? Not saying we need to support it out of the gate, but it would be good to know how possible it is, if this design can accommodate it, and whether we _should_ make that a part of our plan.

this could work easily if they are still qualified at point of use, but i guess that's not what you're intending.

in the current impl there's no lexing done outside of naga, it's all regex-based - we prefix all the members of a module with a sentinel string and the (base32 encoded) module name to make a "namespace", and then search/replace module_name:: in the calling scope with the encoded prefix (and otherwise deny use of the sentinel string to ensure there's no accidental use of module items). this works for now because :: can't otherwise appear in wgsl or glsl.

if we want to import individual names and use them without qualification it may need more care to make sure we only modify exactly what we should. it's probably fine to just use a regex with initial whitespace but i haven't fully thought it through. at worst, ::skin_normals would work.

5. Allowing arbitrary function overloads feels a bit dangerous / allows people to take dependencies on internal implementation details. I do like that it lets people insert themselves into pre-built pipelines, but we might want to be a bit more principled about it. We should consider only allowing overriding of functions that have been labeled as "overridable".

marking functions as overridable is straight-forward. personally i would prefer not to have artificial restrictions on the injection points i can choose, but i'm more familiar with the pbr internals than many users and maybe the support burden would be an issue. but perhaps just documenting "safe" entry points for user overrides would be enough?

5. ... We should also look in to how other engines expose custom lighting logic in shaders. Its possible that a simple function overloading model isn't the best fit. We might need a bit more structure and/or magic. I don't have strong opinions here yet, but this deserves careful design thought.

one thing to bear in mind: the "API" that we would be fixing through overridable function signatures is not quite as restrictive as it seems initially - we can also take the vertex/fragment input and store it to a global variable, so that overriding functions have access to whatever data they need. at the extreme we could also do this with other derived data and reduce the overridable function signatures down to practically nothing. this might be inefficient (i'm not sure how good the driver dead-code analysis is wrt calculations that get stored to globals but never read) but the purge module could fix it if it's a problem.

definitely interested to hear other people's thoughts here though.

6. Given that this removes the ability to import "partial struct fields" / spread them into functions or structs, I think we should discuss our plans for "hiding" the complexity / boilerplate of specifying the default VertexOutputs from shader authors. The current approach is a bit weird / arcane, but defining this manually (like we do in this pr) is worse. If we choose to take the path in this pr, we should make sure it can accommodate this scenario in a reasonable way.

note it is only manually defined for vertex output (there's a shared struct for frag input) ... it is grim though. the current restriction seems to be that structs can be used in entry-point IO, but only one level deep. i'm not entirely sure where the restriction comes from. ideally in future wgsl (or naga) will allow us to have entry-point IO where only the leaves of a heirarchy need to be located.

Aug 23 '22 00:08 robtfm

structs can be used in entry-point IO, but only one level deep

i've moved @bulitin(position) into the MeshVertexOutput structs so we can use them directly as the vertex stage output and fragment stage input, without needing to embed them within another struct layer.

this seems good to me ... i'm not sure if there's a reason why we are not doing this already.

edit: but of course it still can't be so easily extended by authors of custom vertex shaders ... oops

Aug 25 '22 16:08 robtfm

further to the above, the struct depth limitation for entry point IO comes from the WGSL spec. i've tried briefly to extend the naga internals to accept nested structs but have hit a point where my lack of spirv knowledge is blocking me and i am not really prepared to learn it well enough to debug. i think it would be possible though since spirv seems to use globals for stage IO.

but i think it's not so bad as it is. depending on if you use custom vertex shaders for

generating extra outputs for use in custom fragment shaders: if you don't use the standard frag shader then you don't care what the standard frag shader uses, and so you should define your own IO struct with exactly what you need.
using funky vertex input (e.g. vertex pulling) to pipe into the standard frag shaders: then your vertex stage must produce exactly the standard output (and can use the standard IO struct), else it won't work anyway.
producing different output, and then calling into the standard fragment shaders: then you will need to construct a MeshVertexOutput in your top level frag shader to call into the standard frag shader, or a PbrInput to call into pbr_functions::pbr. you'll need to ensure your frag shader can derive everything for the pbr entry point you use, but it won't necessarily need all the standard outputs from the vertex shage anyway.

Aug 26 '22 09:08 robtfm

added uint shader def variant
added shader defs to modules (so we can link MAX_DIRECTIONAL_LIGHTS to mesh_view_bindings.wgsl, anything which imports the wgsl will get the def)
added ability to import individual items with #from x import y, z
- note: when importing an item with the #from directive, the final shader will include the required dependencies (bindings, globals, consts, other functions) of the imported item, but will not include the rest of the imported module. it will however still include all of any modules imported by the imported module. this is probably not desired in general and may be fixed in a future version.

Dec 08 '22 01:12 robtfm

the syntax as it stands is just a small extension of the existing syntax for imports. but given that it will already require users to refactor, it might be nice to resolve any syntax changes now.

currently in this pr: #import <mod> #import <mod> as <name> #from <mod> import <item>(, <another_item>)* where each <bracketed thing> is a non-whitespace block. anything after the matched part is ignored, so comments can be added, but an attempt to import multiple items on one line like #import mod1, mod2 will try to import mod1, (with the comma) and silently drop the mod2. the :: that we currently use in bevy for making "paths" in the module names is optional/arbitrary and has no significance to the system. using an imported item does require the :: syntax, respectively for the above import styles: let x = <mod>::member; let x = <name>::member; let x = ::<item>;

JonahPlusPlus on discord suggested a rust-config style of import like #[import(<ident>)] which has the benefit of being clear about what is captured (and could allow whitespace in module names with some extra effort). i guess that would extend to #[include(<mod1>, <mod2>)] #[include(<mod1>=<name1>, <mod2>=<name2>)] i'm not sure what #from would look like in this style. it would also probably mean reworking #ifdef and friends for consistency.

using a rust-import style we could have #use <mod> #use <mod> as <name> #use <mod>::<item> #use <mod>::{<item> (, <another_item>)*} i don't think we need the *, in case 3 we can check for module <mod>::<item> and import if exists, then if that fails look for <item> in <mod>. or we could just force the curly braces for item imports. i would probably just do that initially.

the issue with this approach is that it deviates from rust in a few ways

#import my_lib::my_module still requires the module members to be addressed as my_lib::my_module::member in the body, as opposed to just my_module::member, which is what a rust-user would expect. perhaps it is better to make the :: meaningful in import names, so that this works how people would expect. then as a follow on we could import a whole folder with automatic naming, like rust projects ... that would be quite cool.
you still have to import something to use it, you can't just throw a hitherto::unspecified::member into the shader and expect it to work. though actually that would also be cool, and doesn't seem impossible.
it's line-based so an import declaration can't span lines. this should be fine to fix as well.

currently it isn't that rust-like in behaviour, so like with using the @import style syntax of wgpu, making it similar-but-not-quite-the-same might lead to avoidable confusion. i am not sure i want to do all the above for the initial version, but if we ultimately aim to do all the things to make it behave like rust that's probably ok in the interim.

i'm definitely open to changing it, but i'm not going to implement in any particular direction immediately unless @cart or @superdump weigh in with a strong view.

Jan 10 '23 14:01 robtfm

My main feedback is that I'm not sure I agree with shader overriding, however. On one hand, it's very powerful for the user. On the other hand, lots of shader details are internal only, not well documented for end users, and subject to change at any time. For instance, I have another PR that changes the point light function. I'm not sure what the use case for shader overrides are that can't be better solved with custom materials - especially given that we expose certain shaders as importable for the user.

thanks for looking it over. in case it helps regarding overrides, i have implemented (but not yet released) the suggestion to require overridable functions to be marked (as virtual) in naga_oil. but, i do agree that the override functionality is distinct from the import/scope management functionality and i can remove the overrides from here if needed.

there's a bug in naga_oil regarding importing consts and globals to/from glsl (https://github.com/robtfm/naga_oil/issues/12), so currently this would break people using imports in glsl. it needs a patch to naga, which i hope will go in and be released in time, but if not i think that will probably block the whole pr from 0.10 anyway.

i also would appreciate anybody who is using glsl to give this a try (or poke me and i'll port a public glsl-based bevy project code base myself, since using patched naga plus patched naga_oil plus patched bevy is a pretty big ask...) as the glsl side is relatively untested at the moment.

Jan 14 '23 21:01 robtfm

It needs a patch to naga, which i hope will go in and be released in time, but if not i think that will probably block the whole pr from 0.10 anyway.

Afaik wgpu/naga should have a release in the next few days, before bevy 0.10. Ideally we get the naga fix in before then, especially since this PR fixes https://github.com/bevyengine/bevy/issues/6996.

I have implemented (but not yet released) the suggestion to require overridable functions to be marked (as virtual) in naga_oil.

How would we decide what functions to mark as virtual or not, and what would we do when we need to change them in the future? I'd like to hear other people's opinions, and ideally concrete use cases, as I'm still not convinced that overrides are the way to go. My opinion is still that if users want to change something, then they should copy bevy's shaders and vendor them into their codebase, or import specific functions in a custom material. I'd like to hear more peoples' opinions on this.

but, i do agree that the override functionality is distinct from the import/scope management functionality and i can remove the overrides from here if needed.

:+1:, we can figure it out later.

Jan 14 '23 21:01 JMS55

You added a new example but didn't add metadata for it. Please update the root Cargo.toml file.

Jan 29 '23 23:01 github-actions[bot]

rebased
bumped naga_oil version
updated the new examples (fog, prepass)
used the static preprocessor from naga_oil to avoid duplicating regexes, reworked asset path (#import "my_shader.wgsl") handling to reflect that (naga_oil doesn't treat quoted imports any differently so we have to check/parse them bevy side)
renamed sign_determinant_model_3x3 to sign_determinant_model_3x3m due to an esoteric bug (edit: also inverse_transpose_3x3 in skinning.wgsl)
edit: demoted logs to trace/debug from debug/info

Jan 30 '23 00:01 robtfm

note: the recently added #else if shader def variant hasn't been added to naga_oil yet, so this would remove that feature for now

Feb 08 '23 16:02 robtfm

Sadly punting this to 0.11. This unlocks a lot of potential, but it has serious implications for Bevy's shaders and we should step into this very carefully / after myself and @superdump have done full reviews.

Feb 15 '23 20:02 cart

My opinion is still that if users want to change something, then they should copy bevy's shaders and vendor them into their codebase, or import specific functions in a custom material. I'd like to hear more peoples' opinions on this.

The reason I have personally been looking forward to something like this is things like heightmaps and splatmaps. In those cases, all of the actual shading remains completely identical, all that changes is the source of the shading inputs. Other use case might be things like procedural texture animations and weather effects, e.g. blending between a wet and dry ground material. Vendoring a bunch of copies of the standard material shader and the shader input structs for these small changes and keeping them in sync with every bevy version does get old a bit fast, even with the currently available imports.

However, I have no strong opinions on overrides vs overwritable global variables vs making the standard material shader more importable. All of those would work for me and should produce mostly identical shader code afaict. The latter does seem to have advantages in terms of making it easier to pass custom attributes at the cost of a bit more boilerplate, perhaps?

Mar 07 '23 15:03 NotAFile

there seems to be general consensus that overrides are questionable at best. when i get a bit of time (and probably once 0.10.1 is settled so that it will get a chance for review before too much ground moves underneath it - rebasing is pretty painful due to how much shader code it touches) i'll rework this so that it is purely the module/namespace code and doesn't add any override functionality.

the benefits of naga_oil imo are not so much related to overrides, but cleaner shader code through namespacing, and the ability to better extend things by operating on shaders as data. overrides are one example of that and maybe not a good one.

heightmaps and splatmaps [...] procedural texture animations and weather effects

you might be interested in #7820 which is a smaller pr that would probably work for that kind of behaviour as well.

Mar 07 '23 15:03 robtfm

I like that plan!

Mar 08 '23 00:03 cart

Discussion in #7903 suggests upcoming additions to this PR - or changes made in its wake - may result in a regression to the existing Material API with regard to the shader definitions made available in VertexState::shader_defs and FragmentState::shader_defs.

As a user of the Material API, compromised access to shader definitions would be destructive to my use case, and possibly block it entirely unless a corresponding alternative can be provided.

Mar 25 '23 00:03 ProfLander

updated to latest.

notes:

removed the "shader function overrides" code and examples. users could still specify overrides using the naga_oil override_any feature but engine support has been removed.
#define is now only allowed at the start of the top-level shader. in naga_oil, the shader state is global and invariant, and is used to look up precompiled modules so allowing #defines at other places leads to madness.
- the single existing use in the engine (in pbr_prepass) has been moved rust-side for now
#else if now supports any directive (#else ifndef, #else if x == 3, etc)
this is a minimal port, there is definitely some clean up that could be done following on (particularly the tonemapping shared wgsl has some hacks)
calling wgsl from glsl doesn't seem to work any more (although a minimal test case works fine). maybe due to naga changes, needs investigation
rebasing is a nightmare; if we can't merge it in this iteration i will probably drop it

Apr 10 '23 00:04 robtfm

As a heads up I spent yesterday deep diving on naga_oil and I'm continuing that today. I'll start leaving thoughts and feedback soon.

Apr 11 '23 20:04 cart

You added a new example but didn't add metadata for it. Please update the root Cargo.toml file.

Apr 29 '23 23:04 github-actions[bot]

bumped naga/wgpu version
modified the item import syntax to #import path::to::module Comma, Separated, Items
removed requirement to qualify imported items at point of use

i haven't done #import module *, would prefer to leave this for future work.

i'm not totally convinced with how this reads, i feel like the plain module imports look too similar to item imports now. but if you're sure you prefer this then ok.

it's on a git branch of naga_oil for now, i'll publish once all nailed down (just to avoid version spamming if there's any further iterations)

Apr 29 '23 23:04 robtfm

Awesome I'll give this another pass!

i haven't done #import module *, would prefer to leave this for future work.

I think thats very reasonable.

May 01 '23 19:05 cart

@robtfm can you kick off the naga_oil repo transfer?

Jun 21 '23 20:06 cart

(I've just temporarily given you repo creation perms in the bevy org as I think that is necessary)

Jun 21 '23 22:06 cart

@robtfm can you kick off the naga_oil repo transfer?

done, and resynced this pr. we'll now need to (at least)

merge the changes from the bevy-changes branch into naga_oil/master
release naga_oil
update naga_oil in cargo.toml in this pr

in case it went under the radar i wanted to hilight again that the current naga_oil doesn't work for mixing bevy pbr's wgsl with glsl - this stopped working on the last naga update and i haven't had time to investigate. my tests do work, which suggests it is a specific shader feature that causes it to fail. it's not a regression so for now i'd suggest i just open an issue against the naga_oil repo for that.

Jun 22 '23 09:06 robtfm

Cool cool. Would you like to start the naga_oil bevy-changes PR? Also happy to get that rolling if you're busy.

Jun 23 '23 19:06 cart

improve shader import model

Objective

Solution

Migration Guide