Question about bird_getUVW function.
Given the input argument dist=0, it returns u=0, v=0, w=4294967295.
Is this w an intended value?
Hi @Yishun99 ! Yes, I believe this is correct; the reason is that bird_getUVW(uint dist) returns the position of the bird curve over triangle indices as it extends over the entire UV plane. To get the barycentric coordinates for a triangle at a given subdiv level, you'll need to mask its output with the lowest level bits - here's the relevant code from micromesh_decoder_subtri.glsl:
uvec3 iuvw;
iuvw = bird_getUVW(subTriangleIndex);
uint iu = iuvw.x;
uint iv = iuvw.y;
uint iw = iuvw.z;
uint levelBird = level - levelFmt + 1;
uint edge = 1 << (levelBird - 1);
// we need to only look at "level" bits
iu = iu & ((1u << levelBird) - 1);
iv = iv & ((1u << levelBird) - 1);
iw = iw & ((1u << levelBird) - 1);
Hope this helps!
One additional thing I needed to double-check (and got wrong in my first comment, now fixed): bird_getUVW() iterates over micro-triangle indices, instead of micro-vertex indices. So, for instance, for a subdiv 2 triangle, the micro-triangle indices are (0, 0, 1), (0, 0, 0), (1, 0, 0), (0, 1, 0) if I've done my math right. Note how the second one is a "flipped" or "upper" triangle, and has u + v + w == 2^subdivLevel - 2.
Thank you @NeilBickford-NV ! There are also many variables bewildering me a lot. Here is my understandings from codes, could you please double-check it.
-
meshletandpart-
meshlet == part
-
-
packandpart- A
part(with 64 primitives) is decoded by twosubgroups(SUBGROUP_SIZE=32). - The threads (or invocations) within a subgroup just form one
pack. -
sdec.cfg.packThreadIDis the same among two iteration of functionsmicrodec_getTriangle, which means that twopacks share onepackID(although it has no further side effect since it is then replaced withMicroDecodedTriangle.outIndex). - One thread (or invocation) corresponding to one micro-vertex.
- A
-
sub-triangleandpart- I believe there is only one subTri in a baseTri (by default). Is it right?
I also find some annotation that differs from my understandings .
Such as "packThreads * packID + packThreadID < SUBGROUP_SIZE" (since packThreads is assigned with SUBGROUP_SIZE in .mesh.glsl).
You're welcome Yishun!
Some of the confusion is probably because the draw_compressed_basic task shader only launches a mesh shader for one, instead of many, base triangles (if MICRO_USE_BASETRIANGLES is set) or subtriangles (if MICRO_USE_BASETRIANGLES is unset). Here's a corresponding comment in draw_compressed_basic.task.glsl
// This shader does not do any packing of micromeshes with low subdivision
// into a single mesh-shader invocation. This yields less performance as we
// unterutilize the hardware this way (an entire mesh workgroup may generate
// only a single triangle).
// Look at the draw_micromesh_lod shaders which use a more complex setup
// that does this.
The draw_micromesh_lod shaders it refers to, which can rasterize micromeshes at varying levels of detail and pack multiple micromeshes with low subdiv levels into a single meshlet, will be released in an upcoming sample dedicated to micromesh rasterization.
For (3), a subtriangle is (in short) "a set of microtriangles that stores its values in a single compressed block of data" (i.e. a subtriangle corresponds to a block).
For instance, imagine a subdiv 5 triangle (1024 triangles) with data compressed using the BlockFormatDispC1::eR11_unorm_lvl4_pack1024 format (where each 1024-bit block stores the data for a subdiv 4 subtriangle with 256 micro-triangles). This subdiv 5 base triangle would have 4 subtriangles, each of which uses a block of eR11_unorm_lvl3_pack512 data.
It's not the same as draw_compressed_basic.task's part, which is a 64-microtriangle meshlet:
uint subdiv_getNumMeshlets(uint subdiv)
{
return (1u << ((max(3, subdiv) - 3) * 2));
}
// From draw_compressed_basic.task.glsl:
#if MICRO_USE_BASETRIANGLES
MicromeshBaseTri microBaseTri = microdata.basetriangles.d[min(microID, pc.microMax)];
uint microSubdiv = micromesh_getBaseSubdiv(microBaseTri);
#else
MicromeshSubTri microSubTri = microdata.subtriangles.d[min(microID, pc.microMax)];
uint microSubdiv = micromesh_getSubdiv(microSubTri);
#endif
uint partMicroMeshlets = subdiv_getNumMeshlets(microSubdiv);
So:
-
meshletandpartare the same indraw_compressed_basic, but indraw_compressed_lod, a meshlet can contain multiple parts, packed together. - A meshlet can contain up to 64 microtriangles (e.g. imagine a subdiv 2 base triangle, which only has 16 microtriangles).
- Threads work together using subgroup operations to decode blocks (subtriangles).
- Each mesh shader thread then outputs one vertex and one micro-triangle.
- Base triangles can contain 1, 4, or 16 subtriangles; this depends on the base triangle's subdiv level and the base triangle's compression format.
Hope this helps!
Just as fyi we will remove/replace these decoders in future. We still are awaiting a new beta driver release for the dedicated rasterization sample. Which would explain some of this with slides and so on. It's unfortunate that this delay exists and I cannot give a definitive ETA, but the comment to better not base code on the rasterization here still holds
Christoph has released a dedicated sample for rasterization here: https://github.com/nvpro-samples/vk_displacement_micromaps
Closing