Displacement-MicroMap-Toolkit icon indicating copy to clipboard operation
Displacement-MicroMap-Toolkit copied to clipboard

Question about bird_getUVW function.

Open Yishun99 opened this issue 2 years ago • 6 comments

Given the input argument dist=0, it returns u=0, v=0, w=4294967295. Is this w an intended value?

Yishun99 avatar Jun 29 '23 03:06 Yishun99

Hi @Yishun99 ! Yes, I believe this is correct; the reason is that bird_getUVW(uint dist) returns the position of the bird curve over triangle indices as it extends over the entire UV plane. To get the barycentric coordinates for a triangle at a given subdiv level, you'll need to mask its output with the lowest level bits - here's the relevant code from micromesh_decoder_subtri.glsl:

  uvec3 iuvw;
  iuvw    = bird_getUVW(subTriangleIndex);
  uint iu = iuvw.x;
  uint iv = iuvw.y;
  uint iw = iuvw.z;

  uint levelBird = level - levelFmt + 1;
  uint edge      = 1 << (levelBird - 1);

  // we need to only look at "level" bits
  iu = iu & ((1u << levelBird) - 1);
  iv = iv & ((1u << levelBird) - 1);
  iw = iw & ((1u << levelBird) - 1);

Hope this helps!

NBickford-NV avatar Jun 29 '23 03:06 NBickford-NV

One additional thing I needed to double-check (and got wrong in my first comment, now fixed): bird_getUVW() iterates over micro-triangle indices, instead of micro-vertex indices. So, for instance, for a subdiv 2 triangle, the micro-triangle indices are (0, 0, 1), (0, 0, 0), (1, 0, 0), (0, 1, 0) if I've done my math right. Note how the second one is a "flipped" or "upper" triangle, and has u + v + w == 2^subdivLevel - 2.

NBickford-NV avatar Jun 29 '23 04:06 NBickford-NV

Thank you @NeilBickford-NV ! There are also many variables bewildering me a lot. Here is my understandings from codes, could you please double-check it.

  1. meshlet and part
    • meshlet == part
  2. pack and part
    • A part (with 64 primitives) is decoded by two subgroups (SUBGROUP_SIZE=32).
    • The threads (or invocations) within a subgroup just form one pack.
    • sdec.cfg.packThreadID is the same among two iteration of function smicrodec_getTriangle, which means that two packs share one packID (although it has no further side effect since it is then replaced with MicroDecodedTriangle.outIndex).
    • One thread (or invocation) corresponding to one micro-vertex.
  3. sub-triangle and part
    • I believe there is only one subTri in a baseTri (by default). Is it right?

I also find some annotation that differs from my understandings . Such as "packThreads * packID + packThreadID < SUBGROUP_SIZE" (since packThreads is assigned with SUBGROUP_SIZE in .mesh.glsl).

Yishun99 avatar Jun 30 '23 12:06 Yishun99

You're welcome Yishun!

Some of the confusion is probably because the draw_compressed_basic task shader only launches a mesh shader for one, instead of many, base triangles (if MICRO_USE_BASETRIANGLES is set) or subtriangles (if MICRO_USE_BASETRIANGLES is unset). Here's a corresponding comment in draw_compressed_basic.task.glsl

// This shader does not do any packing of micromeshes with low subdivision
// into a single mesh-shader invocation. This yields less performance as we
// unterutilize the hardware this way (an entire mesh workgroup may generate
// only a single triangle).
// Look at the draw_micromesh_lod shaders which use a more complex setup
// that does this.

The draw_micromesh_lod shaders it refers to, which can rasterize micromeshes at varying levels of detail and pack multiple micromeshes with low subdiv levels into a single meshlet, will be released in an upcoming sample dedicated to micromesh rasterization.

For (3), a subtriangle is (in short) "a set of microtriangles that stores its values in a single compressed block of data" (i.e. a subtriangle corresponds to a block).

For instance, imagine a subdiv 5 triangle (1024 triangles) with data compressed using the BlockFormatDispC1::eR11_unorm_lvl4_pack1024 format (where each 1024-bit block stores the data for a subdiv 4 subtriangle with 256 micro-triangles). This subdiv 5 base triangle would have 4 subtriangles, each of which uses a block of eR11_unorm_lvl3_pack512 data.

It's not the same as draw_compressed_basic.task's part, which is a 64-microtriangle meshlet:

uint subdiv_getNumMeshlets(uint subdiv)
{
  return (1u << ((max(3, subdiv) - 3) * 2));
}

// From draw_compressed_basic.task.glsl:
#if MICRO_USE_BASETRIANGLES
  MicromeshBaseTri microBaseTri = microdata.basetriangles.d[min(microID, pc.microMax)];
  uint             microSubdiv  = micromesh_getBaseSubdiv(microBaseTri);
#else
  MicromeshSubTri microSubTri = microdata.subtriangles.d[min(microID, pc.microMax)];
  uint            microSubdiv = micromesh_getSubdiv(microSubTri);
#endif

  uint partMicroMeshlets = subdiv_getNumMeshlets(microSubdiv);

So:

  • meshlet and part are the same in draw_compressed_basic, but in draw_compressed_lod, a meshlet can contain multiple parts, packed together.
  • A meshlet can contain up to 64 microtriangles (e.g. imagine a subdiv 2 base triangle, which only has 16 microtriangles).
  • Threads work together using subgroup operations to decode blocks (subtriangles).
  • Each mesh shader thread then outputs one vertex and one micro-triangle.
  • Base triangles can contain 1, 4, or 16 subtriangles; this depends on the base triangle's subdiv level and the base triangle's compression format.

Hope this helps!

NBickford-NV avatar Jul 01 '23 03:07 NBickford-NV

Just as fyi we will remove/replace these decoders in future. We still are awaiting a new beta driver release for the dedicated rasterization sample. Which would explain some of this with slides and so on. It's unfortunate that this delay exists and I cannot give a definitive ETA, but the comment to better not base code on the rasterization here still holds

pixeljetstream avatar Jul 07 '23 08:07 pixeljetstream

Christoph has released a dedicated sample for rasterization here: https://github.com/nvpro-samples/vk_displacement_micromaps

NBickford-NV avatar Oct 17 '23 00:10 NBickford-NV

Closing

pknowlesnv avatar Feb 13 '25 03:02 pknowlesnv