DirectXShaderCompiler Retain alignment when lowering groupshared matrices

When flattening the global for a groupshared matrix, the alignment information was getting lost. As a result, the alignments of the loads and stores were calculating their own alignment based on preferred alignment and trailing zeros of the index. The preferred alignment switched to 16 when the type size was over 128 bits due to a heuristic whose rationale is lost to time. When the global has its own alignment, that gets used, so by retaining it through lowering, the alignments are consistent and reliable.

Includes testing for a few matrix variants

fixes #6416

May 06 '24 17:05 pow2clk

There is a possibility of a perf regression with this change as it may prevent generating larger loads in some backends. Should we set the alignment to what it was assumed to be previously?

It's certainly an option. What I have here preserves the alignment that is present from the start. I'm a bit more nervous about interjecting the alignment that the zero index happened to get when loads and stores were generated from the beginning. I fear that might have more unforeseen consequences than this does, but I can experiment with it.

May 06 '24 17:05 pow2clk

There is a possibility of a perf regression with this change as it may prevent generating larger loads in some backends. Should we set the alignment to what it was assumed to be previously?

It's certainly an option. What I have here preserves the alignment that is present from the start. I'm a bit more nervous about interjecting the alignment that the zero index happened to get when loads and stores were generated from the beginning. I fear that might have more unforeseen consequences than this does, but I can experiment with it.

In the updated PR we are using the datalayout to compute the preferred alignment. This looks like the correct way to do it to me.

May 21 '24 18:05 dmpots