hcman2

Results 13 comments of hcman2

Need to check performance first before adding this PR.

Performance checked. Rebase and wait for CI pass.

Reopen after sync newest codebase.

Looks like the DTVA global read code will be treat as part of LW in SIA.py. They are continuously issued so that the instruction stalls happen quickly. This impacts the...

> PrefetchGlobalRead=2 case, global read is scheduled with local write. We have a parameter called LocalWritePerMfma. This is to determine the number of MFMA between each GlobalRead( which comes with...

Should we add a test yaml to cover this kinds of issue?

> would this cause LDS bank conflict? I don't think this patch will change the LDS bank behavior. **ds_store_b128 addr, vgpr[0:3], offset:0** is changed into **ds_store_b32 addr, vgpr[0], offset:0 ds_store_b32...

[----------] Global test environment tear-down [==========] 48206 tests from 13 test suites ran. (1332135 ms total) [ PASSED ] 48206 tests.

gfx90a passed [----------] Global test environment tear-down [==========] 13061 tests from 13 test suites ran. (480980 ms total) [ PASSED ] 13061 tests.

========== 76 passed, 32 skipped, 1258 warnings in 8858.51s (2:27:38) ========== gfx94x tox passed.