hcman2

Results 13 comments of hcman2

If we split a b128 into 4 x b32. We will have default 4-way bank conflict. The extra cycles for 1 4-way bank conflict b32 instruction are 2, 1 for...

On real test , the LW b32 has 108 cycle latency with 4-way bank conflict.

> Is there any perf test result proving the most of cases would get uplift? Yes. My test example is TN MT256x192 with problem size [8192,7296,1,8192]. For this example we...