Tail-agnostic policy violation for mask instruction
Hi @michael-platzer,
Issue
I have found that some mask instructions violate the tail-agnostic policy. Namely: vmandn.mm, vmand.mm, vmxnor.mm, vmxor.mm, vmnand.mm , vmnor.mm. As you mentioned (and as it is written in the specs), mask instructions are always handled in a tail-agnostic way.
It is important to mention that there are only tail-policy violations for these instructions, the calculation of the new mask is done correctly.
Below you can see an exemplary warning (and error) that is thrown by the UVM environment for each of these instructions.
# UVM_WARNING uvm/sim/../src/env/cvxif_scoreboard.svh(207) @ 660: uvm_test_top.env.scoreboard [CVXIF_SCOREBOARD] Instr "vmandn.mm v2, v2, v0 m1, tu, mu" in Prog "vmandnot" failed:
# at 660: State Difference (Tail agnostic policy violation in mask instruction wtih ref.vl = 16)
# ref.vproc_register[2]: e8404a51302f2e3247434b3a323b2505, dut.vproc_reg[2]: a0000001202524200501012020212505
#
# UVM_ERROR uvm/sim/../src/env/cvxif_scoreboard.svh(223) @ 820: uvm_test_top.env.scoreboard [CVXIF_SCOREBOARD] Program "vmandnot" failed
How to reproduce
You can reproduce these errors by running the cvxif_test_direct_issue_79 test in the UVM environment.
Side note
Spike handles mask instructions with tail-undisturbed policy (at least for all the test programs of the cvxif_test_direct_issue_79 test). Which means that in the UVM_WARNINGs, the ref.vproc_register can be seen as the tail-undisturbed golden-reference, so the result of the dut should be equal or larger than the reference in order to be compliant with the tail-agnostic policy.
Also, the ref.vl in the UVM_WARNING shows the vector length that was used in the reference model for this instruction.
Hi @moimfeld,
thanks for reporting this. Indeed, these instructions are currently executed in the same fashion as the regular vand, vor, etc., with the EMUL always set to 1. Therefore, the instructions apply the operation to the entire vector register (or at least to the first VL bytes, rather than the first VL bits only), which is why the result should be correct but the tail policy is not respected.
I will think about a way to solve this.