Bug report: stale register value in case of double write
Hi there!
I found a bug in Kronos that in some cases ignores the second consecutive write to a given register in case of a WAW dependency. A problematic example is the following:
- The first instruction writes to, say,
ra, for exampleadd ra, t0, zero. - The second instruction writes to
raagain and has no dependency with the previous instruction, for exampleadd ra, t1, zero, where we assumet0!=t1. - The third instruction depends on
ra, sayadd t2, ra, zero. Then, given the behavior of the current HCU, will not stall because it sees a write-back tora. However, this is the write-back from the first instruction and forwards stale data. The final state of this scenario ist2=t0instead of the expectedt2=t1.
Kind regards, Flavien
@flaviens, thanks for digging into this. I'm off the grid until March 2nd. I'll need to parse and issue the study your fix, once I'm back. If all seems well, I'll pull your fix.
Sounds good! Some more info to help you: the bug is essentially caused by the misuse of the written-back register ID to identify whether the write-back belongs to the instruction currently in the EX stage, if my interpretation of this equality test is correct:
https://github.com/SonalPinto/kronos/blob/13678d47ccae4f889690ec1e0ba648299c7027d8/rtl/core/kronos_hcu.sv#L89
This assumption is incorrect if the instruction preceding the current instruction in the EX stage writes to the same destination register. In that case, regwr_pending is unset after the first EX cycle, but in fact, there is still a register write pending for that instruction in the EX stage. The effect of unsetting regwr_pending is that the control logic misses the information that a register hazard happens.
The solution that I suggested in the PR is to rely on the fact that a write-back happening in the first cycle in which an instruction resides in the EX stage cannot belong to that instruction in the EX stage (because it requires at least one cycle to generate the EX result).
I also suggest in the PR moving the logic for computing regwr_pending into the EX stage, as it feels more understandable.
I remain at your disposal!