End2end rtlsim tests failing
Problem:
In test_end2end_bnn_pynq.py, the "ipstitch_rtlsim" test and the rtlsim-based part of the "validate_top1" test fail. The simulation runs through but yields incorrect results (e.g. a top1-accuracy of ~6% for tfc-1-1). I also tested the cnv-1-1 network and found the same issue. The accuracy at all other build steps, including the run on HW, is not affected.
Possible Cause:
During preparation of the rtlsim model, InsertAndSetFIFODepths() is applied. I noticed that either one of the following 2 changes solves the issue:
- Remove the
RemoveShallowFIFOs()transformation as the last step ofInsertAndSetFIFODepths(). - Add an additional
InsertFIFO()transformation after the existingInsertAndSetFIFODepths().
This suggests the following: The FIFO autosizing determines a shallow depth of 2 for the first and last FIFO of the dataflow partition, so they are removed. I'm still investigating how this could cause the rtlsim to produce incorrect results.
Notes:
- The additional
InsertFIFO()re-adds only the first and last FIFO and forces their width to 32.RemoveShallowFIFOs()on the other hand doesn't handle these FIFOs as a special case. - The seemingly redundant
InsertFIFO()was removed from the rtlsim test in this commit (https://github.com/Xilinx/finn/commit/f00e4f56c92102676cc3463abfeb4760280ea41d). It remains in the generalZynqBuild(), where it works as intended and has no effect because IODMAs are now the first/last nodes of the partition.