Tobias Alonso
Tobias Alonso
My general goal was to have a general streamlining transformation. The number of iterations over the sequence of transforms can vary from one net to another. We may set the...
For my test case using NUM_DEFAULT_WORKERS = 30 and StreamingFCLayer_Batch with "mem_mode" = "decoupled", these are the ones that take more time and I think are well suited for cache...
In the same line, an approach that can provide additional optimization, possibly for all layers, is to keep track of the min and max value of each tensor. Currently, each...