Chuck Yount
Chuck Yount
Referring to this README: https://github.com/oneapi-src/oneAPI-samples/blob/master/DirectProgramming/C%2B%2B/StructuredGrids/iso3dfd_omp_offload/README.md
Yes, the problem was as you mentioned: I had used OPT3, then tried to go back to baseline. Maybe this scenario should be better documented.
Old devito-generated TTI stencils were added in v2.3, but they are very ugly.
Maybe save each tuner setting in a file in `$HOME/.yask/` dir or something like that. Would need to match platform and stencil settings, perhaps using some sort of key. Should...
Default of 2 on Xeon is correct now when HT is enabled. If HT is disabled, default will cause 2 cores to work on each block, which shouldn't be terrible,...
As a workaround, there are now APIs to allow the user to construct a dependency graph manually.
Now step conds can use grid vars, but only with const indices.
Also, should only use this when not using temporal tiling.
Consideration: streaming writes would often reduce performance when using temporal tiling.
The per-stage tuning feature has been removed--should make this a bit easier to implement.