Paul Fultz II
Paul Fultz II
> The original golden number (= 80) for tolerance was chosen in some heuristic way. The 80 was chosen because `80 * std::numeric_limits::epsilon()` is roughly 1e-6.
> This tolerance suggested value for FP32 can not logically apply to a mixed FP32 precision + FP16 precision operation Yea we could pick the FP based on the lowest...
Either way, I think the test verification updates are out of scope for this. The goal is to improve the RNG for perf runs specifically so I think this PR...
> Strict tolerances coupled with poorly chosen test vectors! If we have stricter tolerances then we have chosen better test data in general. CK and rocblas will use exact matching(so...
> The main issue addressed in this PR are three fold: Lets make it one fold, and focus on the perf measurements. > catching bugs in code or in testing...
> Definitely we should be sampling more values than just 32 fixed possible numbers in the range of a datatype. We may want to tweak this, but for verification we...
> For some background: where are we failing accuracy because of precision changes? This is related to the fp16 inaccuracy with llamav2(see #2556). #2883 will use FP32 for large reduce_means,...
I think there might be another package thats needed to install clang-format.
I think rocm-llvm-dev is the package that needs to be installed.
We should disable the `cppcoreguidelines-missing-std-forward`, `modernize-type-traits` and `misc-include-cleaner` tidy checks and then open an issue where we can address this in a seperate PRs(or perhaps we decide we want to...