Dilip Sequeira
Dilip Sequeira
The format looks reasonable (I would slightly prefer colors to postfix characters but I expect there are UI accessibility concerns.) A larger problem is having rules for transforming from what...
I agree inference is rarely the last pipeline step. However, if your accelerator is a general purpose programmable device, it's realistic for it to run post-processing too - for example,...
The timeline for getting that into 1.1 seems quite short, given there's no proposal yet.
And regarding 3DUNet not being in server... that's correct, but latency is still relevant for 3DUNet in Edge Single Stream.
It's significant only for benchmarks where the output size is large. Today, that's only segmentation.
I'm sure we can, but what are we looking for, and how would we act on it? MLPerf has, historically, set some fairly arbitrary bounds on the timed portion of...
That would be my preference regardless of this question. If we do that, does that mean we should assume the answer is (1) above?
The hyperparameter question is somewhat off-topic here: I've opened a new issue https://github.com/mlcommons/inference_policies/issues/216
I agree. This class of system programming optimizations apply to generic (broader than ML) accelerators, and it would be counterproductive to exclude them.
Agreed on all counts.