inference icon indicating copy to clipboard operation
inference copied to clipboard

TEST04 Evaluation

Open tjablin opened this issue 3 years ago • 3 comments

TEST04 is only applicable to ResNet50. Is it still worthwhile?

tjablin avatar Oct 03 '22 22:10 tjablin

TEST04 is a good compliance test as it indeed ensures that no significant benefit from caching happened for a submission. But it does not make sense to have it just for resnet50. The current problem is that due to variable length inputs we cannot apply this to other models. Is it possible to increase the input size (currently just 1) to say 8 or 16 (which still should ensure caching) in Test-04 and make it applicable to other models?

arjunsuresh avatar Oct 08 '22 15:10 arjunsuresh

We agree that it would be desirable if this test was more widely applicable. It applying only to Resnet50 currently does not invalidate it, in our opinion. Especially since almost all submitters submit this benchmark. For some, it represents 50%-100% of the benchmarks they submit. We welcome more input from the WG on how this test can be improved.

georgelyuan avatar Oct 25 '22 15:10 georgelyuan

For v3.0 there is no change for TEST 04. For v3.1, TEST 04 will support beyond ResNet50.

Pablo will create an issue to address the v3.1 requirements, work with other folks to create the requirements and implement it for v3.1 submission.

rnaidu02 avatar Jan 10 '23 16:01 rnaidu02