ckstanton comments

Results 7 comments of


                                            ckstanton

Early Stopping implementation might have a bug

Hi @nv-jinhosuh . Thanks for catching this. It turns out that the fix is simple... there's no complicated overflowing issue going on. There's currently an [overlatency query bound](https://github.com/mlcommons/inference/blob/r2.0/loadgen/loadgen.cc#L635) that's set...

Early Stopping implementation might have a bug

> That looks like the same bug to me. Agreed. There should be on the order of ~4500 acceptable overlatency queries for this run, which is bigger than the threshold...

Early Stopping implementation might have a bug

> @nv-jinhosuh That's my feeling too. But IIRC we decided that the early estimate would be used as the metric regardless. And it is what the submission checker appears to...

Early Stopping and test pass/fail

> Currently we set d (tolerance) to zero in our LoadGen. With this I don't think we can have 'third case'. Is it possible if it's the second case, Early...

Early Stopping and test pass/fail

The effective required min_query_count depends on the underlying overlatency percentile of the system. Based on the observed overlatency percentile of a run, it would be possible to have loadgen estimate...

Early Stopping and test pass/fail

> Thank you @ckstanton - I think we cannot do anything about this for 2.0, but for 2.1, I think we need to bring the policy requirements of query count...

Early Stopping and test pass/fail

Thanks, @nv-jinhosuh, for putting this together! The proposal to use early stopping estimates for shorter runs, and to not use them (i.e. report seen percentiles instead) beyond the minimum duration...