podium icon indicating copy to clipboard operation
podium copied to clipboard

Design test procedures for `examples/`

Open FilipBolt opened this issue 5 years ago • 7 comments

We need a way to automatically test examples to ensure they work with framework core changes. One solution would be marking these tests can be marked with slow to avoid running them each time. Also, dataset loading and downloading, training models and similar slow running operations should all be mocked (perhaps a set of mocking tools can be created)

FilipBolt avatar Dec 18 '20 12:12 FilipBolt

Currently, the test examples are not part of the docs and some of them are outdated.

When we add them, it shouldn't be too hard to implement automatic execution of these examples as part of the existing test suite. Not sure about the mocking tool, think this is a long term goal (probably because I'm not a big fan of mocking). Ideally, we would have an external runner (e.g. TakeLab server) connected to CI that runs slow examples if the file content is changed/new files are added. This means we would need the server only from time to time. IMO, these examples (only BERT comes to my mind) are "slow", but not that slow to mock parts of them.

mariosasko avatar Dec 18 '20 13:12 mariosasko

This sounds fantastic, if we could setup a CI that detects changes and runs only in those cases. We could probably get away without mocking whatsoever, which sounds great.

I'd say the docs are a separate issue (though a valid point), so I wouldn't like to expand this issue too much.

FilipBolt avatar Dec 23 '20 17:12 FilipBolt

How would we go about checking the correctness of non-deterministic parts of the examples, e.g. training of a model, which should be a fairly common example case? I see breaking examples into subfunctions and testing those subfunctions separately as a good first step, but would that impact the "esthetics" of the examples?

ivansmokovic avatar Dec 23 '20 19:12 ivansmokovic

We can either fix training of the model (you can make it deterministic at the cost of speed) or simply not care about performance metrics (unless they are relevant) as long as the training completes.

mttk avatar Dec 23 '20 20:12 mttk

We can either fix training of the model (you can make it deterministic at the cost of speed) or simply not care about performance metrics (unless they are relevant) as long as the training completes.

But can we guarantee the correctness of training examples?

ivansmokovic avatar Dec 24 '20 18:12 ivansmokovic

We can either fix training of the model (you can make it deterministic at the cost of speed) or simply not care about performance metrics (unless they are relevant) as long as the training completes.

But can we guarantee the correctness of training examples?

Not sure what you mean by this.

I'd delegate this for post 1.1.0.

mttk avatar Jan 08 '21 12:01 mttk

Will be closed via #318

mttk avatar Apr 02 '21 11:04 mttk