Aliro icon indicating copy to clipboard operation
Aliro copied to clipboard

Refactor how results files produced by ml algorithms are uploaded to lab

Open hjwilli opened this issue 7 years ago • 0 comments

Currently, when an ml algorithm completes result data and images are produced by skl_utils.py as files in a temporary folder. machine.js watches this folder and uses the lab API to upload these files. As soon as skl_utils.py exits, machine.js sets the experiment status to 'finished', and after a 100s timeout checks that the lab file upload promises have been resolved and deletes the folder.

One issue that has has arisen is that the .catch() block for the promises generated when uploading files to lab are not added until the 100s timeout completes, meaning if a file upload fails (for example the .json is not in an acceptable format) the failure returned by lab is not properly caught and is difficult to link to the error particular file that caused the issue (see #61)

Another possible issue is a race condition with the ai; the experiment might marked as finished before a results file the ai needs is fully uploaded.

Refactor to address these issues. One option would be to refactor how the file promises in machine.js are handled. (Want to add .catch() blocks as soon as promises are created, want to set experiment status to 'finished' when all promises have been caught)

Another would be to update skl_utils.py to add the files via the api directly.

hjwilli avatar Aug 02 '18 16:08 hjwilli