cloud
cloud copied to clipboard
Cloud instance management for deep learning applications.
TODOs - Update clean and remove functions to match TPU VM command - Documentation
Is there any plans to support GCP AI Platform?
error `[Errno 2] No such file or directory: '/tmp/sockets/mySocket" Traceback (most recent call last): File "/home/stephengou/miniconda3/envs/stephen_dev/lib/python3.7/site-packages/errand_boy/transports/unixsocket.py", line 45, in server_get_connection os.remove(self.socket_path) FileNotFoundError: [Errno 2] No such file or directory: '/tmp/sockets/mySocket'`
Right now it's very likely that our users could experience a preemption and the simple `if tpu.usable` technique wouldn't be sufficient to recover. Maybe write a function decorator that will...
Keep a redundant resource alive at all times in case one of the active ones dies.