Zongheng Yang
Zongheng Yang
Hi, I'm interested in running a TC-generated CUDA kernel outside of PyTorch. Currently, I'm using the TC options to specify grid and block dim3. E.g., with ``` .mapToThreads(320) .mapToBlocks(32, 320)...
A TPU user mentioned that for TPUs, even with on-demand TPU, it will be killed at any time within every 2 days, and no logs can be found for the...
Adapted from a programmatic use case from Erick. Here's a CLI repro: ``` # Succeeds. Because we allow lower-case GPUs in launching. sky launch -c myclus --gpus v100 '' #...
``` Error: Error reading Project Service foo/cloudbuild.googleapis.com: googleapi: Error 403: Cloud Resource Manager API has not been used in project 123456789 before or it is disabled. Enable it by visiting...
Been failing test_inline_spot_env smoke test: ``` » sky spot status | grep test-inline-spot-env-zongheng-4edc-62 428 test-inline-spot-env-zongheng-4edc-62 1x [CPU:0.5] 4 mins ago 2m 27s - 0 FAILED_NO_RESOURCE » sky logs sky-spot-controller 428...
Fixes #1045 - see that issue for the motivation of the change (UX/confusion on user's front). **This change will half the maximum number of concurrent tasks**, compared to master. This...
In `sky queue` - `[CPU:0.5]` means this task takes 0.5 cpu for **scheduling purpose**. So e.g., if the VM has 8 cores, this means we can concurrently run at most...
Daniel has run into a back-compat issue, where - he had a cluster launched before our AMI upgrade; it's in stopped state - he upgraded Sky, which now includes the...
Repro: ``` sky launch -c xv2test2 'git clone https://github.com/RitwikGupta/xView2-Vulcan.git && conda install --file xView2-Vulcan/spec-file.txt' ``` On 04b94b9004a, some lines are prefixed some are not: ``` mkl-service-2.4.0 | ########## | 100%...
``` » sky queue 2 ↵ Fetching and parsing job queue... Cluster bench is not up (status: STOPPED); skipped. Job queue of cluster sky-spot-controller ssh: connect to host 44.199.255.174 port...