Hakan Baba
Hakan Baba
For some reason this pull request shows much more commits than there is . :( Something may be messed up with the rebase - merge operations.
Dear Maintainers. We would appreciate very much if you could add some comments to this PR ?
If I may, could I suggest alternative paths to `/etc` join( base_dir, "etc") join( base_dir, "conf") join( base_dir, "config") **Rationale** `sagemaker-inference` already has a concept of [`base_dir`](https://github.com/aws/sagemaker-inference-toolkit/blob/45fa4fb33c13a70640aa200fbfa576c323f973da/src/sagemaker_inference/environment.py#L30). `base_dir` defaults to...
I see the defragment error with 0.9.1 as well. (Changed the issue title to reflect 0.9.1)
The assertion comes from [this line](https://github.com/microsoft/DeepSpeed/blame/39b429d56ef12b3dc82fc177e2f0f801db744a3d/deepspeed/runtime/zero/stage3.py#L410). According to the blame, it did not change for a year or so. Looking upper in the call stack, that defragment function is called...
> You can't use bf16 on the V100. Did you make the change in the README? https://github.com/databrickslabs/dolly#v100-gpus Yes. Otherwise one gets a clear error message for non-supported bfg16. Also the...
How about giving the workaround a try first @jamesrmccall ? ``` "offload_param": { "device": "cpu", "pin_memory": true }, ``` In the deepspeed config ? That fixed the issue for me...
> For JIT version, If you properly warmup the kernel compilation, this would not bring performance decrease. We have cache for JIT compiled kernels. I am mainly worried about a...
@hsanson Would there be any interest in this capability. If the maintainers are open to it I could take a stab at it.
The issue here is that the vllm_backend does not support the V1 metrics from vllm. (As far as I can tell) At the time of writing this, the latest vllm_backend's...