vidhyat98 comments

Repositories
Issues
Comments

Results 6 comments of


                                            vidhyat98

[Feature]: GPTQ/AWQ quantization is not fully optimized yet. The speed can be slower than non-quantized models.

when can we expect awq models to be optimized for inference?

[REQUEST] DeepSpeed-FastGen AWQ support

any updates on this?

GRPO on Modal - vllm doesn't start

@SalmanMohammadi When can we expect the cookbook to be updated? Would the files still work if vllm functionality is not used? My training doesn't seem to converge.

[Bug]: Lora refuses to load from disk without extremely weird manipulations with file paths

Facing the same issue with llama 3.1 model adapters

Unable to preprocess GRPO dataset

I'm facing the same issue. ETA on when the fix would be merged?

[Feature] [raycluster] head node Waiting for creating a placement group of specs for 150 seconds

Observing the same issue while trying to serve on 2 A100 40GB GPUs.