vladrad comments

Results 17 comments of


                                            vladrad

Are you planning to implement LiveQuery in dotnet sdk?

I would be so down to work on this! Im just not sure where to start.

Unexpected OOM When Using use_gradient_checkpointing = "unsloth"

Hey! I ran into this in WSL2 as well. I posted in the other thread https://github.com/unslothai/unsloth/issues/600#issuecomment-2181298507 but I think this is due to pinned memory in wsl + coda... when...

Unexpected OOM When Using use_gradient_checkpointing = "unsloth"

I guess using unsloth-wsl would work solving this issue. I am interested in seeing if we can make it work with unsloth. Im curious what it's doing at that step...

Redis-backed queues do not consume further messages once a queue is empty

Hello! I have this exact issue and was wondering if you found the solution?

AWQ support

Finetuning a AWQ image would be amazing. I see it has support for PEFT in transformers https://github.com/huggingface/transformers/pull/28987 . this would be amazing to have, it would mean everyone can just...

AWQ support

Thank you! Let me know if there is anything I can do to help test. I can write code as well though this stuff is not my specialty but id...

[Bug] Llama 3.1 Support

sorry all... could have been better written but @Yoosu-L is right. only happens when you talk to it and it its up and running. seems like the template is slightly...

Cuda 12.8 Multi GPU Blackwell/A100 Fails

Hi all I am back. I am seeing people having similar issues on the cuda forums. I tried all versions Cuda 12.8/12.9 and the last 2-3 open versions. https://forums.developer.nvidia.com/t/cuda-12-8-with-driver-version-570-124-06-on-b200-hgx-getting-code-3-cudaerrorinitializationerror/331233/3 https://forums.developer.nvidia.com/t/cuda-cant-initialize-after-upgrade/332770...

[Bug] TP=2 fails on dual RTX 5090: TorchInductor compile error or CUDA illegal memory access (TP=1 works)

I have the same issue and I tried different kernels and drivers versions. it loads and then crashes.

[Bug] TP=2 fails on dual RTX 5090: TorchInductor compile error or CUDA illegal memory access (TP=1 works)

I am running ``` docker run --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=$HF_TOKEN" -p 8000:8000 --ipc=host vllm/vllm-openai:v0.9.0 --model Qwen/Qwen3-30B-A3B -tp 2 ``` I see the memory being allocated and tensors being...