AlexanderZhx comments

Results 17 comments of


                                            AlexanderZhx

[BUG] Exporting / downloading model larger, than VRAM available (trained with DeepSpeed) fails

This could be somewhat easily solved with something like this (highlighted pseudocode) in `llm_studio/app_utils/sections/experiment.py` ![image](https://github.com/h2oai/h2o-llmstudio/assets/57565244/3aa64aa1-6b8f-458f-a6f6-254992264569)

[BUG] Exporting / downloading model larger, than VRAM available (trained with DeepSpeed) fails

> Is this happing only when using the weights from an old experiment to continue training with "Use previous experiment weights"? No, this is happening with a "freshly" trained model,...

[BUG] Exporting / downloading model larger, than VRAM available (trained with DeepSpeed) fails

> Thank you for reporting. When pushing the model to huggingface hub or downloading from the UI, these weights will be automatically sharded into smaller chunks (by default safetensors with...

[BUG] Exporting / downloading model larger, than VRAM available (trained with DeepSpeed) fails

> Sorry, can't fully follow. So you are using a local model or a model from Huggingface to start your Experiment? starting with a model from HF > And what...

[FEATURE] Training with QLoRA + FSDP

Would be awesome! Looking forward to it.

`segmentation fault` when running `codellama:34b` on A100

same. Phind-codellama in fp16 from the repository works, when loaded from a gguf doesn't though (ubuntu, docker, A100, ollama 0.1.37)

ollama run codellama:34b issue

same. Phind-codellama from the repository works, loading gguf doesn't though

No model options show up - ollama

Same with using the openai api

[Bug]: GPU memory does not get cleared after stopping training. (SD1.5)

> Edit your post to include your config.json please, (just ctrl + f replace your username). This looks like a known bug that occurs when using offload, but I’m not...

[Bug]: GPU memory does not get cleared after stopping training. (SD1.5)

Yes, setting these two variables to false manually in the config.json does work, and the model gets successfully unloaded from GPU. Haven't figured a way to turn them off via...