nathanodle issues

Results 8 issues of


                                            nathanodle

Segmentation Fault w/ ROCm

Running LLaVA-Lightning-MPT-7B-preview, get a core dump on the model worker. Using latest Torch 2.0 w/ ROCm 5.4.2 on AMD Radeon 7900 XTX. Tried demo images and user images with various...

Support Torch >2.0

I would like to use Torch >2.0 for use on machines running CUDA 12. I'm not sure which dependency requires Torch

Will you support Intel Arc?

I’m curious if you will support Arc, neural compressor would particularly benefit those platforms! Thanks!

Offer to retrain checkpoint

I'd like to retrain the checkpoint with a larger dataset but not sure what your training script modifications are compared to the lifeiteng repo. Would you be up for discussing...

Am getting an error asking for huggingface token when trying to use -b. Repo does not require token, and no token I give it will enable the download.

vLLM freezes with gpu-memory-utilization > 0.55

Running vllm according to instructions. Docker segfaults at startup, so I'm running straight on the machine. Starting server with the following shell script. As you can see I've tried to...

user issue

vllm converting model to sym_int4 even when --load-in-low-bit sym_int4 not set

Following the directions at: https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/vLLM_quickstart.html but omitting the --load-in-low-bit sym_int4 line, it is still reported that model is being converted and model weight memory reflects this. I do not want...

user issue

Invalid output and errors using model = ipex.optimize(model): split master weight unsupported, Conv BatchNorm folding failed, Linear BatchNorm folding failed

Hi, trying to run inference with a pretrained OFA (OFA-huge) model according to these instructions: https://github.com/OFA-Sys/OFA/blob/feature/add_transformers/transformers.md This runs fine on both CPU and CUDA but using XPU results in gibberish....

ARC

Correctness