Tianhao Cheng
Tianhao Cheng
@lekurile @jeffra @HeyangQin according to https://github.com/microsoft/DeepSpeed/issues/2876 , I tried to load the model in FP16 and then set the dtype = torch.int8 in init_inference , but it still fails :...
https://github.com/microsoft/DeepSpeed/issues/2865 mention the same problem
@HeyangQin Load BLOOM model with FP16 checkpoint and then set dtype=int8 in init_inference not work : ( Could u please answer this issue: https://github.com/microsoft/DeepSpeed/issues/2923 , and I found some people...
I meet the same problem
+1 , I met the same problem here , I finetune llava with lora and want to inference it with ``` python -m llava.serve.cli --model-path /root/code/LLaVA/checkpoints/llava-v1.5-13b-lora --image-file /root/code/LLaVA/pic.png --model-base FlagAlpha/Llama2-Chinese-7b-Chat...
-200 is the image token and don't change that Is the problem with weights ? have you tried lora and it works well?
Nice idea ! We'll develop this feature in the near future
get , good idea! we'll thinking on how to resume an experiment