Richard Li

Results 4 issues of Richard Li

![image](https://user-images.githubusercontent.com/16338370/224275035-cb212837-7a32-4beb-b121-cb5dc8539218.png) I use the following command to start 2 processes, but the model loaded only on one device. `CUDA_VISIBLE_DEVICES="2,1" torchrun --nproc_per_node 2 example.py --ckpt_dir LLaMA/13B --tokenizer_path LLaMA/tokenizer.model`

The code is mostly modified from [LLaVA](https://github.com/haotian-liu/LLaVA)

你好,注意到33B模型,预训练语料的长度是512。如果在这个基础上进行继续指令finetune,finetune数据长度大于512的话,会有影响吗?

What should I do if I want to use tensor_parallel for a GPTQ quantized model([Llama-2-7b-Chat-GPTQ](https://huggingface.co/4bit/Llama-2-7b-Chat-GPTQ) for examlpe) to inference on 2 or more GPUs? Currently, I am using AutoGPTQ to...