Richard Li issues

Results 4 issues of


                                            Richard Li

2 process started but the model loaded only on one device?

![image](https://user-images.githubusercontent.com/16338370/224275035-cb212837-7a32-4beb-b121-cb5dc8539218.png) I use the following command to start 2 processes, but the model loaded only on one device. `CUDA_VISIBLE_DEVICES="2,1" torchrun --nproc_per_node 2 example.py --ckpt_dir LLaMA/13B --tokenizer_path LLaMA/tokenizer.model`

[feat] Add finetune code for Yi-VL model

The code is mostly modified from [LLaVA](https://github.com/haotian-liu/LLaVA)

关于33B模型预训练语料长度

你好，注意到33B模型，预训练语料的长度是512。如果在这个基础上进行继续指令finetune，finetune数据长度大于512的话，会有影响吗？

Can I use tensor_parallel to inference for a GPTQ quantized model?

What should I do if I want to use tensor_parallel for a GPTQ quantized model([Llama-2-7b-Chat-GPTQ](https://huggingface.co/4bit/Llama-2-7b-Chat-GPTQ) for examlpe) to inference on 2 or more GPUs? Currently, I am using AutoGPTQ to...