WuNein
WuNein
@CatherineSue https://github.com/CatherineSue/vllm/blob/1c508625a85449c83c8fc1f2f99d78e7035fcbb6/vllm/model_executor/layers/pooler.py#L13 Last token is good, there are some using meaning pooling either. "mean", "max", "cls", "weightedmean", "lasttoken" are all important. https://github.com/xlang-ai/instructor-embedding/blob/5cca65eb0ed78ab354b086a5386fb2c528809caa/InstructorEmbedding/instructor.py#L68
Another question , this push will only solve llama's embedding problem. https://github.com/vllm-project/vllm/blob/1c508625a85449c83c8fc1f2f99d78e7035fcbb6/vllm/model_executor/models/llama_embedding.py#L29 Support of other model, is a question?
@CatherineSue For me, return the last hidden state is everything. Pool can be implement separately. You offer a solution that is more difficult to use on other model.
> 作者,你好,我在用我的数据集上继续指令微调的时候遇到了loss为0的问题,想问一下作者或者其他人有没遇到同样的问题,问题、代码、训练参数下所示: 代码:` > > ``` > tokenizer = LlamaTokenizer.from_pretrained(“ziqingyang/chinese-alpaca-lora-7b”, padding_side="left") > base_model = LlamaForCausalLM.from_pretrained( > "decapoda-research/llama-7b-hf", > # load_in_8bit=True, > load_in_8bit=False, > torch_dtype=torch.float16, > device_map=device_map, > ) > base_model.resize_token_embeddings(len(tokenizer))...
> blender works fine with vertex color models But when i render the image , the color is none.
I am rather confussing too, consider [this](https://github.com/thunlp/OpenDelta/blob/4cb61b1dcc2032c002be2e5ed858a351a2cfcff0/opendelta/delta_models/prefix.py#L134). As the attention mask is first feeded to BartDecoder, than the BartDecoderLayer, so the attention seq length is always a problem. I deem...
In my case, i may think the onnx part is not complete yet? The onnx model i export for the yolo v8 m is float16(small problem) and the output shape...
@phixerino , exactly, consider the size of torch, it is not fully utilized in this case. My main goal is to adapt this onnx file to C# program. Modify from...
I tested on my rtx 3090 and 4090, in nvidia docker. IT is a must setup.
sm_90 is H100, 80-86 shall be 3090 - 4090. I have to specific version, or it will generate all sm, include parcel ones. This shall be notify in README, i...