罗皓天 (Haotian Luo) comments

Results 17 comments of


                                            罗皓天 (Haotian Luo)

Request for pretrained model

Also, interested. Is the pretrained model available now?

Request for pretrained model

I've reproduced this work recently. The result on ROC dataset seems really good.

Cannot merge LORA layers when the model is loaded in 8-bit mode

I am worried about whether this quick fix would be harmful to the model's ability? Is there any other way to fix this problem?

M1 Mac XCode 14, Monterey Installation Issues

> > 自己fork了下安装脚本、修改了下、可以用我这个一键安装脚本进行安装 udo /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/shadow-boy/MonkeyDev/master/bin/md-install)" > > xcode 14.3 需要替换 355 和 365 行中间路径为 /Developer/Library/Xcode/Plug-ins/XCBSpecifications.ideplugin ![image](https://github.com/AloneMonkey/MonkeyDev/assets/74357444/f48996e1-a258-4a83-8dfb-201c63611317) 这样即可成功

Error: read ECONNRESET when using Ubuntu 20.04 on WSL2

try `npm audit fix --force` after `npm install`, this is helpful to me

Not able to reproduce the experimental results

Maybe my issue is helpful to you https://github.com/rktamplayo/PlanSum/issues/3.

When eval the model on amazon datasets, the result fluctuate.

I've solved this question, the 234 line sum_tokens[token_ids] += tokens[tindex] have potential randomness. You should change the writing style or set torch.use_deterministic_algorithms(True) to avoid this eval fluctuation.

the inference of OFA-Sys/gsm8k-rft-llama13b2-u13b has shape error: 13Bllama2的u13b版本推理时出现矩阵形状错误

I'm not using accelerate and your script, I'm just using it as a object of LlamaForCausalLM and using bnb quantize for inference. But i don't think that would cause problem.

the inference of OFA-Sys/gsm8k-rft-llama13b2-u13b has shape error: 13Bllama2的u13b版本推理时出现矩阵形状错误

``` import torch import sys import random import numpy as np from transformers import LlamaTokenizer, LlamaForCausalLM, BitsAndBytesConfig bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", # bnb_4bit_quant_type="fp4", bnb_4bit_compute_dtype=torch.bfloat16 ) random.seed(0) np.random.seed(0) torch.manual_seed(0)...

the inference of OFA-Sys/gsm8k-rft-llama13b2-u13b has shape error: 13Bllama2的u13b版本推理时出现矩阵形状错误

``` Nvidia driver version: 525.125.06 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte...