Rangehow comments

Results 12 comments of


                                            Rangehow

您好，如何在日志增加输出Tokens/gpu/s和TFLOPS

+1，希望可以在日志里加上衡量吞吐性能的指标

Super slow iteration with trivial custom transform

Similar issue in text process ```python tokenizer=AutoTokenizer.from_pretrained(model_dir[args.model]) train_dataset=datasets.load_from_disk(dataset_dir[args.dataset],keep_in_memory=True)['train'] train_dataset=train_dataset.map(partial(dname2func[args.dataset],tokenizer=tokenizer),batched=True,num_proc =50,remove_columns=train_dataset.features.keys(),desc='tokenize',keep_in_memory=True) ``` After this train_dataset will be like ```python Dataset({ features: ['input_ids', 'labels'], num_rows: 51760 }) ``` In which input_ids and...

How can we resume training from lora model?

I found the possible cause, in hf we can use model.add_adapter(lora_config) or model=get_peft_model(model,lora_config) to convert a model to peft model. But the former will cause the error while the latter...

请教一下，在合并Chinese-Alpaca-Plus模型的时候，为什么lora模型参数不能颠倒顺序

非常感谢！

HellaSwag with UnicodeDecodeError

same with drop

Question about RoPE code

Thanks for your answer : ) Is there exist some reason that the latter implementation was widely used in code instead former one ?

sh build_index.sh

点开看下你的数据是不是因为换行或者什么原因只有一条，这个bug我记得以前有人提过。这个faiss索引类型有最少的数据要求才能训练。

Error reported while training model # 4. Please help me solve it.

我有点不记得源代码具体的过程了，单从报错log来看，应该是你faiss索引的向量和self.mem_feat_or_feat_maker并不是一个向量。faiss索引的向量数超出了你内存持有的向量，所以越界报错了，可以考虑进行相关检查。

sft代码疑问

虽然问题比较古早了，但我补了一个简单全面的介绍，感兴趣的朋友可以看看 https://zhuanlan.zhihu.com/p/695202364

不完全的提取

> Try this: > > curl https://r.jina.ai/https://www.neu.edu.cn/xygk/lrld.htm -H 'x-respond-with: markdown' 我现在手边没有电脑，我明天会尝试一下这个。但是只是访问这个网址的结果还是不完全的。而且当我重新访问时甚至进不去了。 ![Screenshot_2024-05-15-18-45-45-175_com.android.chrome.jpg](https://github.com/jina-ai/reader/assets/88258534/85676344-10ba-45ea-ba4a-b532a0a29088) ![Screenshot_2024-05-15-18-49-00-773_com.android.chrome.jpg](https://github.com/jina-ai/reader/assets/88258534/5401a4e8-50cb-44d0-a63b-a192d0682641) 然而原始网址可以正常访问 ![Screenshot_2024-05-15-18-49-09-039_com.android.chrome.jpg](https://github.com/jina-ai/reader/assets/88258534/bc0bdc83-edaa-4395-a1ee-a5883290f89a)