Rangehow

Results 12 comments of Rangehow

+1,希望可以在日志里加上衡量吞吐性能的指标

Similar issue in text process ```python tokenizer=AutoTokenizer.from_pretrained(model_dir[args.model]) train_dataset=datasets.load_from_disk(dataset_dir[args.dataset],keep_in_memory=True)['train'] train_dataset=train_dataset.map(partial(dname2func[args.dataset],tokenizer=tokenizer),batched=True,num_proc =50,remove_columns=train_dataset.features.keys(),desc='tokenize',keep_in_memory=True) ``` After this train_dataset will be like ```python Dataset({ features: ['input_ids', 'labels'], num_rows: 51760 }) ``` In which input_ids and...

I found the possible cause, in hf we can use model.add_adapter(lora_config) or model=get_peft_model(model,lora_config) to convert a model to peft model. But the former will cause the error while the latter...

Thanks for your answer : ) Is there exist some reason that the latter implementation was widely used in code instead former one ?

点开看下你的数据是不是因为换行或者什么原因只有一条,这个bug我记得以前有人提过。这个faiss索引类型有最少的数据要求才能训练。

我有点不记得源代码具体的过程了,单从报错log来看,应该是你faiss索引的向量和self.mem_feat_or_feat_maker并不是一个向量。faiss索引的向量数超出了你内存持有的向量,所以越界报错了,可以考虑进行相关检查。

虽然问题比较古早了,但我补了一个简单全面的介绍,感兴趣的朋友可以看看 https://zhuanlan.zhihu.com/p/695202364

> Try this: > > curl https://r.jina.ai/https://www.neu.edu.cn/xygk/lrld.htm -H 'x-respond-with: markdown' 我现在手边没有电脑,我明天会尝试一下这个。 但是只是访问这个网址的结果还是不完全的。而且当我重新访问时甚至进不去了。 ![Screenshot_2024-05-15-18-45-45-175_com.android.chrome.jpg](https://github.com/jina-ai/reader/assets/88258534/85676344-10ba-45ea-ba4a-b532a0a29088) ![Screenshot_2024-05-15-18-49-00-773_com.android.chrome.jpg](https://github.com/jina-ai/reader/assets/88258534/5401a4e8-50cb-44d0-a63b-a192d0682641) 然而原始网址可以正常访问 ![Screenshot_2024-05-15-18-49-09-039_com.android.chrome.jpg](https://github.com/jina-ai/reader/assets/88258534/bc0bdc83-edaa-4395-a1ee-a5883290f89a)