transformers-code issues

2

不好意思打擾了，我在練習範例時遇到以下問題直接卡關看起來是存檔過程出了問題，但是超出初學者能力範圍太多，請求解惑 ```python trainer.train() ``` --- ```console -> [2749](file:///C:/Users/aaa/AppData/Local/anaconda3/envs/perft/lib/site-packages/transformers/modeling_utils.py:2749) safe_save_file(shard, os.path.join(save_directory, shard_file), metadata={"format": "pt"}) ...... --> [492](file:///C:/Users/aaa/AppData/Local/anaconda3/envs/perft/lib/site-packages/safetensors/torch.py:492) "data": _tobytes(v, k), ...... [405](file:///C:/Users/aaa/AppData/Local/anaconda3/envs/perft/lib/site-packages/safetensors/torch.py:405) if not tensor.is_contiguous(): --> [406](file:///C:/Users/aaa/AppData/Local/anaconda3/envs/perft/lib/site-packages/safetensors/torch.py:406) raise ValueError( [407](file:///C:/Users/aaa/AppData/Local/anaconda3/envs/perft/lib/site-packages/safetensors/torch.py:407)...

smithlai

[Questions] 關於Model Head 以及 LLM Training dataset的請教

您好，我想問兩個問題第一個是關於 "AutoModelForCausalLM.from_pretrained" 我知道這個在基礎課程with/without Head篇章有提到過，基本上使用AutoModel會提取底層的Bert，輸出則為Hidden State。使用ForCausalLM則會在後續加上Fully Connected Layer之類的，用以生成logits完成下游任務。在Bert我可以理解，因為他本來就是將文字embedding成hidden state的過程，沒有下游任務。但是以bloom而言， AutoModelForCausalLM也會在他後續又加上Header嗎？我就是在這邊一直糾結。或是反過來，對於本來就有下游任務的某個model，比如llama2-instruct，那我用AutoModelForCausalLM跟AutoModel去讀取他，會有什麼差別？ AutoModel是會讀取完整模型，還是會把他的Head給切了？還是AutoModelForCausalLM會又額外給他多加新的head layer? 第二個問題是關於 Training Data是使用causal data(['input_ids']+DataCollatorForCausalLM)比較好，還是seq2seq data(['input_ids']+['label']+DataCollatorFromSeq2Seq)？是取決於什麼呢？取決於模型嗎？比如gpt-2就應該使用causal data, T5就應該使用seq2seq data? 還是取決於任務跟效果？比如是希望模型能回答出一模一樣的結果使用seq2seq? 我會有這個疑問是因為看到其他人的程式碼 [04-Gemma-2-9b-it peft...

smithlai

创建评估函数, 二分类模型，计算f1_metric 错误

02-NLP Tasks ---> 08-transformers_slolution ---> classificaion_demo.ipynb 问题定位 f1 = f1_metric.compute(predictions=predictions, references=labels) 报错信息 return {"f1": float(score) if score.size == 1 else score} AttributeError: 'float' object has no attribute 'size' 原因分析 def...

xiongyan

您好，请教一下，加载Hugging Face内的模型出现NoneType error是啥情况哇？

您好！非常细致以及专业的视频！感谢您的讲解，带我小白认识了大模型！我导师想让我利用他人在HF上的模型对自己的数据进行预测。其中模型叫做：`InstaDeepAI/nucleotide-transformer-500m-human-ref` 但是我们需要利用下游任务内的[dataset](https://huggingface.co/datasets/InstaDeepAI/nucleotide_transformer_downstream_tasks)进Fine Tuning，作者其实有给出对应的[notebook](https://github.com/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling_with_peft.ipynb)但是我在运行加载dataset的时候出现了报错 ``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[7], line 5 3 # Load the promoter dataset from the InstaDeep Hugging Face ressources 4 dataset_name =...

nanu23333

我在使用accelerate+deepspeed进行Trainer时报错 Tensors must be contiguous

在跑您的ddp_trainer.py时 accelerate launch --config_file defualt_config.yaml ddp_trainer.py Traceback (most recent call last): [rank0]: File "/root/LLM_test/ddp_trainer.py", line 105, in [rank0]: trainer.train() [rank0]: File "/root/anaconda3/envs/liuyu_llm/lib/python3.10/site-packages/transformers/trainer.py", line 2164, in train [rank0]: return inner_training_loop( [rank0]:...

liuyu611