smithlai issues

Results 5 issues of


                                            smithlai

Dependencies not works in langchain:sample

I tried to run container langchain:samples with `./run.sh $(./autotag langchain:samples)` As the description in Readme, the langchain:sample is based on: > Dependencies > build-essential cuda cudnn python tensorrt numpy cmake...

Chapter 07 -trainer自動儲存模型失敗

不好意思打擾了，我在練習範例時遇到以下問題直接卡關看起來是存檔過程出了問題，但是超出初學者能力範圍太多，請求解惑 ```python trainer.train() ``` --- ```console -> [2749](file:///C:/Users/aaa/AppData/Local/anaconda3/envs/perft/lib/site-packages/transformers/modeling_utils.py:2749) safe_save_file(shard, os.path.join(save_directory, shard_file), metadata={"format": "pt"}) ...... --> [492](file:///C:/Users/aaa/AppData/Local/anaconda3/envs/perft/lib/site-packages/safetensors/torch.py:492) "data": _tobytes(v, k), ...... [405](file:///C:/Users/aaa/AppData/Local/anaconda3/envs/perft/lib/site-packages/safetensors/torch.py:405) if not tensor.is_contiguous(): --> [406](file:///C:/Users/aaa/AppData/Local/anaconda3/envs/perft/lib/site-packages/safetensors/torch.py:406) raise ValueError( [407](file:///C:/Users/aaa/AppData/Local/anaconda3/envs/perft/lib/site-packages/safetensors/torch.py:407)...

[Questions] 關於Model Head 以及 LLM Training dataset的請教

您好，我想問兩個問題第一個是關於 "AutoModelForCausalLM.from_pretrained" 我知道這個在基礎課程with/without Head篇章有提到過，基本上使用AutoModel會提取底層的Bert，輸出則為Hidden State。使用ForCausalLM則會在後續加上Fully Connected Layer之類的，用以生成logits完成下游任務。在Bert我可以理解，因為他本來就是將文字embedding成hidden state的過程，沒有下游任務。但是以bloom而言， AutoModelForCausalLM也會在他後續又加上Header嗎？我就是在這邊一直糾結。或是反過來，對於本來就有下游任務的某個model，比如llama2-instruct，那我用AutoModelForCausalLM跟AutoModel去讀取他，會有什麼差別？ AutoModel是會讀取完整模型，還是會把他的Head給切了？還是AutoModelForCausalLM會又額外給他多加新的head layer? 第二個問題是關於 Training Data是使用causal data(['input_ids']+DataCollatorForCausalLM)比較好，還是seq2seq data(['input_ids']+['label']+DataCollatorFromSeq2Seq)？是取決於什麼呢？取決於模型嗎？比如gpt-2就應該使用causal data, T5就應該使用seq2seq data? 還是取決於任務跟效果？比如是希望模型能回答出一模一樣的結果使用seq2seq? 我會有這個疑問是因為看到其他人的程式碼 [04-Gemma-2-9b-it peft...

Anyone successfully build python wheel on desktop?

### OS Platform and Distribution mediapipe Docker on Ubuntu22.04 ### Compiler version gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04) ### Programming Language and version python ### Installed using virtualenv? pip? Conda?(if python)...

type:build/install

platform:docker

os:linux-non-arm

[問題] Gemma-2b 回答不斷循環

你好，我在測試的時候發現一個狀況就是回答會不停循環，這情況不論我使用Gemma-2b, Gemma-1.1-2b,以及他們的-it版本都一樣。只有在gemma2會好轉(然而, Gemma2還不被mediapipe converter支援，無法放到android)。以下是運作範例程式碼的結果(Colab, T4)：基本上我都是原封不動的測試，僅有修改 "num_train_epochs=1, " 因為T4只允許使用2小時，而一個epoch約80分鐘 ```python from datasets import load_dataset from random import randint # Test on sample prompt = pipe.tokenizer.apply_chat_template(eval_dataset[rand_idx]["messages"][:2], tokenize=False, add_generation_prompt=True)...