Haodong Ray

Results 22 comments of Haodong Ray

有英文回答解决了这个问题

I use the dataset given by author. But the bug still occurs. I don't understand change what version of environment. Could you tell me more detail of your solution?

Oh, no. Bad news for me. [crying]

Hi. I have another question about the decoder phase. I print the hidden_states.shape in https://github.com/LargeWorldModel/LWM/blob/f45d2b70bda27abfa9cf32e228916b2883801366/lwm/llama.py#L977 And I find sometime the result is > (512, 8192, 4096) > (2, 385, 4096)...

Hi. I am not the author of this paper. But it seems that you do not know the size of KVCahe could be stable. You just need to change the...

In fact, even if you solve this problem, you will find there do not exist the key word about "lm_head.weight". So maybe this code is custom-made for some model. ![image](https://github.com/user-attachments/assets/a29f0579-03a7-4c3b-85c0-c9f0843574e4)

Hi. Could your $llama_tokenizer_path setting? Why there is a ".vocab_file"

飞书的技术文档怎么没有了 ![Image](https://github.com/user-attachments/assets/82442229-15db-439e-bd7f-c2c245e66dd5)

I raise this question because I find Model follow this json.

My transformers==4.40.0