aiEngineer
aiEngineer

Have the 78k evolved code instructions datasets been open sourced, and the code that generated them
win11下载好后填完api key,点击相应的文本根本无法翻译。[Bug] {{请输入标题,不要留空 - Please enter a title, do not leave it blank.}}
### Please search before asking - [X] I searched in the [issues](https://github.com/yetone/openai-translator/issues) and found nothing similar. ### Please read README - [X] I have read the troubleshooting section in the...
Dear Authors, you have undoubtedly done an excellent job (domain-specific post-pre-training). But I have a small question about the size of the free-law data used in the original paper, I...
Hello, if I don’t have the permissions to update GLIBC under the root directory, how can I solve this problem?
Hello I was wondering about params.n_chunks. What does the n_chunks variable do, and why does it have anything to do with the length of the sequence
作者您好,毫无疑问,你们在llm的增量学习方面上做了一项优秀的工作。以下的llama均为llama2 但我对llama-pro有一点疑问。llama-pro是在llama的基础上添加了8层 identity block,同时进行了通用语料的全参数训练,所以我可以认为此时的8层 identity block已经退化为与其余32层类似的block,llama-pro也就成为了8.3B的普通的llama。 在后面的代码和数学预训练中,只不过是冻结了llama-8.3b的32层block,只微调8层block(当然这8个block的位置还是添加时候的位置) 所以,作者应该设置一个实验,将llama7b中的block冻结24层,微调8层(顺序上是每冻结3层block而开放一层block),这样就能证明是否有必要添加8层新的块进行微调。 Hello Author, undoubtedly, you have done excellent work in the incremental learning of LLM. Below, all "llama" refer to "llama2". But I...