vip

Results 13 comments of vip

the same error IndexError: Invalid key: 22330 is out of bounds for size 0

> 或者你能试试吗?这是我们可以尝试的替代方案,因为我同意我相信只有当我们没有共享文件系统时才存在问题。`pip install git+https://github.com/huggingface/transformers@muellerzr-multinode-save` After updating the code, deepspeed starts the cluster and saves the checkpoint named tmp checkpoint-10 from the node. The host point is checkpoint-10. After saving the...

> 显示已解决,并带有正确的标志。 Has the problem been resolved?

To supplement,:there were no errors when using zero2, but there were new errors after training. Does it not support Mixtra? ![企业微信截图_1705731459109](https://github.com/OpenAccess-AI-Collective/axolotl/assets/119389127/11de8eab-3737-4578-9955-6efa3b30f770)

Now I am continuing SFT training from checkpoint and reporting this error again I have configured this parameter: use_reentrant: true resume_from_checkpoint: /workspace/axolotl-main/checkpoint-5865 ![image](https://github.com/OpenAccess-AI-Collective/axolotl/assets/119389127/0ca9286c-7515-4dda-aea0-b71ba6632d57) ![image](https://github.com/OpenAccess-AI-Collective/axolotl/assets/119389127/8cc1457a-1f52-4a64-83cc-6b83056339da)

Does Mixtra support AWQ 4-bit?

> Do you encounter same issue on LLaMA 2-70B? The current test is llama3, and llama-2-70B has not been tested before. Is this related to int4/awq, FP16 is normal

> 我有同样的问题,但我仍然不知道如何解决它 ![错误 1](https://private-user-images.githubusercontent.com/86126695/376003770-73652fa1-3fd3-4c0a-92c7-df04a9b812d3.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mjg4NzA2NDYsIm5iZiI6MTcyODg3MDM0NiwicGF0aCI6Ii84NjEyNjY5NS8zNzYwMDM3NzAtNzM2NTJmYTEtM2ZkMy00YzBhLTkyYzctZGYwNGE5YjgxMmQzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDEwMTQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQxMDE0VDAxNDU0NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTE3N2ZkYmY0Y2M3YTFiMjNhMGNhN2M4YTFmMGI3NmMwOGM1YjRhZDFkMWI5MTY5Nzk2YTU1NzIxODI1NGY3NmUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.0tjC7tGHUIAeQDHeRuze86Llo3r49GglyhFI0UZ58NE) Has the latest code been updated? Updating the latest code should solve the problem. If it has been updated, please check if the GRAPHRAG index is...

> > > > > > 我这几天测试一下,看看会不会有欠拟合的问题,可能moe模型的稳定性比较强 > > > > > > > > > > > > > > > 你deepspeed降级后,不会出现ImportError: cannot import name 'log' from 'torch.distributed.elastic.agent.server.api'吗?这个问题是只有14.4才能支持的 >...