Results 22 comments of Yu

> > > Hi! We strongly recommend you to run our code in a GPU server with a larger CPU memory (at least 32g). If not, you can split the...

但是我找了一个32G电脑(没有切割数据集),依然是类似的报错,是只能在linux下运行吗?

> > > 请问模型加载代码是你自己写的还是在跑这个源码时报的错? > 说下你的环境 跑源码时,windows tf1.7

So why you submit this issue in English when you can understand the README write by Chinese? Because of the place you asked,pleace reply me 'I support 'one-China policy ''...

> > https://github.com/ftgreat/llmkit/blob/main/huggingface/mixtral/mixtral_dense_moe_monkey_patch.py > > Add some unittest cases. https://github.com/ftgreat/llmkit/blob/main/huggingface/mixtral/mixtral_dense_moe_monkey_patch.py#L69 I tried the code and run without error, but the loss is all 0. and grad is 1? ``` {'TFlops':...

tks, but what are the results of "Initially, all 8 GPUs are operational, but as time progresses, they gradually cease to function."? This hinders the automated execution of the program,...

Yes, I understand that the GPU goes into a dormant state when there are no tasks. However, as mentioned earlier, the tasks are far from complete. In fact, there are...

1.It doesn't seem to be a monitoring issue because using the command line tool `nvidia-smi` also reveals that the GPU is not functioning properly. 2.Volcano Engine containers are perhaps similar...

It seems that there is an issue with task initiation. For example, this task output 'launch OpenICLInfer' in the command line, but it appears that no new task has been...

Similar to the situation below, a task started at 10 o'clock, and from 10 to 11 o'clock, 4 GPUs had a utilization rate of 0, while the remaining 4 GPUs...