Muggle666
Results
1
issues of
Muggle666
I am trying to enable cpu activation offload when training my custom LLAMA model. However, an error occur:  It seems like some inputs of the attention operation is offloaded...
bug
training