logicchains issues

Repositories
Issues
Comments

Results 2 issues of


                                            logicchains

65B model eventually fails with "ggml_new_tensor_impl: not enough space in the scratch memory"

I'm running the 65B model on a machine with 256 gigabytes of (CPU) ram, with context size set to 2048. The same thing happens with both llama65b and alpaca65b, every...

bug

DEFAULT_MASK_VALUE causes gradient explosion and nan loss on deep models

I was training a llama model on GPU, with a custom embedding. It worked fine with 12 layers, dim 1024, seq length 256, but loss would become nan after the...

bug