logicchains
Results
2
issues of
logicchains
I'm running the 65B model on a machine with 256 gigabytes of (CPU) ram, with context size set to 2048. The same thing happens with both llama65b and alpaca65b, every...
bug
I was training a llama model on GPU, with a custom embedding. It worked fine with 12 layers, dim 1024, seq length 256, but loss would become nan after the...
bug