Symbolic link for ~/.transformerslab is probably causing issues
I use a symbolic link for ~/.transformerslab and it fails after a few restarts, and queued trainings. Training is perpetually in queue. It would be nice to be able to copy Terminal output.
local_server.log Attached the log file.
Hmmm...it looks like you are right about the symbolic link being the problem.
When the API starts you can see a bunch of confusion about reading from /big/ vs /home:
DEBUG Reading Python requests from version file at `/big/edwin_transformerlab/src/.python-version`
DEBUG Searching for Python 3.11 in virtual environments, managed installations, or search path
DEBUG Found `cpython-3.11.11-linux-x86_64-gnu` at `/home/edwin/.transformerlab/envs/transformerlab/bin/python3`
...
Using default home directory: /home/edwin/.transformerlab
Using default workspace directory: /home/edwin/.transformerlab/workspace
We are working from /big/edwin_transformerlab/src which is not /home/edwin/.transformerlab/src
Later on that seems to cause issues in a variety of places:
Saving a conversation and getting plugin info have errors that look like:
raise ValueError("{!r} is not in the subpath of {!r}"
ValueError: '/big/edwin_transformerlab/workspace/experiments/alpha/conversations/3jj3jq.json' is not in the subpath of '/home/edwin/.transformerlab/workspace/experiments/alpha' OR one path is relative and the other is absolute.
Later on training runs seem to fail because it can't find the output logs.
Unfortunately, we don't have an easy way to set the home directory in Transformer Lab right now. We have an issue that's been open for a while but I will try to think of other options:
https://github.com/transformerlab/transformerlab-app/issues/71
If you haven't done so already, you can also move where your huggingface hub cache is located (this is where downloaded models go): https://stackoverflow.com/questions/63312859/how-to-change-huggingface-transformers-default-cache-directory/72703148#72703148