Thomas Müller
Thomas Müller
Thanks! For TabFact we only train the classification layer you shouldn't use the ‘answer_coordinates’ or ‘answers’ field in the CSV (we might actually remove them at some point but for...
This should also work: ```Dockerfile ENTRYPOINT ["uvicorn"] CMD ["app.main:app", "--host", "0.0.0.0", "--port", "6565"] ``` (otherwise you will execute `uvicorn uvicorn app.main:app ...`)
I don't think it is compatible. Here is what I tried: 1. Install qlora and all deps 2. `pip install einops` 3. Run training ```shell python qlora.py \ --model_name_or_path mosaicml/mpt-7b...
Looks like they are working on it: https://huggingface.co/mosaicml/mpt-7b/discussions/23
You can work around the above error using the solution from the discussion above. So you checkout the model manually: ```shell git lfs install git clone https://huggingface.co/mosaicml/mpt-7b ``` And then...
Turns out it's peft that is adding the `input_embeds` parameter to the call. I accidentally stumbled over this: https://huggingface.co/cekal/mpt-7b-peft-compatible Which fixes the input embed problem as well as the gradient...
Trying to make this all a bit more straight-forward: https://huggingface.co/mosaicml/mpt-7b/discussions/42
The fix has been added to the main branch of `mpt-7b-peft-compatible`. So now you can just run this: ```shell python qlora.py \ --model_name_or_path cekal/mpt-7b-peft-compatible \ --trust_remote_code True \ --output_dir output...
Just in case this is helpful for someone: If you get this with docker make sure to use an image with cuda toolkit installed, e.g.: `pytorch/pytorch:2.0.1-cuda11.7-cudnn8-devel`
Implemented this relatively simple python-based solution if that's helpful to anyone: https://github.com/muelletm/alpaca.py.