Error in "pgml.transform" with "text2text-generation" and "bigscience/T0"
Environment: Ubuntu 22.04, Self-hosted Postgresml installed in Postgresql 13 database
I am running this SQL:
SELECT pgml.transform(
task => '{
"task" : "text2text-generation",
"model" : "bigscience/T0"
}'::JSONB,
inputs => ARRAY[
'Is the word ''table'' used in the same meaning in the two previous sentences? Sentence A: you can leave the books on the table over there. Sentence B: the tables in this book are very hard to read.'
]
) AS answer;
After the first attempt... it took more than 30 minutes... after this I got an error message. Then I restarted the postgresql service... still this error... then I rebooted the machine... same thing.
Here the error:
An error occurred when executing the SQL command:
SELECT pgml.transform(
task => '{
"task" : "text2text-generation",
"model" : "bigscience/T0"
}'::JSONB,
inputs => ARRAY[
...
ERROR: Traceback (most recent call last):
File "transformers.py", line 449, in transform
File "transformers.py", line 418, in create_pipeline
File "transformers.py", line 306, in __init__
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/pipelines/__init__.py", line 905, in pipeline
framework, model = infer_framework_load_model(
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
raise ValueError(
ValueError: Could not load model bigscience/T0 with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM'>, <class 'transformers.models.t5.modeling_t5.T5ForConditionalGeneration'>). See the original errors:
while loading with AutoModelForSeq2SeqLM, an error is thrown:
Traceback (most recent call last):
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 530, in load_state_dict
return torch.load(
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/torch/serialization.py", line 1004, in load
overall_storage = torch.UntypedStorage.from_file(f, False, size)
RuntimeError: unable to mmap 44541580809 bytes from file </var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin>: Cannot allocate memory (12)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 539, in load_state_dict
if f.read(7) == "version":
File "/usr/lib/python3.10/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 279, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
return model_class.from_pretrained(
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3306, in from_pretrained
state_dict = load_state_dict(resolved_archive_file)
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 551, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for '/var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin' at '/var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
while loading with T5ForConditionalGeneration, an error is thrown:
Traceback (most recent call last):
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 530, in load_state_dict
return torch.load(
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/torch/serialization.py", line 1004, in load
overall_storage = torch.UntypedStorage.from_file(f, False, size)
RuntimeError: unable to mmap 44541580809 bytes from file </var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin>: Cannot allocate memory (12)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 539, in load_state_dict
if f.read(7) == "version":
File "/usr/lib/python3.10/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 279, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3306, in from_pretrained
state_dict = load_state_dict(resolved_archive_file)
File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 551, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for '/var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin' at '/var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
1 statement failed.
Execution time: 5.4s
I see two things there: "Cannot allocate memory" and "Unable to load weights from pytorch checkpoint file"
There is 16 GB RAM in this machine and 4 GB RAM in Nvidia GPU
nvidia-smi
Sun Mar 3 22:35:06 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Quadro M1200 On | 00000000:01:00.0 On | N/A |
| N/A 47C P0 N/A / 200W | 473MiB / 4096MiB | 3% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Wait, 44541580809 bytes = 44.5 GB
Well, do not know how it fits into 16 GB RAM...
The model you are using is an 11B model. It won't fit in 4GB. You can try something like: https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct