postgresml Error in "pgml.transform" with "text2text-generation" and "bigscience/T0"

Environment: Ubuntu 22.04, Self-hosted Postgresml installed in Postgresql 13 database

I am running this SQL:

SELECT pgml.transform(
    task => '{
        "task" : "text2text-generation",
        "model" : "bigscience/T0"
    }'::JSONB,
    inputs => ARRAY[
        'Is the word ''table'' used in the same meaning in the two previous sentences? Sentence A: you can leave the books on the table over there. Sentence B: the tables in this book are very hard to read.'

    ]
) AS answer;

After the first attempt... it took more than 30 minutes... after this I got an error message. Then I restarted the postgresql service... still this error... then I rebooted the machine... same thing.

Here the error:

An error occurred when executing the SQL command:
SELECT pgml.transform(
    task => '{
        "task" : "text2text-generation",
        "model" : "bigscience/T0"
    }'::JSONB,
    inputs => ARRAY[
 ...

ERROR: Traceback (most recent call last):
  File "transformers.py", line 449, in transform
  File "transformers.py", line 418, in create_pipeline
  File "transformers.py", line 306, in __init__
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/pipelines/__init__.py", line 905, in pipeline
    framework, model = infer_framework_load_model(
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
    raise ValueError(
 ValueError: Could not load model bigscience/T0 with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM'>, <class 'transformers.models.t5.modeling_t5.T5ForConditionalGeneration'>). See the original errors:

while loading with AutoModelForSeq2SeqLM, an error is thrown:
Traceback (most recent call last):
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 530, in load_state_dict
    return torch.load(
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/torch/serialization.py", line 1004, in load
    overall_storage = torch.UntypedStorage.from_file(f, False, size)
RuntimeError: unable to mmap 44541580809 bytes from file </var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin>: Cannot allocate memory (12)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 539, in load_state_dict
    if f.read(7) == "version":
  File "/usr/lib/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 279, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
    return model_class.from_pretrained(
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3306, in from_pretrained
    state_dict = load_state_dict(resolved_archive_file)
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 551, in load_state_dict
    raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for '/var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin' at '/var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

while loading with T5ForConditionalGeneration, an error is thrown:
Traceback (most recent call last):
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 530, in load_state_dict
    return torch.load(
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/torch/serialization.py", line 1004, in load
    overall_storage = torch.UntypedStorage.from_file(f, False, size)
RuntimeError: unable to mmap 44541580809 bytes from file </var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin>: Cannot allocate memory (12)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 539, in load_state_dict
    if f.read(7) == "version":
  File "/usr/lib/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 279, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3306, in from_pretrained
    state_dict = load_state_dict(resolved_archive_file)
  File "/var/lib/postgresml-python/pgml-venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 551, in load_state_dict
    raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for '/var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin' at '/var/lib/postgresql/.cache/huggingface/hub/models--bigscience--T0/snapshots/7920e3b4fd0027e20824cec6d1daea6130723fec/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.


1 statement failed.

Execution time: 5.4s

Mar 03 '24 21:03 remote4me

I see two things there: "Cannot allocate memory" and "Unable to load weights from pytorch checkpoint file"

There is 16 GB RAM in this machine and 4 GB RAM in Nvidia GPU

nvidia-smi
Sun Mar  3 22:35:06 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro M1200                   On  | 00000000:01:00.0  On |                  N/A |
| N/A   47C    P0              N/A / 200W |    473MiB /  4096MiB |      3%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Mar 03 '24 21:03 remote4me

Wait, 44541580809 bytes = 44.5 GB Well, do not know how it fits into 16 GB RAM...

Mar 03 '24 21:03 remote4me

The model you are using is an 11B model. It won't fit in 4GB. You can try something like: https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct

Jan 15 '25 17:01 SilasMarvin