core-bioimage-io-python icon indicating copy to clipboard operation
core-bioimage-io-python copied to clipboard

Difficulty accessing torch module from a zipped model

Open pattonw opened this issue 1 year ago • 2 comments

I want to have direct access to the torch.nn.Module of a model from the model zoo (if available), but having difficulty accessing it.

Approach 1

  1. pip install bioimageio.core>=0.7
  2. Download the zip file containing a model
  3. Run this code snippet:
from bioimageio.core import load_description_and_test
from bioimageio.core.model_adapters._pytorch_model_adapter import PytorchModelAdapter
from bioimageio.spec import InvalidDescr

model_id = "nucleisegmentationboundarymodel_pytorch_state_dict.zip"
model_description = load_description_and_test(model_id)
if isinstance(model_description, InvalidDescr):
    raise Exception("Invalid model description")

adapter = PytorchModelAdapter(
    outputs=model_description.outputs,
    weights=model_description.weights.pytorch_state_dict,
    devices=None,
)

print(adapter._network)
  1. Get this traceback:
Traceback (most recent call last):
  File "/Users/pattonw/Work/Packages/dacapo/.venv/lib/python3.11/site-packages/torch/serialization.py", line 757, in _check_seekable
    f.seek(f.tell())
    ^^^^^^
AttributeError: 'Path' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/pattonw/Work/Packages/dacapo/scratch/scratch3.py", line 14, in <module>
    adapter = PytorchModelAdapter(
              ^^^^^^^^^^^^^^^^^^^^
  File "/Users/pattonw/Work/Packages/dacapo/.venv/lib/python3.11/site-packages/bioimageio/core/model_adapters/_pytorch_model_adapter.py", line 44, in __init__
    state: Any = torch.load(
                 ^^^^^^^^^^^
  File "/Users/pattonw/Work/Packages/dacapo/.venv/lib/python3.11/site-packages/torch/serialization.py", line 1319, in load
    with _open_file_like(f, "rb") as opened_file:
         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pattonw/Work/Packages/dacapo/.venv/lib/python3.11/site-packages/torch/serialization.py", line 664, in _open_file_like
    return _open_buffer_reader(name_or_buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pattonw/Work/Packages/dacapo/.venv/lib/python3.11/site-packages/torch/serialization.py", line 649, in __init__
    _check_seekable(buffer)
  File "/Users/pattonw/Work/Packages/dacapo/.venv/lib/python3.11/site-packages/torch/serialization.py", line 760, in _check_seekable
    raise_err_msg(["seek", "tell"], e)
  File "/Users/pattonw/Work/Packages/dacapo/.venv/lib/python3.11/site-packages/torch/serialization.py", line 753, in raise_err_msg
    raise type(e)(msg)
AttributeError: 'Path' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

approach 2/3 This same code works just fine if I use the model name "affable-shark" instead of the path to the downloaded zip of the same model. It also works if I run twice and the second time use "nucleisegmentationboundarymodel_pytorch_state_dict.zip.unzip" as the model_id. In those cases I get the model architecture that looks something like this:

UNet2d(
  (encoder): Encoder(
    (blocks): ModuleList(
      ...
    )
    (samplers): ModuleList(
      ...
    )
  )
  (out_conv): Conv2d(64, 2, kernel_size=(1, 1), stride=(1, 1))
  (final_activation): Sigmoid()
)

pattonw avatar Jan 21 '25 16:01 pattonw

Hi @pattonw Unfortunately the current latest bioimageio.core release has a regression that makes loading from zip files fail. I'm working on getting the patched next release out asap.

Loading "affable-shark" works as it never downloads and loads from a zip file, but downloads each file on-demand (and caches them separately).

Loading "nucleisegmentationboundarymodel_pytorch_state_dict.zip.unzip" works for the same reason (it's not a zip anymore). (This would not be interpreted as a model id btw, but as a local path)

FynnBe avatar Jan 22 '25 09:01 FynnBe

Oh ok, makes sense. For now I just have a workaround of anytime I'm trying to read a zipped model, I just unzip first and then pass it to load_description_and_test, but I'll keep an eye out for the next release so I can remove my hacky solution

pattonw avatar Jan 22 '25 16:01 pattonw

fixed by using the new genericache caching backend and replacing download by bioimageio.spec.utils.get_reader(), whose returned BytesReader is seekable.

FynnBe avatar Oct 13 '25 10:10 FynnBe