Problem reading ZIP archive
Hi, features doesn't work for me with the /examples/objects.zip file that comes with the repo.
(morphocluster) root@f6908fd5e809:/data/example# morphocluster features model_state.pth objects.zip
/opt/conda/envs/morphocluster/lib/python3.7/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
Using pretrained model.
Reading archive... Traceback (most recent call last):
File "/opt/conda/envs/morphocluster/bin/morphocluster", line 33, in <module>
sys.exit(load_entry_point('morphocluster', 'console_scripts', 'morphocluster')())
File "/opt/conda/envs/morphocluster/lib/python3.7/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/opt/conda/envs/morphocluster/lib/python3.7/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/opt/conda/envs/morphocluster/lib/python3.7/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/envs/morphocluster/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/envs/morphocluster/lib/python3.7/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/code/morphocluster/scripts.py", line 148, in features
extract_features(archive_fn, output_fn, parameters_fn, normalize, batch_size, input_mean, input_std)
File "/code/morphocluster/processing/extract_features.py", line 436, in extract_features
dataset = ArchiveDataset(archive_fn, transform)
File "/code/morphocluster/processing/extract_features.py", line 266, in __init__
self.archive = zipfile.ZipFile(archive_fn)
File "/opt/conda/envs/morphocluster/lib/python3.7/zipfile.py", line 1258, in __init__
self._RealGetContents()
File "/opt/conda/envs/morphocluster/lib/python3.7/zipfile.py", line 1325, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
had to... *update scikit-learn and torchvision==0.12.0, *in api.py change from sklearn.manifold.isomap import Isomap to from sklearn.manifold import Isomap *replace('torch.ao.quantization', 'torch.quantization') in torchvision/models/quantization/mobilenetv2.py & mobilenetv3.py ... to get to this point.
Host is MacOs arm64
Cheers Veit
Hi Veit, Thanks for reporting!
Which version are you running? (What is the output of git describe --tags?) Can you open the zip with another program?
Version is 0.2.0-26-g6fc6386
I have successfully opened the file running the ZipFile.zipfile function from within the Docker container.
Anyway, it's strange that I had to clear so many dependencies in the Docker container. It was my understanding that's the reason why we use Docker ;). However, it's the first time I'm using Docker, so maybe I made some mistakes.
I guess the error lies somewhere in the click package. I have tried a different version (8.1.3), but no success.
zipfile.ZipFile is used by MorphoCluster so it seems strange that it works in one place and not in the other...
Anyway, it's strange that I had to clear so many dependencies in the Docker container. It was my understanding that's the reason why we use Docker ;).
Yes, that is one application case of Docker. Currently, we merely use Docker to provide the services we need in a controlled way. But you are right, I should publish a Docker Image so that it does not have to be built by each user individually.
I have this issue on my list and will investigate as soon as possible.
Hi Veit,
In the meantime, I got around to creating a repository on hub.docker.com and built a docker image with the default settings:
https://hub.docker.com/repository/docker/morphocluster/morphocluster
If you replace the build section in you docker compose file with the following, you can skip the build step and you should immediately receive a working container:
services:
morphocluster:
- build:
- context: .
- dockerfile: docker/morphocluster/Dockerfile
+ image: morphocluster/morphocluster:latestt
In the future, I might provide pre-built images regularily so that users can skip the whole setup step altogether.
This won't help with your problem, though. I get the same zipfile.BadZipFile: File is not a zip file error. unzip -t objects.zip, however, reports no problems and the error persists even after zip -F.
In a separate interpreter session in the Docker container, I can open the file and flask load-objects example/objects.zip works as well.
Maybe some loaded modules in morphocluster features somehow interfere...
This could also be related to multiprocessing / multithreading issues: https://github.com/python/cpython/issues/83544
In the meantime, you can replace morphocluster features objects.zip features.h5 with python -c "from morphocluster.scripts import main; main()" features objects.zip features.h5. This works in my setup.
I'd be thankful for an explanation as both commands should be virtually identical.