Zip FS does not implement cat() right
The AbstractFileSystem states for the cat() operation:
def cat(self, path, recursive=False, on_error="raise", **kwargs):
"""Fetch (potentially multiple) paths' contents
Parameters
----------
recursive: bool
If True, assume the path(s) are directories, and get all the contained files
...
But ZipFileSystem.cat() just delegates its argument zip.ZipFile.read(), although path may be a sequence (which makes perfect sense for cat = concatenate):
See https://github.com/intake/filesystem_spec/blob/master/fsspec/implementations/zip.py#L84-L85
I came across the issue using a compound URL for opening a Zarr from a Zip archive in S3:
import fsspec
import xarray as xr
store = fsspec.get_mapper('zip::s3://agriculture-vlab-data-test/S1_L2_COH_VH_31UDR.zarr.zip', s3=dict(anon=True))
ds = xr.open_zarr(store)
Which yields to
...
D:\Miniconda3\envs\xcube\lib\site-packages\zarr\core.py in _chunk_getitems(self, lchunk_coords, lchunk_selection, out, lout_selection, drop_axes, fields)
1834 else:
1835 partial_read_decode = False
-> 1836 cdatas = self.chunk_store.getitems(ckeys, on_error="omit")
1837 for ckey, chunk_select, out_select in zip(ckeys, lchunk_selection, lout_selection):
1838 if ckey in cdatas:
D:\Miniconda3\envs\xcube\lib\site-packages\fsspec\mapping.py in getitems(self, keys, on_error)
88 oe = on_error if on_error == "raise" else "return"
89 try:
---> 90 out = self.fs.cat(keys2, on_error=oe)
91 if isinstance(out, bytes):
92 out = {keys2[0]: out}
D:\Miniconda3\envs\xcube\lib\site-packages\fsspec\implementations\zip.py in cat(self, path, callback, **kwargs)
83
84 def cat(self, path, callback=None, **kwargs):
---> 85 return self.zip.read(path)
86
87 def _open(
D:\Miniconda3\envs\xcube\lib\zipfile.py in read(self, name, pwd)
1461 def read(self, name, pwd=None):
1462 """Return file bytes for name."""
-> 1463 with self.open(name, "r", pwd) as fp:
1464 return fp.read()
1465
D:\Miniconda3\envs\xcube\lib\zipfile.py in open(self, name, mode, pwd, force_zip64)
1500 else:
1501 # Get info object for name
-> 1502 zinfo = self.getinfo(name)
1503
1504 if mode == 'w':
D:\Miniconda3\envs\xcube\lib\zipfile.py in getinfo(self, name)
1425 def getinfo(self, name):
1426 """Return the instance of ZipInfo given 'name'."""
-> 1427 info = self.NameToInfo.get(name)
1428 if info is None:
1429 raise KeyError(
TypeError: unhashable type: 'list'
It looks like renaming cat->cat_file should solve this
(perhaps you wouldn't mind trying that and submitting a PR!)
Sure :)
Ok, I just did a few tests. There are many other places where ZipFileSystem does not handle correctly the situtation where a path is a list of paths. However, I try making this work, because it is important for our project. Hopefully I have a PR by tomorrow. Thanks @martindurant
Looking again, I wouldn't be surprised if simply removing cat would lead to the right behaviour (the superclass cat calls open). That's how it's done int he libarchive implementation.
@martindurant you were right. That fixed it, the PR is #789.
My original use case now works:

Closing this as it was fixed by #789.