password support to ZipFS
Type of changes
- [x] New feature
- [ ] Documentation / docstrings
- [ ] Bug fix
- [ ] Tests
- [ ] Other
Checklist
- [x] I've run the latest black with default args on new code.
- [x] I've updated CHANGELOG.md and CONTRIBUTORS.md where appropriate.
- [x] I've added tests for new code.
- [x] I accept that @willmcgugan may be pedantic in the code review.
Description
Related issue: #360
This PR add passwd argument for ZipFS and ReadZipFS constructor to set the default password for extracting the file. And for per-file password, there is also passwd arg on openbin and readbytes to override the default settings.
Extra, I added setpassword on ReadZipFS since users could get the fs from open_fs and there would be no place the set password.
One thing to be mention, it changes readbytes's signature:
- def readbytes(self, path):
- # type: (Text) -> bytes
+ def readbytes(self, path, passwd=None):
+ # type: (Text, Optional[AnyStr]) -> bytes
and thus it does not follow the form on the base class. Please let me know if you have any concern.
I'm afraid the changed signature is a deal breaker. There is an options keyword args on the open methods which I was thinking could be used for an optional password argument. That should probably be added to all shortcut methods that read or write data.
So the base method could be:
def readbytes(self, path, **options):
# type: (Text) -> bytes
with closing(self.open(path, mode="rb", **options)) as read_file:
contents = read_file.read()
return contents
That would allow ZipFS to use a per-file password, while keeping the same signature. There would need to be a similar change to readtext, writebytes, and writetext. And it would have to be changed everywhere, not just the base.
Sorry this may be a large change that it appeared.
Extra, I added setpassword on ReadZipFS since users could get the fs from open_fs and there would be no place the set password.
FS urls support parameters which you could use to pass in the password.
I'm a little bit confusing.
Just to be checked - there's two things I could do:
- remove
_bytesandsetpasswordwould only allow bytes, since python's zipfile lib only accept bytes - add URL parameter support for opening password protected zip, and there would have a auto conversion from str to bytes since URL parameters is always str.
But what about readbytes? If you mean it is needed to change the base class's signature, it seems shouldn't be done in this PR since it is for zip file? or I should restore the signature temporary?
Thanks for your kindly reply.
remove _bytes and setpassword would only allow bytes, since python's zipfile lib only accept bytes
:+1:
add URL parameter support for opening password protected zip, and there would have a auto conversion from str to bytes since URL parameters is always str.
:+1:
But what about readbytes? If you mean it is needed to change the base class's signature, it seems shouldn't be done in this PR since it is for zip file? or I should restore the signature temporary?
If a signature changes, it needs to be changed everywhere. i.e. the base, ZipFS and the other built in filesystems. One of the goals of PyFilesystem is that you can swap one filesystem for another, and your code will still work.
The base and other filesystems can ignore the password, so it would just be a null-op for them.
On reflection, the argument should probably be called zip_password to avoid any future name collision.
But what about readbytes? If you mean it is needed to change the base class's signature, it seems shouldn't be done in this PR since it is for zip file?
If a signature changes, it needs to be changed everywhere. i.e. the base, ZipFS and the other built in filesystems. One of the goals of PyFilesystem is that you can swap one filesystem for another, and your code will still work.
It's possibly worth doing that in a separate PR, and once that's merged, then come back to this password-protected-zip PR? :man_shrugging:
On reflection, the argument should probably be called zip_password to avoid any future name collision.
Did you mean "dictionary key" rather than "argument" ?
It's possibly worth doing that in a separate PR, and once that's merged, then come back to this password-protected-zip PR? 🤷♂
Yeah, nice to have one feature per PR.
Did you mean "dictionary key" rather than "argument" ?
The options are collected kwargs, so I guess that would make it an argument in the call and a dictionary key in the method.
The options are collected kwargs, so I guess that would make it an argument in the call and a dictionary key in the method.
Ahh, I'd missed that you were using options to collect the kwargs - I naively (without looking at the code) assumed you were passing in an options dictionary.
Still waiting for #362 merge before I continue on this branch...