pyfilesystem2 icon indicating copy to clipboard operation
pyfilesystem2 copied to clipboard

password support to ZipFS

Open tzing opened this issue 6 years ago • 8 comments

Type of changes

  • [x] New feature
  • [ ] Documentation / docstrings
  • [ ] Bug fix
  • [ ] Tests
  • [ ] Other

Checklist

  • [x] I've run the latest black with default args on new code.
  • [x] I've updated CHANGELOG.md and CONTRIBUTORS.md where appropriate.
  • [x] I've added tests for new code.
  • [x] I accept that @willmcgugan may be pedantic in the code review.

Description

Related issue: #360

This PR add passwd argument for ZipFS and ReadZipFS constructor to set the default password for extracting the file. And for per-file password, there is also passwd arg on openbin and readbytes to override the default settings.

Extra, I added setpassword on ReadZipFS since users could get the fs from open_fs and there would be no place the set password.

One thing to be mention, it changes readbytes's signature:

-    def readbytes(self, path):
-        # type: (Text) -> bytes
+    def readbytes(self, path, passwd=None):
+        # type: (Text, Optional[AnyStr]) -> bytes

and thus it does not follow the form on the base class. Please let me know if you have any concern.

tzing avatar Dec 06 '19 14:12 tzing

I'm afraid the changed signature is a deal breaker. There is an options keyword args on the open methods which I was thinking could be used for an optional password argument. That should probably be added to all shortcut methods that read or write data.

So the base method could be:

    def readbytes(self, path, **options):
        # type: (Text) -> bytes
        with closing(self.open(path, mode="rb", **options)) as read_file:
            contents = read_file.read()
        return contents

That would allow ZipFS to use a per-file password, while keeping the same signature. There would need to be a similar change to readtext, writebytes, and writetext. And it would have to be changed everywhere, not just the base.

Sorry this may be a large change that it appeared.

Extra, I added setpassword on ReadZipFS since users could get the fs from open_fs and there would be no place the set password.

FS urls support parameters which you could use to pass in the password.

willmcgugan avatar Dec 06 '19 15:12 willmcgugan

I'm a little bit confusing.

Just to be checked - there's two things I could do:

  1. remove _bytes and setpassword would only allow bytes, since python's zipfile lib only accept bytes
  2. add URL parameter support for opening password protected zip, and there would have a auto conversion from str to bytes since URL parameters is always str.

But what about readbytes? If you mean it is needed to change the base class's signature, it seems shouldn't be done in this PR since it is for zip file? or I should restore the signature temporary?

Thanks for your kindly reply.

tzing avatar Dec 06 '19 16:12 tzing

remove _bytes and setpassword would only allow bytes, since python's zipfile lib only accept bytes

:+1:

add URL parameter support for opening password protected zip, and there would have a auto conversion from str to bytes since URL parameters is always str.

:+1:

But what about readbytes? If you mean it is needed to change the base class's signature, it seems shouldn't be done in this PR since it is for zip file? or I should restore the signature temporary?

If a signature changes, it needs to be changed everywhere. i.e. the base, ZipFS and the other built in filesystems. One of the goals of PyFilesystem is that you can swap one filesystem for another, and your code will still work.

The base and other filesystems can ignore the password, so it would just be a null-op for them.

On reflection, the argument should probably be called zip_password to avoid any future name collision.

willmcgugan avatar Dec 06 '19 16:12 willmcgugan

But what about readbytes? If you mean it is needed to change the base class's signature, it seems shouldn't be done in this PR since it is for zip file?

If a signature changes, it needs to be changed everywhere. i.e. the base, ZipFS and the other built in filesystems. One of the goals of PyFilesystem is that you can swap one filesystem for another, and your code will still work.

It's possibly worth doing that in a separate PR, and once that's merged, then come back to this password-protected-zip PR? :man_shrugging:

lurch avatar Dec 06 '19 16:12 lurch

On reflection, the argument should probably be called zip_password to avoid any future name collision.

Did you mean "dictionary key" rather than "argument" ?

lurch avatar Dec 06 '19 16:12 lurch

It's possibly worth doing that in a separate PR, and once that's merged, then come back to this password-protected-zip PR? 🤷‍♂

Yeah, nice to have one feature per PR.

Did you mean "dictionary key" rather than "argument" ?

The options are collected kwargs, so I guess that would make it an argument in the call and a dictionary key in the method.

willmcgugan avatar Dec 06 '19 16:12 willmcgugan

The options are collected kwargs, so I guess that would make it an argument in the call and a dictionary key in the method.

Ahh, I'd missed that you were using options to collect the kwargs - I naively (without looking at the code) assumed you were passing in an options dictionary.

lurch avatar Dec 06 '19 17:12 lurch

Still waiting for #362 merge before I continue on this branch...

tzing avatar Mar 06 '20 02:03 tzing