pyfilesystem2 icon indicating copy to clipboard operation
pyfilesystem2 copied to clipboard

Can open_fs work with http protocol?

Open mezhaka opened this issue 4 years ago • 3 comments

Is there such a thing as fs.open_fs("https://something.com/whatever")? Being pyfilesystem user for some time (I use it extensively for "gs://", "tar://", "temp://") I assumed it would work out of the box for "http://" and now I discover it does not.

My use case is my own open_file implementation that is essentially:

@contextmanager
def open_file(
    url: str,
    mode: str = "r",
    create: bool = False,
    buffering: int = -1,
    encoding: Optional[str] = None,
    errors: Optional[str] = None,
    newline: str = "",
    **options,
) -> typing.IO:
    writeable = True if "w" in mode else False
    dir_url, file_name = os.path.split(url)
    with open_fs(dir_url, writeable, create) as fs_:
        with fs_.open(file_name, mode, buffering, encoding, errors, newline, **options) as file_:
            yield file_

which now gives me fs.opener.errors.UnsupportedProtocol: protocol 'https' is not supported if I try it with an https url.

I guess there's a good reason pyfilesystem does not do this, but I thought I check out with you here first.

P. S. I suppose there's no way to list things which are under some http path.

mezhaka avatar Dec 09 '21 16:12 mezhaka

P. S. I suppose there's no way to list things which are under some http path.

No, which is probably why this isn't implemented in PyFilesystem. Most operations, such as listing and stat-ing would be unsupported.

You may find smart-open helpful.

dargueta avatar Dec 10 '21 17:12 dargueta

Indeed, HTTP is not describing file systems like FTP so you cannot simply add an HTTPFS that explore all links as files, that's not exactly how it works. The thing that may work, however, would be to have a dedicated class that can handle particular file listing formats, like nginx or Apache let you configure to serve static content.

althonos avatar Dec 13 '21 17:12 althonos

The thing that may work, however, would be to have a dedicated class that can handle particular file listing formats, like nginx or Apache let you configure to serve static content.

Huh, I never even realised that PyFilesystem didn't support that yet! :rofl: Adding a pyfs-interface on top of e.g. http://downloads.raspberrypi.org/ would be a fun project, which I unfortunately don't have time for myself at the moment. (might even be able to re-use some of the FTP-listing parsing code?)

lurch avatar Dec 13 '21 19:12 lurch