datasets icon indicating copy to clipboard operation
datasets copied to clipboard

remove filecheck to enable symlinks

Open fschlatt opened this issue 1 year ago • 5 comments

Enables streaming from local symlinks #7083

@lhoestq

fschlatt avatar Aug 30 '24 07:08 fschlatt

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

The CI is failing, looks like it breaks imagefolder loading.

I just checked fsspec internals and maybe instead we can detect symlink by checking islink and size to make sure it's a file

if info["type"] == "file" or (info.get("islink") and info["size"])

lhoestq avatar Sep 02 '24 16:09 lhoestq

hmm actually size doesn't seem to filter symlinked directories, we need another way

lhoestq avatar Sep 02 '24 16:09 lhoestq

Does fsspec perhaps allow resolving symlinks? Something like https://docs.python.org/3/library/pathlib.html#pathlib.Path.resolve

fschlatt avatar Sep 03 '24 07:09 fschlatt

there is info["destination"] in case of a symlink, so maybe

if info["type"] == "file" or (info.get("islink") and info.get("destination") and os.path.isfile(info["destination"]))

lhoestq avatar Sep 04 '24 12:09 lhoestq

I've added a fix which works with some temporary test files locally

(info["type"] == "file" or (info.get("islink") and os.path.isfile(os.path.realpath(filepath))))

fschlatt avatar Dec 21 '24 12:12 fschlatt