DataSets.jl icon indicating copy to clipboard operation
DataSets.jl copied to clipboard

API to get the cached path for a blob

Open mbauman opened this issue 3 years ago • 1 comments

All the ways to work with a Blob (er, I suppose after 0.2.7 it's just File) are IO-based. But it's backed by a file that's cached in tmp. Not all packages work with IO-based methods and instead want a direct path to a file. How can I get that file? It reports to Julia that it isfile and will happily return its abspath... but it's not really either!

mbauman avatar Nov 28 '22 22:11 mbauman

But it's backed by a file that's cached in tmp.

Generally speaking, I don't think that is necessarily always true. E.g. for Blobs/Files that are stored within TOML files; or for large remote files that could implement some paged caching (i.e. only downloading some chunks of the file to the local machine).

As such, I think the current official recommended way should be to just copy the data to your own temporary file. E.g. write("my-temp-file", open(IO, dataset)) should work.

That said, maybe we should nevertheless have an API that gives you a path to a local cached file if it is available, and copies the full contents into a new temporary file if it isn't.

mortenpi avatar Nov 29 '22 01:11 mortenpi