Remote file templated url
See #2462. Allow remote_file rules to specify a template a Cache URL which can be prioritised over the url passed to the remote_file rule itself.
In a similar way to python_wheel rules specifying a custom wheel naming scheme, here custom cache name scheme(s) can be defined. This should allow escrow backups of remote files in e.g. a cloud storage bucket, or as local files with e.g. cacheurl = file:///var/tmp/escrow/local_storage.
Would use a name_scheme like this in .plzconfig:
[remote_file]
prioritisecache = true
cacheurl = https://some-cloud-provider.com/some-bucket-name
cachenamescheme = {url_base}/{cache_path}/{file_name}
Also add a base64url function.
By default base64url is used to generate the cache_path in the above name scheme using the dirname of a URL (i.e. omitting the file name). Thus for two remote files with the same dirname
http://google.com/file.a: {cache_url}/aHR0cDovL2dvb2dsZS5jb20/file.a
http://google.com/file.b: {cache_url}/aHR0cDovL2dvb2dsZS5jb20/file.b
I'm not sure whether I've done everything necessary to set up the [remote_file] stuff in .plzconfig.
remote_file supports multiple URLs passed to it, which are tried in sequence. That's the underlying mechanism used to back python_wheel. To me that seems sufficient; you simply put your cache URL in as the first entry and it'll be tried before the others, but it won't fail if it doesn't have the thing in question.
To me that seems sufficient; you simply put your cache URL in as the first entry and it'll be tried before the others
We want to be able to download from our bucket for every remote file, for instance in case the external resource gets moved or deleted, we'll fall back to the cache. It would be arduous to define the cache url for every remote_file target.
... Understand the concern though; I'll have a think about how to make this a bit more elegant ...
remote_filesupports multiple URLs passed to it, which are tried in sequence. That's the underlying mechanism used to backpython_wheel. To me that seems sufficient; you simply put your cache URL in as the first entry and it'll be tried before the others, but it won't fail if it doesn't have the thing in question.
As Nick said, this is mostly about reducing the effort needed to effectively add an extra src for every remote_file - is there a better way to achieve that than what Nick is proposing here?
You could define your own build_def that wraps remote file and adds the cache url first to the underlying remote file?
You could define your own build_def that wraps remote file and adds the cache url first to the underlying remote file?
Yeah I think I prefer this idea. This seems a little too magical for a low level rule like remote file.
This issue has been automatically marked as stale because it has not had any recent activity in the past 90 days. It will be closed if no further activity occurs. If you require additional support, please reply to this message. Thank you for your contributions.