ArchiveBox icon indicating copy to clipboard operation
ArchiveBox copied to clipboard

Enhancement: Add `WGET_EXTRA_ARGS`, `CURL_EXTRA_ARGS`, `SINGLEFILE_EXTRA_ARGS` to extend default args without overriding defaults

Open ntevenhere opened this issue 3 years ago • 1 comments

These WGET_ARGS, CURL_ARGS, etc. options let the user shoot themselves in the foot, silently. I think the documentation or the variables themselves should be changed to be more ergonomic.

Did you need to add a header to wget?

WGET_ARGS=['--header=Accept-Language: en-US,en']

Look at this configuration, it looks inoffensive. Checking the documention, nothing ticks you off you're using it wrong. But no. After this, your 🆆 button won't take you to the main html you archived, and wget archives things slighlty differently, silently.

Why? By setting WGET_ARGS you overwrote vital settings, as it turns out they're also stored in WGET_ARGS. The documentation doesn't tell you about this. This happened to me, I just happily overwrote the variable. When I should've written something like this:

WGET_ARGS=['--header=Accept-Language: en-US,en;q=0.5', '--no-verbose', '--adjust-extension', '--convert-links', '--force-directories', '--backup-converted', '--span-hosts', '--no-parent', '-e', 'robots=off']

Proposal:

  1. The documentation for *_ARGS should have a good warning or display the default value, so that users can suspect they're overwriting something.
  2. Or, create EXTRA_WGET_ARGS, EXTRA_CURL_ARGS, and so on. These won't overwrite the now-considered low-level *_ARGS. EXTRA_*_ARGS shall be the more user-facing option and more promoted in the documentation.

ntevenhere avatar Sep 12 '22 21:09 ntevenhere

I'm ok with the EXTRA_ options as I want users to be able to override the defaults.

pirate avatar Sep 28 '22 01:09 pirate