pagecrypt icon indicating copy to clipboard operation
pagecrypt copied to clipboard

New feature: compress input before encryption

Open jomiq opened this issue 8 months ago • 1 comments

Use gzip to compress input before encryption

Quick and dirty implementation just to see if #60 is worthwhile.

  • Compression is disabled by default
  • Pass -c flag to enable. (Added this to --help also)
  • Compression using node/zlib
  • Decompressing using the first random gist that worked
  • Looks for the data-c attribute in the <pre> tag. Can be extended to other compression algorithms.

Results

I first tried compressing test/test-big.html but because that file is already high entropy it only compresses down to 99.82% of the input size. So I looked at the list of longass wikipedia articles and downloaded this gem to use as more plausible data.

I included the page in this PR as test/test-compress.html for future reference.

🥇 It compresses down to 15% 💯

Work needed

The compressed output needs to be converted to string before encryption and vice versa. We should really just feed the buffer to the crypto functions directly to avoid base64 encoding/decoding twice but I'm not smart enough to figure out how to do that.

Also, proper benchmarking

jomiq avatar Aug 10 '25 21:08 jomiq

I made a version without the silly enncode/decode duplication here.

Also did primitive benchmarking and it does add some overhead (decompression is about 50 times slower than decryption). It's still peanuts compared to rendering and such, the example page takes 3500 ms to load, pagecrypt uses about 60 of those.

jomiq avatar Aug 13 '25 21:08 jomiq