CodeKit icon indicating copy to clipboard operation
CodeKit copied to clipboard

exclude canonical links from cache busting

Open davidt-de opened this issue 5 years ago • 2 comments

I stumbled across this request under a already closed issue, while googling a solution for my problem, any thoughts on this? It would be very helpful. At least for me.

[...] I think it should be nice to exclude other links as for example canonical links:

<link rel="canonical" href="https://www.mysite.com/canonical?ckcachebust=575978862">

Thanks!

Originally posted by @nye in https://github.com/bdkjones/CodeKit/issues/409#issuecomment-479428020

davidt-de avatar Oct 22 '20 19:10 davidt-de

I can expose an option to provide exclusions for the cache-buster. My absolute #1 goal with the cache-buster was to be insanely fast. As such, the algorithm does not actually parse the HTML; it simply rips through the text of the file and finds links. The whole thing is written in straight C.

While that's definitely the fastest possible way to do this, it does eliminate options that fully parsing the page would yield, such as inspecting other attributes of a link (e.g. rel="canonical").

To keep the speed but still provide control, the best approach is to expose a list of possible exclusions.

bdkjones avatar Oct 23 '20 01:10 bdkjones

Hi Bryan,

the list would be a great option!

Thanks George

davidt-de avatar Oct 23 '20 03:10 davidt-de

Just wanting to add my +1 to this issue, despite it being old. I'm stuck between adding hashes to canonicals or not using the cache buster at the moment.

Is the exclusion list for cache-busting a task you haven't gotten round to yet, or one that isn't worth the time?

jamiedumont avatar Jul 11 '23 06:07 jamiedumont

Well, it’s definitely an edge case. RegEx matching is godawful slow and the cache buster is currently written in plain C so that it’s as fast as possible. It doesn’t parse the HTML or create a DOM or even worry about Unicode (because no Unicode code points above ASCII are valid for URIs.) I can add exceptions, which will trade off pure speed. But why not just exempt the entire file from cache-busting and then purge the cache at the CDN layer once you deploy? What are we trying to exempt from cache-busting?

bdkjones avatar Jul 11 '23 09:07 bdkjones

Hey @bdkjones!

I was using cache busting on HTML files for URIs to CSS and JS files, but was getting my canonical tag cache busted which made a mess of SEO, etc.

I was previously deploying to a plain nginx server, so needed cache-busting within my build step. I'm now using a CDN, which as you say solves the cache-busting problem.

I understand the desire for outright speed on a feature like this, but surely any page that needs to bust URIs to CSS and JS almost always include a canonical link that gets clobbered too?

jamiedumont avatar Aug 02 '23 15:08 jamiedumont

Okay. It had been a hot second since I'd written any C, so I decided to knock this out. CodeKit will now skip cache-busting for any link tag with rel="canonical":

I'll release this update soon.

Screenshot 2023-08-07 at 22 13 08

bdkjones avatar Aug 08 '23 05:08 bdkjones

Sorry, I’ve somehow missed this! Thanks so much!

jamiedumont avatar Dec 10 '23 10:12 jamiedumont