varnish-cache beresp 304 header merging makes impossible to distinguish between headers set by varnish and backend

Reported by @martin-uplex in https://varnish-cache.org/lists/pipermail/varnish-dev/2019-October/009469.html

Basically, the way we merge headers for 304 responses makes it impossible to distinguish headers originating from the backend from ones which we set in VCL (for the previous response).

Oct 21 '19 09:10 nigoroll

Just to make sure I understand the scenario: The Cache-Control: nocache is created in VCL ?

Oct 21 '19 11:10 bsdphk

We talked at one point about 304 responses getting their own dedicated VCL method, where both beresp.* (RW) and obj.* (RO) were available.

I wonder how much of the current C-lang 304 logic we could implement in builtin_vcl if we did that ?

Oct 21 '19 11:10 bsdphk

Just to make sure I understand the scenario: The Cache-Control: nocache is created in VCL ?

Yes, correct.

Oct 21 '19 12:10 martin-uplex

@bsdphk yes to the first question: The scenario is to modify the headers of the cache-object with respect to the downstream behavior, but have a differing ttl/cacheable status. And yes, I think that an additional VCL method would help and IIUC this was my preferred option when we initially discussed the 304 implementation.

Oct 21 '19 12:10 nigoroll

But didnt we find out back then that we would still need C-magic to make cond-fetch work ?

Oct 21 '19 16:10 bsdphk

Discussed on IRC: I think this model should work:

before vcl_backend_response {} , for a 304, call vcl_backend_refresh {} with beresp.* (r/w) and obj.* (r/o)
builtin.vcl would just contain sub vcl_backend_refresh { return(deliver); }
the header merge, as exists now, would run after vcl_backend_refresh {}:
- beresp. headers always replace obj. headers
304-aware code would basically do almost all of the work (except deletion, see below) in vcl_backend_refresh {} and skip most of the processing in vcl_backend_response {} based on beresp.was_304
The only special case (which I see) would be header deletion, which requires cooperation between vcl_backend_refresh {} and vcl_backend_response {}: _refresh would set some marker (e.g. in a header or variable) and _response would do the actual deletion.

Unless I miss something, this solution should be 100% compatible with existing VCL and allow for advanced handling of 304s only where needed.

If there was a need, we could facilitate header deletion also by marking headers not to be merged (as a vmod function in tandem with some core code addition).

@martin-uplex can you please review if this suggestion would work for you?

Oct 21 '19 16:10 nigoroll

discussed with @nigoroll, he will follow up on this

suggestion in short: no vcl_backend_refresh, but make obj.* (r/o) available in vcl_backend_response, filled with same values as beresp.* if not 304

Oct 22 '19 10:10 martin-uplex

sub vcl_backend_revalidate {
    if (beresp.backend.name ~ "legacy") {
        return (replace);
    }
}

# built-in

sub vcl_backend_revalidate {
    beresp.merge(obj);
    return (reuse);
}

I have been confronted to misbehaving backends in the past, being able to ignore them would be a plus. You can already ignore them in v_b_response by removing condition headers. Nit-picking on the "v_b_refresh" name, and especially the return (deliver) transition if it's meant to lead to v_b_response.

Oct 22 '19 14:10 dridi

@Dridi how would you handle a misbehaving backend when you already have a 304 response (with no body)? I do not see how return (replace) vs. return (reuse) would work, if that is referring to the body.

regarding the header merge, making it explicit would be another option, which would also facilitate the header deletion, yes.

Oct 22 '19 14:10 nigoroll

Right, forget that part of my suggestion...

Oct 22 '19 15:10 dridi

Qs about beresp.merge(obj):

Can it be called with any other object besides obj? If so, then what other object could make sense as the argument? If not, then wouldn't it be better if there is no argument? Say beresp.merge() or beresp.merge_obj()?
What happens if it doesn't get called? Then varnishd does the default merging? Or does that mean you don't want the merge? In the latter case, the documentation should probably point out that the HTTP standard has some requirements about merging headers of the object to be validated with headers from the validating response, so it's a "use at your own risk" option. I would assume then that the default in builtin.vcl would be to call the merge.

Oct 22 '19 16:10 slimhazard

Can it be called with any other object besides obj?

Yes, it can be called in a different context with different arguments.

sub vcl_init {
    my_replicator.request(...).merge(req);
    my_replicator.send();
    # where the request() function returns an HTTP
}

What happens if it doesn't get called?

My suggestion to move (and expose) this to VCL introduces a risk. Much like I can break a bereq or a resp today by messing with it in pure VCL code. I'm definitely leaning towards the "use at your own risk" side.

Oct 22 '19 18:10 dridi

* What happens if it doesn't get called?

We have to choose if the default behavior is to remain in the core or be moved to the builtin VCL. In my humble opinion, if we introduce a .merge(), then we should not do any merging in the core, but do it in VCL.

On a return (reuse) we have the option of stealing the body and headers, RFC be damned.

... the documentation should probably point out that the HTTP standard has some requirements about merging headers of the object to be validated with headers from the validating response, so it's a "use at your own risk" option.

Yeah, this is true, and for many app writers, not well understood. Having the option to amend that in VCL is a natural request.

Oct 23 '19 08:10 hermunn

One use case we might also consider is not merging surrogate keys for example:

sub vcl_backend_revalidate {
    if (beresp.http.xkey) {
        # don't keep track of stale surrogate keys
        set beresp.http.new_xkey = beresp.http.xkey;
        beresp.merge(obj);
        set beresp.http.xkey = beresp.http.new_xkey;
        unset beresp.http.new_xkey;
        return (TBD); # my reuse vs replace transition didn't make sense
    }
}

Oct 23 '19 08:10 dridi

        set beresp.http.new_xkey = beresp.http.xkey;

Well, a std.collect is in order here, and this illustrates a maybe sore point in all of this - Varnish does not work great with repeated headers.

Oct 23 '19 08:10 hermunn

set beresp.http.new_xkey = beresp.http.xkey.collect();

# or

beresp.http.new_xkey.clone(beresp.http.xkey);

Now that we have type properties and methods, we can make vmod_header redundant by enhancing the VCL_HEADER and VCL_HTTP types.

Oct 23 '19 08:10 dridi

bugwash: @Dridi and myself agree that sub vcl_backend_refresh preferred

Dec 23 '19 14:12 nigoroll

Discussed with @Dridi and @bsdphk

We agree that a new sub vcl_backend_refresh is the best option
- return values: ok/retry/abandon/fail
The 304 handling should move to vcl, by default calling merge_304;

I will prepare some mock up vcl illustrating how the relevant use cases will look like

@martin-uplex es case
Etag mangling for (un)gzip-ing
#3169

Also:

read https://tools.ietf.org/html/draft-ietf-httpbis-semantics-06

Feb 07 '20 15:02 nigoroll

FYI, working on the promised VCL mockup proposal made me realize that not even the current master version of the cache rfc can possibly work:

For each stored response identified for update, the cache MUST use the header fields provided in the 304 (Not Modified) response to replace all instances of the corresponding header fields in the stored response.

As noticed here and in #3169, if origin servers wrongly update some headers (like Content-Length, Content-Encoding or maybe Content-Type), bad things might happen.

The corresponding discussion seems to take place in https://github.com/httpwg/http-core/issues/165 and the latest sensible proposal seems to be https://github.com/httpwg/http-core/pull/337

I will follow this proposal and maybe participate in the discussion if need to.

Mar 22 '20 12:03 nigoroll

TODO also: consider https://cache-tests.fyi/#conditional when writing the VIP

Apr 17 '20 12:04 nigoroll

Bugwash suggestion:

# built-in
sub vcl_backend_refresh {
        return (merge);
}

In this subroutine we have access to beresp (read-write) and obj_stale (read-only) and the available transitions are:

merge (merge obj_stale with beresp)
replace (take beresp as-is)
fail
abandon
error?

Sep 27 '23 12:09 dridi

VDD CONSENSUS:

1.Do not add new functionality unless an implementor cannot complete a real application without it.

vcl_backend_refresh
beresp r/w access
obj_stale r/o access
return (merge, obj_stale, beresp, fail, abandon, error, retry);

merge: existing logic

change status to 200 if beresp.status == 304
header merge logic as is

obj_stale: undo the result of the sub, just copy everything from obj_stale

beresp: use beresp as it is (including status)

beresp.was_304 is going to stay true

Sep 27 '23 13:09 nigoroll