Implementing the Cache-Control "must-revalidate" directive #3364

ThijsFeryn · 2020-07-16T08:59:28Z

This PR implements must-revalidate based on the spec, stating that when must-revalidate is part of the Cache-Control header, no stale data should be served from cache.

In terms of Varnish, this means grace is set to zero.

Example header

This is what such a header could look like:

Cache-Control: public, max-age=3600, must-revalidate

Combining `must-revalidate` and `stale-while-revalidate`

By putting this if-clause behind the stale-while-revalidate check, any occurrence of must-revalidate will supercede stale-while-revalidate.

If for some weird reason, the backend would send back both directives, grace would be set to zero, as illustrated in the example below:

Cache-Control: public, max-age=3600, must-revalidate, stale-while-revalidate=30

This is probably a breaking change

Because this PR changes the behavior of how grace is set, this could be considered a breaking change.

If you would accept this patch, I'm not sure at what point this would end up in a future release of Varnish.

bin/varnishtest/tests/b00073.vtc

dridi · 2020-07-16T09:51:58Z

I added this pull request to #3246 ( 7.0 Compliance improvements)

bin/varnishd/cache/cache_rfc2616.c

dridi · 2020-07-20T10:19:37Z

One thing that should be noted is the risk of serialization introduced by this change, negating the non-zero default_grace:

Cache-Control: public, must-revalidate

Such a change should get a fair share of red tape in the release notes and relevant documentation.

For this use case we would need a non-zero keep, and in the spirit of better compliance we could change the default timings:

default_ttl=0
default_grace=0
default_keep=2m

This way we could cache resources with no TTL based on the revalidation status (200 or 304). This would however be a major VCL breaking change since a zero TTL is interpreted today as uncacheable. To avoid serialization it would be necessary to wake up anyone in the waiting list regardless of the outcome (200 or 304) to be served with a fresh response.

Even a negative TTL shouldn't be interpreted as uncacheable (and more precisely unstorable) if we allow grace and keep.

That doesn't change my review, I wanted to mention the risks and where we could go if we dial this compliance knob up to eleven. I had mentioned some of that to @ThijsFeryn prior to his submission but didn't have time to elaborate here last week.

rezan · 2020-07-20T12:58:29Z

the risk of serialization introduced by this change

Request serialization can only happen when you set a zero TTL. Having a zero grace value does not affect this, so its safe.

dridi · 2020-07-20T13:50:47Z

Request serialization can only happen when you set a zero TTL. Having a zero grace value does not affect this, so its safe.

This is prevented with the current default_ttl and default_grace, but if you touch those defaults (as a Varnish user) and override the initial beresp.grace (directly from the backend) via the must-revalidate directive you introduce this risk. That's why I'm advocating for significant red tape in the docs because it can be difficult to diagnose.

This is what meant by "risk", we are pulling a rug under existing VCL.

And I guess I forgot some thoughts:

It can be easy to turn a cacheable response into a hitmiss object because the TTL is negative, especially in a tiered setup where Varnish's backend might be another Varnish instance serving a graced object. Making an object storable based on ttl+grace+keep would make this a non-issue.

The Cache-Control: public, must-revalidate serialization risk also goes away if ttl+grace+keep is enough to store a response since we have a usable cache insertion regardless of the 200 vs 304 outcome, which should be enough to wake clients on the waiting list.

Still, it breaks the assumption that a zero or negative TTL can be used to tell the built-in VCL and the core code that a response is uncacheable:

varnishtest "cache negative ttl"

server s1 {
        rxreq
        txresp -hdr {ETag: "123abc"} -body helloworld

        rxreq
        expect req.http.If-None-Match == {"123abc"}
        txresp -status 304 -hdr {ETag: "123abc"}
} -start

varnish v1 -vcl+backend {
        sub vcl_backend_response {
                set beresp.ttl = -1s;
                set beresp.grace = 0s;
                set beresp.keep = 2m;
                return (deliver);
        }
} -start

client c1 -repeat 2 {
        txreq
        rxresp
        expect resp.body == helloworld
} -run

Such a scenario would pass with this change in core code:

--- i/bin/varnishd/cache/cache_hash.c
+++ w/bin/varnishd/cache/cache_hash.c
@@ -435,7 +435,7 @@ HSH_Lookup(struct req *req, struct objcore **ocp, struct objcore **bocp)
                        continue;
                }
 
-               if (oc->ttl <= 0.)
+               if (oc->ttl + oc->grace + oc->keep <= 0.)
                        continue;
 
                if (BAN_CheckObject(wrk, oc, req)) {

And we'd need something similar in the built-in VCL.

That's it for my compliance detour, I hope I clarified my previous comment. I'm also fully aware of my lack of scrutiny, I haven't researched all the problems that this suggestion could introduce, for now I'm putting it on the proverbial table.

ThijsFeryn force-pushed the must-revalidate branch from 1e6fb66 to b33929f Compare July 16, 2020 09:12

dridi reviewed Jul 16, 2020

View reviewed changes

bin/varnishtest/tests/b00073.vtc Outdated Show resolved Hide resolved

bsdphk mentioned this pull request Jul 16, 2020

8.0 Compliance improvements #3246

Open

dridi reviewed Jul 16, 2020

View reviewed changes

bin/varnishd/cache/cache_rfc2616.c Show resolved Hide resolved

ThijsFeryn force-pushed the must-revalidate branch 2 times, most recently from 5faaddf to a34ec5c Compare July 16, 2020 10:13

dridi reviewed Jul 16, 2020

View reviewed changes

bin/varnishd/cache/cache_rfc2616.c Outdated Show resolved Hide resolved

ThijsFeryn force-pushed the must-revalidate branch 2 times, most recently from 62a8808 to 3ea49a8 Compare July 16, 2020 13:27

ThijsFeryn commented Jul 16, 2020

View reviewed changes

bin/varnishd/cache/cache_rfc2616.c Show resolved Hide resolved

dridi approved these changes Jul 16, 2020

View reviewed changes

bin/varnishd/cache/cache_rfc2616.c Outdated Show resolved Hide resolved

Implementing the Cache-Control "must-revalidate" directive

ee40cda

ThijsFeryn force-pushed the must-revalidate branch from 3ea49a8 to ee40cda Compare July 16, 2020 13:51

bsdphk added a=Bugwash Today and removed a=Bugwash Today labels Jul 27, 2020

bsdphk added the a=feedback please label Aug 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing the Cache-Control "must-revalidate" directive #3364

Implementing the Cache-Control "must-revalidate" directive #3364

ThijsFeryn commented Jul 16, 2020

dridi commented Jul 16, 2020

dridi commented Jul 20, 2020

rezan commented Jul 20, 2020

dridi commented Jul 20, 2020

Implementing the Cache-Control "must-revalidate" directive #3364

Are you sure you want to change the base?

Implementing the Cache-Control "must-revalidate" directive #3364

Conversation

ThijsFeryn commented Jul 16, 2020

Example header

Combining must-revalidate and stale-while-revalidate

This is probably a breaking change

dridi commented Jul 16, 2020

dridi commented Jul 20, 2020

rezan commented Jul 20, 2020

dridi commented Jul 20, 2020

Combining `must-revalidate` and `stale-while-revalidate`