(original) (raw)

I wrote: > (1) Semantics: the HTTP protocol needs to support these cases > in request messages: > > (Ajeff) I want a firsthand response from the origin server. > > (Bjeff) I want a response that has been validated with the > origin server after I made this request. > > (Cjeff) I want to validate my own cached response with > the origin server. > > (Djeff) I want to validate my own cached response, but I'll > accept a non-authoritative answer (i.e., from a cache > that has a still-fresh copy with a matching validator) > > (Ejeff) I want a fresh response. > > Note that I've arranged these in roughly decreasing order of "paranoia".

Roy wrote: The way you phrased the above sentences would lead you down the wrong track in trying to determine the actual semantics. Try this:

(Aroy) I want a new copy from the origin server.

(Broy) I want a response from the origin server
    based on the parameters of this request.

(Croy) I want a new copy from the next inbound cache or origin.

(Droy) I want a response from the next inbound cache or origin
    based on the parameters of this request.

(Eroy) I want whatever is currently available.

Note that I have relabelled these ("Ajeff" and "Aroy") because in some cases there is a correspondence between my letters and Roy's letters, and in other cases there is not.

Roy has pointed out some things that I had missing, but has managed to omit some stuff that I think is important. So let's try again (and I'll use a disjoint set of names);

(Ax) I want a new copy from the origin server.
    (Same as Aroy, similar to Ajeff)

(Bx) I want a response that is either from a cache or
from the origin server, based on the parameters of this
request, that is either a new copy from the origin server,
or one that the origin server considers valid.
    (Bjeff, restated to make it clearer, and I
    think what Roy meant by Broy if the client
    doest not have a cache entry)

(Cx) I want to validate my own cached response with
the origin server.  I don't care what any intervening
cache has.
    (Cjeff, restated to make it clearer, and I
    think what Roy meant by Broy if the client
    has a cache entry)

(Dx) I want to validate my own cached response, but I'll
accept either a non-authoritative answer (i.e., from a cache
that has a still-fresh copy with a matching validator) or
an authoritative response from the origin server, whichever
comes first.
    (Djeff, restated to make it clearer, and I think
    what Roy meant by Droy if the client has a cache
    entry)

(Ex) I don't have a cache entry locally, so I want a fresh
response (either a non-authoritative but fresh response
from an inbound cache, or an authoritative response from
the origin server), whichever comes first.
    (Ejeff, restated to make it clearer, and I think
    what Roy meant by Croy if the client does not
    have a cache entry)

(Fx) I want whatever is currently available, and I'm
willing to settle for a stale response if that's the best
I can have.  I don't have a local cache entry.
    (Eroy, I think.  I think this depends on our
    fundamental disagreement about semantic transparency)

(Gx) I want whatever is currently available, and I'm
willing to settle for a stale response if that's the best
I can have.  I have a local cache entry, so if what
you have is what I already have, don't send it again.
    (new)

Roy, does this cover the space? I'm NOT asking you to rearrange things again, I'm asking whether I left anything out.

If it does cover the space, feel free to rearrange things, but then please do not leave anything out when you do that!

On to (2) Terminology:

I originally had

Ajeff = Reload
Bjeff = Revalidate
Cjeff = Specific revalidate
Djeff = Conditional method
Ejeff = Normal method
Roy proposed Aroy = Reload Broy = Refresh (whether or not it is valid is immaterial) Croy = Refresh from origin (Revalidate is okay, but less descriptive) Droy = Conditional GET Eroy = GET

Equating Eroy with specific syntax is precisely what I was trying to avoid (describing these cases in terms of how one of us thinks they ought to be implemented in HTTP/1.1) because it forces a particular syntax on us. Which is precisely what Roy is trying to do, I guess, but I'm not going to follow that lead.

As for Roy's comment on Broy ("whether or not it is valid is immaterial"): what is the point of asking the origin server for a response if the result isn't "valid"? This is based on my definition for "valid"

valid A cached entity is valid, with respect to a given request at a given time, if it is exactly what the origin server would return in response to that request at that time.

(Roy's definition, offered on 2 Feb 1996, is 5 times as long but specifically includes my definition as one alternative. So if it isn't valid by Roy's definition, it cannot be valid by mine either.)

Let me try again:

Ax = Reload
Bx = Revalidate
Cx = Specific Revalidate
Dx = Conditional method
Ex = Unconditional method
Fx = Unconditional method, stale responses allowed
Gx = Conditional method, stale responses allowed

For Ax through Ex, I'm asserting that stale responses are not allowed. Even Roy has to agree with that on Ax through Cx, because these are defined as contacting the origin server. And I hope Roy can agree with me that it makes sense to allow the user (the ultimate ruler, in Roy's model) to decide whether or not a stale response should be accepted, so this means that Dx and Ex are just as important as Fx and Gx.

One could insert the word "answerable-from-cache" in front of the the names of Dx-Gx, but I won't bloat this terminology like that.

(3) Syntax:

Because Roy started by redefining what I meant by (B), (C), etc., he then had to take issue with what I wrote as his syntax for these. Which makes it pointless for me to quibble with his response, since it's at cross purposes with what I was asking him.

Again, if I understand Roy's spec correctly, this is how he would implement Ax-Gx:

Ax = Reload
     GET /home.html HTTP/1.1
     Cache-control: no-cache

[Roy agrees already]

Bx = Revalidate
     GET /home.html HTTP/1.1
     Cache-control: max-age=0

Bx cannot include a validator, because of the way I defined it: the client does not have a validator to offer. If your syntax cannot express case Bx, then you have a problem to fix.

Cx = Specific Revalidate
        GET /home.html HTTP/1.1
        <some-conditional-header>: <cache-validation condition>
        Cache-control: no-cache

[Roy agrees already]

Now we get into a touchy area. Roy writes: I'm not sure why you have it that HTTP will only transfer fresh entities by default. This is precisely our argument on transparency. I've already argued repeatedly that it is wrong (in the technical/moral/ethical sense) for a cache to return a response that it knows is stale to a user that expects transparency. I suppose Roy doesn't expect that users (by default) want transparency.

If you change that default, then you also
need a new syntax for obtaining the old default.

... just keep reading, Roy!

Dx = Conditional method
        GET /home.html HTTP/1.1
        <some-conditional-header>: <cache-validation condition>

Ex = Unconditional method
        GET /home.html HTTP/1.1

In my view, these are saying "I want transparency" by default.

Fx = Unconditional method, stale responses allowed
        GET /home.html HTTP/1.1
    Cache-control: stale-max=630000

Gx = Conditional method, stale responses allowed
        GET /home.html HTTP/1.1
        <some-conditional-header>: <cache-validation condition>
    Cache-control: stale-max=630000

These are what you asked for ("a new syntax for obtaining the old default") except that here I have aribitrarily limited the duration of staleness to 7 days. I suppose you could instead send stale-max=1576800000, which is about 50 years (ignoring leap years) and is representable in a signed 32-bit field. I trust this is sufficient for your purposes?

Next I tried to propose an alternative syntax. I've already had to use it for Fx and Gx, because Roy's syntax cannot distinguish between Dx and Gx or between Ex and Fx. "Cache-control: max-age" simply doesn't make this distinction; it makes a different one.

For Ax, where Roy would send GET /home.html HTTP/1.1 Cache-control: no-cache

I would send GET /home.html HTTP/1.1 Cache-control: reload

but that is just a spelling difference.

For Bx, where Roy would send (I think) GET /home.html HTTP/1.1 Cache-control: max-age=0 I would send GET /home.html HTTP/1.1 Cache-control: revalidate which is close to being a spelling difference, if one is careful about comparing the age of a response that has been received within the past second.

For Cx, where Roy would send GET /home.html HTTP/1.1 : Cache-control: no-cache I would send GET /home.html HTTP/1.1 : Cache-control: revalidate which I think is also basically a spelling difference.

Roy writes that my syntax for Ax introduces an unnecessary contradiction with

GET /home.html HTTP/1.1
Cache-control: reload
If-Modified-Since: Thu, 15 Feb 1996 15:05:20 GMT

by which I believe he means that (in my scheme) one could also write Cx as GET /home.html HTTP/1.1 : Cache-control: reload which is true, I suppose, but I fail to see the harm. Either that, or we could make this a "protocol error", in the same way that a request of the form GET /home.html HTTP/1.1 Cache-control: private="Date" Cache-control: no-cache is also internally contradictory.

Roy objects to my syntax for Bx GET /home.html HTTP/1.1 Cache-control: revalidate for two reasons: No, it introduces a contradiction if no validation information is provided to go along with the revalidate directive. Not at all. I want the client to be able to say "I don't have this in my local cache, so I don't have a validator to check, but I want you Dr. Cache to check your validator." I would use this if I was stuck with a proxy run by someone who thought it was cool to ignore Expires: dates.

Furthermore, it places syntax significant to an origin server in a
place that origin servers should never need to look.

Huh? This isn't significant to the origin server. When the intermediary cache that has a cache entry (if there is one) makes its validation request to the origin server, it adds its own validator to that request. The origin server sees the validator (If-Valid: or I-M-S), and can ignore the "Cache-control: revalidate". Any cache further along the inbound chain, however, is forced to pass the request along to the origin server for validation.

Roy says my syntax for Cx GET /home.html HTTP/1.1 If-Modified-Since: Thu, 15 Feb 1996 15:05:20 GMT Cache-control: revalidate is "Redundant", apparently because he though this was the same case as Dx ... but it isn't. They both mean "I have this already in my cache, is it OK?", but Cx means "and check with the origin server" while Dx means "tell me what you know yourself, Dr. Cache".

300+ lines later, I've worn out my fingers (and probably my welcome), but I wanted to try to get this as clear as possible.

To be honest, with the addition of "Cache-control: stale-max=NNN" and a requirement that all other requests implicitly require non-stale responses, the only different here between me and Roy is syntax. Which is why I hope that Roy agrees with me up through the point in this message where I introduce "Cache-control: reload", because then we have agreement on all of the important stuff. And if he doesn't agree with me to that point, there's no point in arguing about syntax.

-Jeff