Cache invalidation is the first hard problem in computer science. In this post I discuss some ways handle it in the context of HTTP.

Often stale data is okay. In the comment section of a blog, it probably does not matter if comments posted within the last hour are not visible to all users. An origin server can specify this with the Cache-Control and Expires HTTP headers. But there are some cases where this does not hold:

  • If a user logs in or out the interface must immediately reflect this.
  • If a user posts a comment, they must see the result.

The origin server has some options within HTTP to force proxies and browsers to invalidate parts of their cache on these events. The following headers may be userful.

Invalidate caches when cookies change. Thus, if the session cookie changes, the interface will be updated. When using this, make sure to set Cache-Control: private for logged in users to avoid thrashing shared caches with user-specific pages. Also keep in mind that a malicous user can still thrash caches by setting unique irrelevant cookies. This can be avoided by responding with Cache-Control: private for all request with the Cookie header (regardless of session state).

Unfortunately, this prevents shared caching if using some tracking cookie for analytics as all users will have a unique cookie regardless of being logged in or not.

A fastly blog post discourges the use of this header, saying that Cookie is probably one of the most unique request headers, and is therefore very bad [in Vary]”. However, that disregards the importance of browser caches.

Location

When responding to a form POST with a 303 See Other response, the redirect target will be invalidated. For instance if using a plain form for posting comments and then redirecting back to the page that shows comments1. In these cases we get correct cache behavior “for free”.

Content-Location

When responding to a POST, the URI in the Content-Location header will be invalidated. This is useful if comments are posted through AJAX (thus without a redirect) and the corresponding GET endpoint must be invalidated.

Further reading

All this information is available in the HTTP specification. I recommend starting with section 13 on caching.

  1. Yet another argument in favor of POST-Redirect-GET