I’ve been doing research for a talk I’ll be giving at Prairie Dev Con in April about caching in Web Services. Turns out a large number of web resources out there in the wild have absolutely no notion of caching, creating massive drains on load times that could be entirely avoided. While looking into things, I found out the Cache-Control HTTP header is capable of doing so much more than just telling your browser what to do.
Normally, Cache-Control is used by web developers to ultimately tell the client what to do. Value like “no-cache” and “max-age” effectively tell a user’s browser or web client how they should store a cached copy of the response, with the intent of avoiding unnecessary network traffic for future requests while the cache is still valid – or conversely, telling the web client to never cache the result, always making a new request.
What I didn’t know is that there’s other values that you can specify that can be picked up by clients other than the end user’s web client, such as intermediary proxies, edge servers, and the like. For example, a HTTP header of “Cache-Control:public;” tells any network component along the request/response chain that the response can be cached.
This means a whole ton of things. If you’re using a hosting provider or cloud service and they support this functionality, they can be / might be caching your results for you. Meaning that your web application may not even be notified of some future requests, because a proxy or intermediary server along the way has decided to respond with a cached version of the data instead of bothering to pipe the request all the way to your application.
Of course, you’ll want some level of control over this, and other values can provide that for you. A value of “private” specifies that the response is specific to a individual client, and shouldn’t be cached for all requests. “no-store” prevents anyone from storing anything about the request or response; usually used when sensitive / private information is exchanged. Values like “max-age” and “must-revalidate” define a number of seconds that any intermediate cache can retain and reuse requests and responses, making sure that those values are automatically invalidated after a time you specify.
On the flip side, you can include some of these values in your request as well. For instance, if you specify a value of “min-fresh” tells the cache that you’ll only accept a cached response that’s a maximum number of seconds old. “only-if-cached” requests only cached responses, making sure that we don’t bother the server if there’s nothing in the cache to begin with, and instead return a 504 (Gateway Timeout).
The caveat to all of this is making sure that your web application is including these headers with your web responses, which in most frameworks is relatively easy to do. Also keep in mind if you’re getting weird errors with debugging or requests not hitting your application layer, that this is a thing that could be happening (especially in a test or live environment).