The caches are the mogwais of computing: they are cute, friendly and our friends, but they have the three rules that must be obeyed:
- It cannot come into contact with water;
- Keep it away from the bright light;
- No matter how much he cries, how much he begs, never, never feed him after midnight.
Break one of the rules and you’ll have one Gremlin in the hands.
As strange as it may seem to some cache is good and came to help, but requires special attention and care. Caching an application has two very important positive points:
- Decreases the response time of the application, which usually translates into increased performance;
- Increases data consistency as the same operation will have the same return;
- Reduces the load on the server, because the cache becomes a good part of what the server would do;
The act of caching comes down to somehow storing the result of a process and reusing it at all times that the process is performed again and today virtually everything caches.
The page you access through your browser is cached in so many ways that it is difficult to list them all. It is cached locally on the client’s computer, cached on the company’s proxy server, cached on the cache server, cached by the web server, and cached by the application (and perhaps other ways more).
Computer caches information on the different memories that make up the system: L1, L2, L3, RAM and hard disk memories.
Our brain caches a lot of everyday information: the way home/company is an example, there comes a point that we make this way without having to decide if it is better.
Apart from the complexity of being numerous types of caches to be controlled, there is the problem of knowing when a cache should stop being used, that the central point of the question.
To proceed with the answer, we will have to imagine a hypothetical situation composed of three actors: the client, responsible for making the request, the cache, intermediary actor who will cache the information and the resource, responsible for managing real information and generating the response.
Where:
The client makes the request, which will pass through the cache server. As it is the first time the request is made, the cache will not have the stored response, so forward the request to the resource server;
The cache server passes the request made by the client to the resource server when it does not have the cached information;
The resource server sends the response to the cache server, which will store in some way for future use;
The cache server passes on the response it received from the resource server to the client;
The client re-requests the same resource and again this request will be sent to the cache server;
Now, since the response is cached, the cache server can respond to the client without relying on the resource server;
And what makes it really hard to invalidate the cache is:
- There are many caches. Controlling all is costly for the development team;
- Telling the cache server that it should invalidate the information it has is challenging;
There are some more common approaches that seek to solve, or mitigate the problem.
Cache server request resource server when last updated
Before the cache server responds to the client with cached information it prompts the resource server when the last change was made. If the information is the most current, that is, no changes were made after the information was cached, the cache server responds to the client with its cached version.
It is curious that in this approach the resource server is still triggered every new request, which turns out to be a negative point of the solution, but it is still a plausible solution since the information transmitted by messages 6 and 7 are much smaller than in 2 and 3, who handled the original requisition. Another advantage of this solution is that it makes it easier for distributed systems, where the cache server is physically separated from the resource server.
This approach is quite common in web applications where the browser sends a request for only the headers of the response to the server and as the data present in the headers it does the request or not of the resource. It is common precisely because on the web the system is distributed and has unilateral communication, being the resource server send a message to the client to notify him of changes.
An expiry date is defined for cached information
Another common way to invalidate the cache is to set an expiration date.
Upon receiving the response from the resource server, message 3, the cache server will store, along with the information, the expiration date. Thus, when receiving the request from the client again, message 5, it is enough to check if date/time are shorter than the expiration date and, if they are, it responds the client with the cached information; otherwise, it forwards the request to the resource server and restarts the process.
The great advantage of this approach is that the resource server is not even aware of the requests that have been answered with the cached information, thus creating a greater independence between the actors of the system, whereas it has the disadvantage that if there are changes to the resource server the rest of the system will not know until the expiration date is exceeded.
Seeking to circumvent the disadvantages of this method, there are some secondary approaches that can be taken. If the communication between the actors is bilateral, the resource server can send a message to the cache server notifying the changes so that it makes a new request to the resource server and updates the cached information, whether or not there is a request from the client; but if the communication is unilateral, there is no way for the resource server to send that message to the cache server, then another way to mitigate the problem is to define a parameter in the request that requires the cache server to update the information, regardless of whether there have been changes to the resource server. An example of this is an HTTP request that is handled on a cache server with the presence of the header X-No-Cache: True
it handles the request in a "By Pass" way, requesting the application server and storing the response as cache. Thus, whenever there is a change in the resource, one can trigger a request from a client with the parameter properly configured to update the cached information. Both approaches require having control of the cache server, which does not always occur.
Invalidation of cache due to lack of use
Combined with the two approaches, more the first than the second, it is possible to make the cache server invalidate the information stored due to lack of use. If a request is made now, the response is cached and is not redone for a long period of time, the cache server can discard the information and request the resource server again when necessary. This approach does not take changes into account in the resource server, so it is more common to be applied as a helper of other approaches.
And we’re just talking about the resource server view, which is where the team usually has control and the freedom to manage/design as required by the project. When we consider from the customer there may be barriers that the application is unable to pass.
It is common to have in companies a proxy service that controls information traffic from the internal area with the external area, controlling what can leave and what can enter. As there is a service that handles all requests, it is common that, in addition to auditing information, also cache them. If the Fulano employee made the requisition he needed and got the answer, when Beltrana needs the same information there will be no need to fetch it directly from the resource server; if the company’s own internal service has cached the information it can deliver to Beltrana the same information that was delivered to Fulano. Being an internal service of the company in question, the development team (of the application) is at the mercy of the behavior that the company has defined for its internal service. It is a cache that we will have no control over. It must be from the company itself to develop the logic of cache invalidation.
In short, the cache was created by essence to be validated, not invalidated, so it is expected (and desired) that the invalidation process will be complicated. They’re inversely proportional because as you create more rigid caching rules, the harder it will be to invalidate them, while the more versatile the caching rules, the less sense it makes to uselo, since the resource server would be requested more frequently.
Mathematically, the graph that characterizes two inversely proportional quantities is:
When designing the system it will be necessary to define at which point of the curve will be worked. Defining this point so as to meet all the requirements of the project, that curling when it should be curled and invalidating when it should be invalidated is extremely difficult. Overall, you’ll always end up caching more than you should, not being able to see updates in real time, or caching less, without taking advantage of all the benefits of caching.
And how the bfavaretto quoted in his reply, The problem gets worse because all of this is recursive. The resource server we consider in the response may be another cache server, so a cache server will act as the client of another cache server and cache. Quis custodiet Psos custodes? The problem then goes from invalidating the cache from the application to invalidating the cache from the cache or the cache from another cache or the cache cache from another cache, etc.
A web application easily reaches the structure:
In this case the application will need to invalidate the web server cache, the web server will need to invalidate the CDN cache, the CDN will need to invalidate the local proxy cache, the local proxy will need to invalidate the browser cache and, finally, the current version of the application will reach the client - if another version is not cached in the meantime.
Putz, I was gonna answer that right now, but I got my last cache reply yet.
– Woss