Chapter 6
Cache
It's because there's cache! It needs to be cleared.
Cache: the source of many problems. It's omnipresent in conversations as soon as you work on a web product and the developers' go-to excuse to avoid diagnostics.
What is a cache?
A cache is a system, a software, that remembers the result of a process. There's no need to rerun the entire chain;
you already have the answer at hand.
A cache can contain a lot of things: the result of a costly process, images, videos, data that rarely changes,
etc...
The goal is to improve performance by reducing:
-
Response time: less processing, geographical proximity to the client...
Less delay = a happy client and an improvement in search engine ranking. -
Resources used: computing power, load on the machines protected by the cache...
Fewer resources = lower costs.
You use caches every day!
Each webpage display triggers dozens of caches distributed in different systems: your web browser, your internet
service provider, the website's hosting service, the website's internal software, etc…
Before even reaching the website and its content, your browser manages a whole bunch of obscure caches on its own.
Remember the DNS we talked about in the second
chapter? Of course, there's caching involved there.
Each layer of cache has its own purpose, involving different technologies and requiring a dedicated management
strategy.
A few examples:
- The website can indicate how browsers and intermediaries should cache its content. That's why when you revisit a page, it loads much faster than the first time.
- A Content Delivery Network, or CDN, is a network of machines distributed around the world. CDNs aim to geographically bring content closer to clients, thereby reducing network transfer times. For instance, for a client in Europe, their request will probably be routed to a European server. CDNs are particularly, but not exclusively, used for static files: images, videos, website design…
- A reverse proxy is a special server that receives requests to protect application servers. There's no need to execute code if the proxy can already provide a pre-calculated response. Most known technologies include Varnish, Nginx.
- Application cache. We're going to explore this in more detail right away.
The application cache
These are the cache systems directly used by the code written by developers. We mainly use it to reduce the number of communications by storing:
- Results of database processes.
- Responses from calls to other internal components or partners.
Let's take an example:
- You want to display your subscription offers on a page.
- Your offer catalog is hosted by a partner. It can take up to 5 seconds to respond, which is an eternity in a customer's navigation experience.
- You want to offer different deals to a European client and an American client.
The offers don't change every day, so we decide to cache them. This avoids the 5-second call.
When do we cache? Two possible strategies: the first client to arrive triggers the request and endures the 5 seconds
of waiting. That's usually what we do 🙈. Or, we're kind and develop a script that pre-fills the cache. This is
called warm-up.
The world being cruel, we let the first client take the hit 😈.
The complete sequence for them:
- They load the page with all the offers.
- We detect this client as European. We check if the European offers are already in the cache. No ❌.
- We ask the partner to provide us with the European offers. They respond after 5 seconds.
- The response is stored in the cache.
- The response is returned to the client so they can see the offers on the page.
The second client arrives:
- They load the page with all the offers.
- We detect this client as European. We check if the European offers are already in the cache. Yes ✅.
- The response is returned to the client so they can see the offers on the page.
We've saved 5 seconds per client 🥳! Well, for the Europeans... The first American client will also have to wait 5
seconds.
In our example, the cache is segmented by a single criterion: the continent. It's a finite list whose values we
know. We could have easily preheated the cache. It's not always the case! We could segment offers according to
multiple criteria: the client is already subscribed so we only want to offer them upgrades, the client has canceled
their subscription so we need to try to retain them, this client has already benefited from a promotion so we
shouldn't offer them a new one…
😱 What a nightmare!
And if a cache is difficult to preheat, it will also be difficult to clear.
Clearing the cache
There are only two hard things in computer science: cache invalidation and naming things <Phil KARLTON>
Case 1: Cache shared by several clients
Let's go back to the offers example. You create a new offer for Americans in your partner's back office. You need to
clear the cache of offers so that clients can access them.
First problem: how does your site know something has changed at the partner's end? 😅
Two solutions: either you regularly check with the partner to see if anything has changed, or they can send you the
event "a new offer has been created". In both cases, communication needs to be set up.
Second problem: if you clear the cache and it takes 5 seconds to rebuild, the thousands of clients currently on the
site will all trigger a request to the partner simultaneously.
Clearing a cache is like opening the store gate on a sale morning. By removing the protection, you expose the entire
system to a huge instant traffic spike.
Solutions: avoid clearing cache during busy times, preheat as much as possible, clear only what's necessary.
Third problem: what is "only what's necessary"?
Adding the other criteria mentioned earlier, the new offer might require clearing several caches. It's usually the
incorrect clearing of one of them that gives developers grey hairs. The worst part is when caches need to be cleared
in a certain order because one's calculation depends on another.
Case 2: Cache specific to a client
Suppose that at the time of a client's connection to the mobile application, you go fetch a bunch of data about
them: payment history, etc... You cache these data so that they are quickly available during navigation.
Problem: unless something is modified on their profile, this client's data will indefinitely take up space in the
cache.
That's why most caching systems allow setting an expiration delay. When creating the cache entry, you can indicate,
for example, "I want this entry to expire in 3 hours". It's not as easy as it sounds! A too short delay: your cache
is useless. A too long delay: you risk overloading the memory.