Better Living Through Caching


The fastest query is one you never execute.

The premise is that one of the slowest parts of starting up an application isn’t starting the application itself, it’s loading the initial application state. This can become a problem when you’re loading many copies of your application on many servers, especially you’re in the cloud and paying for CPU cycles. In that article, a commenter proposes reading application start up state from a serialized blob; basically a chunk of memory written to disk. The trick is that the serialized blob is stored in cache rather than on disk or in a database. Sometimes you need to hit disk in order to refresh the cache, but the general idea is that all configuration info is stored in a single binary object that can be quickly read and used to start up an application to a known good state.

Caching for More Than Start Up Times

Once you start caching application start state, it’s natural to look for more places to introduce additional caching. Remember, the fastest query is the one that you never execute.

Most people already know that they can add caching to their application to improve performance and get around slower parts of the system. There are a number of well understood design patterns that focus around caching and its place in software architecture. A lot of people don’t take this one step further and use caching as a trick to avoid down time when they roll out updates.

You might be thinking “Wait a minute, doesn’t my database/SAN/operating system have some kind of cache?” You’re right, it does. Storage cache is your last line of defense before reading from disk. Why not cache things in your application and skip the network hit?

So what happens when you need to update the application? In the past you probably scheduled an outage in the middle of the night. Or maybe you performed rolling outages from server to server and then slowly brought features online across groups of servers. However you did it, it’s complicated, requires down time, and you need to have a rollback plan; rollbacks on large databases can take a lot of time.

What if instead of just caching configuration to avoid slow start up, you start caching all data (or as much as can fit into memory)? You’re doing that already, right? Why mention it again?

If you’re caching data already, it seems logical that your application is written with multiple tiers. Those tiers are probably divided out by application or by service. If so, there’s a lot of logical separation between different features and functionality. You might even be calling a read/write API as if it were a service provided by a third party. This is a perfect example of how you can cache your reads and avoid hitting lower layers of the application; the front end never needs to know that anything exists apart from the services that provide data.

If you can cache data at the service level, you can theoretically take your back end systems offline for maintenance and bring them back online with minimal disruption to your users. Ideally, there would be no disruption. You could queue up modifications during your maintenance window and then commit them to the database once the updated database, services, or features are back online.

The Beauty of Isolation

By isolating features and layers from each other, you can make your applications more responsive. Rather than relying on servers to respond quickly during application start times, you can make it possible to load binary configuration data from cache. Frequently run queries can be served even faster by caching results in memory. Down times can even be avoided by caching reads and writes during the maintenance window. Of course, caching writes can be difficult. You can start by caching reads and keep your application up most of your users; it’s better than shutting everyone out completely.

To learn more about caching on Windows, read up on AppFabric Cache. On the *nix side of things, there’s the tried and true memcache. More novel and exotic solutions exist, but AppFabric Cache and memcache are great places to get started.

Previous Post
I’m on RunAsRadio Talking SQL Azure and SSDs
Next Post
Announcing Our New Weekly Community Recap Email

2 Comments. Leave new

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.