Everyone poops and everyone grows. We all reach that growth stage in our own time, but sooner or later we all encounter it. Once you reach a growth stage you will experience some of these common emotions.
It’s 8:37AM and your developers just barged into your cubicle yelling, “The database can’t handle the load!” What’s your first reaction? If you’re like me, you probably threw a stapler at them. If you’re a bit more level-headed, you most likely said something like, “That’s unpossible!”
Congratulations, you’re in the first stage of database growth: denial. It’s a natural response to hearing that your application has grown beyond the physical confines of your database. The usual reaction is to quickly scramble for historical performance data, your performance tuning scripts, and server level performance enhancements.
You might be successful; you may be able accomplish enough at this stage to fight off performance problems for a few more months. If trends continue you’ll have to get over your denial and move on to the next stage of your growth.
You’ve accepted that you can’t keep denying the problem. You’ve even accepted that there is a problem. In a meeting with the architecture review team, you’ve probably even said, “Let’s face it, there’s a problem.” Everyone nodded their heads in agreement and you got a smile on your face until the senior architect said, “After analyzing the network traffic and feature usage, we realize that we need to make changes in our architecture. We’re going to split the application into feature silos; each silo will have a separate database.”
And then your anger sets in. “Do you have any idea how much work that’s going to be? This will destroy normalization,” you shout. “We can’t lose integrity throughout the entire application!”
The architect smiles at you and says, “We won’t be losing critical integrity, core writes will go to a main service where they’ll be replicated to the feature database servers. You’ll have all the integrity that’s necessary for each feature to function. The developers will change how a few writes are performed in the middle tier, but that should be the biggest change.”
Reluctantly, you agree. You return to your desk, grumbling, and begin plans to rip apart the beautifully normalized schema that you worked hard to design with your database design team. You’re angry about the work you need to put into denormalizing the database and the scope of your changes compared to the development team.
You’ve successfully redesigned your database by splitting out features into separate databases. This let you identify the main performance problems and move them off to their own servers. It wasn’t the solution you liked, but you came to terms with it. Heck, the migration even went smoothly. After your initial misgivings and anger you were able to implement a good solution.
Unfortunately, your design couldn’t withstand the forces of a free market economy. After being mentioned on Oprah, the slow and steady growth curve has become a spike: it’s the dreaded hockey stick of scale!
You’re prepared – you bust out your scripts and monitoring tools and get to work. Within a few hours you’ve identified the slowest queries, the crappiest indexes, and come to the conclusion that the biggest bottleneck is the three year old server and five year old SAN.
Armed with your facts, delightful graphs, and hardware requirements you head up to your manager’s office to request new hardware. And then the bargaining begins. You know you’ll get enough hardware in the end, but at what cost?
You’re running on brand new servers and big fast SAN, what could possibly go wrong? Unprecedented success. Your company continues to grow at a ridiculous rate. It’s good for the stockholders, it’s good for the executives, and it’s good for your bonuses. Unfortunately, it’s bad for your schedule.
With newfound success comes a host of new problems: locking, blocking, and now deadlocks. You could configure read replicas using replication (despite the nasty reputation). If it’s sometime after the first quarter of 2012 you could use Denali’s availability groups to scale out reads and provide some additional safety features. Either way, you know that there’s no way one server per feature is going to keep up with the load and you know that you can’t split up your features any more than you already have.
You’ve resigned yourself to long nights of baby sitting replication, custom solutions, and waiting for new versions of SQL Server to bring you much needed features.
Over time you’ve turned what you thought was a Rube Goldberg machine of database scalability into a high performance, high availability solution. After configuration replication and, eventually, availability groups you worked with the architecture team to design more ways to scale out – writes are being routed at the application level using a process known as sharding. Using sharding you’re now able to grow different portions of different features at different rates of growth. You no longer have to scale up your entire infrastructure in response to the needs of a small percentage of the users; they can be broken out and scale at their own rate.
Of course, not all stages of growth will happen in this order. You might even skip some of them altogether. No matter how you grow, these mechanisms for dealing with growth are all valid and will serve you well.