When you’re building an application for lots of clients, there’s two common ways to design the database(s):
- Option A: Put all clients in the same database
- Option 2: Build one database per client
I’ve got clients on both sides of this religious war, and last week I had the misfortune of taking two of them on opposite sides to lunch. After hearing them passionately debate their choices, I had to put together a blog post talking about the pros and cons of each solution:
Putting All the Clients in the Same Database
It’s simple: just add a Client table at the top of the schema, add a ClientUsers table to make sure people only see their own data, and away we go.
Easier schema management. When developers deploy a new version of the application, they only have to make schema changes in one database. There’s no worries about different customers being out of sync or on the wrong version.
Easier performance tuning. We can check index usage and statistics in just one place, implement improvements easily, and see the effects immediately across all our clients. With hundreds or thousands of databases, even the smallest change can be difficult to coordinate. We can check our procedure cache contents and know for certain which queries or stored procedures are the most intensive across our entire application, whereas if we’re using separate databases per client, we may have a tougher time aggregating query use across different execution plans.
Easier to build reports and an external API. If we need to grant access to our entire database for outsiders to build products, we can do that easier if all of the data is in a single database. If the API has to deal with grouping data from multiple databases on multiple servers, it adds development and testing time. (On the other hand, that “multiple servers” thing starts to hint at a restriction for the one-database-to-rule-them-all scenario: one database usually means all our load impacts just one database server.)
Easier high availability & disaster recovery. It’s really, really simple to manage database mirroring, log shipping, replication, and clustering if all we have to worry about is just one database. We can build a heck of an infrastructure quickly.
Putting Each Client in its Own Database
Many applications start their lives as internal apps for a single company’s needs. They realize the application is really valuable, so they start selling access to outsiders. It’s simple: when your second client signs on, just add another copy of the database and away we go.
Easier single-client restores. Clients are unreliable meatbags. (Except mine – they’re reliable meatbags.) They have all kinds of “oops” moments where they want to retrieve all of their data back to a point in time, and that’s a huge pain in the rear if their data is intermingled with other client data in the same tables. Restores in a single-client-database scenario are brain-dead easy: just restore the client’s database. No one else is affected.
Easier data exports. Clients love getting their hands on their data. They want the security of knowing they can get their data out anytime they want, avoiding the dreaded vendor lock-in scenario, and they want to do their own reporting. With each client’s data isolated into their own database, we can simply give them a copy of their own database backup. We don’t have to build data export APIs.
Easier multi-server scalability. When our application needs more power than we can get from a single server, we can divide up the databases between multiple servers. We can also spread out the load geographically, putting servers in Asia or Europe to be closer to clients.
Easier per-client performance tuning. If some clients use different features or reports, we can build a specialized set of indexes or indexed views just for those clients without growing everyone’s data size. Granted, there’s some risk here – by allowing schema differences between clients, we’ve just made our code deployments a little riskier and our performance management more difficult.
Easier security management. As long as we’ve properly locked down security with one user per database, we don’t have to worry about Client X accessing Client Y’s data. However, if we just use a single login for everyone, then we haven’t really addressed this concern.
Easier maintenance windows. In a global environment where customers are scattered around the globe, it’s easier to take customers offline for maintenance if we can do it in groups or zones.
What’s My Verdict?
There’s no one right choice: you have to know your own company’s strengths and weaknesses. Let’s take my two clients as examples.
Company A excels at hardware performance tuning. They’re really, really good at wringing the very last bit of performance out of hardware, and they don’t mind replacing their SQL Server hardware on a 12-18 month cycle. (They refresh web servers every 4-6 months!) Their Achilles’ heel is extreme compliance and security requirements. They have incredible auditing needs, and it’s just easier for them to implement bulletproof controls on a single server, single database than it is to manage those requirements across thousands of databases on dozens of servers. They chose one database, one server, many clients.
Company 2 excels at development practices. Managing schema changes and code deployments across thousands of databases just isn’t an issue for them. They have clients around the world, and they’re processing credit card transactions for those clients around the clock. They need the ability to spread load geographically, and they don’t want to replace servers around the world every 12-18 months. They chose one database for each client, and it’s paying off as they start to put SQL Servers in Asia and Europe for their offshore clients.