Blog

When you’re building an application for lots of clients, there’s two common ways to design the database(s):

  • Option A: Put all clients in the same database
  • Option 2: Build one database per client

I’ve got clients on both sides of this religious war, and last week I had the misfortune of taking two of them on opposite sides to lunch.  After hearing them passionately debate their choices, I had to put together a blog post talking about the pros and cons of each solution:

Lots of Clients, One Database

The Traditional Solution

Putting All the Clients in the Same Database

It’s simple: just add a Client table at the top of the schema, add a ClientUsers table to make sure people only see their own data, and away we go.

Easier schema management. When developers deploy a new version of the application, they only have to make schema changes in one database.  There’s no worries about different customers being out of sync or on the wrong version.

Easier performance tuning. We can check index usage and statistics in just one place, implement improvements easily, and see the effects immediately across all our clients.  With hundreds or thousands of databases, even the smallest change can be difficult to coordinate.  We can check our procedure cache contents and know for certain which queries or stored procedures are the most intensive across our entire application, whereas if we’re using separate databases per client, we may have a tougher time aggregating query use across different execution plans.

Easier to build an external API. If we need to grant access to our entire database for outsiders to build products, we can do that easier if all of the data is in a single database.  If the API has to deal with grouping data from multiple databases on multiple servers, it adds development and testing time.  (On the other hand, that “multiple servers” thing starts to hint at a restriction for the one-database-to-rule-them-all scenario: one database usually means all our load impacts just one database server.)

Easier high availability & disaster recovery. It’s really, really simple to manage database mirroring, log shipping, replication, and clustering if all we have to worry about is just one database.  We can build a heck of an infrastructure quickly.

Each Client Gets A Database

The Trendy Solution

Putting Each Client in its Own Database

Many applications start their lives as internal apps for a single company’s needs.  They realize the application is really valuable, so they start selling access to outsiders.  It’s simple: when your second client signs on, just add another copy of the database and away we go.

Easier single-client restores. Clients are unreliable meatbags.  (Except mine – they’re reliable meatbags.)  They have all kinds of “oops” moments where they want to retrieve all of their data back to a point in time, and that’s a huge pain in the rear if their data is intermingled with other client data in the same tables.  Restores in a single-client-database scenario are brain-dead easy: just restore the client’s database.  No one else is affected.

Easier data exports. Clients love getting their hands on their data.  They want the security of knowing they can get their data out anytime they want, avoiding the dreaded vendor lock-in scenario, and they want to do their own reporting.  With each client’s data isolated into their own database, we can simply give them a copy of their own database backup.  We don’t have to build data export APIs.

Easier multi-server scalability. When our application needs more power than we can get from a single server, we can divide up the databases between multiple servers.  We can also spread out the load geographically, putting servers in Asia or Europe to be closer to clients.

Easier per-client performance tuning. If some clients use different features or reports, we can build a specialized set of indexes or indexed views just for those clients without growing everyone’s data size.  Granted, there’s some risk here – by allowing schema differences between clients, we’ve just made our code deployments a little riskier and our performance management more difficult.

Easier security management. As long as we’ve properly locked down security with one user per database, we don’t have to worry about Client X accessing Client Y’s data.  However, if we just use a single login for everyone, then we haven’t really addressed this concern.

Easier maintenance windows.  In a global environment where customers are scattered around the globe, it’s easier to take customers offline for maintenance if we can do it in groups or zones.

What’s My Verdict?

(No, I wouldn't actually eat this.)

And you thought cake couldn't get any better.

There’s no one right choice: you have to know your own company’s strengths and weaknesses.  Let’s take my two clients as examples.

Company A excels at hardware performance tuning.  They’re really, really good at wringing the very last bit of performance out of hardware, and they don’t mind replacing their SQL Server hardware on a 12-18 month cycle.  (They refresh web servers every 4-6 months!)  Their Achilles’ heel is extreme compliance and security requirements.  They have incredible auditing needs, and it’s just easier for them to implement bulletproof controls on a single server, single database than it is to manage those requirements across thousands of databases on dozens of servers.  They chose one database, one server, many clients.

Company 2 excels at development practices.  Managing schema changes and code deployments across thousands of databases just isn’t an issue for them.  They have clients around the world, and they’re processing credit card transactions for those clients around the clock.  They need the ability to spread load geographically, and they don’t want to replace servers around the world every 12-18 months.  They chose one database for each client, and it’s paying off as they start to put SQL Servers in Asia and Europe for their offshore clients.

↑ Back to top
  1. “Easier multi-server scalability” is the clincher for us. It’s the one point that takes me squarely off the fence and into Option B er… I mean Option 2.

  2. Why on Earth all good ideas come to Brent’s head first and then to mine? This is what I think when I am reading his new blog entries. I am glad to hear that there is no such thing as “one size that fits all”!
    There is only one thing Brent, why “Company A” and “Company 2″? Why not “Company A” and “Company B” or “Company 1″ and “Company 2″?

  3. I’ve actually used a hybrid approach where there was a client master database with minimal information stored in it, but the key was the client table that includes the connection string pinpointing where the client information was located. This allowed us to start out having all of our information in a single database (and server), but grow out the number of servers and databases as appropriate (and as we needed to scale). As we added clients to the system, we could pick which database server (and database) to place the client data. This approach does miss out on aggregating data across all databases, but we didn’t have a need to merge the data together.

    We also created a system for automating the building of the database deployment scripts. This system would generate the necessary SQL scripts to be run on each database server (it didn’t automatically execute the scripts, but provided the scripts for us to give to the operation guys to deploy). As we scaled, we did plan on automating that script deployment.

  4. Who says you have to choose? Both of these can work together but it is a lot harder to implement Option A after your app is already rolled out. Design your database from the beginning to handle multiple tenants. As you grow it is relatively easy to add Option 2 and give a client its own database or scale out by splitting up your clients. Just because a database can contain more than one client doesn’t mean it has to.

  5. We had a debate similar to this about 9 years ago when starting our company, and came down on the “one database per customer” side. Our primary deciding factor which isn’t mentioned in your article is the risk that one slip-up by a developer and you risk exposing Customer A’s data to Customer B. I definitely feel we made the right choice for our company.

    Another benefit we enjoy is the ability to stand up N copies of a given customer’s database to create test environments for them. Also, if we really need to dig into a performance problem that one customer is experiencing, we can restore a backup on a different machine and troubleshoot it there, rather than impacting the production environment. It’s much less data to have to move around and restore than if we just had one monolithic database.

  6. We use 1 DB per client, and have a few additional reasons to do it this way.

    1) Licensing cost. We use many instances of SQL Express, which is free! Combine with cheaper commodity servers because you scale out.

    B) Rolling migrations to new version. We don’t upgrade everyone at once, so we can catch problems with the first few clients, rather than having everyone affected.

    iii) Some clients aren’t willing to use a hosted solution, so they can buy a downloaded version of the app, and use the same DB in either place.

    The biggest downside in my view is that high availability is much harder.

  7. Hi Brent,
    I like this statement “There’s no one right choice”.

    Of course, They might need some different solution at some point over time.
    Option A: Might need server scale out.
    Option B: Might need consolidation.

  8. Pingback: Something for the Weekend – SQL Server Links 03/06/11

  9. How about the “Middle Way”

    ‘Everything exists’: That is one extreme.
    ‘Everything doesn’t exist’: That is a second extreme.
    Avoiding these two extremes,
    the Tathagata teaches the Dhamma via the middle.

    What I’m referring to is a hybrid approach, where each client gets their own schema within a single database.

    My application initially has just a [Model] schema- with tables, views, procs, etc. I built a CloneSchema procedure which scripts and redeploys all model objects to a new schema, at customer onboard time. I also wrote a ClearSchema, RenameSchema, and MigrateSchema procedure.
    The Model schema has extended properties to track its current version and build number (checked by upgrade scripts), and the properties are copied to the newly cloned schemas.

    This approach effectively “shards” the customers to separate tables within one or many databases. The DBA can decide when a database is big enough, and spin up additional databases to hold new customer schemas, or as a new location for a large client schema to be migrated to.
    And each client’s data reside in their own private tables, which gives them warm fuzzies.

    Alas, it took some time to develop the schema management procedures. I wish the powerful SMO.Scripter functions were available directly from TSQL. Also would be nice if the application could set a schema context via a ‘Use Schema’ similar to ‘Use Database’.

    But overall, the schema approach works and meets the performance, management, and compliance needs of the business and the clients.

    • Rob – that’s interesting, but it may not work well as the database grows. For example, if you have a client suddenly grow, and you need to migrate them to another database, you’re going to have a huge amount of logged transactions and downtime while you move their schema and data from one database to another. You also can’t quickly move a single client’s data from slower to faster storage without downtime – whereas if it’s in a dedicated database, we can use tools like Storage vMotion or SAN virtualization to move a single database’s files across storage types without downtime. I can’t do that when the data is buried inside the same database files as other client data.

      You also can’t give regular database backups to a client without building your own tools. With dedicated client databases, backup & recovery of individual data is a piece of cake.

      I’m glad it’s working for you, though, and that’s the important part: it has to meet your own needs. Thanks!

  10. What’s up with all the wedding cake pics? Are you finally getting ready to tie the knot? :)

  11. Pingback: Links for June 10th through June 11th — Vinny Carpenter's blog

  12. Brent,

    Interesting discussion. We had a very very long discussion internally and revisit it for each major release. Each time we come to the same conclusion for us…each one separated. We have argued all of the points above but one that we consistently come up against is all the PII laws (especially in the EU) and clients requiring their data to NOT be co-mingled.

  13. Hi All,
    I am hoping that, i am not too late here to post my requirement. I have gone through above discussion and want to post my question here. I want to create a project where two of different client will have same database. Both clients will have different tables for their information details. They will share common tables into database like user login, client table etc. I am creating one table which store table name of each client with client id. So when ever any client logged in it will fetch that table name to retrieve record of particular client. I am using two different tables for each client because they have different columns. For example one client would have website column but other not. At same time, one more table to store column name of each client table for display sequence with client id to fetch this column name result. Each client column title to display is different which I am storing in same table.
    I know all information and structure of data is bit confusing. But I am currently working on it as assignment. Can anyone help me out for this?

    • This is a really interesting question. I wish I had the time to dedicate to give you an answer that’s just as interesting, but there’s a lot of possibilities here. The best way to get help on something like this is to narrow down the question as tightly as possible, then post it on http://StackOverflow.com or http://SQLServerCentral.com. Include as much example code as you can so that others can quickly run it and get an idea of what you’re asking, and they might even post your code back with fixes. If you don’t get good answers within a day or two, let me know and I’ll take a look at the question to see if there’s a way we can get better answers.

  14. Thanks Brent for your quick reply. I have been finding solution for this from last one week but have not found proper solution online. As you mentioned i will post my requirement on stackoverlow for help.

  15. Pingback: Designing a database for multiple clients

  16. Just read your article, very interesting. We’re actually going towards Option 2: Build one database per client, but our concern is scalability. Your article talks about hundreds of databases, but what if you’re expecting 100,000 customers, that would mean 100,000+ databases. Granted in our case, these would be relatively small databases, approx 10 MB each, would that number cause an issue if you want to ensure high availability?

    • Allan – personalized advice on building an SaaS app is kinda beyond the scope of what I can do in blog comments. We do have a couple of clients with that number of databases, but keep in mind the more databases you have, the more work you have to do for high availability and disaster recovery – just as I stated in the article above. If you’d like personalized advice, feel free to email us about a consulting engagement. Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php