How to Design Multi-Client Databases

When you’re building an application for lots of clients, there’s two common ways to design the database(s):

  • Option A: Put all clients in the same database
  • Option 2: Build one database per client

I’ve got clients on both sides of this religious war, and last week I had the misfortune of taking two of them on opposite sides to lunch.  After hearing them passionately debate their choices, I had to put together a blog post talking about the pros and cons of each solution:

Lots of Clients, One Database
The Traditional Solution

Putting All the Clients in the Same Database

It’s simple: just add a Client table at the top of the schema, add a ClientUsers table to make sure people only see their own data, and away we go.

Easier schema management. When developers deploy a new version of the application, they only have to make schema changes in one database.  There’s no worries about different customers being out of sync or on the wrong version.

Easier performance tuning. We can check index usage and statistics in just one place, implement improvements easily, and see the effects immediately across all our clients.  With hundreds or thousands of databases, even the smallest change can be difficult to coordinate.  We can check our procedure cache contents and know for certain which queries or stored procedures are the most intensive across our entire application, whereas if we’re using separate databases per client, we may have a tougher time aggregating query use across different execution plans.

Easier to build reports and an external API. If we need to grant access to our entire database for outsiders to build products, we can do that easier if all of the data is in a single database.  If the API has to deal with grouping data from multiple databases on multiple servers, it adds development and testing time.  (On the other hand, that “multiple servers” thing starts to hint at a restriction for the one-database-to-rule-them-all scenario: one database usually means all our load impacts just one database server.)

Easier high availability & disaster recovery. It’s really, really simple to manage database mirroring, log shipping, replication, and clustering if all we have to worry about is just one database.  We can build a heck of an infrastructure quickly.

Each Client Gets A Database
The Trendy Solution

Putting Each Client in its Own Database

Many applications start their lives as internal apps for a single company’s needs.  They realize the application is really valuable, so they start selling access to outsiders.  It’s simple: when your second client signs on, just add another copy of the database and away we go.

Easier single-client restores. Clients are unreliable meatbags.  (Except mine – they’re reliable meatbags.)  They have all kinds of “oops” moments where they want to retrieve all of their data back to a point in time, and that’s a huge pain in the rear if their data is intermingled with other client data in the same tables.  Restores in a single-client-database scenario are brain-dead easy: just restore the client’s database.  No one else is affected.

Easier data exports. Clients love getting their hands on their data.  They want the security of knowing they can get their data out anytime they want, avoiding the dreaded vendor lock-in scenario, and they want to do their own reporting.  With each client’s data isolated into their own database, we can simply give them a copy of their own database backup.  We don’t have to build data export APIs.

Easier multi-server scalability. When our application needs more power than we can get from a single server, we can divide up the databases between multiple servers.  We can also spread out the load geographically, putting servers in Asia or Europe to be closer to clients.

Easier per-client performance tuning. If some clients use different features or reports, we can build a specialized set of indexes or indexed views just for those clients without growing everyone’s data size.  Granted, there’s some risk here – by allowing schema differences between clients, we’ve just made our code deployments a little riskier and our performance management more difficult.

Easier security management. As long as we’ve properly locked down security with one user per database, we don’t have to worry about Client X accessing Client Y’s data.  However, if we just use a single login for everyone, then we haven’t really addressed this concern.

Easier maintenance windows.  In a global environment where customers are scattered around the globe, it’s easier to take customers offline for maintenance if we can do it in groups or zones.

What’s My Verdict?

(No, I wouldn't actually eat this.)
And you thought cake couldn’t get any better.

There’s no one right choice: you have to know your own company’s strengths and weaknesses.  Let’s take my two clients as examples.

Company A excels at hardware performance tuning.  They’re really, really good at wringing the very last bit of performance out of hardware, and they don’t mind replacing their SQL Server hardware on a 12-18 month cycle.  (They refresh web servers every 4-6 months!)  Their Achilles’ heel is extreme compliance and security requirements.  They have incredible auditing needs, and it’s just easier for them to implement bulletproof controls on a single server, single database than it is to manage those requirements across thousands of databases on dozens of servers.  They chose one database, one server, many clients.

Company 2 excels at development practices.  Managing schema changes and code deployments across thousands of databases just isn’t an issue for them.  They have clients around the world, and they’re processing credit card transactions for those clients around the clock.  They need the ability to spread load geographically, and they don’t want to replace servers around the world every 12-18 months.  They chose one database for each client, and it’s paying off as they start to put SQL Servers in Asia and Europe for their offshore clients.

Previous Post
SQL Server Locking and You!
Next Post
Fixing SQL Server Management Studio’s Tab Text

54 Comments. Leave new

  • “Easier multi-server scalability” is the clincher for us. It’s the one point that takes me squarely off the fence and into Option B er… I mean Option 2.

    Reply
  • Pavel Nefyodov
    June 2, 2011 8:39 am

    Why on Earth all good ideas come to Brent’s head first and then to mine? This is what I think when I am reading his new blog entries. I am glad to hear that there is no such thing as “one size that fits all”!
    There is only one thing Brent, why “Company A” and “Company 2”? Why not “Company A” and “Company B” or “Company 1” and “Company 2”?

    Reply
  • I’ve actually used a hybrid approach where there was a client master database with minimal information stored in it, but the key was the client table that includes the connection string pinpointing where the client information was located. This allowed us to start out having all of our information in a single database (and server), but grow out the number of servers and databases as appropriate (and as we needed to scale). As we added clients to the system, we could pick which database server (and database) to place the client data. This approach does miss out on aggregating data across all databases, but we didn’t have a need to merge the data together.

    We also created a system for automating the building of the database deployment scripts. This system would generate the necessary SQL scripts to be run on each database server (it didn’t automatically execute the scripts, but provided the scripts for us to give to the operation guys to deploy). As we scaled, we did plan on automating that script deployment.

    Reply
  • Robert Tenney
    June 2, 2011 9:56 am

    Who says you have to choose? Both of these can work together but it is a lot harder to implement Option A after your app is already rolled out. Design your database from the beginning to handle multiple tenants. As you grow it is relatively easy to add Option 2 and give a client its own database or scale out by splitting up your clients. Just because a database can contain more than one client doesn’t mean it has to.

    Reply
  • Larry Silverman
    June 2, 2011 10:35 am

    We had a debate similar to this about 9 years ago when starting our company, and came down on the “one database per customer” side. Our primary deciding factor which isn’t mentioned in your article is the risk that one slip-up by a developer and you risk exposing Customer A’s data to Customer B. I definitely feel we made the right choice for our company.

    Another benefit we enjoy is the ability to stand up N copies of a given customer’s database to create test environments for them. Also, if we really need to dig into a performance problem that one customer is experiencing, we can restore a backup on a different machine and troubleshoot it there, rather than impacting the production environment. It’s much less data to have to move around and restore than if we just had one monolithic database.

    Reply
  • We use 1 DB per client, and have a few additional reasons to do it this way.

    1) Licensing cost. We use many instances of SQL Express, which is free! Combine with cheaper commodity servers because you scale out.

    B) Rolling migrations to new version. We don’t upgrade everyone at once, so we can catch problems with the first few clients, rather than having everyone affected.

    iii) Some clients aren’t willing to use a hosted solution, so they can buy a downloaded version of the app, and use the same DB in either place.

    The biggest downside in my view is that high availability is much harder.

    Reply
  • Hi Brent,
    I like this statement “There’s no one right choice”.

    Of course, They might need some different solution at some point over time.
    Option A: Might need server scale out.
    Option B: Might need consolidation.

    Reply
  • Rob McCauley
    June 3, 2011 11:46 pm

    How about the “Middle Way”

    ‘Everything exists’: That is one extreme.
    ‘Everything doesn’t exist’: That is a second extreme.
    Avoiding these two extremes,
    the Tathagata teaches the Dhamma via the middle.

    What I’m referring to is a hybrid approach, where each client gets their own schema within a single database.

    My application initially has just a [Model] schema- with tables, views, procs, etc. I built a CloneSchema procedure which scripts and redeploys all model objects to a new schema, at customer onboard time. I also wrote a ClearSchema, RenameSchema, and MigrateSchema procedure.
    The Model schema has extended properties to track its current version and build number (checked by upgrade scripts), and the properties are copied to the newly cloned schemas.

    This approach effectively “shards” the customers to separate tables within one or many databases. The DBA can decide when a database is big enough, and spin up additional databases to hold new customer schemas, or as a new location for a large client schema to be migrated to.
    And each client’s data reside in their own private tables, which gives them warm fuzzies.

    Alas, it took some time to develop the schema management procedures. I wish the powerful SMO.Scripter functions were available directly from TSQL. Also would be nice if the application could set a schema context via a ‘Use Schema’ similar to ‘Use Database’.

    But overall, the schema approach works and meets the performance, management, and compliance needs of the business and the clients.

    Reply
    • Rob – that’s interesting, but it may not work well as the database grows. For example, if you have a client suddenly grow, and you need to migrate them to another database, you’re going to have a huge amount of logged transactions and downtime while you move their schema and data from one database to another. You also can’t quickly move a single client’s data from slower to faster storage without downtime – whereas if it’s in a dedicated database, we can use tools like Storage vMotion or SAN virtualization to move a single database’s files across storage types without downtime. I can’t do that when the data is buried inside the same database files as other client data.

      You also can’t give regular database backups to a client without building your own tools. With dedicated client databases, backup & recovery of individual data is a piece of cake.

      I’m glad it’s working for you, though, and that’s the important part: it has to meet your own needs. Thanks!

      Reply
  • What’s up with all the wedding cake pics? Are you finally getting ready to tie the knot? 🙂

    Reply
  • SJ (@gonzodagr8)
    June 27, 2011 10:34 am

    Brent,

    Interesting discussion. We had a very very long discussion internally and revisit it for each major release. Each time we come to the same conclusion for us…each one separated. We have argued all of the points above but one that we consistently come up against is all the PII laws (especially in the EU) and clients requiring their data to NOT be co-mingled.

    Reply
  • Hi All,
    I am hoping that, i am not too late here to post my requirement. I have gone through above discussion and want to post my question here. I want to create a project where two of different client will have same database. Both clients will have different tables for their information details. They will share common tables into database like user login, client table etc. I am creating one table which store table name of each client with client id. So when ever any client logged in it will fetch that table name to retrieve record of particular client. I am using two different tables for each client because they have different columns. For example one client would have website column but other not. At same time, one more table to store column name of each client table for display sequence with client id to fetch this column name result. Each client column title to display is different which I am storing in same table.
    I know all information and structure of data is bit confusing. But I am currently working on it as assignment. Can anyone help me out for this?

    Reply
    • This is a really interesting question. I wish I had the time to dedicate to give you an answer that’s just as interesting, but there’s a lot of possibilities here. The best way to get help on something like this is to narrow down the question as tightly as possible, then post it on http://StackOverflow.com or http://SQLServerCentral.com. Include as much example code as you can so that others can quickly run it and get an idea of what you’re asking, and they might even post your code back with fixes. If you don’t get good answers within a day or two, let me know and I’ll take a look at the question to see if there’s a way we can get better answers.

      Reply
  • Thanks Brent for your quick reply. I have been finding solution for this from last one week but have not found proper solution online. As you mentioned i will post my requirement on stackoverlow for help.

    Reply
  • Just read your article, very interesting. We’re actually going towards Option 2: Build one database per client, but our concern is scalability. Your article talks about hundreds of databases, but what if you’re expecting 100,000 customers, that would mean 100,000+ databases. Granted in our case, these would be relatively small databases, approx 10 MB each, would that number cause an issue if you want to ensure high availability?

    Reply
    • Allan – personalized advice on building an SaaS app is kinda beyond the scope of what I can do in blog comments. We do have a couple of clients with that number of databases, but keep in mind the more databases you have, the more work you have to do for high availability and disaster recovery – just as I stated in the article above. If you’d like personalized advice, feel free to email us about a consulting engagement. Thanks!

      Reply
  • Dale Anderson
    May 22, 2015 1:10 pm

    Times have changed —

    I just read through most of this blog/comments, and the one on ‘Why use Schemas’ — While I do see value on both sides of this interesting debate, I think the real answer comes down to one thing: The right solution for the job!

    Now that means that I do agree that the ‘One DB per Client’ model can work quite well in many circumstances; but let’s be honest, only for applications that don’t have thousands of clients (or more). Also if secure, segregated multi-tenant data is the goal, this can be very problematic for developers facing HIPPA auditors, or data warehouses with billions of rows, some public (shared), some private (must be secure). Of course when you have thousands if database instances, administration and management, can quickly become a DBA nightmare.

    Now that Big Data and Cloud Computing is here, SaaS based applications often require support for thousands of clients (as Allan points out, maybe hundreds of thousands). Since we’re being honest here, really, DBA’s hate dealing with databases — right? It’s their job sure; but they want it to be easy; and good ones what to automate (which I highly encourage). Well if it was easy everyone would be able to do it. Case in point – a client of mine bought an application being customized that uses the ‘One DB per Client’ model. Currently there are 300+ clients and already the DBA’s complain. And guess what, the developers are complaining too — they can’t seem to write API’s that cross over these databases seamlessly and constantly ask why a ‘Client’ umbrella table can’t be created to manage these customers in a single DB. The bigger problem is that once the customization is complete, they plan to migrate an OLD (and I mean OOOOLD) database which over 8000+ clients. They cringe at the thought of the migration let alone managing up to 10K databases once they sell to a few more clients. But, that’s the application; they don’t keep millions of rows in each database so they are all very small, which is another complaint we’ve heard over and over “these DB’s are small and many; doesn’t; seem right! Seems like a lot of work for no value”! This solution has proven to be a huge burden for them. Nobody likes it.

    Now, the ‘Multi Client single DB’ option gets much more interesting when SaaS and Big Data are concerned. But I assure you that I am not going to go into the Hadoop mantra here; let’s keep this to MS SQL Server… While I find some of the arguments in favor of ‘Multi Client single DB’ quite interesting, no one mentioned the best use matching up of a ‘Multi-Schema per FILEGROUP’ design option. What I’ve done is to create a GLOBAL file group containing Metadata about everyone and everything in the DB, and a SEPARATE schema and file group for each client. This enables me to GUARENTEE data separation, user security, scalability, performance, AND manage everything in what turns out to be a ‘virtual’ single DB (which by the way I can still split out into multiple DB if I want to for distribution considerations) where each Schema/FILEGROUP has its own unique dataset based on an identical physical data model. This allows developers to write a common API data access layer AND make DBA automation much easier.

    There, does that make sense?

    BTW: My application is multi-tenant, cloud based, data warehouse, & BI where clients might have millions or billions of rows. Schema’s, FILEGROUPS, and carefully designed data models give me the flexibility for growth without the cost of hiring an army of DBA’s.

    Reply
    • Steve Thompson
      March 13, 2017 4:36 pm

      Very interesting article and comments. it is clear there will never be a one size fits all. Dale – your one client per schema/filegroup with a master global file group is interesting. Was wondering if SQL 2016 Row Level Security could solve some of the multi-tenant single db solution.

      Reply
  • “It’s simple: just add a Client table at the top of the schema, add a ClientUsers table to make sure people only see their own data..” I didn’t get you there. How can we do this? Can you share some link for example?

    Reply
    • Andy – sure, in pseudocode:

      CREATE TABLE dbo.Clients(ClientId INT, ClientName VARCHAR(100));
      CREATE TABLE dbo.Employees(EmployeeId INT, ClientId INT, EmployeeName VARCHAR(100));

      Then whenever one of your customers wants to query their list of employees, you need to filter it by their ClientId.

      Reply
  • Tim Janssen
    July 10, 2018 8:57 am

    In a Database per client setup, there will be differences between clients in the way the database is used, so we experienced that we had to create some client specific indexes or different fill factors. We are still looking on how to manage these differences. We somehow want to keep track of the changes in GIT or some other repository but also want to have an automated way to check if all indexes are in the right way in the right client database. Do you guys have any good idea?

    Reply
  • Rajesh Chilambi
    March 21, 2019 2:22 pm

    In the case of having one db for every client – how would we do data warehousing and analytics. Are there examples of doing this anywhere? Are there any recommendations to how we architect this?

    Reply
    • Rajesh – that kind of architecture design is a little beyond what I can do free in a blog comment, but it’s a great example of the kind of thing you bring in a consultant to help you tackle.

      Reply
  • Chaitanay Deshmukh
    March 24, 2019 10:02 pm

    Hello Brent, first of all thanks for this article. There are very few articles about this topic confidently and this one of the best, and comments too. I have a question, I read comment from Rob McCauley above and this is the approach that we have. We have single MSSQL database instance and inside that we have separate schema for each client. All these schemas share same table structure. Now what we do in our Java code is, based on the client identified, we connect to that database within the same Database Instance. We do this by firing “Use [XYZ]” query and then returning connection through Hibernate Multi – tenancy. But what we observed is, there is intermittent slowness on production and it is difficult to diagnose.
    When I read about Hibernate Multi-tenancy it says that data for each client should reside on physically separate database instance. Which is not the case in our environment.
    My question here is, how connection pooling will be done in this case, as we have single Data-source configured in Wildly? This data-source is shared between all clients and reruns multiple connections based on client identifier.
    My suspicion is in connection pooling where we spend lot of time. Having data for different Clients on single database instance is ok but at least connection pooling should have been client based where each client has a
    separate connection pool.
    Can you please share your thought on this one?

    Reply
  • Hi Brent,
    Currently we are looking at one scheme one client using MySQL.We have close to 2000 clients.
    Could you please outline on the factors we should consider.
    Please note that currently the data is in MS SQL server and eventually we want to move on to MySQL.
    What are the ways to move data from ms sql server to mysql. And what are the best options to promote a code to all the 2000 schemas ? What kind of tools would you recommend to load data to mysql in the future.
    Can you please share your thoughts.

    Reply
    • Kowsalya – this site is about Microsoft SQL Server, not MySQL. I can’t really give you advice on MySQL. Best of luck on your journey.

      Reply
      • Hi Brent,
        We have not yet decided on MySql. The decision is between using MS SQL Server and MySQL. But if the decision is made to use MS SQL Server what would be your options.Could you please provide your thoughts on MS SQL Server.

        Reply
  • What about sp and udf? If i want to move from centerlized db to client-per-db i need to have all sp and udf to be copied to all client’s db and every new development need to be deployed to all

    How something like this can be achieved?

    Reply
  • Sandip Limbachiya
    December 31, 2020 2:00 am

    Hello Brent,

    We have a system with 220 clients and we have created a Master database with common data and 220 Client base separate database. We are thinking to merge it into one database. but all client has above 1 Lacs record. Can you assist me that am I going right? Is it good to merge or continue the same pattern?

    Thanks.

    Reply
  • Himanshu Bhujbal
    March 24, 2021 12:08 pm

    Thanks Brent Ozar, and other Members for sharing insights.

    I want to ask specifically what will be the impact on Query Run time for both 1 db per client and 1 schema per client.

    Which option will give or will be fast in running queries and daily data load task ?

    Regards,
    Himanshu

    Reply
  • Sanjay Thavi
    June 28, 2021 7:52 am

    Why you people are so much bothered
    Design your application for option 1
    it will work for both single client single database
    and Single Database Multiple clients.

    But option 2 will not work for multiple clients in single database.

    Reply
    • Sanjay – yes, you’re referring to a hybrid approach. That can be the best of both worlds – but also the worst of both worlds. I’d recommend re-reading the post a little more carefully, and thinking about how the drawbacks of your design might work. Cheers!

      Reply
  • Hi Brent!

    Would be interested to hear your opinion on the idea of using db per client model, but where you have ‘shared’ tables, i.e. content tables which all clients should have access to? Would you store this on a base/core db which all clients would also have access to as well as their own, or when new records are created they are sync’d out to each db by e.g. use of SSP?

    Reply
  • Jacek Trocinski
    April 6, 2022 8:24 am

    Does having DB per user/client scale well for a web application with hundreds of thousands of users?

    Reply
    • It can, yes – I’ve got a client with over half a million databases, each of which is a paying user with their own separate database. It’s hard work – but consider the opposite.

      If you have a single database, and half a million paying users, that presents problems too. 😉

      Reply
  • Jacek Trocinski
    April 6, 2022 8:48 am

    Where do you store common tables that are updated by an ETL process (e.g., stock prices)?

    Reply
    • That’s a great question, and it’s the kind of thing I tackle in my consulting work all the time. The answer is different in each scenario depending on your application. For help, click the Consulting link at the top of the screen.

      Reply
  • Jacek Troci?ski
    April 6, 2022 10:31 am

    Can you specify on what databases the multi-client architecture would work?

    E.g., I can’t imagine this type of architecture would work on an Oracle database.

    Reply
  • Jonathan Marbutt
    February 7, 2023 1:11 am

    Where I have struggled is finding any great tooling for managing database migrations well for thousands of databases. Are most people using like the Visual Studio DB Projects to manage these for deployments? Or are they using tools like red-gate? Or rolling their own to manage lots of tenants with individual databases.

    Reply
    • I’ll rephrase your question: “Is there a big market for production-quality tools for SaaS vendors?”

      Simply, no. Think about the diminishing number of companies that meet each of these specs:
      * Build SaaS apps
      * Hosted with a SQL Server back end
      * Putting each client in its own database
      * Who are only now getting started, or who can dump their existing homegrown tooling and move to something else

      There just aren’t many.

      Reply
  • Ivan Argentinski
    June 10, 2023 4:02 pm

    We have 1 db per client. The article is great, but feels incomplete for me in one aspect:

    Per-client performance tuning

    We have clients with 5+TB dbs and we do that a lot. The difference is that once we find a solution, we do not simply implement it only for that customer (even if we do, it would be wiped away at the next schema upgrade).

    We do test it against many dbs and finally implement it in a central schema. Than, it is gradually deployed to all customers. Of course, we make sure that it is deployed immediately or very soon to the affected customer.

    So, in essence, we do the tuning per client, but than deploy to all clients. So we do not have different schemas.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.