Tiering kCura Relativity Databases (Or Any SaaS Product)

When you’re the database administrator working with a software product that stores every client’s data in a different database, the sheer number of databases can be intimidating. As you grow from dozens to hundreds to thousands of databases, you can’t treat all of them equally.

Start by making a graph of the database sizes – here’s an example from one of my clients with 58 databases on a server (certainly not a big number, but just easy to digest in a blog post):

Relativity Databases by Size

Relativity Databases by Size

Next, divide the databases into two tiers:

Standard Tier: In this example, it’s the 53 databases on the left, all under 50GB in size. We can put all of them on the same clustered SQL Server instance, on one pool of shared storage. We use native SQL Server backups on those, and we don’t do any index tuning on those databases at all. The standard Relativity database schema works just fine for them.

Premium Tier: The 5 databases on the right are all over 50GB, and we’ll treat those differently. There’s nothing magic about the 50GB number – we just picked the outliers that are much larger and will require more hand-holding for performance and reliability management.

Your top 10% largest databases will always need your full attention. It’s not that kCura Relativity or SQL Server have a hard time with 50GB of data in one database – indeed, we’ve got clients hundreds of times larger than that – but you will have a hard time with it. After all, your 10% largest databases will always scare you because they’ll be the biggest ones you’ve ever worked with.

The Premium Tier gets:

  • Custom performance management – every week, the DBA looks at SQL Server’s index and query metrics on those databases to figure out which indexes to tune.
  • Custom backup management – as these databases grow, they may be a better fit for SAN snapshot backups, which let you back up databases of any size in under ten seconds. For that to work, though, we have to put those databases on their own storage volumes, so we can’t afford the management overhead of doing that for every database. We reserve this technique for the bigger databases.
  • Custom hardware/software – we don’t want these databases lumped in with the vast majority of databases because these larger databases tend to dominate SQL Server’s memory. SQL Server doesn’t have controls to limit the amount of data cached per database, so if one of the large databases is heavily active, he’ll push all of the other databases out of RAM. For that reason, it may make sense to put these databases on hardware with much more memory.
  • Custom filegroup/file design – depending on the storage hardware and SAN pathing, the Premium Tier databases may benefit from having multiple files per filegroup.
  • Custom index maintenance strategies – as databases get larger, it’s not as practical to rebuild their indexes so frequently, and we may switch to just updating statistics frequently instead.
  • Custom index design and query tuning – it’s not just enough to rebuild your indexes, but you also need to design new indexes for your most resource-intensive saved searches. kCura ships a set of starter indexes that work for most workspaces, but your 10% largest databases – especially if they’re pushing performance limits on your hardware – will need hand-crafted indexes.
  • Custom monitoring – when there’s a long-running query in the Premium Tier, we need to know quickly because it has much bigger performance implications than it does in the Standard Tier.
  • Partition the AuditRecord table – this technique gets you faster backups and DBCCs, but comes at the expense of more management overhead. It’s worth the work for Premium Tier workspaces.

Once we’ve implemented tiering, then we can often achieve big hardware & licensing savings. We can use cheaper hardware for the Standard Tier, and we may even be able to get away with SQL Server Standard Edition in some cases.

As Relativity hosting companies grow, they can even decide to add a third tier: Cold Cases, databases that are no longer loading new data or undergoing intensive searches. We lump all of the Cold Case tier onto much cheaper hardware, and possibly pass the savings on to the end users. (Or not, hey, whatever. I’m not here to judge.)

Every week, the DBA updates the sizing chart to show which databases have grown dramatically over the last 30 days. The business and the DBA agree on a size point where cases will be upgraded to the Premium Tier – and sometimes, the business even tells the DBA ahead of time when a case is destined to grow up.

The bottom line is that the DBA spends less time messing around with Standard Tier databases, and more time focusing on the databases that are harder to manage – and often bring in more revenue.

, , ,
Previous Post
The Day the SQL Server Quit [Video]
Next Post
4 Lightweight Ways to Tell if a Database is Used

6 Comments. Leave new

  • I’ve run across this same scenario before where the schemas were all the same, but the data volume and growth rates varied for each database. It wasn’t a big enough difference to change how maintenance was done, but the performance issues were only in the larger databases.

    Because of this, performance tuning was only done using the larger databases as the test cases, but all indexing changes were pushed across all databases. This gave us a separate tier for testing, but consistency for supporting.

    However, in your example it appears that you don’t touch the indexes on the smaller databases at all. I would believe that the optimization made for a larger database would, in an all-or-nothing mindset, cause more good than harm in smaller databases of the same schema. Is this something you agree with, or are there other factors here that I’m missing?

    Thanks,
    Steve

    Reply
    • Mike Kolek
      May 1, 2014 2:32 pm

      Steve, that’s a great question. Each Relativity workspace has its own SQL database, and users with the appropriate workspace permissions can add their own columns at any time.

      The security model is very granular. Relativity allows paralegals, project managers, and even freshly minted lawyers coming out of law school to create complex searches on the fly. The drawback is that queries typically aren’t fully optimized as Brent points out in another post.

      It’s because of this flexibility that each workspace database is likely to be a little unique. kCura does however conduct regular analysis and optimization of the indexes and statistics that are included in the template databases deployed during fresh installations.

      Reply
      • That makes more sense. I didn’t realize there were schema differences, which changes everything.

        Thanks

        Reply
        • Steve – ah, Mike beat me to this one!

          I’ll add to his reply – when I’m dealing with vendor databases, I expect the vendor to be working on overall index improvements. Those types of indexes get deployed across the board. If you’re dealing with an in-house app, then the internal developers need to be focusing on company-wide indexes.

          Reply
  • Is size the best indicator of a database importance?

    Reply
    • Importance, no, but there’s a strong correlation in database size and the number of times performance problems pop up. There will always be exceptions, but any time performance issues get to dev, it’s generally the largest workspace around.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}