In the e-discovery business, kCura Relativity is the big gorilla, and I’ve been working with kCura and their clients since 2011. I think it’s one of the most interesting businesses I’ve ever worked with – up there with StackOverflow – and like Stack, they’re really open with their customers.
Today marks my second year being at Relativity Fest, the conference for Relativity users, and I’m presenting two sessions on performance tuning Relativity’s SQL Servers.
First, A Quick Introduction to Electronic Discovery
Say Microsoft sees sp_Blitz® and decides to build it into SQL Server 2015 without giving us any credit. Brent Ozar Unlimited®’s top-notch law firm would approach Microsoft, try to negotiate a settlement, but if all else fails, we’d have to sue Microsoft.
Our lawyers would make a list of evidence we want to get from Microsoft – say, all email, source code, file server contents, and laptop contents belonging to Buck Woody, Mark Souza, and a few other key employees that we believe would prove our case. Naturally, because this is a big case, there’s dozens of Microsoft people that we want to investigate. After the judge agrees, now comes the tough part: we have to actually get the data.
Our lawyers, God bless ’em, aren’t very good with computers, and they’re certainly not good enough to run around the country taking images of file servers, email servers, laptops, and backup tapes. They would hire an e-discovery partner that would:
- Get copies of all computers/servers involved
- Reverse engineer every file format
- Convert the documents into an easy-to-view format (like a TIFF image or a PDF)
- Convert the document contents into an easy-to-search format (like text stored in a database)
- Store all of the document metadata in a document (like whose computer it was found on)
After a few more steps, document review starts. Our lawyers would go searching through this data looking to find evidence that supports our case. Unfortunately, our lawyers aren’t a very big company, so they’d probably hire a large outsourced firm with hundreds or thousands of lawyers who can rapidly sift through all of these documents looking for evidence. After all, court cases are a race with deadline dates imposed by judges. We need as many incriminating documents as we can find, as quickly as possible, to help our lawyers build their case strategy.
kCura Relativity is a popular hub for this whole process, and it uses Microsoft SQL Server as a database back end. I first started working with kCura when Andrew Sieja approached me in 2011, and we’ve had a great relationship ever since. I really like this industry because it’s facing and solving so many cool technical, logistical, and legal challenges.
How Relativity Uses SQL Server
SQL Server is the backbone of Relativity. As documents come in from the field, the document metadata (source, file name, document type, etc) are loaded into SQL Server. As processing software figures out the file’s contents, the extracted text is also added.
Databases grow large quickly even though document itself isn’t loaded into the database. Each case (workspace) is its own SQL Server database, and these databases can easily grow into the terabytes. Database management is made more difficult because when the case is first created, we have no idea how big it’ll become – some cases are just a few gigs, and others grow into the terabytes with no warning.
How Relativity Challenges SQL Server
Lawyers want near-zero downtime. Any given SQL Server can be hosting dozens (or hundreds) of live cases, all of which have different law firms involved, often with global staff analyzing documents around the clock. Typically this means multi-node failover clusters with complex disaster recovery strategies.
Users can add databases on any server at any time. New cases are added without warning. Your disaster recovery mechanism had better be able to keep up.
Users can add their own columns at any time. This is extremely unusual for ISV applications, especially ones that have 24/7 OLTP-style loads. When you add columns in SQL Server, you need a table lock, and this can create all kinds of blocking havoc.
Anybody can search for anything, anytime. It’s extremely difficult to design an indexing strategy when every incoming query is different. This element of Relativity is more like a data warehouse where end users are constantly asking new questions of the data, trying to spot trends.
We’re talking about lawyers building SQL. One awesome thing about Relativity is that it empowers anybody to build really complex searches fast without the help of a developer or a DBA. The drawback is that the queries aren’t exactly optimized. I will say this: the queries by lawyers look a lot better than a lawsuit written by DBAs.
And all the while, people are still loading documents. New batches of data come in from the field at unpredictable times. It’s like having a data warehouse that loads data in the middle of the day.
Other apps are hitting the server too. Relativity has lots of third party partners who either query the databases directly or use Relativity’s API. While the kCura developers put a ton of work into tuning their queries, not everybody is as focused on performance.
Security and auditing are vital. We have to make sure no one sees a document they’re not allowed to see, and that every access to privileged information is tracked.
Every case is different. Even though the basic application code is the same, the added fields and the search patterns vary wildly. You can’t just make one set of indexes and call it a day.
These things add up to make performance tuning SQL Server significantly more complex – but still totally doable.
The Easy, Expensive Way to Tune Relativity’s SQL Server
Follow my instructions from my TechEd 2012 session, Building the Fastest SQL Servers:
- Buy as much memory as your server can hold.
- Build your storage infrastructure to deliver SQL Server’s maximum consumption rate (MCR) for your number of CPU cores.
- Before going live, load test to make sure it actually delivers that throughput.
- After going live, monitor your storage throughput and latency, looking for surprise problems.
To learn more about how this works, dig through Microsoft’s Fast Track Data Warehouse Reference Architectures. Those documents teach you how to calculate SQL Server’s MCR, how to design storage networks to deliver data that quickly, and show you how to configure SQL Server correctly for massive
Done right, you’re done here. This gives you an infrastructure that can tolerate any kind of search queries at any time, indexed or not. Well, not exactly: you still have to follow basic good design processes. You can’t lather on dozens of indexes, thereby slowing down inserts/updates during document loads. You also have to stay on top of your case loads because they’re only going to grow over time, and you can still outgrow your hardware’s speed capacity.
The Hard but Cheaper Way to Performance Tune
Manage with automation in mind. Each time you do a performance tuning or availability task, think about how you’re going to accomplish it automatically going forward. (I mention this first just because it’s important as you think about the rest of these tasks.)
Proactively create, modify, and disable indexes. Because anybody can query for anything at any time, kCura’s developers can’t possibly come up with an index strategy that will work for your users. They build in a basic set that should be good enough for most use cases, but if you want killer performance with minimal hardware spends, you’re going to have to roll up your sleeves. Start with our How to Think Like the Engine training, and then move on to our How to Tune Indexes and Speed Up SQL Server videos.
Use an index naming convention. You’ll need to be able to instantly identify which indexes were created by your shop, and which are natively shipped with Relativity. Don’t go dropping Relativity’s built-in indexes – even though some may be duplicates or not used in your environment. Consider disabling them instead, and notify kCura’s support team first.
Monitor the most resource-intensive queries. Use a SQL Server performance monitoring tool or analyze the plan cache to find queries doing the most logical reads. Since every database will have its own execution plans, you may also need to zoom out and look at aggregated execution plan stats.
When something new pops up, attack it. Your options are:
- If it’s coming from outside of Relativity (like a monitoring app or a loading tool), try tuning the query first. If you can’t eliminate common antipatterns like implicit conversions and non-sargable where clauses, you may need to use indexes.
- If it’s a Relativity saved search, work with the client support reps to find the most efficient way to get the data the client needs. Sometimes just a few tweaks to the search can make an unbelievable difference.
- If it’s part of Relativity’s own code, tune it with indexing. This is where tuning an ISV app gets challenging, because your work never really ends. Every new version of Relativity brings changed queries, and you have to make sure you don’t leave indexes lying around that aren’t helping anymore.
If you can’t tune it, start a case with kCura support. Include your Relativity version, plus specific metrics showing how much load the query is causing on your server and where the query is coming from. kCura’s developers love making the Relativity experience better for everyone involved, and they need your real-world feedback on what’s causing you problems. Just make sure to keep it upbeat and positive – don’t just say “Relativity sucks at writing this query” and hit Send.
The Bottom Line on Scaling kCura Relativity
It’s really just like performance tuning any other database: when you blow past a terabyte of data, and you’re doing a combination of OLTP and reporting-style access, you’re going to have to roll up your sleeves.
It doesn’t matter whether the app is homegrown or from an ISV – the DBA needs to:
- Know the database schema well
- Know the queries well
- Know how hardware can offset query and schema challenges
- Know when to turn SQL Server’s knobs (and when not to)
I’m excited that kCura invites me to talk about these topics at Relativity Fest, and if you manage Relativity instances, I’d highly recommend the conference. kCura takes performance seriously – it’s why they send me around to some of their clients for in-person visits, and the changes I suggest actually get implemented into the product. At last year’s Fest, I was proud when Andrew Sieja took the stage and talked about the big CPU performance improvements – some of those were from me working directly with customers, and folding those changes back into the product.
The value of Fest isn’t just about the topics that are being presented – it’s also about meeting your fellow SQL Server professionals who are facing exactly the same challenges. Heck, that’s one of the big values of attending our training classes, too! You can make bonds with people that you’ve met on Twitter or through email, and make new friendships with people who can help you through the rest of your career.
Fascinating! I’ve said it before and I’ll say it again, I really enjoy learning about these “real world” consulting cases and appreciate you sharing them.
Can you share any more details about the document content search implementation? Is Full Text Indexing being used, Lucene, something else….
The default search is a third party product called dtSearch, and hosting partners sometimes use alternate search back ends like Lucene as well.