This morning at the PASS Summit, we have the pleasure of listening to Professor David DeWitt talk about Hekaton internals.
I’m actually not going to liveblog this – I’m going to sit back and take in the presentation as an attendee because it’s going to be so damn good, and I’m not going to be able to do justice to it in a live blog post. I need to explain why.
Yesterday, a few of us bloggers were given an advance question-and-answer session with him to talk about – well, whatever. Here’s what I asked him, and keep in mind that the answers are paraphrased. I took notes as fast as I could, and it’s
possible probable guaranteed that I misheard things, so don’t take this as a word-for-word transcription. I’m trying to maintain the spirit of what he said.
BGO: Out of your accomplishments this year, what are you the most proud of?
Shipping PDWv2 with Polybase.
Seeing Hekaton emerge in SQL Server 2014 CTP2. I didn’t have ownership of that project at the very beginning – I got it after 3 months – and then I owned it for a year and a half. Seeing it come out the door was exciting.
I’ve been working on this keynote since July 1st. I went canoeing in the Arctic in 1st of August, and I had to have it done by then, so the month of July was spent banging the talk out. In 75 minutes with 77 slides (with complex animations), I’m trying to explain to the PASS Summit audience something that I would normally cover a couple of lectures to students. It’s going to be complex.
Hekaton is totally different than the relational engine. How Hekaton stores data is just as different from the regular engine as the column store engine is different. Just as we saw Apollo’s column store indexing folded into the mainstream engine over time, we may fold in Hekaton improvements over time too.
BGO: What do you enjoy about speaking at Summit?
The high of doing this. It’s a very appreciative audience, unlike an undergraduate audience. <laughter> In a college environment, I don’t really want people to have laptops in class. They’re probably shopping online.
PASS is a great environment where you can tell people are here because they want to learn voluntarily. It’s all volunteers. Volunteers make such a big commitment to the event.
At the same time, it’s not all fun. There are periods like 2 weeks before where I’m incredibly stressed out. Two years ago I got a 5 on my session feedback evaluations, and then last year I got a 3. What am I going to get this year? I’m really stressed about that.
(Note from Brent: I really do get the vibe that DeWitt cares passionately about the session materials and how PASS attendees receive it. He’s not under fire from Microsoft to produce amazing materials – he is just totally self-motivated to beat expectations of the audience.)
BGO: What’s the toughest part about your job?
Not being able to ship stuff as fast as I want. I’m not at a startup. I’ve come to appreciate what it means to be part of a company that prides itself on delivering really high quality software. SQL Server has a sterling reputation for really high quality. I’ve learned so much about the testing process.
In the upcoming release of PDW, we’ve got a feature coming, and it’s really important to me. It’s what most people would consider a small thing, but it’s very important to me. Unfortunately, we can’t enable it – we can’t ship it to the public because we don’t have enough time to test it. That’s frustrating, but it’s fair, and the bar for testing PDW isn’t even as high as regular SQL Server testing. The bar for SQL Server engine testing is incredibly high.
I could do with 2-3x the number of resources than I have.
I’m old enough to get Medicare, but I still have lots of good ideas. Mike Stonebraker turned 70 this fall. Mike was my graduate TA for my first graduate class. I’ve known him for 40 years. He’s had a lot of successful startups, and he doesn’t need to work, but he has 4 startups and still goes to work every day.
BGO: Is there anything you regret not doing?
I’m envious of Stonebraker and all the startups he’s done.
I was part of Vertica, so I’ve never worked on the Microsoft column store stuff because of non-competes. But being part of a startup would be really gratifying. It takes guts, has challenges, and I’m not sure I would have been successful, but that’s the one thing I regret.
And I wish I would have been better at mathematics.
BGO: What’s the one thing you want people to take away from the keynote?
Building Hekaton was really a serious long term endeavor. We’ve been at it a full 5 years. It was a big deal. It could be the basis for a lot of new SQL products down the road.
For relational database storage, columnar stuff was really the first chink in the armor. It’s processed in vectors, the vectors get combined with bit masking, we use a lot of different query processing techniques. More chinks are coming.
Look at what’s happening with computer hardware, specifically memory prices. There’s a chance technology will drive us to a place where we have large amounts of memory, some non-volatile RAM, we may end up with database systems whose databases are all in memory or near memory (NVRAM). In the next 10-20 years, the Hekaton approach of memory-intensive, core-intensive approach could become the de facto way of storing data as opposed to the disk-based product (the way SQL Server stores data & logs today).
My hope is that people will take those slides and study them carefully for the exam. <laughter> I want them to read them closely and understand why we did it this way.
BGO: So it sounds like it’s not a one-and-done feature like so many others we’ve seen. Development is actively ongoing, and there’s still more investments being made here?
Apollo (column store indexes) came out in SQL Server 2012, and in SQL Server 2014, it’s v2 with updates and investments. PDW v2 is out with more features – except for that one small feature that’s my favorite that we can’t ship. These storage investments aren’t one-and-done – we’re focusing on these.
Hekaton CTP 1 had hash indexes only, and CTP2 adds B-tree indexes. There’s a white paper coming out on the index types we’ll support.
We have lots of exciting things in the language hopper for Hekaton. We’re broadening the language and data type support in Hekaton V2.
BGO: Can I quote you on that? I want to make sure I can actually blog that Hekaton V2 is going to have expanded language support.
My thoughts on what’s about to go down
For the keynote, DeWitt’s tackling something really challenging. How do you teach database internals – and not just regular internals, but really all-new internals – to a very wide audience? In this room, we’ve got database administrators, BI developers, database developers, and managers. Many of us in here don’t regularly work with latching problems in SQL Server’s current engine, much less a new one.
I admire what he’s trying to achieve, and having read the slide deck, I admire how he’s going to do it. The snark department is going to make fun of his clip art, but pay close attention. In the next 75 minutes, you’re going to learn internals of both the current engine and the Hekaton one. You’re probably not going to deploy Hekaton v1 for existing applications, but if Microsoft continues making payments on this vision, you’re probably going to want v2. Today’s session will explain why.
Charlotte, NC – Your intrepid reporter is onsite at the convention center, settling in at the Blogger’s Table. At 8:15AM Eastern time, PASS President Bill Graziano will take the stage, discuss PASS, and then hand things over to Microsoft’s Quentin Clark for product announcements.
Some of the things I’m looking for include:
- Attendance numbers on the Summit (was a record-setting 3,894 last year)
- Graziano recapping his last year as PASS president and introducing the new president
- SQL Server 2014′s release date, pricing, and edition restrictions (will this be the year Standard Edition is cut down to 4 cores and 32GB of RAM, or will our outcries finally be heard?)
- A couple of customer stories about how SQL 2014 is already helping them out
- The next Community Technology Preview (CTP2) download of SQL 2014
- A strategic vision for Microsoft’s business intelligence products (I wouldn’t expect in-depth BI talk here because that’s where the PASS Business Analytics conference comes in)
- A Metro-looking PowerPoint slide deck with multiple “pillars” of “synergy” that “surprise and delight” you with “big data” on your “smart devices”
I don’t expect this to be a year of surprises for the core SQL Server product. At this point, the feature set of SQL 2014 is pretty well baked and publicized (Hekaton, buffer pool extensions, better column store indexes, better AlwaysOn AG management, etc). It’s too late in 2014′s development cycle to pull a “one more thing” moment with the engine, and it’s too early to start talking about vNext.
For live updates, refresh this page every few minutes. The latest stuff will be at the bottom. To follow the live stream on Twitter, use http://tweetchat.com and follow hash tag #sqlpass or #summit13. I’ll be using #sqlpass just because it’s been used heavily in the past. You can also watch the live stream here.
8:18 – The room is filled up and the lights are still bright. Stay tuned.
8:21 – Showing a video with attendees discussing what they’re interested in. Hilarious energy from Jes Schultz Borland, Stacia Misner, Christina Leo.
8:24 – Bill Graziano took the stage and he’s introducing the PASS Board. “Please reach out to any of us during the Summit with ideas and questions.” They’re serious about that – they take a lot of notes with your ideas.
8:26 – 700K technical training hours, 227 chapters, 22 virtual chapters, 81 SQLSaturday events, 5 24 Hours of PASS, 2 SQL Rally events – all in the most recent fiscal year.
8:33 – Bill has an “ask” for you to sign up as a PASS volunteer.
8:36 – Video intro for Microsoft’s Quentin Clark showing a bunch of statistics about SQL Server. Folksy guitar music in the background. I’m not sure if this is a protest song or what this is.
8:40 – When talking to the community and users, he’s heard us say that we want a complete story end to end for our databases. He’s covering the transformation from brick and mortar to e-commerce. The old fear was that retail was going to die and e-commerce was going to take over. As an Amazon Prime member who rarely steps foot in stores, I’m curious to how he’s going to disprove that.
8:42 – OH, I see what he did there. He’s saying that retail and e-commerce is merging – you can buy stuff online and return it to the store. Now, IT is undergoing a similar transformation where we need both on-premise and cloud. This is quite a change from Microsoft’s screaming “WE’RE ALL IN THE CLOUD” in times past.
8:44 – Saying that in the past, Microsoft has done keynotes that were all over the place with software that never actually came out and wasn’t tied together. Today is about stuff that is either generally available now, or will available really soon, and will all tie together.
8:47 - The next Community Technology Preview (CTP2) download of SQL 2014 is available. This is the final public CTP before release.
8:48 – Quentin says Microsoft is “pushing the boundaries” of in-memory performance, better availability, redefining mission critical in the cloud, bringing BI to everyone, etc. I think that’s true for historical SQL Server implementations, but not necessarily other platforms. He’s focusing on Oracle’s columnar in-memory keynote a couple of weeks. Is Oracle really the big competitor in the in-memory database space?
8:50 – Regarding Hekaton’s in-memory OLTP, he says, “We’re not asking you to rebuild your application.” This just isn’t true – I still haven’t seen a single application that could take the move to Hekaton’s full benefits (compiled SQL and in-memory tables) without schema changes, and schema changes are serious business for existing applications.
8:52 – Demoing an online store with 1mm users, and 20k simultaneous transactions. I gotta be honest: if you’re querying the database live in an online store, you are doing it completely and utterly wrong. Meet Mr. Cache.
8:55 – Oh, this demo fills me with rage and sadness. Everything about the tables and the queries is worst practices. This has absolutely no business in a production database and an online store. If your site and your code is this bad, and you refuse to use a cache, then sure, Hekaton is for you.
9:00 – Talking about “breakthroughs in availability in recovery” by allowing AlwaysOn secondaries in Windows Azure, backup to Windows Azure, and data files in Windows Azure. Already covered my very skeptical take on those features.
9:03 – Announcing the Backup to Azure Tool for all supported SQL Server versions. You get encrypted, automated backups to Windows Azure file storage.
9:09 – IT’S A FREAKIN’ FILE SYNC TOOL. You have got to be kidding me. This is disgusting. What a joke. You back up locally and the tool uploads your files to Windows Azure file storage. Golf clap. Across the face.
UPDATE 4PM – turns out this is a filter driver. Your backup goes directly to Windows Azure file storage, and doesn’t reside on local storage. If your Internet connection blips, your backup breaks. This also means you have no local copy of your backups. It’s not designed to be your primary backup method, but more designed to be a quarterly archive of offsite backups. Thing is, I think you’re much better off just copying an existing already-done backup to Windows Azure rather than setting up yet another product to take yet another backup and monitor yet another job. Use one tool to manage offsite backups of all kinds of products, whether they’re SQL or Oracle. However, the MS folks I’ve talked to are really taking that opinion hard – they really believe small businesses will install this tool to get offsite backups. They talk about how small businesses want something simple – but remember, small businesses can’t even handle maintenance plans, much less add yet another backup tool into their environment and configure it correctly.
9:10 – And the Azure demo failed due to network problems. But I’m sure that’ll never happen to you, buddy. You’re much better at networking than Microsoft.
9:16 – Recapping Windows Azure HDInsight, DW Virtual Machine in Windows Azure, PDW with PolyBase. Older announcements.
9:22 – Customer testimonial video from Barcelona. Has a nice beat to it. Evidently people in Barcelona walk through the streets taking pictures with Microsoft Surfaces. I’m sure there’s no propaganda in that video at all.
9:25 – Power* product listing.
9:28 – Excel demo using Skype data. 35TB of data per day. “Let me show you what a typical Skype analyst does.” And then: “We take your privacy seriously, so we’ll show fake data.” If you care about privacy, why do you have Skype analysts and gather 35TB of data per day? Seriously, this is awful given the NSA paranoia, don’t demo this.
9:32 – Merging Skype data with country sources. This is exactly what NSA does. Ugh, what a tasteless demo. Who approves this stuff?
9:33 – “Now I can bring in the locations of the nearest prisons, call in a black helicopter, and track it live with PowerMap.” Wait, they didn’t actually say that. But they should.
9:36 – Demoing on a Surface. Although…not actually clicking anything.
9:39 – Showing Power BI for Office 365 Preview with an equivalent of English Query back from SQL Server 2000. You can put in something like “calls per capita by country” and it figures out the right query to run.
9:42 – They’re asking for data over and over in different ways, and it automatically interprets how to query the data, and more interestingly, how to render it. Uses Bing maps, bar charts, scatter plots.
9:43 – You can sign up for it now at PowerBI.com.
9:45 – Power BI Contest at http://www.facebook.com/microsoftbi where you can submit BI solutions with Power BI and vote on submissions from others. Top 5 finalists get Surface Pros, top 10 get Xbox Ones.
9:47 – And that’s a wrap. Off to the sessions! Total letdown in terms of what we didn’t hear: release date, pricing on SQL 2014.
You’re not at the PASS Summit today.
It’s alright. I know how it goes. Sometimes the boss won’t give us enough of a training budget to travel, or it’s the other guy’s turn, or maybe you work more as a developer than as a DBA so you can’t justify the PASS Summit.
I’ve got good news. Today at the Summit, we’re showing off our newest one-day training class “How to Make SQL Server Apps Go Faster.” It’s hard for me to even comprehend these numbers, but we have over 250 registrants paying $395 each. That’s incredible. It’s the best turnout I’ve ever seen for a pre-con session!
So we’re bringing the session online as our newest online training video: Make SQL Server Apps Go Faster. Here’s the abstract:
You’re a developer or DBA stuck with a database server that’s not going fast enough. You’ve got a hunch that the biggest bottleneck is inside the database server somewhere, but where? In this video series, you’ll learn how to use powerful scripts to identify which queries are killing your server, what parts of the database server are holding you back, how to tackle indexing improvements, and how to identify query anti-patterns.
In 7 hours of training videos with Microsoft MVPs, you’ll learn:
- How wait stats tell you where to focus your tuning
- How the plan cache shows you which queries are the worst
- How to make fast improvements by picking the right indexes
- How to identify and fix the most common query anti-patterns
You get 18 months of access to the recordings streamed to your desktop/laptop/iPhone/iPad, plus 3 more bonus training videos on scalability,
all for just $299. Use coupon code FASTER during checkout and you get 20% off before Tuesday, Oct 22, bringing it to just $239.20.
Buy it now.
Two years ago this week, I took to the stage at the PASS Summit 2011 and unveiled sp_Blitz™, my free health check stored procedure that helps you rapidly take over SQL Servers you’ve never seen before.
Some historical highlights include:
- 117,125 downloads
- 117 checks (technically much more than that, because we look for variations of stuff)
- 30 versions (including today’s updates with even more bug fixes)
- Dozens of public code contributions
- Analyzes the SQL Server plan cache for performance issues
- Can store data to a table for analysis over time
- Can take an input table to skip any checks on any servers or databases
People often stop me at conferences and say two things: they’re so thankful for it making their job easier, and they can’t believe we give it away for free. Between sp_Blitz™, sp_BlitzIndex™, our weekly webcast, our YouTube channel, our ebooks, our 6 month DBA training newsletter, and in-depth blog posts, we just give a ridiculous amount of stuff away for free. It’s going to get even better at the PASS Summit – I can’t wait to share my next free tool with you guys.
This is why, when we announced that we’re looking for Employee #2, Kendra specifically said in the post:
First, we need to know where you’ve spoken recently and what you talked about. This can include all sorts of public speaking– it doesn’t have to be technical. Next, tell us how you give back to a technical community. (It doesn’t have to be the SQL Server community.)
We wanted to find people who were passionate about making a difference in the community. People who give back to make jobs easier for others. People who openly share their knowledge with the public whenever possible.
That’s also why I’m excited to see my friends again at the Summit this week. This is where the giving-back happens, and it’s where the giving-back people get together. Here’s to another year of giving back.
The database comes with a manual, but let’s be honest: nobody’s got time for that. DBAs – help me distill the most important concepts and rules down to 140-character chunks. I’ll work with Dell to compile the best tips into a free ebook for developers.
There’s three ways you can contribute:
- Register for the live webcast on Thursday, October 24th from 10AM to 11AM Eastern. Leave a comment in chat during the webcast, and we’ll talk through the best ones to help boil them down to a perfect essence.
- Join me on Twitter on Thursday, October 24th from 10AM to 11AM Eastern using the hash tag #dellsql. If you just want to read along, you can use TweetChat.com at that time.
- Leave a blog post comment below with your nuggets of genius.
Make sure to keep it upbeat and helpful – after all, your name will be next to your suggestion, and you don’t want to look like That DBA. Let’s help developers get started on the right foot!
Problem: your database server’s performance stinks like Limburger cheese.
Solution: find the queries that are making that smell.
Hiding in Plain Sight
Your database server knows exactly which queries are performing the worst. When queries are executed, the execution plans – and associated execution statistics – are stored in memory (with a few exceptions). This information is stored in the plan cache, and you can access it by querying a couple of DMVs.
One way you can get this information is to run this query in SSMS. But what if you wanted it on a regular schedule, like every Monday morning, so you knew what queries hit your server the hardest in the last week? What if you wanted developers to have access to this information, without letting them run wild in SSMS?
Write a report!
Building a Basic Table Report
The most basic way to display information in SQL Server Reporting Services (SSRS) is by using a table. I’m going to go into Report Designer, add a new data source to connect to my instance, use a slightly modified version of the above query to return the top 10 queries by total CPU, and add a table to the report.
Not sure how to do all of that? My 90-minute SQL Server Reporting Services Basics training will get you up to speed!
Here’s my design:
And here’s a preview of it:
It’s functional, but it could be better. Some of the things I’d like to see in this report:
- The time the service started. The plan cache is only good from the time of the last restart (or the last time the plan cache was cleared, but I hope you’re not doing that on a regular basis).
- Formatting. Some of the numbers are hard to read, and it’s ugly. No one wants to look at ugly data.
- Measurements. Is that CPU in milliseconds? How about the duration?
- Can I get a link to the execution plan?
Improving the Report
I’m going to add a header to the report to display the last restart date. I add a new dataset which queries sys.dm_os_sys_info for the last start time. I add a header to the report, and add a textbox. I build an expression to display the last start time.
The report looks sharper already!
To display the rest of the information, I’m going to use the list control – I think it’s one of the most under-used features of SSRS. A list lets you add multiple report elements – text boxes, tables, images – to one space and arrange them as you wish. The list starts with two textboxes, one for the number of executions and one for the last execution date. Since I’m combining text and data, I use expressions.
Then I add a table to display the CPU, reads, and duration. I make sure I add appropriate labels for the measurements. I also format each of the results as a number so the numbers have commas in them.
Then, I add another textbox, display the query text, and format it with a fixed-width font so it’s easier to read.
Here’s the design:
Here is what the report looks like when run:
That’s a lot easier to read!
I’ve knocked everything off my list except the execution plan. I’m going to add another text box and add the plan.
That is not readable. Here’s the issue: the plan is stored as XML, and SSRS can’t natively display XML. I could leave this here for a user to copy and paste into SSMS or SQL Sentry Plan Explorer, but that’s a lot of work. I decided to take it out of the final version of my report.
The last step is to deploy this to Report Manager. After I deploy it, users that have permissions to the folder it’s deployed to can view it – without having to go into SSMS or coming to me for the information. They can also subscribe to it to receive a copy on a regular basis.
The next step to be taken: tune those queries!
SSRS can be used to solve a variety of issues – it isn’t solely the realm of analysts that need to know the sales from last quarter or what products are the most profitable. As a DBA or developer, you can learn to use SSRS and use it for multiple tasks. Any time you need to display data, consider SSRS as a tool!
In the e-discovery business, kCura Relativity is the big gorilla, and I’ve been working with kCura and their clients since 2011. I think it’s one of the most interesting businesses I’ve ever worked with – up there with StackOverflow – and like Stack, they’re really open with their customers.
Today marks my second year being at Relativity Fest, the conference for Relativity users, and I’m presenting two sessions on performance tuning Relativity’s SQL Servers.
First, A Quick Introduction to Electronic Discovery
Say Microsoft sees sp_Blitz™ and decides to build it into SQL Server 2015 without giving us any credit. Brent Ozar Unlimited’s top-notch law firm would approach Microsoft, try to negotiate a settlement, but if all else fails, we’d have to sue Microsoft.
Our lawyers would make a list of evidence we want to get from Microsoft – say, all email, source code, file server contents, and laptop contents belonging to Buck Woody, Mark Souza, and a few other key employees that we believe would prove our case. Naturally, because this is a big case, there’s dozens of Microsoft people that we want to investigate. After the judge agrees, now comes the tough part: we have to actually get the data.
Our lawyers, God bless ‘em, aren’t very good with computers, and they’re certainly not good enough to run around the country taking images of file servers, email servers, laptops, and backup tapes. They would hire an e-discovery partner that would:
- Get copies of all computers/servers involved
- Reverse engineer every file format
- Convert the documents into an easy-to-view format (like a TIFF image or a PDF)
- Convert the document contents into an easy-to-search format (like text stored in a database)
- Store all of the document metadata in a document (like whose computer it was found on)
After a few more steps, document review starts. Our lawyers would go searching through this data looking to find evidence that supports our case. Unfortunately, our lawyers aren’t a very big company, so they’d probably hire a large outsourced firm with hundreds or thousands of lawyers who can rapidly sift through all of these documents looking for evidence. After all, court cases are a race with deadline dates imposed by judges. We need as many incriminating documents as we can find, as quickly as possible, to help our lawyers build their case strategy.
kCura Relativity is a popular hub for this whole process, and it uses Microsoft SQL Server as a database back end. I first started working with kCura when Andrew Sieja approached me in 2011, and we’ve had a great relationship ever since. I really like this industry because it’s facing and solving so many cool technical, logistical, and legal challenges.
How Relativity Uses SQL Server
SQL Server is the backbone of Relativity. As documents come in from the field, the document metadata (source, file name, document type, etc) are loaded into SQL Server. As processing software figures out the file’s contents, the extracted text is also added.
Databases grow large quickly even though document itself isn’t loaded into the database. Each case (workspace) is its own SQL Server database, and these databases can easily grow into the terabytes. Database management is made more difficult because when the case is first created, we have no idea how big it’ll become – some cases are just a few gigs, and others grow into the terabytes with no warning.
How Relativity Challenges SQL Server
Lawyers want near-zero downtime. Any given SQL Server can be hosting dozens (or hundreds) of live cases, all of which have different law firms involved, often with global staff analyzing documents around the clock. Typically this means multi-node failover clusters with complex disaster recovery strategies.
Users can add databases on any server at any time. New cases are added without warning. Your disaster recovery mechanism had better be able to keep up.
Users can add their own columns at any time. This is extremely unusual for ISV applications, especially ones that have 24/7 OLTP-style loads. When you add columns in SQL Server, you need a table lock, and this can create all kinds of blocking havoc.
Anybody can search for anything, anytime. It’s extremely difficult to design an indexing strategy when every incoming query is different. This element of Relativity is more like a data warehouse where end users are constantly asking new questions of the data, trying to spot trends.
We’re talking about lawyers building SQL. One awesome thing about Relativity is that it empowers anybody to build really complex searches fast without the help of a developer or a DBA. The drawback is that the queries aren’t exactly optimized. I will say this: the queries by lawyers look a lot better than a lawsuit written by DBAs.
And all the while, people are still loading documents. New batches of data come in from the field at unpredictable times. It’s like having a data warehouse that loads data in the middle of the day.
Other apps are hitting the server too. Relativity has lots of third party partners who either query the databases directly or use Relativity’s API. While the kCura developers put a ton of work into tuning their queries, not everybody is as focused on performance.
Security and auditing are vital. We have to make sure no one sees a document they’re not allowed to see, and that every access to privileged information is tracked.
Every case is different. Even though the basic application code is the same, the added fields and the search patterns vary wildly. You can’t just make one set of indexes and call it a day.
These things add up to make performance tuning SQL Server significantly more complex – but still totally doable.
The Easy, Expensive Way to Tune Relativity’s SQL Server
Follow my instructions from my TechEd 2012 session, Building the Fastest SQL Servers:
- Buy as much memory as your server can hold.
- Build your storage infrastructure to deliver SQL Server’s maximum consumption rate (MCR) for your number of CPU cores.
- Before going live, load test to make sure it actually delivers that throughput.
- After going live, monitor your storage throughput and latency, looking for surprise problems.
To learn more about how this works, dig through Microsoft’s Fast Track Data Warehouse Reference Architectures. Those documents teach you how to calculate SQL Server’s MCR, how to design storage networks to deliver data that quickly, and show you how to configure SQL Server correctly for massive
Done right, you’re done here. This gives you an infrastructure that can tolerate any kind of search queries at any time, indexed or not. Well, not exactly: you still have to follow basic good design processes. You can’t lather on dozens of indexes, thereby slowing down inserts/updates during document loads. You also have to stay on top of your case loads because they’re only going to grow over time, and you can still outgrow your hardware’s speed capacity.
The Hard but Cheaper Way to Performance Tune
Manage with automation in mind. Each time you do a performance tuning or availability task, think about how you’re going to accomplish it automatically going forward. (I mention this first just because it’s important as you think about the rest of these tasks.)
Proactively create, modify, and disable indexes. Because anybody can query for anything at any time, kCura’s developers can’t possibly come up with an index strategy that will work for your users. They build in a basic set that should be good enough for most use cases, but if you want killer performance with minimal hardware spends, you’re going to have to roll up your sleeves. Start with our How to Think Like the Engine training, and then move on to our How to Tune Indexes and Speed Up SQL Server videos.
Use an index naming convention. You’ll need to be able to instantly identify which indexes were created by your shop, and which are natively shipped with Relativity. Don’t go dropping Relativity’s built-in indexes – even though some may be duplicates or not used in your environment. Consider disabling them instead, and notify kCura’s support team first.
Monitor the most resource-intensive queries. Use a SQL Server performance monitoring tool or analyze the plan cache to find queries doing the most logical reads. Since every database will have its own execution plans, you may also need to zoom out and look at aggregated execution plan stats.
When something new pops up, attack it. Your options are:
- If it’s coming from outside of Relativity (like a monitoring app or a loading tool), try tuning the query first. If you can’t eliminate common antipatterns like implicit conversions and non-sargable where clauses, you may need to use indexes.
- If it’s a Relativity saved search, work with the client support reps to find the most efficient way to get the data the client needs. Sometimes just a few tweaks to the search can make an unbelievable difference.
- If it’s part of Relativity’s own code, tune it with indexing. This is where tuning an ISV app gets challenging, because your work never really ends. Every new version of Relativity brings changed queries, and you have to make sure you don’t leave indexes lying around that aren’t helping anymore.
If you can’t tune it, start a case with kCura support. Include your Relativity version, plus specific metrics showing how much load the query is causing on your server and where the query is coming from. kCura’s developers love making the Relativity experience better for everyone involved, and they need your real-world feedback on what’s causing you problems. Just make sure to keep it upbeat and positive – don’t just say “Relativity sucks at writing this query” and hit Send.
The Bottom Line on Scaling kCura Relativity
It’s really just like performance tuning any other database: when you blow past a terabyte of data, and you’re doing a combination of OLTP and reporting-style access, you’re going to have to roll up your sleeves.
It doesn’t matter whether the app is homegrown or from an ISV – the DBA needs to:
- Know the database schema well
- Know the queries well
- Know how hardware can offset query and schema challenges
- Know when to turn SQL Server’s knobs (and when not to)
I’m excited that kCura invites me to talk about these topics at Relativity Fest, and if you manage Relativity instances, I’d highly recommend the conference. kCura takes performance seriously – it’s why they send me around to some of their clients for in-person visits, and the changes I suggest actually get implemented into the product. At last year’s Fest, I was proud when Andrew Sieja took the stage and talked about the big CPU performance improvements – some of those were from me working directly with customers, and folding those changes back into the product.
The value of Fest isn’t just about the topics that are being presented – it’s also about meeting your fellow SQL Server professionals who are facing exactly the same challenges. Heck, that’s one of the big values of attending our training classes, too! You can make bonds with people that you’ve met on Twitter or through email, and make new friendships with people who can help you through the rest of your career.
You’ve been working with SQL Server for a couple of years, but you’re not really sure what’s going on in the System Databases folder. What gets stored in Master, Model, and MSDB? What processes use them? In the event of a crash, should you restore them? Brent Ozar will give you an introductory tour in this half-hour video:
This Wednesday night, come join me at DevMynd. Here’s the topic I’ll be presenting:
How StackOverflow Uses (And Doesn’t Use) SQL Server
This insanely popular Q&A site serves millions of web pages per day from a Microsoft SQL Server 2012 back end. Believe it or not, the questions and answers come down to just one SQL Server with no full time database administrator. Success at scaling any database-driven app boils down to knowing when it’s appropriate – and not appropriate – to use the database.
You’ve been nursing your SQL Server through day after day of increasing load. Stored procedures are taking longer and longer to run. There are timeouts on the client side. Both customers and management are getting upset. As you’re bemoaning the terrible code that the developers wrote, it hits you: you don’t have a code problem, you have a scalability problem.
The Types of SQL Server Scalability Problems
It seems obvious to an outsider, but hardware has to be replaced on a regular basis. Between replacements, hardware is like code: it requires attention and maintenance to keep it running smoothly. As a DBA, it’s your job to pay as much attention to the database hardware as you do to the wait stats, slow queries, and missing indexes.
This doesn’t mean you can throw a monitoring package in place and walk away. Understanding how SQL Server and the underlying hardware respond under your application’s workload is important. Once you have a baseline on SQL Server and the hardware, you can easily tell if a problem is bad code or you need to call up your friendly hardware salesperson.
Database design is probably not your bailiwick; you might even say you’re not a database architect, you’re a database administrator. That may be, but you’ve also got the keys to the kingdom of database performance.
Through the DMVs you can tell which indexes have lock escalation problems, which files have slow reads, and even narrow these problems down to specific queries and times of day. Even if you can’t tell Boyce-Codd Normal Form from Backus-Naur Form, you have tools to help you identify problems at a physical level.
You probably want an example – if you add several new queries to the application and suddenly there are a lot more lock escalation attempts on a table, you can safely conclude that at least one of those queries would benefit from a new index.
I lied, sometimes the scalability problem is a code problem.
DBAs love to blame those pesky developers for causing problems. There’s some truth in that statement – developers introduce changes into an otherwise stable system. But when developers are adding new features and functionality, they can’t always know which indexes will be the best; after all, you didn’t have the resources to provide them with a full copy of the database, right?
Thankfully, you can track down top resource consuming queries very easily. Once you’ve identified those queries, you can either tune them or work with the developers to educate them about what they can do better the next time. All too often, time is the only issue that prevents things from being done correctly – developers are pressed for time to get a feature out, so they don’t spend as much time tuning code as they’d like. Help them out – find the bad queries and share the knowledge or just tune them yourself.
Outside of bad T-SQL, there are a number of patterns that can cause problems for database performance. So bad, in fact, that they’re going in a new section.
If you’ve thought that something didn’t smell quite right in the database, you were probably right. There are a number of ways to use SQL Server that work… for a while. I like to call these scalability anti-patterns. These anti-patterns work well as long as you can keep throwing more physical resources at the problem.
Anti-patterns can be difficult to identify as an anti-pattern – is it just bad code or are you seeing something worse? Over at ACM Queue there’s a guide 20 Obstacles to Scalability that lists patterns that will prevent you from moving forward. The most common patterns you’ll find are a lack of caching, serial processing, using the database as a queue, and full-text search.
Fixing anti-patterns will take a long time. These are architectural decisions that are baked into the entire application, not just the database. In order to fix these, you’re going to need to work with the developers, research new tools, and figure out how to implement the features
Solving the Problems
Easy Mode: Hardware
Let’s be honest, using money to solve your scalability problems isn’t a solution. Using money to solve scalability problems can often be a quick fix. If storage is slow, you can embark on a query tuning project or buy additional memory (or even SSDs). Although you’ll eventually need to tune those queries, you can buy yourself some breathing room by adding new hardware.
Don’t let new hardware lull you into a false sense of security. If you buy hardware to temporarily solve a problem, make sure you also record the performance problems and get time scheduled in the future to put a fix in place.
Medium Difficulty: Tuning
When you finally decide to tune your application, identify your biggest bottlenecks (if you don’t know, we can teach you how) to make sure that you’re tuning in the right place. You can’t tune T-SQL to make the network faster. Once you know what you’re trying to fix, identify the queries cause the most pain. Is memory a problem? Look for queries with large memory grants and tune them. Think about adding indexes.
Database performance tuning is as much about improving code as it is about making the database do less. Better indexes mean you have to sort less, but so does sorting in the application layer. Maybe you can pre-compute complex queries – even aggregations may be slightly out of date, there may not be a real-time requirement.
The Hard Stuff: Redesign
When hardware and tuning aren’t enough, it’s time to roll up your sleeves and redesign parts of the application. You’ll still need to identify bottlenecks, but you aren’t going to be tuning code.
Got a queue in the database? Figure out how you can move that queue based processing outside of SQL Server or process the queue in large batches rather than line by line.
Reporting against live data? Find ways to report against readable secondary servers or even report against a time delayed copy of the data in a data warehouse.
There are ways to redesign any feature for scalability. Tools and techniques have been documented in books, articles, blog posts, and conference talks. The question is no longer “How could you possibly do that?” The question is “Which technique are we going to use?”
What Should You Do?
For most of you, the answer is simple: find your bottleneck and decide if you can make a simple fix or if you need to devote more time to the problem. If you need more time, buy a bit of hardware to help you through. Don’t jump on the re-write bandwagon too early, but always keep ideas in the back of your head. And, above all else, make sure you’re solving the right problem with the right solution. Adding more processors won’t help if every query is wrapped in a