We’ve added two new one-day pre-conference classes to our 2015 calendar:
SQLSaturday Boston Pre-Con: Developer’s Guide to SQL Server Performance – You’re stuck with a database server that’s not going fast enough. You’ve got a hunch that the biggest bottleneck is inside the database server somewhere, but where? In this one-day class, you’ll learn how to use powerful scripts to identify which queries are killing your server, what parts of the database server are holding you back, how to tackle indexing improvements, and how to identify query anti-patterns. The class is just $99 – learn more now.
SQLRally Nordic Copenhagen: Performance Tuning When You Can’t Fix Queries – Brent is flying over to Rally again! Here’s the abstract: Your users are frustrated because the app is too slow, but you can’t change the queries. Maybe it’s a third party app, or maybe you’re using generated code, or maybe you’re just not allowed to change it. Take heart – there’s still hope. Brent Ozar does this every week, and he’ll share his proven methodologies to performance tune with indexes, SQL Server configuration switches, and hardware. Learn more now.
Your users want Relativity to be up at all times. What’s the first step? How much work is involved? What will it all cost? I’ll give you a simple worksheet to get management and the IT team on the same page, and then show you how to turn those specs into a rough project plan and budget.
You’ll learn how to choose between different high-availability methods, and understand why clustering is such a no-brainer choice in this 22-minute video.
Got questions? Join us for our Tuesday Q&A webcast where I’ll answer your Relativity questions.
The latest version of our free SQL Server health check adds some nifty new stuff:
- Checks for non-default database configurations like enabling forced parameterization or delayed durability
- Looks in the default trace for long file growths or serious errors like memory dumps
- Checks Hekaton memory use and transaction errors
- Warns about database files on network shares or Azure storage
- Added the server name in the output if you enable @CheckServerInfo = 1
- Discontinued the Windows app version (was prohibitively expensive to get it into the Windows app store)
- And miscellaneous bug fixes and improvements
Today at Relativity Fest in Chicago, kCura Relativity 9 introduces the option to move some text storage out of Microsoft SQL Server and into kCura’s new Data Grid, a tool built atop the open source Elasticsearch.
Is kCura abandoning SQL Server? No, but understanding what’s going on will help you be a better database administrator and developer.
kCura’s Challenges with Microsoft SQL Server
To recap some of my past posts on Relativity, it creates a new SQL Server database when one of the end users creates a new workspace and starts loading data. Over the coming weeks, data pours into SQL Server at completely unpredictable rates. We have no idea how many documents are going to be acquired from subpoenaed hard drives, file servers, backup tapes, Facebook messages, you name it. I’ve seen workspaces grow from zero to ten terabytes in a single week, all without the systems administration teams even knowing it’s happening. They ran their weekly backup size report, and surprise, surprise, surprise.
Data streams into Relativity during business hours at the very same time hundreds or thousands of document reviewers are running queries against those very same tables. The entire team is under tight time deadlines, and there’s no way to take databases (let alone servers) offline for loads.
And oh yeah, we’re often contractually bound not to lose any attorney work product whatsoever out of the database.
This is all doable with traditional relational databases, but it ain’t easy. It’s made even tougher by the fact that many Relativity hosting partners are understaffed, many without even a full time database administrator.
How kCura and Jeremiah Planned for Change
For the last couple of years, Andrew Sieja has repeatedly asked me a tough question: “If you were going to redesign Relativity from the ground up, and anything was on the table, what would it look like?” He faced a classic case of Innovator’s Dilemma – his team had built a wildly successful product, but there’s innovation coming from everywhere, and sooner or later somebody was going to beat him to the next level. While SQL Server was getting the job done, alternative data storage platforms beckoned with some really cool advantages, and he needed to take advantage of them before his competitors did.
kCura brought in Jeremiah to work through the options out on the market. There are a gazillion new data storage & search options out there, and some of them claim to do a phenomenal job on absolutely everything. (Hint: they’re usually lying.) Jeremiah helped them prioritize the features they needed most, and then recommended the right fit for them.
The end result is the newly announced kCura Data Grid, an extremely scalable and performant search platform built atop the open source Elasticsearch. You might recognize Elasticsearch from my demos of Opserver, Stack Exchange‘s open source monitoring tool, because Stack also migrated their SQL Server full text search out into Elasticsearch. They’re not alone – Elasticsearch has plenty of high profile case studies.
The Benefits and Risks of Elasticsearch
We typically see 70-90% of a Relativity workspace’s space consumed by extracted text and audit logging, both of which are great fits for Elasticsearch. Pushing that data out of SQL Server potentially means:
- Reduced storage costs – while SQL Server relies on expensive shared storage (typically $5k-$10k per terabyte), ES achieves redundancy with multiple commodity boxes (typically $1k-$2k per terabyte). This adds up fast for big workspaces.
- Faster search – ES is mind-numbingly fast. Seriously.
- Easier scale-out – it’s really, really hard (and expensive) to scale out a single multi-terabyte database across multiple Microsoft SQL Servers when people can create new databases at any time. (It’s even hard enough just to scale a single known database across multiple servers!) It’s easy to add ES replicas for higher performance and availability.
It’s not a silver bullet, and as with any technology change, there are risks and limitations:
- Security – ES doesn’t have any built in, so kCura had to build their own.
- Backups – when everything is in one data platform, it’s easy to back up everything at the same moment in time. Split the data, and you run into challenges – but these aren’t really new for Relativity. The databases and the native files couldn’t be backed up to the same point in time either.
- Management – Relativity hosting partners don’t have ES expertise on staff, and they’ll need training as ES becomes a mission critical part of their infrastructure.
What This Means for SQL Server Developers and DBAs
Microsoft SQL Server is an amazing relational database, a Swiss Army knife of a persistence layer. Sure, it can handle tables and joins, but more than that, it can do things like full text search, spatial data, CLR code execution, and scale-out via multiple methods. You can build a product backed by SQL Server and go a long, long, long way.
I don’t think SQL Server ran out of capabilities here, but kCura needed to plan for orders-of-magnitude growth in storage and search capabilities over the coming years. They grew one hell of a big, powerful business solution with a single database back end, and they’ve got the luxury of a large development staff and a bunch of new data storage options.
Premature optimization is the root of all evil. When you’re building the product you need today, the right database is the one you already know well. As your product grows, keep learning your own database, plus learn the other options out there. The storage and search markets are changing so dramatically every year – don’t make a bet on one today unless you have to, because tomorrow might bring an even better solution for your needs.
For more details about Relativity 9, check out kCura’s Relativity 9 page.
If you lead a local SQL Server user group, and you’re coming to the PASS Summit in Seattle next month, we’d like to say thanks.
We’re giving away 5 seats to our Make SQL Server Apps Go Faster class that runs on Monday/Tuesday before Summit. We know you probably have duties you need to attend to on Tuesday, but don’t worry – even if you have to leave early, you’re going to learn a lot.
Email us at
Help@BrentOzar.com with the name of the user group you run. Next Saturday, October 18th, we’ll draw 5 emails and notify them about their free tickets, plus a little extra something to say thanks.
See you at Summit, and thanks for everything you do for the community.
The fine print: Only one entrant per user group, please. (If your group has multiple leaders, just nominate someone to enter.)
Update – the contest is closed!
Brent Loves the Beer Pong
My favorite part is almost a throwaway gag – the database emergency is over, and our hero is moving on to other things:
It’s only on the screen for a second or two, just enough to make the viewer say, “Wait – is that guy doing what I think he’s doing?”
It captures the idea that nobody really wants to work with us. They want the SQL Server to be fast and reliable, and they want it to be invisible. They don’t really want to spend their time working on the database. They want to get back to adding features to their application, managing other servers, or … playing beer pong.
Doug Loves to Keep Servers Out of the E.R.
I get a lot of satisfaction from knowing that our clients not only learn to solve the problems they have, but also how to avoid them in the future. We don’t want our clients to be in the same pickle a few months later and need to call someone again. We value success stories over repeat business.
Jes Loves Teaching Rocket Surgery
This is exactly what we do:
We don’t just find and fix problems, we teach and train people so they are empowered to solve other problems going forward. At heart, I’m a teacher. I don’t hoard my knowledge, or try to be the one person who knows how to fix a problem.
Kendra Loves That Creepy Insect Moment
My favorite moment is the view of the SQL Server, right after the witch doctor fixes things up. It’s smiling but…. then an insect runs right over its teeth. Things are better, but you’ve got this weird feeling of dread. What’s really going on in there?
Epipheo captured this brilliantly. We never want to be the witch doctor. We built our process so that it doesn’t just make the symptoms of your problem go away, but empowers you to understand the cause and the cure.
Jeremiah Loves Saving Money
It’s true, I really do love helping our clients save money. I don’t want to be the person recommending a new infrastructure, built from the ground up, with fine Corinthian leather server racks. We joke about solving computing problems with money, but the truth is that we try to solve problems through ingenuity before we try to solve them with an AmEx.
Along the way to solving problems with our smarts and yours, we focus on just the important stuff. There’s no 4,300 page report. No incomprehensible advice. Just solutions to your problems.
So what was your favorite part?
Our web site explains who we are. It’s our storefront. The vast majority of our clients find us in Google, get to know us by reading our blogs, start to trust us by using our scripts, and then finally contact us for personalized help and training.
For that to work well, our web site needs to say, “Brent Ozar Unlimited is a boutique database consulting firm who helps you make SQL Server faster and more reliable. They’re brilliant, fun, and relatable.”
That sounds really simple now, but a few years ago, we had no idea what we wanted the web site to say. To figure it out, we hired Pixelspoke to build our brand identity, logo, and web site. They interviewed us, talked to our clients, and checked at our competitors. They painted a really clear picture of who we were, what we do, and what makes us different. The end result is the web site you see today. They did a magical job, and I think it says a lot that we love their work just as much today as we did two years ago. I still grin when I see our logo.
Last year, we upped our game by having Eric Larsen draw more funny portraits of the team, and rotating those on the site’s home page. We loved the results – the guy is crazy talented. He drew us teaching various SQL Server concepts, and his illustrations managed to capture exactly what we wanted the site to say.
Leveling Up: Explaining Our Process in 106 Seconds
Some customers don’t have time to do the long courtship of gradually getting to know us. Their SQL Server is desperately sick right now, and they don’t have the luxury of getting to know all the possible health providers out there. They’re Googling and judging people by what they see in the first minute.
So this year, we wanted to take our web game up another notch by hiring a partner to build us an amazing 2-minute video. Eric pointed us to his friends at Epipheo. Like us, they’re a company that focuses on exactly one thing – but theirs is making short videos. After examining their portfolio (my personal favorite is Projetech’s) and talking to them, we were hooked.
Epipheo’s mission was to distill who we are, what we do, and what makes us different down into a single video. I’m absolutely convinced they did it.
The video tells the viewer:
- Your SQL Server is sick, and you don’t know how to get relief.
- You could throw money at it, but that doesn’t usually work. (Many of our clients even tell us that they’ve already tried this by the time they contact us.)
- You could hire The Specialist, but he won’t explain what he’s doing, and you have no idea if it’s good or bad
- You could bring in The Large IT Firm, but it costs a fortune, and they’ll overwhelm you with irrelevant data
Or you could hire us:
- We focus on the pain that’s bothering you, not a gazillion unrelated “best practices”
- We work side by side with you, explaining your SQL Server’s symptoms as we go
- We teach you our techniques so you can repeat the process on your other SQL Servers
- And then we leave you alone, because we’re not here to drain you dry.
Yep. That’s exactly what our SQL Critical Care® is, and why it’s unique. Next up, we’ll talk about our favorite parts of the video where Epipheo really captured our personalities.
We’ll sum it up in a 106 second video about our SQL Critical Care®:
In tomorrow’s post, we’ll talk about why we made a video. In the meantime, enjoy. I know I certainly do – I’ve re-watched this dozens of times, and I still giggle. It’s on our home page, and we can’t wait for everyone to see it and giggle too.
I love going on the RunAs Radio podcast because Richard Campbell is so much fun to talk to. In this show, we talk about the amazing new hardware that has come out lately for SQL Server. Dell and HP have brought out some amazing gear – support for 1.8″ SSDs, 64GB DIMMs, and more. The 2-socket server market is such an amazing space today.
Our first two concerns in this series were virtualization licensing and recoverability. It’s not enough to have answers for these questions today, though – we have to think about how long the server will be in place.
The server’s expiration date isn’t dictated by IT. If it was up to us, we’d never run an old, unsupported version of anything – but sometimes the business is quite content doing just that. So how do we plan for this from the start?
We need to get written answers from the business using a simple worksheet. In our Virtualization, Storage, and Hardware video class, students get an Excel spreadsheet to use when interviewing the business users about the project requirements. Here’s one of the tabs:
Start with the first question: how long does this SQL Server need to last? This can be measured in time (like 3 years) or in business growth (like until we quadruple the number of active users).
There are columns for several answers because database servers are rarely one-person projects – people have lots of different opinions, so everybody fills in their answers, and then we come to a group consensus in the last column.
The next several questions ask about how the business, application, and data will change during that time. We’re not looking for exact percentages, but just vague ideas. Are we going to double the number of end users per year? Are we going to start saving files in the database in the next version of our app?
This isn’t just about query performance – remember, we have to be able to do our backups, DBCCs, and restores fast enough to meet the business goals for RPO and RTO. After all, the business isn’t likely to say, “Since we’re growing fast, it’s okay if you lose more data.”
Who Really Answers These Questions?
Every time I teach this class in person, someone raises their hand and asks, “Do business people ever actually have answers for these questions?”
But it opens their eyes to your challenge as a database administrator. How are you supposed to build a perfect server to handle load that the business can’t define?
The less specific the business goals are, the more agile your server design needs to be.
That’s Where Virtualization Shines.
When you’re handed a wildly vague set of business requirements, it’s easy to build a virtual machine sized for today, and then add additional capacity to handle growth of data, users, and app complexity.
But here’s the kicker: you have to actually monitor the growth.
And just buying an off-the-shelf monitoring product ain’t gonna cut it.
Off-the-shelf monitoring products don’t tell you when your number of active users have doubled, when your incoming order rate has quadrupled, or when you’re keeping ten years of sales history instead of two. You have to work with the business to define a list of business metrics that you can monitor via database queries, and then add those metrics to your monitoring software.
For example, when I ran a sales system, I added monitoring queries for the number of orders per day, the number of active salespeople, and the number of orders we kept in the active database. I used my monitoring tool to create line graphs of those numbers every day, and then I sent that report to the stakeholders every month. I could point to the graph and say, “Sales volume has grown by forty percent, and we haven’t archived any data. Without additional investments in hardware, tuning, or staff training, you can expect that performance will degrade accordingly.”
Lines like that open wallets.
But You Gotta Understand Databases, Virtualization, and Business.
This is where I got so frustrated with guidance that says things like, “Set up one data file per virtual core.” If you’ve got a database that’s doubled in size over the past year, and you need to add four additional vCPUs, that guidance means you would have to:
- Add four LUNs on every SQL Server (primary, disaster recovery, reporting, QA, development – because you’re going to refresh those databases from production)
- Add four more identically sized data files (which instantly gives you hot spots on those empty files)
- Rebuild your clustered indexes to even out the load on the old and new data files
- Watch your transaction logs and backup sizes during this time because you’re going to have a huge change rate
But the one-file-per-core never includes guidance like that. They just expect you to “add files” and call it a day, and they ignore the technical and business side effects. Does the business really want to take a huge slowdown impact while all this happens? Especially if they’re in the midst of a growth spurt? Of course not.
Instead, you have to look at the bottleneck that the SQL Server is really facing, and identify the best way to fix that bottleneck with the least impact to the business.
That brings us to the last set of questions on that page of the planning spreadsheet – how will we handle growth when it happens?
If we suddenly acquire our biggest competitor and need to double the data and user load, or if we run a Super Bowl ad, how will we proactively keep the database fast? Are we going to be able to tune code or hire staff – or are we only able to throw hardware at the problem?
Another way to visualize this for management is my Manager’s Guide to Tuning Code – it shows management what your options are, and lets them choose your avenues.
Don’t get me wrong – I love throwing hardware at a problem when it’s the easiest, cheapest, fastest way to solve the pain. But remember that while virtualization makes it easier to throw hardware at a problem quickly, that may not be the easiest/cheapest/fastest solution. And even if you do it, you need to understand the real side effects of tricks like “one data file per volume per core” – because it’s nowhere near easy/cheap/fast in the real world.