Tune in here to watch our webcast video for this week! To join our weekly webcast for live Q&A, make sure to watch the video by 12:00 PM EST on Tuesday, September 23! Not only do we answer your questions, we also give away a prize at 12:25 PM EST – don’t miss it!
It’s hard to keep up with what your Report Server is doing, especially if your only tool is Report Manager. Now there’s an easier way, using the newest member of the sp_Blitz family: sp_BlitzRS! In this webcast, you’ll learn how sp_BlitzRS can help you stay informed of how well your SSRS installation is running, who gets what subscriptions, which reports would benefit from preprocessing, and more!
Have questions? Feel free to leave a comment so we can discuss it on Tuesday!
Tune in here to watch our webcast video for this week! To join our weekly webcast for live Q&A, make sure to watch the video by 12:00 PM EST on Tuesday, September 16! Not only do we answer your questions, we also give away a prize at 12:25 PM EST – don’t miss it!
Curious how you can give a compelling technical presentation? Join Kendra to learn five important tips on how to select the right topic for your talk, write an effective abstract, construct a coherent presentation, and make it to the podium to give your first presentation.
Have questions? Feel free to leave a comment so we can discuss it on Tuesday!
You store data in SQL Server, but you don’t have enough work to keep a full-time DBA busy.
In just half an hour, you’ll learn the basics of performance troubleshooting and index tuning from me – a recovering developer myself. I’ll show you the basic care and feeding of a Microsoft SQL Server instance and give you scripts to keep you out of trouble:
In this session that I gave at the 24 Hours of PASS 2014, I cover these steps:
1. Start every new table with a clustered index.
A clustered index tells SQL Server, “Here’s how I want you to sort the data.” Without that definition, SQL Server just sloppily throws your data anywhere that it can find space. While that might be awesome for insert speeds, it’s not so good as a universal data structure when we need to make performance tuning as easy as possible.
Generally, your clustered index needs to be on fields that are:
As you learn more about indexes, it’s tempting to micro-optimize them by saying, “Well, I think my end users are going to query this data on a combination of date and warehouse number.” Don’t over-think that before the users start using the app. Index designs are easy to change later.
There are situations where heaps – tables without clustered indexes – can perform faster. Thing is, they have some serious gotchas as Kendra Little explains in this 30-minute video. Heaps also miss out on the magic of SQL Server’s automatically suggested indexes, and that’s especially important for us in the next step.
2. Check your indexes weekly with sp_BlitzIndex®.
As it runs your beautifully crafted queries, SQL Server is constantly gathering index diagnostics. It keeps track of what indexes would have been the most helpful, how many times they would have been used, and how to create those indexes. You can query those diagnostic tables yourself, but frankly, it’s painful.
That’s where the free sp_BlitzIndex comes in. Download it, install it on your production server, and then run it in your database. It gives a funny sanity check on which indexes you could probably drop, and which ones you should add.
Run it, and then cover your eyes. It’s going to produce all kinds of heinous warnings about the awful mess that the last guy stuck you with. I only want you to focus on two warnings:
Warning: Index Hoarding – these are nonclustered indexes that are in place now, but nobody’s using them. SQL Server has decided these indexes just aren’t the most efficient way to get your data. Disable these rather than dropping them, because that way if they’re needed again, you can just enable them again as explained in that link.
(Now would be a good time to mention that if you’ve been bossing SQL Server around by specifying index hints in your queries, you’re gonna have a bad time. Your queries will fail.)
Warning: Missing Indexes – SQL Server wants a copy of your table stored in a different order to make queries go faster. Careful adding these willy-nilly – aim for 5 or less nonclustered indexes per table, and each index should have 5 or less fields on it. Avoid indexing XML, VARCHAR(MAX), NVARCHAR(MAX), or other big data types because the index will be huuuuge.
If you’re on SQL Server Standard Edition, creating indexes can lock your table while you work, so do the actual adding in development first. Time how long it takes, and depending on your dev hardware’s speed and database size, you’ll have a rough idea of what it will look like in production. Then do the doin’ during a maintenance window or after hours. If you’re lucky enough to use SQL Server Enterprise Edition, you can use the WITH (ONLINE = ON) parameter while creating indexes to have a lower impact on your end users.
And remember that first step where we said to start with a clustered index? If you’re thinking SQL Server will recommend the right clustered index for you, think again.
Repeat this process every Friday. See, SQL Server empties out the contents of this diagnostic data whenever SQL Server restarts, and then it constantly keeps piling the data back up. On Friday, even if your server was restarted over the weekend, you’ve got some good diagnostic data to check because users have been hitting the app all week. (This is why you can’t run it in your dev/staging environment and get the same good recommendations.) Within a month or so, you won’t believe the difference in performance.
3. Build the Suckerboard weekly with sp_BlitzCache®.
As it’s executing queries, SQL Server tracks which queries get run most often, and which ones use the most server resources. It’s like capturing a trace of your server, but even better because SQL Server is already doing this for you. You can query the dynamic management views to pull the data out, but ain’t nobody got time for that.
That’s where the free sp_BlitzCache comes in. Download it, install it on your production SQL Server, and then run it with these parameters:
EXEC sp_BlitzCache @top = 10, @sort_order = ‘duration’
This will give you the top 10 queries that have run the longest in total. For example, if a query runs for 10 seconds, and it’s been called 1,000 times, then it’ll have 10,000 total seconds of duration, so it will rank higher than a query that was only ran once and ran for 500 seconds.
This is the opposite of the leaderboard – this is the Suckerboard. It’s the Hall of Shame, the queries that we’re not too proud of. The good news is that it’s usually really easy to fix these once you know what they are. On the far right of sp_BlitzCache’s output, you can click on the query plan to see the query’s execution plan, and you’ll often find things like missing index warnings or zombies.
Repeat the process with a few other parameters to see the highest CPU users, the ones that read the most data, and the ones that ran most frequently:
EXEC sp_BlitzCache @top = 10, @sort_order = ‘CPU’
EXEC sp_BlitzCache @top = 10, @sort_order = ‘reads’
EXEC sp_BlitzCache @top = 10, @sort_order = ‘execution’
Repeat this every Friday and email the list of queries to your team. This has an amazing effect – people see their queries on the Suckerboard, and they immediately wanna work on those queries to get them off the Hall of Shame. It becomes a competition to keep off the Suckerboard.
After a month or so, you’ll be very comfortable with the list of queries. You’ll know them right away at a glance, and you’ll know why they show up – like what kind of work they’re doing. This makes you extremely well-equipped to deal with the next step.
4. When slowness strikes, use sp_WhoIsActive and sp_AskBrent®.
When someone comes running in screaming that the SQL Server is slow, run these two stored procedures:
sp_WhoIsActive lists all of the queries that are running right now, ordered by longest-running to newest.
sp_AskBrent checks a bunch of common potential issues on your server and warns you if a data file is growing, a transaction is stuck rolling back, a backup is running, and so on.
5. When you need to learn more, here’s what to do.
- Explaining Clustered vs Nonclustered Indexes – Jes Schultz Borland simplifies these in a blog post with examples.
- Filtered Index Limitations – these are indexes with a WHERE clause, thereby taking up less space, but…yeah.
- How to Decide if You Should Use Table Partitioning – this advanced indexing technique has a tempting call as you start to scale. Kendra Little explains it.
- More index resources
Plan cache and execution plan resources:
- How to Use sp_BlitzCache® – Jeremiah explains how to query your plan cache and find the most resource-intensive queries.
- How to Get Your Query’s Execution Plan – a 15-minute video that explains a few different methods.
- Parameter Sniffing – if your query suddenly runs slow out of nowhere without any changes, this might be your problem.
- More execution plan resources
And if you’re coming to PASS, we’re holding a performance tuning class on Monday & Tuesday called Make SQL Server Apps Go Faster. If you learned something in this article & video, then you’ll learn a LOT in the class. Read more about it now.
DBAs reading this: stop freaking out.
I know, I’m simplifying a lot of stuff in here. When you read the sentence, “Index designs are easy to change later,” you instantly pop a vein in your forehead because you remember that time you had to change a clustered index on a 1TB data warehouse.
Take a step back and put yourself in the frame of mind of a developer who’s alone in a dark room with an application, a SQL Server, no DBA, no budget, and no time to get everything perfect. I’m trying to give them the tools to get good-enough performance while spending less than 2 hours a week worrying about their database.
Sure, there’s all kinds of interesting scenarios and tricks you’d like me to explain on this page – but remember, the time they spend reading this page and learning techniques counts against their 2 hours of spare time per week. Focus on the things that will truly make a difference in performance, and ease off the academic “well actually” stuff.
Thanks for tuning in to our webcast again this week! To join our weekly webcast for live Q&A, make sure to watch the video by 12:00 PM EST on Tuesday, September 9! Not only do we answer your questions, we also give away a prize at 12:25 PM – don’t miss it!
The SQL Server community is unlike any other technical community. We have so many helpful people and our pick of resources. When you want to contribute, you have many options – from speaking to writing, attending conferences to running user groups, answering forum questions to writing articles. Which should you invest your time in to help your career and personal growth? Let me share my experiences with all of these so you can make the best choice.
Have questions? Feel free to leave a comment so we can discuss it on Tuesday!
Let’s say you’re a DBA managing a 2TB database. You use SQL Server transaction log shipping to keep a standby copy of the database nice and warm in case of emergency. Lots of data can change in your log shipping primary database: sometimes it’s index maintenance, sometimes it’s a code release, sometimes it’s just natural data processing.
And when a lot of data changes, your warm standby sometimes is a lot less warm than you’d like. It can take a long time to restore all those log files!
Here’s a trick that you can use to help “catch up” your secondary faster. A quick shout-out to my old friend Gina Jen, the SQL Server DBA and log shipper extra-ordinaire who taught me this cool trick years ago in a land far far away.
Log shipping Secret Weapon: Differential Backups
Through lots of testing and wily engineering, you’ve managed to configure nightly compressed full backups for your 2TB database that are pretty darn fast. (No, not everyone backs up this much data every night, but stick with me for the sake of the example.)
- Log shipping primary had a full backup last night at 2 am
- Log shipping secondary has transaction logs restored through 7 am
- It’s 3 pm, and you’d really like to have everything caught up before you leave the office
Here’s an option: run a compressed differential backup against your log shipping primary. Leave all the log shipping backup and copy jobs running, though — you don’t need to expose yourself to the potential of data loss.
After the differential backup finishes, copy it over to a nice fast place to restore to your secondary server. Disable the log shipping restore job for that database, and restore the differential backup with NORECOVERY. This will effectively catch you up, and then you can re-enable the log shipping restore and you’re off to the races!
But Wait A Second. Aren’t Those Full Backups A Problem?
Running a full backup without the COPY_ONLY keyword will reset the “differential base”. That means that each differential backup contains changes since the last full backup.
But here’s the cool thing about log shipping: restoring a transaction log brings the new differential base over to the secondary.
So as long as you’ve restored transaction logs past the point of the prior full backup, you can restore a differential to your log shipping secondary.
This Sounds Too Good To Be True. What Can Go Wrong?
This isn’t foolproof. If you haven’t run a full backup in a long time, your differential backup may be really big, and taking that backup and restoring it may take much longer than restoring the logs. (Even if you’re using log shipping, you should be doing regular full backups, by the way.)
And like I mentioned above, if your log restores are so far behind that they haven’t “caught up” with the last full backup taken on the primary, you’re not going to be able to restore that differential backup to the secondary.
What If a Transaction Log Backup File Is Missing?
A technique like this could work for you, as long as a full backup hasn’t run since the transaction log backup file went missing. (If it has, you need to re-setup log shipping using a full).
But a word of warning: if you have missing transaction log backup files, you have a “broken” log chain. You should take a full backup of your log shipping primary database to get you to a point where you have functioning log backups after it, even if you’re going to use a differential to bridge the gap on your log shipping secondary. (And keep that differential around, too!) Keep in mind that you won’t have point-in-time recovery for a period around where the log file is missing, too.
Log shipping is Great
I just love log shipping. It’s quick to set up, it’s relatively easy to manage, it’s included in Standard Edition, and it’s got these surprising little tricks that make it easy to keep going. You can learn more about log shipping from Jes, and join us in person in 2015 in our Senior DBA training, which includes an advanced module on log shipping.
You love Q&A sites like StackOverflow.com and DBA.StackExchange.com, but sometimes it’s hard to find interesting questions that need to be answered. So many people just sit around hitting refresh, knocking out the new incoming questions as soon as they come in. What’s a database person to do?
Use the power of the SQL.
Data.StackExchange.com lets you run real T-SQL queries against a recently restored copy of the StackExchange databases. Here’s my super-secret 3-step process to find questions that I have a shot at answering.
Step 1. Find out how old the restored database is.
Run this query to get a list of site databases and the newest post date in each one:
I take a glance at StackExchange.Dba and StackOverflow to make sure I’m dealing with a relatively recent database restore. These database restores are done weekly, but you know how it goes – sometimes scheduled jobs fail.
Step 2. Find questions everybody’s talking about.
Run this query to find questions with many comments, but no accepted answer. Note that for this one, you do have to pick the site you want to focus on – this doesn’t run across all databases.
If you click on the Post Link, you’re taken to the live question. If the database restore was a few days ago, keep in mind that the live question may have changed or been answered by now.
The other columns help me see at a glance if this is a question I’m interested in. In the above screenshot, the third question, “New database server hardware,” has 2 answers, but none of them have been upvoted, and there hasn’t been any comments since 2014-05-13. It’s old, and the questioner has probably moved on, but you can revive the question by posting a really good answer. You can also take your time, knowing nobody else is really active on it right now. These are great opportunities to post a really in-depth answer for other folks to find as they search the web later.
Which brings me to my next favorite query:
Step 3. Find questions that people keep looking at.
Run this query to find questions with many views, but no accepted answer. This one is filtered by a specific tag because the SQL Server ones don’t usually show up in the top views. Use tags from the DBA.se tag list or the StackOverflow tag list.
Obscure error numbers are usually going to pop up here because there’s not much public information about ‘em. Now’s your chance to write an answer explaining how to troubleshoot that error number, and presto, you’re the next Jon Skeet.
Transparent Data Encryption is a way to encrypt your data in SQL Server. It affects the data and log files of the database. I recently gave a webcast on this topic, and got some great questions about TDE.
What versions of SQL Server is it available in?
It is in 2008+. (Let me ask again…why are you still using SQL Server 2005?)
Is TDE only available in Enterprise Edition?
Yes, this is an expensive-edition-only feature.
Can TDE encrypt at the table or column level?
No, it’s all or nothing.
Does TDE provide encryption of backups?
When you enable TDE on a database, the backups will, by nature of the feature, be encrypted. However, it’s not possible to encrypt only the backup files and not the data. Native backup-only encryption is available as of SQL Server 2014 (Enterprise, BI, and Standard editions).
When I turn TDE on, will all the data pages need to be read? Will this take some time?
Yes, and yes. Test it in your development environment first to determine how long it will take to perform this in production.
Does TDE work with…
Failover clustering? Yes – there is only one copy of shared data.
AlwaysOn Availability Groups? Yes, but you have to set it all up with T-SQL.
Database mirroring? Yes – the data will be encrypted on both instances.
Log shipping? Yes – the data is encrypted on both instances.
Replication? Technically…but it’s a pain. The certificate must be installed on all subscribers, and the data is not encrypted as it is distributed.
What other questions do you have?
You’ve got a query that’s running too slow, and you need to figure out why. The first step is admitting that you’ve got a problem, but the second step is getting the execution plan. In this 16-minute video, I will show you how to get an estimated execution plan, why you probably shouldn’t, and several different ways to get the actual execution plan. I’ll finish with my favorite resources for learning how to read and improve execution plans:
To ask questions, join us on our Tech Triage Tuesday webcast where we’ll do live Q&A and follow up with demos. We’ll also be drawing for a surprise prize from Brent. See you there!
tl;dr – I do not recommend this book.
I was so incredibly excited when it was originally announced. A book published by VMware Press, written by prominent VMware, SQL, and storage consultants? GREAT! So much has changed in those topics over the last few years, and it’s high time we got official word on how to do a great job with this combination of technology. Everybody’s doin’ it and doin’ it and doin’ it well, so let’s get the best practices on paper.
When it arrived on my doorstep, I did the same thing I do with any new tech book: I sit down with a pad of post-it notes, I hit the table of contents, and I look for a section that covers something I know really well. I jump directly to that and I fact-check. If the authors do a great job on the things I know well, then I’ve got confidence they’re telling the truth about the things I don’t know well.
I’ll jump around through pages in the same order I picked ‘em while reading:
Page 309: High Availability Options
Here’s the original. Take your time looking at it first, then click on it to see the technical problems:
OK, maybe it was bad luck on the first page. Let’s keep going.
Page 111: Database File Design
The “Microsoft Recommended Settings” are based on a 2006 article about Microsoft SQL Server 2005. I pointed this out to the book’s authors, who responded that Microsoft’s page is “published guidance” that they still consider to be the best advice today about SQL Server performance. Interesting.
Even so, the #3 tip in that ancient Microsoft list is:
3. Try not to “over” optimize the design of the storage; simpler designs generally offer good performance and more flexibility.
The book is recommending the exact opposite – a minimum of one data file per core for every single database you virtualize. That’s incredibly dangerous: it means on a server with, say, 50 databases and 8 virtual CPUs, you’ll now have 400 data files to deal with, all of which will have their own empty space sitting around.
I asked the authors how this would work in servers with multiple databases, and they responded that I was “completely wrong.” They say in a virtual world, each mission critical database should have its own SQL Server instance.
That doesn’t match up with what I see in the field, but it may be completely true. (I’d be curious if any of our readers have similar experiences, getting management to spin up a new VM for each important database.)
So how are you supposed to configure all those files? Let’s turn to…
Page 124: Data File Layout on Storage
Imagine this setup for a server with dozens of databases. And imagine the work you’d have to do if you decide to add another 4 or 8 virtual processors – you’d have to add more LUNs, add files, rebalance all of the data by rebuilding your clustered indexes (possibly taking an outage in the process if you’re on SQL Server Standard Edition).
What’s the point of all this work? Let’s turn to…
Page 114: You Need Data Files for Parallelism
No, you don’t need more data files for parallelism. Paul Randal debunked that in 2007, and if anybody still believes it, make sure to read the full post including the comments. It’s simply not true.
I asked the authors about this, and they disagree with Paul Randal, Bob Dorr, Cindy Gross, and the other Microsoft employees who went on the record about what’s happening in the source code. The authors wrote:
You can’t say Microsoft debunked something when they still have published guidance about it…. If in fact if your assertions were accurate and as severe then we would have not had the success we’ve had in customer environments or the positive feedback we’ve had from Microsoft. I would suggest you research virtualization environments and how they are different before publishing your review.
(Ah, he’s got a point – I should probably start learning about SQL on VMware. I’ll start with this this guy’s 2009 blog posts – you go ahead and keep reading while I get my learn on. This could take me a while to read all these, plus get through his 6-hour video course on it.)
So why are the authors so focused on micromanaging IO throughput with dozens of files per database? Why do they see so many problems with storage reads? I mean, sure, I hear a lot of complaints about slow storage, but there’s an easy way to fix that. Let’s turn to page 19 for the answer:
Page 19: How to Size Your Virtual Machines
Ah, I think I see the problem.
To make matters worse, they don’t even mention how licensing affects this. If you’re licensing SQL Server Standard Edition at the VM guest level, the smallest VM you can pay for is 4 vCPUs. Oops. You’ll be paying for vCPUs you’re not even using. (And if you’re licensing Enterprise at the host level, you pay for all cores, which means you’re stacking dozens of these tiny database servers on each host, and managing your storage throughput will be a nightmare.)
In fact, licensing doesn’t even merit a mention in the Table of Contents or the book’s index – ironic, given that it’s the very first thing you should consider during a virtual SQL Server implementation.
In Conclusion: Wait for the Second Edition
I’m going to stop here because you get the point. I gave up on the book after about fifty pages of chartjunk, outdated suggestions, and questionable metrics (proc cache hit ratio should be >95% for “busy” databases, and >70% for “slow” databases).
This is disappointing because the book is packed with information, and I bet a lot of it is really good.
But the parts I know well are not accurate, so I can’t trust the rest.
We always like to innovate — not just with the solutions we design for our consulting customers and in how we teach, but in our free videos, too.
Our YouTube channel has become super popular. Lots of folks watch the recordings of our live webcasts. We stopped recently and asked, “How can we make this even better for the folks who attend our live event?” And we realized: we can give you more time to ask questions about that week’s training topic!
Here’s your mission:
- Watch the video below today. We won’t be presenting this live this week or re-covering the material from the video, we’re doing more advanced QA for the folks who’ve already watched it.
- Note down questions or comments you have on this post. (This is totally optional, but it means you won’t forget your question and it’s more likely we have time to talk about it with you.)
- Attend the live webcast on Tuesday at the normal time (11:30 am Central). Register here.
- During the first 10 minutes of the webcast, we’ll give away a prize– but you must be present to win!
The live discussion of the video and Q&A won’t be recorded and published, and you also need to be present to win the prize. See you on Tuesday!