SQL Server DBAs are slow to adopt Windows Server 2012 and Server 2012 R2. I frequently find teams planning new implementations of SQL Server 2012 on Windows Server 2008 R2.
There seem to be two main reasons for this:
- DBAs want to change a minimal number of factors, and are slow to trust a new version of the OS. Windows 2008R2 feels “safe.”
- Windows Server 2012 was hard to love when it first came out. With no Start menu, Windows 2012 was more baffling than exciting to most production DBAs.
But these reasons don’t hold up– and staying on an old OS isn’t doing you any favors. If you’re planning a SQL Server 2012 (or upcoming 2014) installation, you should also be using Windows Server 2012 or Server 2012 R2.
Windows Server 2008R2 isn’t “Safer” or “Simpler” than Server 2012 R2
Let’s break down that first concern above. The essense of it is that staying on Windows Server 2008R2 seems less risky than using a new OS.
But let’s look at that closely. The most recent service pack for Windows 2008R2 was released in March 2011– that was Service Pack 1. (I bet I’m not the only person who has searched in vain for SP2 repeatedly, thinking I must have just missed it.)
Lots of fixes have been released since Service Pack 1. Here’s a few highlights:
- KB 2566205 – Scalability of the I/O performance decreases in Windows Server 2008 R2 if more than 256 GB of physical memory is used
- KB 2468345 – Computer intermittently performs poorly or stops responding when the Storport driver is used in Windows Server 2008 R2
- KB 2545635 – You experience disk I/O performance issues on a computer that has an Intel Westmere-EX processor with at least 8 sockets and that is running Windows Server 2008 R2
- KB 2553549 – All the TCP/IP ports that are in a TIME_WAIT status are not closed after 497 days from system startup in Windows Vista, in Windows 7, in Windows Server 2008 and in Windows Server 2008 R2
Which patches you need varies by your configuration, and this is just the tip of the iceberg.
When you’re using high availability solutions, the list of patches you need to consider just gets bigger– and the risks get higher. In this post Brent describes a cluster-specific bug that was released after Stack Exchange experienced Availability Group outages due to a clustering issue.
Although Windows 2008R2 has been out for a while, managing an Availability Group or a Windows Failover Cluster on it isn’t simpler– it’s actually more complicated! You’ve got more patches you need to be aware of and manage, and you still need to keep an eye out for new fixes as they’re released.
Failover Clustering has huge new features in quorum
Don’t get me wrong– Windows 2008 R2 has some pretty nice features when it comes to setting up a cluster, particularly compared to older versions of Windows. I love the Cluster Validation Wizard. I find the Failover Cluster Manager snap-in to be pretty intuitive and easy to manage.
But Failover Clusters have grown up significantly– particularly when it comes to your options with making sure that your applications stay online. And having more ways to keep applications online is probably the exact reason that you’re using features involving clustering!
Dynamic Quorum and Vote Assignment – Windows Server 2012
Windows Server 2012 introduced “Dynamic Quorum“. Dynamic Quorum lets the cluster add and remove votes for nodes if they shut down or fail. This makes it more likely for the cluster to maintain a majority if you have multiple sequential failures.
Windows Server 2012 also lets you easily customize quorum configuration and remove votes from some nodes if you’d like to. To remove node votes with Windows Server 2008R2, you’d have to know to install KB 2494036 and then reconfigure quorum.
Bonus: as of Windows Server 2012, the Windows Failover Clustering feature is part of Windows Standard Edition.
Dynamic Witness, Tiebreakers – Windows Server 2012 R2
Windows Server 2012 R2 introduces even more quorum features. The Dynamic Witness feature lets the cluster decide when it will or won’t give a vote to the witness you have configured.
This means that if you have multiple sequential failures, the cluster has more options and is smarter about keeping the right number of votes available. It also means that the guidance about creating a witness is simpler: since the vote can be managed dynamically, you should always create a witness.
Windows 2012 R2 has more features, too– the cluster is smarter about what happens when things start to come back online after you force quorum, and it has options for specifying a tiebreaker if you have multiple failures at the same time.
And it’s a simple thing, but I love it the most: in 2012 R2 you can see what’s happening with dynamic votes for nodes by using the Failover Cluster Manager (no PowerShell commands required). This makes testing different failure scenarios much more straightfoward as you can quickly and easily see how many votes there are, and who’s got them.
Oh, and they fixed that Start menu thing, too.
You can’t easily migrate your Windows 2008R2 Cluster to Windows 2012+
The final reason I wouldn’t plan a new highly available environment on Windows 2008R2 is simply that I don’t want to be stuck there. You might think, “We’ll start out on a known OS and then upgrade soon” — but that’s easier said than done.
If you’re on a Windows 2008R2 cluster and you want to upgrade, you will need to create a new, separate failover cluster using Windows Server 2012, and then migrate to it. That requires separate storage, separate servers, and a careful migration plan. It’s absolutely doable, but it’s not simple, cheap, or easy. (Note that the built-in “Migrate a Cluster” wizard doesn’t support SQL Server installs for Windows 2008R2 -> Windows 2012.)
What about SQL Server 2008? Or 2005?
If you’re doing a new server buildout, you may decide to install SQL Server 2008 R2 on a Windows 2012 / 2012 R2 failover cluster. That’s supported, but to get installation to work you may need to use a slipstream installation and take a few other steps. (I think slipstreaming is great, but budget extra time to get everything tested and working.)
If you’re thinking about running SQL Server 2005 on Windows 2012 or Windows 2012 R2, that’s just not supported by Microsoft.
It’s Time to Embrace Our New OS Overlord
I really like Windows 2008 R2. I’ve used it for years, it’s very familiar, and I feel like I know its quirks. But I also recognize that it’s got a lot of baggage. It’s not really simpler or easier to manage.
The improvements to Windows Failover Clustering features in later versions of the operating system should be very compelling to anyone considering clustering SQL Server or using Availability Groups. It’s not always perfect and it’s certainly complicated, but it’s much better than what you’ve got in Windows 2008R2.
It’s time to start planning your new SQL Server implementations with Windows 2012 and higher.
Ever wanted to prove that your code or index changes had a positive impact? In this 30 minute video, Kendra shows you how to use Extended Events to measure a production workload– and how you can aggregate the results to prove your changes made a difference.
Want to skip straight to the demos? Here’s a quick timeline:
- Demo 1: How I generate a fake workload – 2:45
- Demo 2: Taking an initial benchmark with Extended Events – 4:45
- Demo 3: Aggregating the results of the Extended Events Trace -10:52
- Demo 4: Taking a second benchmark with Extended Events and aggregating the results – 18:05
Want a sample of the type of Extended Events trace run in the demo? Here’s a sample script. This isn’t any rocket science on my part: I generated the script with the Extended Events wizard and then told it to script it out. (Try it yourself!)
--****************** --These scripts were generated from SQL Server 2012 --Management Studio Extended Events Wizard --Customization is primarily just formatting for readability --****************** CREATE EVENT SESSION [Production Perf Sample-Before] ON SERVER ADD EVENT sqlserver.sp_statement_completed(SET collect_object_name=(1) ACTION(sqlserver.client_app_name,sqlserver.database_id, sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.session_id) WHERE ((([package0].[divides_by_uint64]([sqlserver].[session_id],(5))) AND ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)))) AND ([package0].[equal_boolean]([sqlserver].[is_system],(0))) AND ([sqlserver].[query_hash]<>(0)) )), ADD EVENT sqlserver.sql_statement_completed( ACTION(sqlserver.client_app_name,sqlserver.database_id, sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.session_id) WHERE ((([package0].[divides_by_uint64]([sqlserver].[session_id],(5))) AND ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)))) AND ([package0].[equal_boolean]([sqlserver].[is_system],(0))) AND ([sqlserver].[query_hash]<>(0)) )) ADD TARGET package0.event_file (SET filename=N'c:\Xevents\Traces\Production Perf Sample-Before') WITH ( EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY=30 SECONDS, TRACK_CAUSALITY=ON, STARTUP_STATE=OFF ) GO --ALTER EVENT SESSION [Production Perf Sample-Before] ON SERVER STATE=START; --GO --ALTER EVENT SESSION [Production Perf Sample-Before] ON SERVER STATE=STOP; --GO
One of my favorite things about SQL Server is the instrumentation. It’s incredibly good at giving you information about what’s going on in the database engine and query optimizer.
Want to know why your server is slow? Need to know what made it go offline? You have a lot of tools available.
This hasn’t always been the case. One of the toughest things about the old SQL Server 2000 is the lack of instrumentation. The tools that help you improve SQL Server have all been added in the last 10 years.
The cool thing is that the instrumentation gets better and better with each version of SQL Server– new features get new Dynamic Management Views and functions. And sometimes we get a whole new feature to help us get insight into SQL Server.
If you’ve still got SQL Server 2000 instances where performance and availability is important, explain to your management how limited the tools are to diagnose and repair issues. SQL Server’s come an awful long way!
The other day I was running a script I’d written to expand out some tables in the AdventureWorks sample database for some testing. The script was taking a long time, and at first I thought I was just spoiled because I usually run my tests on my laptop, whose storage is backed by SSDs. In this case I was running my test against a SQL Server instance up in AWS.
After a delicious fifteen minute coffee break, I realized something was wrong– it STILL wasn’t done. I grabbed sp_AskBrent™ to do a quick check and see if my storage was really THAT slow.
I ran sp_AskBrent™ with a 60 second sample, like this:
exec sp_AskBrent @seconds=60, @ExpertMode=1; GO
Here’s what I saw (click for a larger version):
PREEMPTIVE_OS_WRITEFILEGATHER means slow file growths
In a 60 second sample, I’d been waiting for PREEMPTIVE_OS_WRITEFILEGATHER for 36 seconds. Whoops!
The script I was running was making tables larger. The database was out of room, and was growing the data file in small autogrowth increments. Each of these automatic growths was stopping to “zero out” the files, because I hadn’t enabled Instant File Initialization in SQL Server.
How I enabled Instant File Initialization
My test instance is set up a bit uniquely– it uses UNC paths for the SQL Server databases. This is supported in SQL Server 2012, and I was testing it out in a Windows Failover Cluster. (If you use shares for files, you need to be very careful about redundancy and availability of course, but that’s a topic for another post.)
I’d forgotten to enable Instant File Initialization for the SQL Server Service on the server that hosted the file shares! Wherever Windows will actually grow the files, the SQL Server Service account needs the “Perform Volume Maintenance” tasks right.
Using the Local Security Policy to Grant “Perform Volume Maintenance Tasks”
Here’s how to grant the rights:
First, open the local security. I usually do this by running the command secpol.msc, but you can also find this in the “Administrative Tools” menu.
Then navigate to Local Policy -> User Right Assignment -> Perform Volume Maintenance Tasks.
Add the SQL Server Service account. The change will take effect the next time you restart SQL Server.
So, did it improve my performance?
Heck yeah, it did! On the first run I got tired of waiting and killed my expansion script after 20 minutes and just dealt with the rollback. After enabling Instant File Initialization on fileserver that holds the shares where my database files live, the expansion script finished in four minutes and 48 seconds. Shazam!
Temp tables are like real tables, just a little tricker.
When you’re starting out writing TSQL, it’s easy to want to do all your work in a single query. You learn about derived sub-queries, CROSS APPLY statements, and common table expressions. Suddenly, each of your queries is so complex that you hardly know what you’re doing.
But complex TSQL can be quite fragile. Small mis-estimations in parts of the plan can cause things to go very much awry in other regions of the plan.
Temp tables can help performance!
One treatment for slow, complex TSQL is to break it into multiple steps. You create a temp table, populate it, and reference it in future queries. This can lead to improved estimates and reliability, because temporary tables can have column and index related statistics. Those statistics help the optimizer estimate how many rows will be returned at different points along the way.
But this tuning technique isn’t foolproof. It’s easy to forget that statistics can be tricky– just like with real tables. Let’s take a look and prove that statistics issues from “normal” tables also matter on temp tables.
Out of date statistics on temp tables can hurt performance
The first step is to load up a temp table with some rows:
INSERT #temp (TransactionID, TempValue) SELECT TOP 100000 TransactionID, 1 FROM Production.TransactionHistory; GO INSERT #temp (TransactionID, TempValue) SELECT TOP 10000 TransactionID, 2 FROM Production.TransactionHistory; GO
We now have:
- 100K rows where TempValue=1
- 10K rows where TempValue=2
Now let’s say I query the temp table in some way. My where clause is for a specific TempValue. The act of running this query causes a column level statistic to be created on the TempValue column:
SELECT max(th.Quantity) FROM #temp as t JOIN Production.TransactionHistory th on t.TransactionID=th.TransactionID WHERE TempValue=2; GO
We can see the statistic, too. This code is modified from Books Online and uses a DMV available in SQL Server 2012 SP1 and higher:
use tempdb; GO SELECT sp.stats_id, name, last_updated, rows, rows_sampled, steps, unfiltered_rows, modification_counter FROM sys.stats AS stat CROSS APPLY sys.dm_db_stats_properties(stat.object_id, stat.stats_id) AS sp WHERE stat.object_id = object_id('#temp'); GO
I continue on, and insert 11K rows for TempValue=3:
INSERT #temp (TransactionID, TempValue) SELECT TOP 11000 TransactionID, 3 FROM Production.TransactionHistory; GO
At this point I have:
- 100K rows where TempValue=1
- 10K rows where TempValue=2
- 11K rows where TempValue=3
Now I run another query which wants to look ONLY at the 11K rows where TempValue=3. SQL Server completely mis-guesses how many rows will be returned– instead of 11K rows, it guesses that I’ll just get a SINGLE row:
SELECT max(th.Quantity) FROM #temp as t JOIN Production.TransactionHistory th on t.TransactionID=th.TransactionID WHERE TempValue=3; GO
Looking at the statistics, we can see why– the statistic that was created when I ran my first SELECT query wasn’t triggered to update:
Let’s look at the histogram
When we look at the histogram for the statistic, it confirms that it doesn’t know about any rows with a TempValue higher than two:
Bad Statistics can lead to bad optimization
If I was joining from this temp table to other large tables, it might cause a very large data skew in my plan– and it could cause big problems. The optimizer might select inefficient joins, not allocate enough memory, and choose not to go parallel if estimates are skewed too low.
Temp tables are a great tool to use, but remember a few key rules:
- Column level statistics will be created for you when you run queries using the temp table
- If you modify the temp table in multiple steps, those statistics may get out of date
In complex procedures, manually creating and updating statistics on columns or indexes can save you time and improve execution plan quality.
Could Trace Flag 2371 Help?
If you find that you have a lot of existing code that has this problem, Trace Flag 2371 may help. This trace flag is available as of SQL Server 2008R2 SP1 and higher, and it changes the default threshold for when statistics automatically update.
The net effect of the trace flag is to cause statistics updates to happen more frequently on large tables. You must turn it on server wide for it to take effect — it doesn’t work for individual sessions.
I’m not a huge fan of enabling any trace flags unless you’re solving a specific problem, and documenting it. This trace flag is not very widely used in my experience, so flipping it on puts you in a small subset of people. If you start hitting wacky issues and need to resolve them, suddenly the trace flag gives you a whole new dimension of complexity. So please don’t think of this trace flag as “preventive medicine” — only use if it you don’t have other good ways to solve the problem.
I first knew something was up when I looked at the job history for a simple maintenance plan. It had two steps:
- Rebuild all the indexes in the database – this took 10 minutes each night.
- Update statistics – this took 2-3 hours each night.
Maintenance Plans light the fuse
I love the concept of maintenance plans, but I don’t love the way all the tasks are set up.
In the case I was looking at, the Update Statistics task was being used with two values that are set by default:
- Run against all statistics
- Update them with fullscan
“All” statistics means that both “column” and “index” statistics will be updated. There may be quite a lot of statistics — most people leave the “auto create statistics” option enabled on their databases, which means that queries will dynamically cause the creation of more and more statistics over time.
Combined with “fullscan”, updating all statistics can become a significant amount of work. “Fullscan” means that to update a statistic, SQL Server will scan 100% of the values in the index or column. That adds up to a lot of IO.
Why ‘SELECT StatMan’ repeatedly scans tables
If SQL Server needs to update column level statistics for the same table, it could potentially use a single scan and update multiple stats, right?
Because of the runtimes I was seeing, I was pretty sure that wasn’t happening. But we can take a closer look and see for ourselves.
In our maintenance plan task, if we hit “View TSQL”, a window pops up showing us the comamnds that the plan will run. (I love this feature, by the way!) We will use one of these commands to test things out in a bit.
First, let’s make sure we have some column level statistics on our database. It already has indexes and their associated stats. To create some column level stats, I run these queries:
--create two column stats using 'auto create statistics' select * from Person.Person where MiddleName like 'M%'; select * from Person.Person where Title is not null; GO --Create two filtered stats on Title create statistics kl_statstest1 on Person.Person (Title) where Title = 'Mr.' GO create statistics kl_statstest2 on Person.Person (Title) where Title = 'Ms.' GO
That will create two “auto” stats what start with “_WA_Sys”, and two stats that I named myself. To check ‘em out and see ALL the index and column stats on the table, we run:
exec sp_helpstats 'Person.Person', 'All'; GO
Sure enough, this shows us that we have seven stats total– three are related to indexes.
Alright, time to run that sample command excerpted from our maintenance plan. I start up an Extended Events trace to capture IO from sp_statements completed, then run the command the maintenance plan was going to use to update every statistic on this table with fullscan:
UPDATE STATISTICS [Person].[Person] WITH FULLSCAN GO
Here’s the trace output –click to view it in a larger image:
Looking at the Extended Events trace output, I can see the commands that were run as well as their logical reads. The commands look like this:
SELECT StatMan([SC0]) FROM (SELECT TOP 100 PERCENT [Title] AS [SC0] FROM [Person].[Person] WITH (READUNCOMMITTED) WHERE ([title]='Mr.') ORDER BY [SC0] ) AS _MS_UPDSTATS_TBL OPTION (MAXDOP 16)
The “logical_reads” column lets me know that updating four of these statistics had to do four separate scans of my table– and three of them are all on the Title column! (Doing a SELECT * FROM Person.Person shows 5,664 logical reads by comparison.)
IO was lower for statistics related to nonclustered indexes because those NC indexes have fewer pages than the clustered index.
A better way to update statistics: Let SQL Server pick the TABLESAMPLE
If you just run the TSQL command ‘UPDATE STATISTICS Person.Person’ (without telling it to scan all the rows), it has the option to do something like this:
SELECT StatMan([SC0], [SB0000]) FROM (SELECT TOP 100 PERCENT [SC0], step_direction([SC0]) over (order by NULL) AS [SB0000] FROM (SELECT [Title] AS [SC0] FROM [Person].[Person] TABLESAMPLE SYSTEM (3.547531e+001 PERCENT) WITH (READUNCOMMITTED) ) AS _MS_UPDSTATS_TBL_HELPER ORDER BY [SC0], [SB0000] ) AS _MS_UPDSTATS_TBL OPTION (MAXDOP 1)
It dynamically figures out a sample size by which to calculate results! (It can pick a variety of options– including scanning the whole thing.)
How to configure faster, better statistics maintenance
Avoid falling for the pre-populated settings in the “Update Statistics” task in the maintenance plan. It’s rare to truly need to use FULLSCAN to update stats in SQL Server, and even when cases where it’s justified you want to implement that with statements targeting the individual statistics to update. The basic “UPDATE STATISTICS Schema.TableName” command is pretty clever– the issue is simply that Maintenance Plans don’t make it easy for you to run that!
Unfortunately, if you use maintenance plans there’s no super simple solution– it forces you to specify either fullscan or a specific sample. There’s no way to just use the basic “You compute the minimum sample” with that task.
You’ve still got good options, they’re just a few more steps:
- You could use a t-sql related task or a custom SQL Agent job to run sp_updatestats
- You could use a free index and statistics maintenance script. The example I’ve linked to is super clever, and avoids updating statistics where it has just rebuilt an index!
- You could also let auto update stats take care of the issue– that’s often just fine on small databases or where there aren’t major data fluctuations
And each of those options should chew up less IO than updating all index and column statistics with FULLSCAN.
When is a “Full Scan” not a Full Scan?
The performance counter “Full scans/sec” sounds pretty scary. If you see spikes in this counter, you’re likely to think that your queries are madly scanning all the data in your tables– there are no stylish, well performant seeks, just drunken queries spewing IO everywhere.
If you look up the counter in SQL Server Books Online, it seems to confirm that. Full Scans/sec is defined as the ”Number of unrestricted full scans per second. These can be either base-table or full-index scans.” It’s not just a FULL scan, it’s “unrestricted.” That’s sounds really bad, right?
Let’s take a look. You can warm up your test SQL Server instance and restore your favorite sample database to play along with the sample queries below using the AdventureWorks2012 database and perfmon.
Test 1: A full scan
Here’s our first test query. This query technically does an index scan, but it’s a very efficient partial scan– it uses just 3 logical reads to hop to the end of the index and grab the last value.
--This will run 500 times declare @val1 int; select @val1=max(BusinessEntityID) from Person.Person; GO 500
500 of these finish in less than one second. Here’s what the plan looks like. (If you look deep into the properties of the scan operator, you’ll find that it says that it’s a special “Backwards” scan for speed.)
In perfmon, I can see that this query did the following:
- 500 Full Scans/sec (SQL Server:Access Methods)
- 1,908 Page lookups/sec (SQL Server: Buffer Manager – this is logical reads)
- 0 Page reads/sec (SQL Server: Buffer Manager — this is physical reads and the data was already in memory)
Hmm, so what we’ve seen here is that we can make the full scans/sec counter spike with a scan that isn’t really “unrestricted” or “full” in the way that we might think.
Full scan does NOT really mean that it read the entire index! (I’m not the first to discover this, of course. Michael J Swart gives a nice example of a TOP query doing something similar in this post on different types of scans.)
Test 2: a “worse” full scan
Let’s make our test query a bit nastier. This new version of the query is not such a great citizen — it reads 3,820 pages on each run (as opposed to 3 reads from the query before):
--The variable here is just to keep SSMS from choking on the output declare @val1 xml; select @val1 = AdditionalContactInfo from Person.Person; GO 500
This is also a scan but it does more IO. 500 of these take five seconds or so– much longer. Here’s the plan:
When we run this, we see that:
- Full scans/sec only spikes up to a max value of 97 this time– it’s much LOWER, even though we hit the SQL Server with a nastier query. We just ran this from a single SQL Server session, so it didn’t spike up as much and took longer– but it certainly isn’t better than the first query.
- Page lookups/sec spikes up to a max value of 362,161. Each run of the query is doing more logical IO– and the fact that this query does more work shows up much more clearly here.
- Page reads/sec just has a slight bump and at one point 8 reads/sec are registered. The data was already in memory, so this wasn’t really impacted at all.
The “worse” query didn’t spike the full scans/sec counter, but it did more IO. If we were designing monitoring around the full scans/sec counter then we’d probably just throw an alert on the incident that actually had less impact.
Pulling it all together
Monitoring is a tricky business. Performance counters tell you a lot about SQL Server– but if you monitor on individual thresholds on many performance counters, you’ll end up with false alarms and more questions than answers. The counters shown in this post can be useful for baselining, trending, and interpreting in a larger context of information, but by themselves they don’t tell you a full story.
If you’re getting frustrated with perf counters and want to learn what’s really dragging down your SQL Server, check out our free webcasts, our free scripts, or our performance tuning training videos.
Here’s a SQL Server Management Studio trick that comes in super handy when you’ve got long pieces of code. I almost never see anyone use this, and I think the reason is that few people know about it.
When I’m working on a stored procedure, sometimes I want to add in a new variable or a temp table. I declare ‘em all at the top of the procedure, but I’m working much farther down in the proc than that.
I don’t have to lose my place. I can split the window using a cool, but hard to see feature known as the “splitter bar”.
Drag the splitter bar down. I now have two synchronized views into my procedure.
Voila, I can add my temp table in the right area without losing my place.
This feature isn’t specific to SQL Server Management Studio– it exists in lots of other products. (Including Word!) But the feature is hard to spot and most folks just never seem to learn about it. (I had no idea it was there myself until Jeremiah showed it to me a while back, and now I’m addicted to it.)
SQL Server has amazing instrumentation. Dynamic management views and functions give you great insight into what’s happening in your SQL Server and what’s slowing it down. Interpreting these views and functions takes a lot of time, but with practice and skill you can use them to become great at performance tuning.
But nothing’s perfect. Some of these DMVs and DMFs have bugs or column names that can be misleading. If you take them at face value, you can end up with egg all over your face.
sys.dm_os_sys_info – hyperthread_ratio column
This DMV is great for quickly checking the last startup time of your SQL instance and finding out if it’s virtualized. But don’t trust sys.dm_os_sys_info to tell you whether or not hyperthreading is enabled on your processors.
The “hyperthread_ratio” column is simply an indication that you have multiple cores per processor and does NOT tell you whether or not they are using hyperthreading.
sys.dm_index_usage_stats – the whole concept of “usage”
This DMV is trying to tell you the truth, but almost nobody understands what it means. This DMV tells you how many times a query used an index in its execution plan– but it doesn’t tell you exactly how many times the index was accessed. Only the number of operators referencing it when the plan was run.
A simple example: I run a query that does a nested loop lookup to retrieve a value from the index PK_Address_AddressID. The query runs once, and in that one run it executes the lookup 30 times. This is counted as one “use” of PK_Address_AddressID, even though 30 lookups were executed.
If a single index appears more than once in an execution plan (due to a subquery, union, etc), then it’ll be counted once for each operator in the plan.
TLDR; index_usage_stats does NOT show you the number of “reads” or “writes” an index is really serving up.
sys.dm_exec_trigger_stats – execution_count and last_execution_time columns
Which triggers on your server are running the most and using the most resources? The sys.dm_exec_trigger_stats DMV seems like it’s perfect to answer that question, but beware.
There’s a bug where execution_count increments due to completely unrelated actions that wasn’t fixed until SQL Server 2012. (I’ve seen the info from this DMV be wonky on 2008R2, and I’ve validated I can’t reproduce this on SQL Server 2014, but I haven’t personally tested on 2012.)
Side note: isn’t it cool that Microsoft makes the Connect bug data public? I’m not sure that I ever would have figured out what contributes to inflating the execution counts on my own!
sys.sysindexes – Rowmodctr
It can sometimes be useful to estimate how many changes have occurred since statistics were last updated on a column or index. This gives you a quick way to guestimate if those stats are maybe not-so-fresh.
Good news on this one– the column isn’t perfectly true, but Books Online has a great rundown of its issues. It lets you know that SQL Server doesn’t really use this counter for its own purposes, and that it’s only roughly accurate at best.
And then it lets you know that SQL Server doesn’t expose column modification counters, so this maybe-pretty-close-guestimate counter is still perhaps better than nothing.
I’m a huge fan of SQL Server’s performance counters, wheter you’re querying them via sys.dm_os_performance_counters or looking at them with perfmon.exe.
But there’s a whole host of misleading and just plain outdated counters that will lead you astray. Learn which perf counters to beware in Jeremiah’s post on perfmon.
SQL Server’s Instrumentation is Great
And honestly, so is SQL Server’s documentation. The challenge is that there’s a massive amount to document– and Books Online can’t cover all the nuances of everything. Keep using those DMVs– just keep an eye out for the gotchas.
Sometimes you see someone with toilet paper sticking to their shoe, their shirt tucked into their underwear, or badly in need a tissue to get rid of whatever that is hanging out of their nose. It’s only right to find a discreet way to let them know they should do some quick cleanup before they get completely embarrassed.
The same embarrassing “oops” exist for SQL Server. There are some practices and configurations which just make it look like the DBA may not have learned about this “internet” thing yet.
But, truthfully, it’s very simple to accidentally embarrass yourself. Here are the top 10 (well, 11) signs that your skills or your SQL Server are badly in need of an update:
- You have databases that haven’t been backed up recently
- Your last DBCC CheckDB is over 2 weeks old
- You have jobs that shrink files– or autoshrink enabled
- You truncate your transaction log
- You’re running the RTM version of SQL Server
- You use SQL 2000 style commands like DBCC SHOWCONTIG and DBCC REINDEX on later versions of SQL Server
- Priority Boost is enabled
- Power savings is enabled
- App accounts have SA or db_owner permissions
- It’s a 32-bit SQL Server
- Your tempdb has only one data file
Even if you know better, have you checked your servers to make sure that you don’t have toilet paper on your shoe without realizing it? Run our free sp_blitz™ script to check for most of these issues, and more.