Some people hate repetition. They want to avoid manual tasks at all costs.
I used to be part of the cult of automation. I thought that everything should be made efficient. Why should I do something twice, if I can figure out how to do it once and then just run a script or push a button?
I don’t feel that way anymore about a lot of things.
One thing I do often is write scripts to reproduce problems or demonstrate how something works. Very often, there’s pretty similar setup steps in these scripts. I’ve written the following code hundreds of times:
/* Recreate the test database with prejudice */ USE master; GO IF DB_ID ('IAMNotAProductionDatabase') IS NOT NULL BEGIN ALTER DATABASE IAMNotAProductionDatabase SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE IAMNotAProductionDatabase; END GO CREATE DATABASE IAMNotAProductionDatabase; GO
It’s often useful to back up a demo database and restore it to avoid a long setup process. So I also frequently retype commands like this:
BACKUP DATABASE IAMNotAProductionDatabase TO DISK=N'S:\Backup\IAMNotAProductionDatabase.bak' WITH COMPRESSION, INIT; GO RESTORE DATABASE IAMNotAProductionDatabase FROM DISK=N'S:\Backup\IAMNotAProductionDatabase.bak' WITH NORECOVERY, REPLACE; GO /* I like to do this in a separate step because I am neurotic like that */ RESTORE DATABASE IAMNotAProductionDatabase WITH RECOVERY; GO
I could use snippets in SQL Server Management Studio to reduce the typing. I could use a third party tool which auto-completes statements.
But I don’t. I re-type it every time. I was slow at first, but I’m much faster at it now.
These are commands I care about. Sure, they’re simple, they’re basic, they’re nothing fancy. But they’re really core to my profession and my job. I like being able to type these commands quickly and not wonder for too long about the syntax. I like not having to worry if I remembered to set up snippets in whatever test environment I’m using, or if it has third party tools installed.
I’m terrible at memorizing things, and this helps.
I’m not against automation. For production monitoring and responding to incidents, setting up tools and processes that help you work faster and avoid having one person as a single point of failure is awesome. Just don’t apply that principal to everything in your work.
There are some tasks that are fine to repeat, if you want to get to know them by heart.
Brent says: to learn more, check out our list of tasks you probably shouldn’t automate, and read the comments too – lively discussion in there.
Jeremiah says: I’m a huge fan of automation when you understand what you’re automating, why you’re automating it, and what can go wrong when you’re not watching. Otherwise, you need to be really careful.
Erik says: Just about everything I’ve automated has stemmed from a .sql file full of trial, error, and documentation. Know your processes, and make sure someone is made aware if they start failing.
There are a few little gems built into SQL Server Management Studio that are easy to forget, but they sure can come in handy.
I sometimes generate a little formatted report of recent growths for a database, and how long they took. It’s really easy, and it’s built right in. Here’s how:
- Right click on a database name in Object Explorer
- Select Reports
- Then Standard Reports
- Then Disk Usage. Voila, the Disk Usage report appears!
- To see recent autogrowth or autoshrink events, click the little plus sign under the pie charts.
Here’s a 30 second video of that in action:
Where Does SQL Server Track Autogrowth Events?
This report reads from the default trace files. Those files can roll over pretty frequently in an active environment, so this only gives you recent events. You can get the same information with a query, or look for slow auto-growths using our free sp_Blitz® script.
When sp_Blitz® gives me a warning that something’s up, I like to get screenshots from this report to go the extra mile. It’s very nicely formatted and shows any troublesome latency pretty clearly.
Developers have struggled with a problem for a long time: how do I load up a new table, then quickly switch it in and replace it, to make it visible to users?
There’s a few different approaches to reloading data and switching it in, and unfortunately most of them have big problems involving locking. One method is this:
- Create a schema for staging
- When loading tables, create them in the staging schema
- When load is complete, drop the table from your production schema
- Then use ALTER SCHEMA TRANSFER to move the staging table into the production schema
This method sounds good, because items being worked on for staging are easy to identify in their own schema. There’s clearly going to be issues with concurrency between dropping the table and transferring in the new table, but the gamble is that will be fast and can be minimized.
ALTER SCHEMA TRANSFER locks the whole schema
The concurrency problem is worse than you might think. When you run ALTER SCHEMA TRANSFER, it locks up everything in that schema, so nobody can use any table– the lock is not limited to the table you’re transferring.
Not sure? Let’s take a look. First, we create database:
CREATE DATABASE LockTest; GO
The create a schema named LockMe in the database. Create a table named InnocentTable in the LockMe schema, and insert some rows.
USE LockTest; GO CREATE SCHEMA LockMe AUTHORIZATION dbo GO CREATE TABLE LockMe.InnocentTable ( i int identity not null ); GO INSERT LockMe.InnocentTable DEFAULT VALUES GO 10
Now create a new table in the dbo schema and add a few rows. We’re going to transfer this into the LockMe schema.
CREATE TABLE dbo.NewFriend ( i int identity not null ); GO INSERT dbo.NewFriend DEFAULT VALUES GO 20
Alright, now it’s time to do the transfer! Here we go. To make this easy to see, we’re going to leave a transaction open:
BEGIN TRAN ALTER SCHEMA LockMe TRANSFER NewFriend
In another session, we just try to query LockMe.InnocentTable. We didn’t run any commands against it, but this query gets blocked and can’t make any progress:
SELECT TOP 1 * FROM LockMe.InnocentTable
Looking at this in a third session with sp_WhoIsActive, we can see the problem:
Some Bad News: There Is No Awesome Way to Swap In a Table
The locking you see above is a very high level lock. The query can’t even get a schema stability lock on LockMe.InnocentTable. There’s no magic hint that will get you around it: adding a NOLOCK hint won’t work (it still needs a schema stability lock).
Optimistic locking won’t get you around it either — queries in SNAPSHOT and READ COMMITTED SNAPSHOT need to get a schema stability lock, too.
If You Must Swap In a Table, ALTER SCHEMA is the Greater Evil
Unless you really want to block absolutely everyone who might use the schema (and live with the consequences), this command will cause you pain.
Linked Servers are easy to set up in SQL Server, but they can have major problems with security and performance. Join Kendra to find out what may be wrong with your Linked Servers, and your options for fixing them.
Got limited time? We got your back. Check out the table of contents for the video below, if you want to skip around.
Linked Servers – Table of Contents
01:55 – Problem #1: sloppy security
03:28 – Problem #2: Questionable queries
04:24 – Hey, why are you filtering AFTER?
05:05 – The “Collation Compatible” Setting
06:34 – What can go wrong?
07:26 – Problem #3: Terrible Statistics
08:03 – Thankfully, Microsoft finally documented this
08:53 – But… we have a conflict!
09:53 – SQL Server 2012 SP1 offers relief
10:47 – Security option: this isn’t perfect
12:07 – Impersonation/Kerberos: this is harder than it looks
14:35 – Reduce security risks
15:56 – Learn more: BrentOzar.com/GO/Link
Want to brush up your skills on SQL Server over the next few months? We’ve got free webcasts coming your way from May to August! Don’t miss out, register today and get these free events on your calendar right away.
How to Configure Quorum for SQL Server Clusters (Sponsored by Idera Software)
Friday, May 22, 2015 11:00 AM – 12:00 PM Central
“Quorum” is incredibly important to keep your SQL Server online when you use Windows Failover Clustering or AlwaysOn Availability Groups. Join Microsoft Certified Master Kendra Little to learn what quorum is, how to see the current quorum configuration, how to change it, and guidelines for how to configure quorum in three real-world scenarios. Register for this free webcast here.
Triaging Failures in Availability Groups (Sponsored by Idera Software)
Friday, Jun 19, 2015 11:00 AM – 12:00 PM Central
You’ve heard a lot about Availability Groups, but have you ever had to troubleshoot a problem? Microsoft Certified Master Kendra Little will walk you through an example of a failure in an Availability Group and show you how she solved it. You’ll also learn the ingredients you need to set up a test environment and start solving problems yourself. Register for this free webcast now.
Field Medic’s Guide to Database Mirroring: Idera Geek Sync
Wednesday, August 19, 2015, 10:00 AM – 11 AM Central
SQL Server’s Database Mirroring feature is a powerful tool and much simpler to manage than Availability Groups. Join Microsoft Certified Master Kendra Little to learn when Database Mirroring is the right choice, common rescue scenarios, and special setup required for mirrors.
You’ll leave the webcast with tips and tricks to manage Database Mirroring. Click here to register for this free Webcast before spots run out.
Watch Now: Consolidating Indexes in SQL Server
Duplicate indexes can be a big problem for your performance and maintenance. In this free video, you’ll learn why they can be such a pain, and you’ll also get to step through three consolidation scenarios. Get out your pen and scratch paper to test your knowledge!
Head on over to the Idera site and watch the free video of the webcast today.
SQL Server DBAs often need to work with virtualization– and frequently need to work with multiple virtualization platforms. Many platforms have similar features, but they often have different names, and it’s hard to remember what everything’s called.
I’ve got a little cheat sheet for you for two of the platforms! (No offense, Xen fans, please don’t come after me.)
|What is it?||VMware name||Hyper-V name|
|Management Tools, Free||vSphere / web client||Hyper-V Manager (snap-in)|
|Management Tools, Paid||vCenter Server||System Center Virtual Machine Manager|
|Automatic migration to another host if one fails||High Availability (HA)||High Availability (configured via a role in a Windows Failover Cluster)|
|Moving a VM from host to host||vMotion||Live Migration|
|Moving a VM from one storage subsystem to another||Storage vMotion||Storage Migration|
|Automatic load balancing||DRS (Distributed Resource Scheduler)||Dynamic Optimization in VMM (Virtual Machine Manager, a part of System Center)|
|Rules to keep VMs from being too close||Affinity Rules for DRS||Availability Sets, part of Intelligent Placement in VMM|
Want to learn more about virtualizing SQL Server?
Have you ever wondered if you should be a SQL Server Database Administrator? If you think this might be the right career path for you, watch this video and take our quiz!
Got limited time or need to restart the quiz? No problem, we’ve got a table of contents with time indicators below.
“Should You Be a SQL Server DBA?” Table of Contents:
00:21 – Getting the wrong job stinks
00:44 – Getting the right job is awesome
01:44 – Get paper and pen, it’s quiz time!
02:11 – Question 1
02:38 – Question 2
03:09 – Question 3
03:22 – Question 4
03:39 – Question 5
03:59 – Question 6
05:39 – Question 7
06:41 – Question 8
07:26 – Time to score this thing!
07:33 – Scoring Question 1
08:45 – Scoring Question 2
10:25 – Scoring Question 3
11:38 – Scoring Question 4
12:49 – Scoring Question 5
13:43 – Scoring Question 6
15:37 – Scoring Question 7
17:25 – Scoring Question 8
19:28 – What does this all mean?
20:29 – Traits of a happy DBA
21:33 – Learn more at BrentOzar.com/go/Career
Brent says: if you think this quiz is just for people who aren’t in the DBA business already, think again. Kendra’s title is “SHOULD you be a DBA?” And, uh, some of us probably shouldn’t. (I love this stuff though.)
When you create an index in SQL Server with multiple columns, behind the scenes it creates a related multi-column statistic for the index. This statistic gives SQL Server some information about the relationship between the columns that it can use for row estimates when running queries.
But what if you use ‘included’ columns in the index? Do they get information recorded in the statistics?
Here’s my Index
To test, we’ll create an index with multiple key columns and included columns. I’m using a restored copy of the StackOverflow database.
/* FavoriteCount and LastEditDate aren't in the index key */ /* They'll just hang out in the leaf of the index */ CREATE INDEX ix_Posts_OwnerUserId_PostTypeId_INCLUDES ON dbo.Posts (OwnerUserId, PostTypeId) INCLUDE (FavoriteCount, LastEditDate); GO
First Stop: sys.Stats and Friends
The sys.stats DMV lets me query metadata out about my statistics, and I can join up to find out how many columns it has, what order they are in, when they were updated, and all sorts of info:
SELECT so.name, st.name, st.stats_id, sc.stats_column_id, c.name as column_name, st.auto_created, st.filter_definition, sp.last_updated, sp.rows, sp.rows_sampled, sp.steps, sp.modification_counter FROM sys.stats AS st JOIN sys.stats_columns AS sc on st.object_id=sc.object_id and st.stats_id=sc.stats_id JOIN sys.columns as c on sc.object_id=c.object_id and sc.column_id=c.column_id JOIN sys.objects as so on st.object_id=so.object_id CROSS APPLY sys.dm_db_stats_properties(st.object_id, st.stats_id) sp WHERE so.name='Posts' ORDER by so.name, st.stats_id, sc.stats_column_id; GO
Here’s the first few columns of the results:
This doesn’t show FavoriteCount or LastEdit date. My index key columns are in the statistic, in the same order they appear in the index.
What About DBCC SHOW_STATISTICS?
We can see a representation of the statistic with more information. We just plug in the name of the table and the statistic to DBCC SHOW_STATISTICS, like this:
DBCC SHOW_STATISTICS ('Posts', 'ix_Posts_OwnerUserId_PostTypeId_INCLUDES'); GO
This returns a few result sets. We get information about the header of the statistics, some density information about the relationship between the columns, and a histogram that shows the distribution of rows for the leading column in the statistic.
That Density Vector Has Three Rows!
Check that out, the density vector has a third row to help SQL Server estimate data distribution for OwnerUserId, PostTypeId, and the Id column. We didn’t specify the Id column anywhere in our index!
The Id column is the key of the clustered index on this table. SQL Server decided that it would also be interesting to know the density information of that as a third column.
What if the Index Was Unique? Would it Still Sneak In the Clustering Key?
The index above doesn’t have unique data, but let’s test a different index:
CREATE UNIQUE INDEX ix_VoteTypes_Name on dbo.VoteTypes (Name); GO DBCC SHOW_STATISTICS ('VoteTypes', 'ix_VoteTypes_Name'); GO
Here’s the density vector:
SQL Server decided that tracking the density information for the combination of Name and Id was still useful, even though this is a unique index and SQL Server didn’t have to sneak that column into the key of the index to make it unique behind the scenes.
Included Columns Weren’t in the Density Vector, but My Clustering Key Was!
In other words, included columns won’t be in the density vector unless they happen to also be in the key of your clustered index.
Want to write better queries?
- Learn How to Think Like SQL Server for $29
- Learn How to Tune Indexes and Speed Up SQL Server for $299
- Join us in person for a week of Advanced Query and Indexing
When you set up a Windows Server, you might be tempted to add a bunch of roles, just in case you want to use them in the future. Don’t do it: things can get weird.
If you add the Hyper-V role to Windows Server 2012 and higher, it changes some things about how Windows behaves. Your server is now a host. And the host starts to behave in some ways like a virtual machine — it’s now the “Parent Partition”, which is a “quasi-virtual machine”.
There are limits to the number of logical processors that any virtual machine partition might see, so if you have more than 64 logical processors, this can limit what’s available in the Parent Partition / host operating system, and really confuse you.
When SQL Server starts up, it checks to see if you’re virtualized. If the Hyper-V role is enabled, it will think it’s running in a VM. This could end up in a sticky situation if you have a licensing audit and hyper-threading is enabled — you’re supposed to license all the logical processors in a virtual machine regardless of hyper threading.
But good news, this is an easy problem to solve:
- Don’t install the Hyper-V role “just in case” on a SQL Server
- If someone else sets up your SQL Servers, check the installed roles before going live
- Don’t ever design an architecture where a Hyper-V host does double-duty and is also a SQL Server
- If you log onto a Hyper-V host, look at ‘Logical processors’ to see the total count (‘Host logical processors’ may be lower)
It sure can.
The lock risks of shrinking data files in SQL Server aren’t very well documented. Many people have written about shrinking files being a bad regular practice— and that’s totally true. But sometimes you may need to run a one-time operation if you’ve been able to clear out or archive a lot of data. And you might wonder what kind of pains shrinking could cause you.
One pain it could cause you is blocking and lock waits. It’s easy to reproduce, here’s how.
A Recipe for LCK_M_X Lock Waits and Blocking
I was able to easily reproduce blocking in my restored copy of the StackOverflow database on SQL Server 2014 by doing a few things.
I rebuilt a non-clustered index on the Posts table with this command:
ALTER INDEX [kl_Posts_OwnerUserId_CreationDate] on Posts REBUILD WITH (ONLINE=ON); GO
This ate up some space, and gave shrink something to do!
I then started a data modification and left an open transaction running:
BEGIN TRAN DELETE FROM dbo.Posts where Id=16;
And then, in another session, I started to shrink the data file:
DBCC SHRINKFILE ('StackOverflow'); GO
If I run that with @get_locks=1, I can see details on exactly what lock SHRINKFILE is after:
<Database name="StackOverflow"> <Locks> <Lock request_mode="S" request_status="GRANT" request_count="1" /> </Locks> <Objects> <Object name="(null)"> <Locks> <Lock resource_type="FILE.FILE_SHRINK" request_mode="X" request_status="GRANT" request_count="1" /> </Locks> </Object> <Object name="Posts" schema_name="dbo"> <Locks> <Lock resource_type="OBJECT" request_mode="IX" request_status="GRANT" request_count="1" /> <Lock resource_type="PAGE" page_type="*" index_name="kl_Posts_OwnerUserId_CreationDate" request_mode="X" request_status="GRANT" request_count="3" /> <Lock resource_type="PAGE" page_type="*" index_name="kl_Posts_OwnerUserId_CreationDate" request_mode="X" request_status="WAIT" request_count="1" /> </Locks> </Object> </Objects> </Database>
Yep, it’s waiting on a page lock in kl_Posts_OwnerUserId_CreationDate.
But How Many Locks did SHRINKFILE Really Need?
Maybe it needed just a few locks… right?
Well, I ran a trace against my session that did that shrinkfile command, and here’s the number of locks by database and type that it caught…
And this was a FAST run, except for that blocking!
Shrinking files is painful
It can actually be worse than this — on a few runs, I was able to reproduce blocking with SCH_M locks that could cause even nastier blocking chains.
Shrinking is slow, tedious work. It eats CPU, steals your IO, and always takes longer than you want. And if it has work to do and you’ve got users or processors modifying that data, lock contention ensues.
Worst of all, at the end of shrinking a database, nobody will ever thank you. Be careful out there, and if you must run it make sure you don’t leave it unattended.
Brent says: whoa, I never even thought about that. Maybe we should design indexes to improve DBCC SHRINKDB speed.
Erik says: if you do this more than once, I will come find you.