Blog

Four years ago this month, we turned this blog into a consulting company.

We’ve had so much fun over the last few years, and we’re really proud of what we’ve built:

  • Over 500 SQL Critical Care® patients
  • Thousands of training video customers
  • Thousands of in-person training attendees at dozens of classes and pre-cons
  • Hundreds of free YouTube videos watched by over a million people
  • Almost 50,000 email subscribers
  • Several million web site viewers
  • A handful of awesome full time employees

Let’s celebrate. Between now and Tuesday 4/28, discount code Our4thBirthday gets you 40% off online sales of our training videos. Enjoy!

15 comments ↑ Back to top
Internals Cat populates the density vector

Internals Cat populates the density vector

When you create an index in SQL Server with multiple columns, behind the scenes it creates a related multi-column statistic for the index. This statistic gives SQL Server some information about the relationship between the columns that it can use for row estimates when running queries.

But what if you use ‘included’ columns in the index? Do they get information recorded in the statistics?

Here’s my Index

To test, we’ll create an index with multiple key columns and included columns. I’m using a restored copy of the StackOverflow database.

/* FavoriteCount and LastEditDate aren't in the index key */
/* They'll just hang out in the leaf of the index */
CREATE INDEX ix_Posts_OwnerUserId_PostTypeId_INCLUDES 
    ON dbo.Posts (OwnerUserId, PostTypeId) 
    INCLUDE (FavoriteCount, LastEditDate);
GO

First Stop: sys.Stats and Friends

The sys.stats DMV lets me query metadata out about my statistics, and I can join up to find out how many columns it has, what order they are in, when they were updated, and all sorts of info:

SELECT so.name,
    st.name,
    st.stats_id,
    sc.stats_column_id,
    c.name as column_name,
    st.auto_created,
    st.filter_definition,
    sp.last_updated,
    sp.rows,
    sp.rows_sampled,
    sp.steps,
    sp.modification_counter
FROM sys.stats AS st
JOIN sys.stats_columns AS sc on st.object_id=sc.object_id and st.stats_id=sc.stats_id
JOIN sys.columns as c on sc.object_id=c.object_id and sc.column_id=c.column_id
JOIN sys.objects as so on st.object_id=so.object_id
CROSS APPLY sys.dm_db_stats_properties(st.object_id, st.stats_id) sp
WHERE so.name='Posts'
ORDER by so.name, st.stats_id, sc.stats_column_id;
GO

Here’s the first few columns of the results:

statistics-columns-order

This doesn’t show FavoriteCount or LastEdit date. My index key columns are in the statistic, in the same order they appear in the index.

What About DBCC SHOW_STATISTICS?

We can see a representation of the statistic with more information. We just plug in the name of the table and the statistic to DBCC SHOW_STATISTICS, like this:

DBCC SHOW_STATISTICS ('Posts', 'ix_Posts_OwnerUserId_PostTypeId_INCLUDES');
GO

This returns a few result sets. We get information about the header of the statistics, some density information about the relationship between the columns, and a histogram that shows the distribution of rows for the leading column in the statistic.

dbcc show statistics output

 

That Density Vector Has Three Rows!

Check that out, the density vector has a third row to help SQL Server estimate data distribution for OwnerUserId, PostTypeId, and the Id column. We didn’t specify the Id column anywhere in our index!

The Id column is the key of the clustered index on this table. SQL Server decided that it would also be interesting to know the density information of that as a third column.

What if the Index Was Unique? Would it Still Sneak In the Clustering Key?

The index above doesn’t have unique data, but let’s test a different index:

CREATE UNIQUE INDEX ix_VoteTypes_Name on dbo.VoteTypes (Name);
GO

DBCC SHOW_STATISTICS ('VoteTypes', 'ix_VoteTypes_Name');
GO

Here’s the density vector:

dbcc show statistics output unique index

SQL Server decided that tracking the density information for the combination of Name and Id was still useful, even though this is a unique index and SQL Server didn’t have to sneak that column into the key of the index to make it unique behind the scenes.

Included Columns Weren’t in the Density Vector, but My Clustering Key Was!

In other words, included columns won’t be in the density vector unless they happen to also be in the key of your clustered index.

Want to write better queries?

 

1 comment ↑ Back to top

This is a cool SSMS trick I picked up a while back

Though not nearly as far back as I wish I had. It’s so cool I made a .gif of it in action. When you’re done putting your socks back on, I’ll tell you how it works.

AltShiftDemo

This .gif was brought to you by the Cool SSMS Tricks Foundation, in association with Worldwide .gifs

 

Pure ALT+SHIFT magic.

Hold down both keys at the same time, and use your up and down arrow keys to navigate vertically. There will be a thin grey line showing you exactly which rows you’ve grabbed. Then Just type normally. I uh, simulated a typing error, to illustrate that you can also delete text doing this. Yeah.

It really makes doing simple multi-line edits a breeze, especially if you don’t feel like setting up Excel formulas to do similar tasks. These are random Massachusetts zip codes, which is why they get a leading zero, and quotes.

Can you feel the efficiency?!

Kendra says: What in the…. holy cow, that actually works!

Brent says: I knew about that trick, but ZOMG PEOPLE THERE IS A PRODUCTIVITY GIF IN OUR BLOG

43 comments ↑ Back to top

Every now and then, one of our clients considers adopting an alternative database platform – sometimes NoSQL, sometimes a brand new relational database. They’ll ask for our help in evaluating the vendor’s solution.

One of the best ways to do it is ask the database vendor to set us up on a WebEx or GoToMeeting with one of their happy customers. I want to hear from the technical folks, not management.

I run the call with six PowerPoint slides – here’s how:

3 comments ↑ Back to top
Former Sysadmins

Former Sysadmins

When you set up a Windows Server, you might be tempted to add a bunch of roles, just in case you want to use them in the future. Don’t do it: things can get weird.

If you add the Hyper-V role to Windows Server 2012 and higher, it changes some things about how Windows behaves. Your server is now a host. And the host starts to behave in some ways like a virtual machine — it’s now the “Parent Partition”, which is a “quasi-virtual machine”.

There are limits to the number of logical processors that any virtual machine partition might see, so if you have more than 64 logical processors, this can limit what’s available in the Parent Partition / host operating system, and really confuse you.

When SQL Server starts up, it checks to see if you’re virtualized. If the Hyper-V role is enabled, it will think it’s running in a VM. This could end up in a sticky situation if you have a licensing audit and hyper-threading is enabled — you’re supposed to license all the logical processors in a virtual machine regardless of hyper threading.

But good news, this is an easy problem to solve:

  • Don’t install the Hyper-V role “just in case” on a SQL Server
  • If someone else sets up your SQL Servers, check the installed roles before going live
  • Don’t ever design an architecture where a Hyper-V host does double-duty and is also a SQL Server
  • If you log onto a Hyper-V host, look at ‘Logical processors’ to see the total count (‘Host logical processors’ may be lower)
2 comments ↑ Back to top

Yesterday, Microsoft announced availability of Service Pack 1, saying:

As part of our continued commitment to software excellence for our customers, this upgrade is available to all customers with existing SQL Server 2014 deployments via the download links below.

Yeah, about that commitment to software excellence.

This morning, the download is gone:

Notice: The SQL SSIS team has found an issue with SP1 installation if SSIS catalog is present in the SQL Server instance.They are currently investigating this issue including possible workarounds and fixes.

Oof – the term “possible workarounds and fixes” doesn’t sound good for those who jumped in and applied the patch. A commenter on the MS Data Platform Insider blog reported that it hosed the master database broke the instance in a way suspiciously similar to a similar bug in SQL Server 2012 SP2.

(And jeez, what is it with service packs lately? Remember the SQL 2012 SP1 100% CPU issue? I’m starting to think you’re safer with cumulative updates than with service packs.)

Remember, kids, don’t rush into patching. If your servers are mission critical, test in your staging environment first – staging is the DBA’s development. (No, your development environment isn’t staging – because your developers make their living in the dev environment, and if you broke that with SP1 yesterday, you’ll be slaving away today to get your dev instance back up and running.)

Update: the fix is in. If you applied SP1, follow the instructions in this StackExchange post.

16 comments ↑ Back to top

We’ve all had to move a database. For databases above a small size, this is a painful process. While there are a number of ways that you can move a database without downtime, they all involve a lot of extra work. What if there were something easier?

Every database is unique and precious.

Every database is unique and precious.

What’s the Real Problem?

For most people, moving a database is a one time thing – you’re moving to a new server. That’s pretty much it. I spent 5 minutes staring at my ceiling trying to figure out other reasons that most people might move a database from one SQL Server to another.

For some people, though, databases could move on a regular basis. Think about it like this – you host a software as a service (SaaS) application. You’ve got one database per relative unit of work (customer, forum, cat picture collection). These databases can sit idle for months before, suddenly, there’s tremendous activity.

Normally, you wouldn’t make each SQL Server capable of handling peak load for any section of your SaaS. Instead, you would probably have a few servers ready to handle the worst load and move your busiest database over to those servers.

Typically, this is going to involve a small outage and setting up something like database mirroring or log shipping in order to move the database with minimal down time.

SMB to the Rescue!

Microsoft has supported housing SQL Server databases on a file share for a while now. If you’re not familiar with the idea, I recommend checking out the MSDN article Install SQL Server with SMB Fileshare as a Storage Option.

Moving a database with SMB is straight forward – detach the database from the old server, attach the database to the new server, bring the database online. For most of us, this quick change is much preferred to other methods of moving the database.

SMB File Shares and You

Although SMB file shares are a new technology, they’re worth considering for new SQL Servers. SMB file shares make it easy to move databases between SQL Servers. If you want to learn more about using SMB check out the MSDN article on using SMB and SQL Server.

Brent says: If your storage area network (SAN) is using older connectivity methods like 2Gb or 4Gb FC, and your Ethernet is 10Gb, you might even see a performance boost.

11 comments ↑ Back to top

SELECT…INTO  is one of my favorite SQL Server features.

It’s great for creating table skeletons with false WHERE clauses (1=2), moving a real table to a staged/temp location for testing, etc.

In SQL Server 2014

It acquired the ability to go parallel, which is pretty neat, but that’s not what we’re talking about here.

It has some limitations

Chief among them is this:

Indexes, constraints, and triggers defined in the source table are not transferred to the new table, nor can they be specified in the SELECT…INTO statement. If these objects are required, you can create them after executing the SELECT…INTO statement.

Which is… Sort of true. There’s a trick, and I’ll show you a quick example here with another of my favorite things: a Numbers table.

;WITH E1(N) AS (
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL  ),                          
E2(N) AS (SELECT NULL FROM E1 a, E1 b, E1 c, E1 d, E1 e, E1 f, E1 g, E1 h, E1 i, E1 j),
Numbers AS (SELECT TOP (1000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM E2)
SELECT [N].[N]
INTO #NumbersTest
FROM [Numbers] N

ALTER TABLE #NumbersTest ADD CONSTRAINT [PK_Numbers] 
PRIMARY KEY CLUSTERED (N) WITH (FILLFACTOR = 100)

Trying to add the PK constraint here fails, because the column is NULLable

Msg 8111, Level 16, State 1, Line 37
Cannot define PRIMARY KEY constraint on nullable column in table '#NumbersTest'.
Msg 1750, Level 16, State 0, Line 37
Could not create constraint or index. See previous errors.

We can verify this by looking at the table metadata:

SELECT [columns].[name], [columns].[is_nullable]
FROM tempdb.sys.columns 
WHERE [object_id] = OBJECT_ID(N'tempdb..#NumbersTest');


name     is_nullable
N           1

So how do we fix this? We could alter the table, but that won’t leave us with the lasting satisfaction of proving BOL wrong on a technicality. We’ll adjust our code a bit, and try again.

;WITH E1(N) AS (
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL  ),                          
E2(N) AS (SELECT NULL FROM E1 a, E1 b, E1 c, E1 d, E1 e, E1 f, E1 g, E1 h, E1 i, E1 j),
Numbers AS (SELECT TOP (1000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM E2)
SELECT 
ISNULL([N].[N], 0) AS N  /* <--- The magic is here! */
INTO #NumbersTest_IN
FROM [Numbers] N

ALTER TABLE #NumbersTest_IN ADD CONSTRAINT [PK_Numbers] 
PRIMARY KEY CLUSTERED (N) WITH (FILLFACTOR = 100)

This time, with the addition of an ISNULL check on the column, it ‘inherits’ the not NULLable property, and the PK constraint adds successfully. We can verify that by checking the table metadata, if you don’t believe me:

name     is_nullable
N           0

Note that this same behavior does not occur if you replace ISNULL() with COALESCE()

And with that, you can clean up the test tables. Or keep them around. Do some math. Whatever blows your hair back.

Got a favorite use for a numbers table outside of string splitting or doing date math? Let me know in the comments! I may write a follow up.

DROP TABLE [#NumbersTest]
DROP TABLE [#NumbersTest_IN]

Brent says: Wow. That is a really slick trick.

Kendra says: I know some people have bias against SELECT INTO because it seems sloppy and you have to go the extra mile to get the right types, but it can be a great tool. Don’t rule it out.

Jeremiah says: I really like this method – it’s a quick way to copy tables without using other cumbersome techniques

15 comments ↑ Back to top

At SQLbits last month, I presented a new session: Cool Story, Bro – The DBAreactions Guide to SQL Server 2014. I wanted to have some fun while educating folks about the surprise gotchas of the newest features.

Here’s the In-Memory OLTP (Hekaton) section of the session:

Our sp_Blitz® has long warned you if Hekaton is in use, and its Hekaton detail page shows some of the limitations.

10 comments ↑ Back to top

It sure can.

The lock risks of shrinking data files in SQL Server aren’t very well documented. Many people have written about shrinking files being a bad regular practice– and that’s totally true. But sometimes you may need to run a one-time operation if you’ve been able to clear out or archive a lot of data. And you might wonder what kind of pains shrinking could cause you.

One pain it could cause you is blocking and lock waits. It’s easy to reproduce, here’s how.

A Recipe for LCK_M_X Lock Waits and Blocking

I was able to easily reproduce blocking in my restored copy of the StackOverflow database on SQL Server 2014 by doing a few things.

I rebuilt a non-clustered index on the Posts table with this command:

ALTER INDEX [kl_Posts_OwnerUserId_CreationDate] on Posts REBUILD WITH (ONLINE=ON);
GO

This ate up some space, and gave shrink something to do!

I then started a data modification and left an open transaction running:

BEGIN TRAN
DELETE FROM dbo.Posts where Id=16;

And then, in another session, I started to shrink the data file:

DBCC SHRINKFILE ('StackOverflow');
GO

Behold, Blocking!

Shrink starts up and slogs its way through things, and soon enough, lock waits appear. This view is from Adam Machanic‘s sp_WhoIsActive:

lock waits

I’ll take LCK_M_X Waits for 15,000, Alex!

If I run that with @get_locks=1, I can see details on exactly what lock SHRINKFILE is after:

<Database name="StackOverflow">
  <Locks>
    <Lock request_mode="S" request_status="GRANT" request_count="1" />
  </Locks>
  <Objects>
    <Object name="(null)">
      <Locks>
        <Lock resource_type="FILE.FILE_SHRINK" request_mode="X" request_status="GRANT" request_count="1" />
      </Locks>
    </Object>
    <Object name="Posts" schema_name="dbo">
      <Locks>
        <Lock resource_type="OBJECT" request_mode="IX" request_status="GRANT" request_count="1" />
        <Lock resource_type="PAGE" page_type="*" index_name="kl_Posts_OwnerUserId_CreationDate" request_mode="X" request_status="GRANT" request_count="3" />
        <Lock resource_type="PAGE" page_type="*" index_name="kl_Posts_OwnerUserId_CreationDate" request_mode="X" request_status="WAIT" request_count="1" />
      </Locks>
    </Object>
  </Objects>
</Database>

Yep, it’s waiting on a page lock in kl_Posts_OwnerUserId_CreationDate.

But How Many Locks did SHRINKFILE Really Need?

Maybe it needed just a few locks… right?

Well, I ran a trace against my session that did that shrinkfile command, and here’s the number of locks by database and type that it caught…

a few locks

Lock mode: XTreme

And this was a FAST run, except for that blocking!

Shrinking files is painful

It can actually be worse than this — on a few runs, I was able to reproduce blocking with SCH_M locks that could cause even nastier blocking chains.

Shrinking is slow, tedious work. It eats CPU, steals your IO, and always takes longer than you want. And if it has work to do and you’ve got users or processors modifying that data, lock contention ensues.

Worst of all, at the end of shrinking a database, nobody will ever thank you. Be careful out there, and if you must run it make sure you don’t leave it unattended.

Brent says: whoa, I never even thought about that. Maybe we should design indexes to improve DBCC SHRINKDB speed.

Erik says: if you do this more than once, I will come find you.

16 comments ↑ Back to top
css.php