sp_BlitzIndex® just can’t keep a secret. That’s a feature, not a bug. I wrote sp_BlitzIndex® specifically so that it would rat out some things that SQL Server is doing behind the scenes to secretly help you out– because it’s really better if you know about it.
What are secret columns?
SQL Server needs a unique way to navigate to every row in an index. It also needs to maintain a relationship between every nonclustered index and its base table. In order to accomplish these two tasks, SQL Server always makes sure to add one or more columns to a nonclustered index. If you don’t add the columns yourself, it goes ahead and does it secretly. (“Secret columns” is a term I made up, because a term didn’t exist yet.)
In most cases, people create nonclustered indexes on tables that have clustered indexes. When this happens, SQL Server will make sure that all of the columns from the clustering key of the table are ALSO in the nonclustered index. Exactly how it happens is a little complicated. Here’s a quick summary of the rules involved:
- If the nonclustered index is NOT unique, SQL Server makes sure that the columns from the clustering key make it unique behind the scenes
- This is done by promoting the clustered key columns to the key of the nonclustered index (otherwise they can be “included” columns)
- If the clustered index columns are NOT unique, an additional secret column called a “uniquifier” will have already been added to it– and that sneaks into the nonclustered index too
This made my head spin around the first time I tried to sort it all out. Please don’t try to memorize that!
sp_BlitzIndex® helps make this easier
I never expect anyone to remember this stuff. I built sp_BlitzIndex® so that it would give you the information you need about what’s going on in a specific index without having to memorize a whole lot of rules.
Let’s take a look at a specific example. The AdventureWorks sample database has a table named Production.WorkOrderRouting. We can take a look at the indexes, missing indexes, columns, and foreign keys on this table in sp_BlitzIndex® by running the command:
This shows us that the table has two indexes. One of these is the clustered index:
The definition indicates this with the code “[CX]”. It’s also a Primary Key, indicated by the “[PK]”. Primary keys are always unique in SQL Server, so this is a Clustered PK (which is by definition unique). Because this is the clustered index we know that it will contain EVERY column in the index all of the data. The data is sorted by three key columns: WorkOrderID, ProductID, OperationSequence.
Where are the secret columns?
In this case, the secret columns are all in the nonclustered index. The nonclustered index looks very narrow and simple at first. It has only one key column, ProductID:
Scrolling over, sp_BlitzIndex® claims that things aren’t so simple. There are three secret columns in this nonclustered index, and they have been put in its key:
Those are the exact three columns from our clustered index! (WorkOrderID, ProductID, OperationSequence)
Exactly how many columns are in this nonclustered index?
You may have noticed that the ProductID column appears twice in sp_BlitzIndex® for this nonclustered index. It was defined as a key column, but now we’re saying that it’s a secret column as well.
Question: Does that mean the column is stored twice? No, the column is NOT stored twice. The effective definition of this nonclustered index is that it has THREE columns in it (not just one, as the creator might have thought). The columns are in the following order: ProductID, WorkOrderID, OperationSequence
SQL Server will only ever add columns from the clustering key to an index if you have not specified them in the index. It will never have to store a column twice. I repeat, there is no penalty if you list a column from the clustered index in your nonclustered index. (Yay!)
To determine the columns that will be in an index, look first at the index definition. Then add in the “secret” columns which haven’t already been included.
Why doesn’t sp_BlitzIndex® remove “duplicate” secret columns?
Well, it’s complicated. I made this design choice for a few reasons:
- I wanted to preserve the definition as the index was created
- The definition may state a column as an include that is made into a key column by SQL Server secretly. The easiest way to show this is to show both
- I wanted to show the secret columns on their own so it would be more clear that they will stay in the nonclustered index even if you remove them from the stated definition
A more complex example
Maybe you’re not sure if I’m telling the truth about these secret columns. The good news is, I can prove it. First, let’s look at a more complex example.
We’re sticking with the Production.WorkOrderRouting table. What happens if we add a more complex nonclustered index? This new nonclustered index has two different columns from the clustering key — one as a key column, and one as an included column. I create the new nonclustered index with this command:
CREATE INDEX ixtest
( PlannedCost, ProductID, ActualCost)
INCLUDE (ActualEndDate, OperationSequence)
sp_BlitzIndex® shows that the new index is assigned IndexID=4 for the table.
Definition: [3 KEYS] PlannedCost, ProductID, ActualCost
[2 INCLUDES] ActualEndDate, OperationSequence
Secret Columns: [3 KEYS] WorkOrderID, ProductID, OperationSequence
There’s some overlap here. To interpret this, we know:
- The key columns in the user definition will be present in the order specified
- Secret columns are being added to the key of the nonclustered index in this case. So any secret column that was listed as an “included” column of the user definition will be promoted by SQL Server.
Based on the rules, the functional definition of this index is:
5 Keys = PlannedCost, ProductID, ActualCost, WorkOrderID, OperationSequence
1 Include = ActualEndDate
We effectively had WorkOrderID added to the key (it wasn’t listed), and OperationSequence promoted to the key from the include.
Let’s prove it!
To prove it we’re going to use a handy new DMV in SQL Server 2012 and the built in DBCC PAGE command.
These are both undocumented tools. Please do not run this in production. Seriously, I’m not kidding. There are bugs with this stuff sometimes, OK? Why risk it? This is suitable for AdventureWorks on a test instance far from production, your boss, and your resume.
First, I’m going to find a single page from the leaf level of this new nonclustered index. I love the new sys.dm_db_database_page_allocations DMV because it makes this super easy, as long as I know that my new index has ID=4 (I got that from sp_BlitzIndex® earlier):
SELECT top 1 allocated_page_page_id,
OBJECT_ID('Production.WorkOrderRouting'), /*object_id */
4, /*index id */
NULL, /*partition id */
page_type=2 /* index page*/
and page_level=0 /*leaf of the index */
ORDER BY page_level DESC, next_page_page_id DESC;
Next, I have SQL Server dump the contents of that index page to my screen. I do that by using this (not officially documented) commands:
--This trace flag prints DBCC PAGE data to our screen.
--Otherwise it'll just go to the error log.
DBCC TRACEON (3604);
--Plug in the page
--I'm looking in file #1
--Dump style = 3
And VOILA! I get back a dump of all sorts of information about what’s on that single page of data– including exactly what’s being used as key and includes for that index!
Were we right?
Sure enough we can validate our theory. The “secret columns” put a new column in our key and promoted one column from the include.
The good news is the order of the keys we specified was preserved, and no column had to be stored twice. That’s awesome!
The bad news is that SQL Server didn’t tell us about this on our own. We had to use a special tool to figure it out. And while this is somewhat of a party trick, it’s probably not something you can trot out to fascinate the crowd at your next family reunion.
Frequently asked questions
I’m overwhelmed! What if I don’t get this? Breathe deep, I promise it’s OK. If you’re just getting started, that’s understandable. Just remember that the columns in “secret columns” will be added to the nonclustered index if they aren’t already there. That’s all you really need to get going.
Is it good to have all those columns in the clustering key? Not necessarily. You want the clustering key to be narrow– having a lot of columns or even a few large columns in it can clearly bloat all your indexes on the table and cause lots of problems. Before you get too comfy, remember it’s not always bad to have more than one narrow column in a clustering key, particularly if they create a unique, static key that is useful for range queries and has a healthy write pattern.
Isn’t it actually more complicated than this?
Isn’t this complicated enough? Sure, “Row Identifiers” (“RIDs”) come into play for heaps and there’s those pesky “uniquifiers”. That’s a bit much to explain in one post. (If they impact you, sp_BlitzIndex® will show you those, too.)
How do I get sp_BlitzIndex®? Download sp_BlitzIndex® here.