i’ve seen this happen
Especially if you’ve just inherited a database, or started using a vendor application. This can also be the result of inexperienced developers having free reign over index design.
Unless you’re running regular health checks on your indexes with something like our sp_BlitzIndex® tool, you might not catch immediately that you have a heap of HEAPs in your database.
You may be even further flummoxed upon finding that someone thoughtfully created Primary Keys with Nonclustered Indexes on them, yet no Clustered Indexes. Unless the original developer is still around, the intent may not be clear.
Using this code snippet, you can quickly identify tables that were created with Nonclustered Indexes on the Primary Key, and no Clustered Index. Another way to spot this potential issue might be looking for RID lookups, or Table Scans in your Plan Cache. Wide Nonclustered Indexes may also be present to compensate for the lack of a good Clustered Index.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; SELECT QUOTENAME(SCHEMA_NAME([t].[schema_id])) + '.' + QUOTENAME([t].[name]) AS [Table] , QUOTENAME(OBJECT_NAME([kc].[object_id])) AS [IndexName] , CAST((SUM([a].[total_pages]) * 8 / 1024.0 ) AS DECIMAL(18,2))AS [IndexSizeMB] FROM [sys].[tables] [t] INNER JOIN [sys].[indexes] [i] ON [t].[object_id] = [i].[object_id] INNER JOIN [sys].[partitions] [p] ON [i].[object_id] = [p].[object_id] AND [i].[index_id] = [p].[index_id] INNER JOIN [sys].[allocation_units] [a] ON [a].[container_id] = CASE WHEN [a].[type] IN ( 1, 3 ) THEN [p].[hobt_id] WHEN [a].[type] = 2 THEN [p].[partition_id] END INNER JOIN [sys].[key_constraints] AS [kc] ON [t].[object_id] = [kc].[parent_object_id] WHERE ( [i].[name] IS NOT NULL AND OBJECTPROPERTY([kc].[object_id], 'CnstIsNonclustKey') = 1 --Unique Constraint or Primary Key can qualify AND OBJECTPROPERTY([t].[object_id], 'TableHasClustIndex') = 0 --Make sure there's no Clustered Index, this is a valid design choice AND OBJECTPROPERTY([t].[object_id], 'TableHasPrimaryKey') = 1 --Make sure it has a Primary Key and it's not just a Unique Constraint AND OBJECTPROPERTY([t].[object_id], 'IsUserTable') = 1 --Make sure it's a user table because whatever, why not? We've come this far ) GROUP BY [t].[schema_id] , [t].[name] , OBJECT_NAME([kc].[object_id]) ORDER BY SUM([a].[total_pages]) * 8 / 1024.0 DESC; |
There are times when heaps are a valid choice
ETL or staging tables are the most common examples of when raw insert throughput is necessary, and a Clustered Index may not be.
But when tables are designed to be queried against and lack a Clustered Index, it’s usually a problem to be fixed.
Brent says: and remember, kids, SQL Server won’t suggest a clustered index.
Kendra says: if you think this probably didn’t happen to you, that’s a good sign you should doublecheck.
31 Comments. Leave new
My third party production database returned 5,733 rows, 4,199 rows that have 0.000 index size! The largest index of the 1,534 remaining rows was 10,243.109375GB…..
Hi Eric,
thx for the script.
Column “[IndexSizeGB]” => [IndexSizeMB] (maybe a typo?) 😉
Good eye! I started out looking at the size in GB. Fixing now.
Thanks for the article,
But why so complicated? you also are aggregating the size for all indexes in the table.
Unless that’s intended (get index name, but size for all NC’s), to identify HEAPs with NC PK, I’d go for simpler
SELECT QUOTENAME(OBJECT_SCHEMA_NAME(ix.object_id)) + ‘.’ +
QUOTENAME(OBJECT_NAME(ix.object_id)) AS object_name
, QUOTENAME(ix.name) AS index_name
FROM sys.indexes AS ix
LEFT JOIN (SELECT object_id FROM sys.indexes AS cix WHERE cix.type_desc = ‘CLUSTERED’) AS cix
ON cix.object_id = ix.object_id
WHERE ix.type_desc = ‘NONCLUSTERED’
AND ix.is_primary_key = 1
AND cix.object_id IS NULL
Then you can add partitions and allocation_units to get real size of 1 NC index which is also PK
If there’re are columns called [type_desc] and [is_primary_key] why not used them?
Nice article though, pointing something might be overlooked
Wouldn’t the row count be at least as useful as the IndexSizeMB?
IndexRowCount = ISNULL( ( SELECT SUM(p.rows)
FROM sys.partitions p2
WHERE p.index_id = i.index_id
AND kc.[object_id] = p.object_id
), 0
)
Yeah, my third-party app has about 300 tables. There are approximately five clustered indexes, and most of those are on GUID values.
Of course, there are lots of non-clustered indexes and FK relationships.
How lucky of you that your third-party app has FKs. 🙂
One of the ones we use has 138 tables and no FKs. It is an award-winning and market-leading application in its category.
that’s nothing man!
Just seen one of mines and has 3410 HEAPs with no PKs, hence no FKs whatsoever!
Sorry, the point I didn’t make clear… The vendor is okay with us adding clustered indexes, which is a project I’m working on.
Unfortunately, when one table has, for example, 46 FKs that need to be dropped and recreated as part of the process of clustering the table … 😐
Aaron Bertrand actually has a great script for dropping and recreating FKs:
https://www.mssqltips.com/sqlservertip/3347/drop-and-recreate-all-foreign-key-constraints-in-sql-server/
Okay, that’s awesome. Thanks!
My primary application that I work with has very large databases with 100s of tables. Some of them in the past were created without primary keys.
How can I get a list of tables that LACK a primary key? I really don’t want to troll through my entire database to check each table to see if it has a primary key.
Stewart – I use this script to hunt down missing PK’s
SELECT OBJECT_SCHEMA_NAME( object_id ) as SchemaName, name AS TableName
FROM sys.tables
WHERE OBJECTPROPERTY(object_id,’tablehasprimaryKey’) = 0
ORDER BY SchemaName, TableName ;
you can try
SELECT * FROM sys.tables
WHERE object_id NOT IN (SELECT object_id FROM sys.indexes WHERE is_primary_key = 1)
Cheers
I recent inherited a mission-critical database full of heaps, and GUIDs for keys. They said ‘we have performance problems…” – the blitz tool was a great help – runs much better now
Eric,
In the system I checked, this line seems unnecessary (I get the same results with or without it):
AND OBJECTPROPERTY(kc.[object_id], ‘CnstIsNonclustKey’) = 1
What situation will this line filter out that isn’t taken care of by the other filters?
You may get different results if you have a column with a constraint that is not indexed.
Eric, using single quotes after — comment dashes?
It’s OK for code snippets, but careful in Production code.
I’ve had SQL 2008 views throw horrible exceptions after I added a header description comment with single quotes, then tried scripting out an ALTER VIEW from SSMS…
Do you normally use snippets of code from the internet in production? Our number is in the side bar 🙂
Hah! Certainly not, I only use code by email spam bots and IRC strangers 🙂
My comment was just a heads up to one of many SSMS bugs that might catch one out.
The plus/minus side is that it’s got me typing propely expandid Inglish in my TC-cool coments 😉
The join between sys.partitions and sys.allocation_units is dependent upon the allocation unit type (heap, clustered index or nonclustered index) and should be made like so
INNER JOIN sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN sys.allocation_units au ON
CASE
WHEN au.[type] IN (1,3) THEN p.hobt_id
WHEN au.[type] = 2 THEN p.partition_id
END
Regards Perry
Huh. I’ve never seen it done like that before. That’s neat though, I’ll have to mess with it. Thanks, Perry.
Lol well at least no one spotted my mistake.
The join is based on the allocation unit type where 1 and 3 are row data and row overflow, type 2 is LOB based. Check books online for more detail
Regards Perry
I usually open this thing up and zoom in a couple million percent:
http://www.microsoft.com/en-us/download/details.aspx?id=39083
So I have this table where the unique column is the container bar code. It’s 48 characters wide and not ascending. I don’t want to use this as the clustered index as that then gives all of the other indexes a 48 character pointer in place of the 8 byte RID. Yes, there are many thousands of rows.
I could put an auto-number column in there but it would mean nothing and it would not help in avoiding index bloat. Sometimes all you can do is all you can do.
Is the issue tables with non-clustered primary keys, but no clustered index?
Seems like there is some context missing here.
Robin – correct, that’s exactly what the title said. 😀
Stumbled upon 2 tables in MSDB.
[dbo].[sysssispackagefolders]
[dbo].[sysssispackages]
Do you think this is design?
Dennis – are those system tables? If so, I wouldn’t worry about them since you can’t change them.
Sure are.
Sometimes it’s hard to be convinced of “Best practices” when Microsoft go against them in their own structure. Just gotta try to love them for 15 mins then hate them for 15 mins. 🙂
Hi Brent,
Maybe better to filter Memory Optimized tables in your query since clsutered index is not supported there and they are shown in the result