Finding Tables with Nonclustered Primary Keys and no Clustered Index

Last Updated 7 years ago

i’ve seen this happen

Especially if you’ve just inherited a database, or started using a vendor application. This can also be the result of inexperienced developers having free reign over index design.

Unless you’re running regular health checks on your indexes with something like our sp_BlitzIndex® tool, you might not catch immediately that you have a heap of HEAPs in your database.

You may be even further flummoxed upon finding that someone thoughtfully created Primary Keys with Nonclustered Indexes on them, yet no Clustered Indexes. Unless the original developer is still around, the intent may not be clear.

Using this code snippet, you can quickly identify tables that were created with Nonclustered Indexes on the Primary Key, and no Clustered Index. Another way to spot this potential issue might be looking for RID lookups, or Table Scans in your Plan Cache. Wide Nonclustered Indexes may also be present to compensate for the lack of a good Clustered Index.

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

SELECT QUOTENAME(SCHEMA_NAME([t].[schema_id])) + '.' + QUOTENAME([t].[name]) AS [Table] ,
 QUOTENAME(OBJECT_NAME([kc].[object_id])) AS [IndexName] ,
 CAST((SUM([a].[total_pages]) * 8 / 1024.0 ) AS DECIMAL(18,2))AS [IndexSizeMB]
FROM [sys].[tables] [t]
INNER JOIN [sys].[indexes] [i]
ON [t].[object_id] = [i].[object_id]
INNER JOIN [sys].[partitions] [p]
ON [i].[object_id] = [p].[object_id]
 AND [i].[index_id] = [p].[index_id]
INNER JOIN [sys].[allocation_units] [a]
ON [a].[container_id] = CASE WHEN [a].[type] IN ( 1, 3 ) THEN [p].[hobt_id]
 WHEN [a].[type] = 2 THEN [p].[partition_id]
 END
INNER JOIN [sys].[key_constraints] AS [kc]
ON [t].[object_id] = [kc].[parent_object_id]
WHERE (
 [i].[name] IS NOT NULL
 AND OBJECTPROPERTY([kc].[object_id], 'CnstIsNonclustKey') = 1 --Unique Constraint or Primary Key can qualify
 AND OBJECTPROPERTY([t].[object_id], 'TableHasClustIndex') = 0 --Make sure there's no Clustered Index, this is a valid design choice
 AND OBJECTPROPERTY([t].[object_id], 'TableHasPrimaryKey') = 1 --Make sure it has a Primary Key and it's not just a Unique Constraint
 AND OBJECTPROPERTY([t].[object_id], 'IsUserTable') = 1 --Make sure it's a user table because whatever, why not? We've come this far
 )
GROUP BY [t].[schema_id] ,
 [t].[name] ,
 OBJECT_NAME([kc].[object_id])
ORDER BY SUM([a].[total_pages]) * 8 / 1024.0 DESC;

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

SELECT QUOTENAME(SCHEMA_NAME([t].[schema_id])) + '.' + QUOTENAME([t].[name]) AS [Table] ,

QUOTENAME(OBJECT_NAME([kc].[object_id])) AS [IndexName] ,

CAST((SUM([a].[total_pages]) * 8 / 1024.0 ) AS DECIMAL(18,2))AS [IndexSizeMB]

FROM [sys].[tables] [t]

INNER JOIN [sys].[indexes] [i]

ON [t].[object_id] = [i].[object_id]

INNER JOIN [sys].[partitions] [p]

ON [i].[object_id] = [p].[object_id]

AND [i].[index_id] = [p].[index_id]

INNER JOIN [sys].[allocation_units] [a]

ON [a].[container_id] = CASE WHEN [a].[type] IN ( 1, 3 ) THEN [p].[hobt_id]

WHEN [a].[type] = 2 THEN [p].[partition_id]

END

INNER JOIN [sys].[key_constraints] AS [kc]

ON [t].[object_id] = [kc].[parent_object_id]

WHERE (

[i].[name] IS NOT NULL

AND OBJECTPROPERTY([kc].[object_id], 'CnstIsNonclustKey') = 1 --Unique Constraint or Primary Key can qualify

AND OBJECTPROPERTY([t].[object_id], 'TableHasClustIndex') = 0 --Make sure there's no Clustered Index, this is a valid design choice

AND OBJECTPROPERTY([t].[object_id], 'TableHasPrimaryKey') = 1 --Make sure it has a Primary Key and it's not just a Unique Constraint

AND OBJECTPROPERTY([t].[object_id], 'IsUserTable') = 1 --Make sure it's a user table because whatever, why not? We've come this far

)

GROUP BY [t].[schema_id] ,

[t].[name] ,

OBJECT_NAME([kc].[object_id])

ORDER BY SUM([a].[total_pages]) * 8 / 1024.0 DESC;

There are times when heaps are a valid choice

ETL or staging tables are the most common examples of when raw insert throughput is necessary, and a Clustered Index may not be.

But when tables are designed to be queried against and lack a Clustered Index, it’s usually a problem to be fixed.

Brent says: and remember, kids, SQL Server won’t suggest a clustered index.

Kendra says: if you think this probably didn’t happen to you, that’s a good sign you should doublecheck.

New Cardinality Estimator, New Missing Index Requests

Out of Office: Time For the Brent Ozar Unlimited Retreat

31 Comments. Leave new

Greg Smith
July 22, 2015 8:27 am

My third party production database returned 5,733 rows, 4,199 rows that have 0.000 index size! The largest index of the 1,534 remaining rows was 10,243.109375GB…..

Reply
Mike
July 22, 2015 9:01 am

Hi Eric,

thx for the script.
Column “[IndexSizeGB]” => [IndexSizeMB] (maybe a typo?) 😉

Reply
- Erik Darling
  July 22, 2015 10:35 am
  
  Good eye! I started out looking at the size in GB. Fixing now.
  
  Reply
  - raulggonzalez
    July 22, 2015 3:11 pm
    
    Thanks for the article,
    
    But why so complicated? you also are aggregating the size for all indexes in the table.
    
    Unless that’s intended (get index name, but size for all NC’s), to identify HEAPs with NC PK, I’d go for simpler
    
    SELECT QUOTENAME(OBJECT_SCHEMA_NAME(ix.object_id)) + ‘.’ +
    QUOTENAME(OBJECT_NAME(ix.object_id)) AS object_name
    , QUOTENAME(ix.name) AS index_name
    FROM sys.indexes AS ix
    LEFT JOIN (SELECT object_id FROM sys.indexes AS cix WHERE cix.type_desc = ‘CLUSTERED’) AS cix
    ON cix.object_id = ix.object_id
    WHERE ix.type_desc = ‘NONCLUSTERED’
    AND ix.is_primary_key = 1
    AND cix.object_id IS NULL
    
    Then you can add partitions and allocation_units to get real size of 1 NC index which is also PK
    
    If there’re are columns called [type_desc] and [is_primary_key] why not used them?
    
    Nice article though, pointing something might be overlooked
    
    Reply
  - Mark Freeman
    July 24, 2015 4:27 pm
    
    Wouldn’t the row count be at least as useful as the IndexSizeMB?
    
    IndexRowCount = ISNULL( ( SELECT SUM(p.rows)
    FROM sys.partitions p2
    WHERE p.index_id = i.index_id
    AND kc.[object_id] = p.object_id
    ), 0
    )
    
    Reply
Adrien
July 22, 2015 11:23 am

Yeah, my third-party app has about 300 tables. There are approximately five clustered indexes, and most of those are on GUID values.

Of course, there are lots of non-clustered indexes and FK relationships.

Reply
- Mark Freeman
  July 23, 2015 9:15 am
  
  How lucky of you that your third-party app has FKs. 🙂
  
  One of the ones we use has 138 tables and no FKs. It is an award-winning and market-leading application in its category.
  
  Reply
  - raulggonzalez
    July 23, 2015 9:46 am
    
    that’s nothing man!
    
    Just seen one of mines and has 3410 HEAPs with no PKs, hence no FKs whatsoever!
    
    Reply
  - Adrien
    July 23, 2015 9:55 am
    
    Sorry, the point I didn’t make clear… The vendor is okay with us adding clustered indexes, which is a project I’m working on.
    
    Unfortunately, when one table has, for example, 46 FKs that need to be dropped and recreated as part of the process of clustering the table … 😐
    
    Reply
    - Erik Darling
      July 23, 2015 9:58 am
      
      Aaron Bertrand actually has a great script for dropping and recreating FKs:
      
      https://www.mssqltips.com/sqlservertip/3347/drop-and-recreate-all-foreign-key-constraints-in-sql-server/
      
      Reply
      - Adrien
        July 23, 2015 10:01 am
        
        Okay, that’s awesome. Thanks!
Stewart McGuire
July 22, 2015 12:36 pm

My primary application that I work with has very large databases with 100s of tables. Some of them in the past were created without primary keys.

How can I get a list of tables that LACK a primary key? I really don’t want to troll through my entire database to check each table to see if it has a primary key.

Reply
- Tom H
  July 22, 2015 1:01 pm
  
  Stewart – I use this script to hunt down missing PK’s
  
  SELECT OBJECT_SCHEMA_NAME( object_id ) as SchemaName, name AS TableName
  FROM sys.tables
  WHERE OBJECTPROPERTY(object_id,’tablehasprimaryKey’) = 0
  ORDER BY SchemaName, TableName ;
  
  Reply
- raulggonzalez
  July 22, 2015 3:37 pm
  
  you can try
  
  SELECT * FROM sys.tables
  WHERE object_id NOT IN (SELECT object_id FROM sys.indexes WHERE is_primary_key = 1)
  
  Cheers
  
  Reply
Tom H
July 22, 2015 12:58 pm

I recent inherited a mission-critical database full of heaps, and GUIDs for keys. They said ‘we have performance problems…” – the blitz tool was a great help – runs much better now

Reply
Mark Freeman
July 22, 2015 2:00 pm

Eric,

In the system I checked, this line seems unnecessary (I get the same results with or without it):

AND OBJECTPROPERTY(kc.[object_id], ‘CnstIsNonclustKey’) = 1

What situation will this line filter out that isn’t taken care of by the other filters?

Reply
- Erik Darling
  July 22, 2015 3:30 pm
  
  You may get different results if you have a column with a constraint that is not indexed.
  
  Reply
Orestes
July 23, 2015 3:46 am

Eric, using single quotes after — comment dashes?
It’s OK for code snippets, but careful in Production code.
I’ve had SQL 2008 views throw horrible exceptions after I added a header description comment with single quotes, then tried scripting out an ALTER VIEW from SSMS…

Reply
- Erik Darling
  July 23, 2015 9:03 am
  
  Do you normally use snippets of code from the internet in production? Our number is in the side bar 🙂
  
  Reply
- Orestes
  July 23, 2015 9:13 am
  
  Hah! Certainly not, I only use code by email spam bots and IRC strangers 🙂
  
  My comment was just a heads up to one of many SSMS bugs that might catch one out.
  The plus/minus side is that it’s got me typing propely expandid Inglish in my TC-cool coments 😉
  
  Reply
Perry Whittle
July 24, 2015 12:15 am

The join between sys.partitions and sys.allocation_units is dependent upon the allocation unit type (heap, clustered index or nonclustered index) and should be made like so

INNER JOIN sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN sys.allocation_units au ON
CASE
WHEN au.[type] IN (1,3) THEN p.hobt_id
WHEN au.[type] = 2 THEN p.partition_id
END

Regards Perry

Reply
- Erik Darling
  July 24, 2015 11:01 am
  
  Huh. I’ve never seen it done like that before. That’s neat though, I’ll have to mess with it. Thanks, Perry.
  
  Reply
Perry whittle
July 24, 2015 2:45 pm

Lol well at least no one spotted my mistake.
The join is based on the allocation unit type where 1 and 3 are row data and row overflow, type 2 is LOB based. Check books online for more detail
Regards Perry

Reply
- Erik Darling
  July 24, 2015 3:16 pm
  
  I usually open this thing up and zoom in a couple million percent:
  
  http://www.microsoft.com/en-us/download/details.aspx?id=39083
  
  Reply
Charles Kincaid
July 26, 2015 9:39 pm

So I have this table where the unique column is the container bar code. It’s 48 characters wide and not ascending. I don’t want to use this as the clustered index as that then gives all of the other indexes a 48 character pointer in place of the 8 byte RID. Yes, there are many thousands of rows.

I could put an auto-number column in there but it would mean nothing and it would not help in avoiding index bloat. Sometimes all you can do is all you can do.

Reply
Robin
July 31, 2015 3:42 pm

Is the issue tables with non-clustered primary keys, but no clustered index?

Seems like there is some context missing here.

Reply
- Brent Ozar
  August 1, 2015 9:52 am
  
  Robin – correct, that’s exactly what the title said. 😀
  
  Reply
Dennis
August 11, 2015 5:28 am

Stumbled upon 2 tables in MSDB.
[dbo].[sysssispackagefolders]
[dbo].[sysssispackages]

Do you think this is design?

Reply
- Brent Ozar
  August 11, 2015 7:52 am
  
  Dennis – are those system tables? If so, I wouldn’t worry about them since you can’t change them.
  
  Reply
  - Dennis
    August 13, 2015 8:35 am
    
    Sure are.
    Sometimes it’s hard to be convinced of “Best practices” when Microsoft go against them in their own structure. Just gotta try to love them for 15 mins then hate them for 15 mins. 🙂
    
    Reply
Elnaz
February 22, 2024 10:45 pm

Hi Brent,

Maybe better to filter Memory Optimized tables in your query since clsutered index is not supported there and they are shown in the result

Reply