Unindexed Foreign Keys Can Make Performance Worse.

Last Updated December 29, 2021

I know. You, dear reader, saw that title and you came in here because you’re furious. You want foreign key relationships configured in all of your tables to prevent bad data from getting in.

But you gotta make sure to index them, too.

Let’s take the Stack Overflow database – I’m using the 50GB Stack Overflow 2013 version for this post. I’m going to start with no nonclustered indexes whatsoever, and then I’m going to add in foreign keys between the Users table and several tables where users add content, like Badges, Comments, Posts, and Votes:

/* Implement foreign keys: */
ALTER TABLE dbo.Badges WITH NOCHECK
    ADD CONSTRAINT fk_badges_users_id FOREIGN KEY (UserId) 
        REFERENCES dbo.Users(Id);
GO

ALTER TABLE dbo.Comments WITH NOCHECK
    ADD CONSTRAINT fk_comments_users_id FOREIGN KEY (UserId) 
        REFERENCES dbo.Users(Id);
GO

ALTER TABLE dbo.Posts WITH NOCHECK
    ADD CONSTRAINT fk_posts_users_id FOREIGN KEY (OwnerUserId) 
        REFERENCES dbo.Users(Id);
GO

ALTER TABLE dbo.Votes WITH NOCHECK
    ADD CONSTRAINT fk_votes_users_id FOREIGN KEY (UserId) 
        REFERENCES dbo.Users(Id);
GO

/* Implement foreign keys: */

ALTER TABLE dbo.Badges WITH NOCHECK

ADD CONSTRAINT fk_badges_users_id FOREIGN KEY (UserId)

REFERENCES dbo.Users(Id);

ALTER TABLE dbo.Comments WITH NOCHECK

ADD CONSTRAINT fk_comments_users_id FOREIGN KEY (UserId)

REFERENCES dbo.Users(Id);

ALTER TABLE dbo.Posts WITH NOCHECK

ADD CONSTRAINT fk_posts_users_id FOREIGN KEY (OwnerUserId)

REFERENCES dbo.Users(Id);

ALTER TABLE dbo.Votes WITH NOCHECK

ADD CONSTRAINT fk_votes_users_id FOREIGN KEY (UserId)

REFERENCES dbo.Users(Id);

Now, let’s say my application wants to delete a specific user, id # 26837. First, the app starts by deleting all of the rows in the content tables for this user:

/* Do our prep work to delete rows in all the child tables: */
DELETE dbo.Badges WHERE UserId = 26837;
DELETE dbo.Comments WHERE UserId = 26837;
DELETE dbo.Posts WHERE OwnerUserId = 26837;
DELETE dbo.Votes WHERE UserId = 26837;
GO

/* Do our prep work to delete rows in all the child tables: */

DELETE dbo.Badges WHERE UserId = 26837;

DELETE dbo.Comments WHERE UserId = 26837;

DELETE dbo.Posts WHERE OwnerUserId = 26837;

DELETE dbo.Votes WHERE UserId = 26837;

That does table scans, and let’s say we’re okay with that for some reason because our application rarely does deletes. Now that all the content tables are gone, let’s go back and delete the User row:

DELETE dbo.Users WHERE Id = 26837;
GO

1 2	DELETE dbo.Users WHERE Id = 26837; GO

It’s just deleting one row, right? How bad can it be?

Terrible, as it turns out. SQL Server wants to double-check all of the content tables to make sure User 26837 doesn’t own any Badges, Comments, Posts, or Views. We do a ton of logical reads, too:

The first solution you probably think of: index all of the foreign key columns:

CREATE INDEX UserId ON dbo.Badges(UserId);
CREATE INDEX UserId ON dbo.Comments(UserId);
CREATE INDEX OwnerUserId ON dbo.Posts(OwnerUserId);
CREATE INDEX UserId ON dbo.Votes(UserId);

CREATE INDEX UserId ON dbo.Badges(UserId);

CREATE INDEX UserId ON dbo.Comments(UserId);

CREATE INDEX OwnerUserId ON dbo.Posts(OwnerUserId);

CREATE INDEX UserId ON dbo.Votes(UserId);

That makes the deletes go way faster, but it comes at a few costs: slower inserts & updates to maintain those indexes, a 3% larger database size in this case, 3% longer maintenance jobs, etc.

At one recent engagement, we simply couldn’t use that solution. They already had serious performance problems during inserts: their hardware couldn’t handle any additional slowdowns during inserts & updates, and they already had extensive blocking & writing problems. Other solution options included:

Queueing the deletes for later – in this Stack Overflow example, say the application doesn’t directly delete the Users rows, and instead, adds rows to a queue table to be handled later. During a maintenance window, we could do the deletes in batches, dropping groups of users across the tables with only one pass.
Drop the foreign keys – at this particular client, the keys were configured with NOCHECK anyway due to past bad data, and there wasn’t a way we could fix that quickly enough. In that case, dropping the foreign keys made the deletes instantly faster, which was part of the overall solution in getting enough pain relief to go through their busy season.

Do I wish every database had perfect data, perfect foreign keys, perfect indexes, and perfect hardware to sustain all of the load? Absolutely. But in real life, constraints aren’t black and white: they’re fifty shades of … oh, you know.

How to Create a Table with a Partitioned Clustered Columnstore Index

SQL Server Suddenly Frozen? You Might Be Snapshotting Too Many Databases.

18 Comments. Leave new

g.sena
September 17, 2020 8:31 am

Yep. Fifty shades of “It depends”.

Reply
Brian M
September 17, 2020 9:51 am

Absolutely it depends.
In our App, users are NEVER deleted, their obsolete flat is set to true. Additionally we never search on a user_id so why index, the users table has the index on user_id so validating FKs from other tables is fast.

Proof the indexes were useless, checked the stats, all updates not seeks or scans.

If I needed to delete a user, then I would create a script to go out index all of the other tables and then delete the row and then drop those indexes.

In SQL 2000 I have seen 7 missing indexes bring down the system for deleting one or more row (even in a begin transaction/rollback transaction) on a much smaller database. So adding the indexes, deleting the bad rows, and then dropping the indexes was the way to go.

Reply
AleD
September 17, 2020 11:38 pm

”
They already had serious performance problems during inserts: their hardware couldn’t handle any additional slowdowns during inserts & updates
”
but what hw are we talking about?

Reply
- Brent Ozar
  September 18, 2020 2:59 am
  
  Ale – I love that my clients let me share stuff in order to help get y’all free real world training, but we do need to draw a line somewhere. Fair enough?
  
  Reply
  - AleD
    September 18, 2020 3:20 am
    
    ok, but it was to get an idea if it was “average” poor hw , or is hw good but the application is mission critical. just curiosity
    
    Reply
    - Brent Ozar
      September 18, 2020 3:23 am
      
      The hardware was good. My 2-day SQL Critical Care is $6,995, so that tends to filter out most poor hardware.
      
      Reply
      - J,V,
        September 18, 2020 4:12 am
        
        Too bad, otherwise it would be easiest money…
        -“Hello, why is my DB so slow?”
        -“Your hardware sucks. Replace it! Have a nice day!”
      - Brent Ozar
        September 18, 2020 4:18 am
        
        I actually handle that during the sales process for free – that’s one of the things I love about the Consultant Toolkit. I get all kinds of diagnostic data about their SQL Server, and in about half of the sales call requests I deal with, I’m able to solve the problem for free without even having a call.
        
        It might sound odd to solve their problem for free, but if I can, then when they hit an actual real problem that can’t be solved with a quick change, then they’ll come back directly to me. They’ll remember that I got them across the finish line quickly, and they won’t consider hiring anybody else.
      - AleD
        September 18, 2020 8:35 am
        
        It is an interesting scenario where there are tables without foreign keys which would otherwise make the process too slow. And the hw is good.
        I hope I never have to deal with it! 🙂
Frank Peters
September 18, 2020 12:24 am

Nice post! Thanks Brent

Reply
Wyatt
September 18, 2020 6:00 am

I have run into this exact situation with one of my applications. Deletes aren’t even possible in the UI, but occasionally, bad things happen, and I have to do manual deletes out of said table. They can take minutes with a naive delete (say 1000 rows), blocking the whole application. Due to the rarity of deletes, my solution (which wouldn’t work in a lot of situations with fewer permissions) is/was to create the index, delete the records I needed to delete, then drop the index.

You would think SQL Server could recognize it was going to delete a lot of records and build some sort of temporary indexes in tempdb to do all the foreign key checks, but that’s probably way harder than it sounds.

Reply
Suchiman
September 20, 2020 5:49 pm

> Drop the foreign keys – at this particular client, the keys were configured with NOCHECK anyway due to past bad data

You mean the FK was untrusted? If they’re NOCHECK, they should not appear in the query plan at all.

Reply
- Brent Ozar
  September 20, 2020 5:53 pm
  
  Bad news – that’s only for the enforcement during the FK creation, but not for subsequent delete/update/insert operations.
  
  Reply
  - Suchiman
    September 21, 2020 2:37 am
    
    That is correct, however they’ll have checking enabled from that point on forward (they’re not configured NOCHECK, only were added with NOCHECK). You can see the difference with this small sample (excuse me not having the stackoverflow database at hand).
    
    CREATE TABLE Principal (Id int CONSTRAINT PK_Principal PRIMARY KEY)
    CREATE TABLE Dependent (Id int CONSTRAINT PK_Dependent PRIMARY KEY, PrincipalId int)
    ALTER TABLE Dependent WITH NOCHECK ADD CONSTRAINT FK_Principal FOREIGN KEY (PrincipalId) REFERENCES Principal(Id)
    SELECT is_disabled is_NOCHECK, is_not_trusted FROM sys.foreign_keys — is_nocheck = 0, is_not_trusted = 1
    INSERT Dependent VALUES (1, 42) — The INSERT statement conflicted with the FOREIGN KEY constraint “FK_Principal”. The conflict occurred in database “Test”, table “dbo.Principal”, column ‘Id’.
    
    ALTER TABLE Dependent NOCHECK CONSTRAINT FK_Principal
    SELECT is_disabled is_NOCHECK, is_not_trusted FROM sys.foreign_keys — is_nocheck = 1, is_not_trusted = 1
    INSERT Dependent VALUES (1, 42) — no problem
    
    Reply
    - Brent Ozar
      September 21, 2020 3:43 am
      
      That code sample doesn’t relate to what I’m doing in the post. I understand that you think you know how this works, but before you comment further, please follow the code in the post. Further unrelated code won’t be productive for either of us. Thanks for understanding.
      
      Reply
- Brent Ozar
  September 20, 2020 5:54 pm
  
  Just to be really clear – you can prove it by downloading the database used in the blog post above, and following the demo scripts in the post. That’s why I use open source data & publish my scripts. It’s really helpful for testing your assumptions.
  
  Reply
Iana
July 11, 2023 6:39 am

I have a similar situation with deletion taking hours.
I have disabled the FKs, did the deletion, enabled them back WITH TRUSTED ON. It worked almost instantly,
The question is why did it work so well? Enabling back the FKs and making them trusted, does check the existing data. I would have expected to take the same amount of time as with the initial delete operation.

Reply
Nem toda chave estrangeira deve ser indexada – Porto SQL
February 11, 2024 3:33 pm

[…] em que não foram criados índices para todas as chaves estrangeiras é relatado no artigo “Unindexed Foreign Keys Can Make Performance Worse”, em que o autor comenta que “Em um compromisso recente, simplesmente não pudemos usar essa […]

Reply