I know. You, dear reader, saw that title and you came in here because you’re furious. You want foreign key relationships configured in all of your tables to prevent bad data from getting in.
But you gotta make sure to index them, too.
Let’s take the Stack Overflow database – I’m using the 50GB Stack Overflow 2013 version for this post. I’m going to start with no nonclustered indexes whatsoever, and then I’m going to add in foreign keys between the Users table and several tables where users add content, like Badges, Comments, Posts, and Votes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
/* Implement foreign keys: */ ALTER TABLE dbo.Badges WITH NOCHECK ADD CONSTRAINT fk_badges_users_id FOREIGN KEY (UserId) REFERENCES dbo.Users(Id); GO ALTER TABLE dbo.Comments WITH NOCHECK ADD CONSTRAINT fk_comments_users_id FOREIGN KEY (UserId) REFERENCES dbo.Users(Id); GO ALTER TABLE dbo.Posts WITH NOCHECK ADD CONSTRAINT fk_posts_users_id FOREIGN KEY (OwnerUserId) REFERENCES dbo.Users(Id); GO ALTER TABLE dbo.Votes WITH NOCHECK ADD CONSTRAINT fk_votes_users_id FOREIGN KEY (UserId) REFERENCES dbo.Users(Id); GO |
Now, let’s say my application wants to delete a specific user, id # 26837. First, the app starts by deleting all of the rows in the content tables for this user:
1 2 3 4 5 6 |
/* Do our prep work to delete rows in all the child tables: */ DELETE dbo.Badges WHERE UserId = 26837; DELETE dbo.Comments WHERE UserId = 26837; DELETE dbo.Posts WHERE OwnerUserId = 26837; DELETE dbo.Votes WHERE UserId = 26837; GO |
That does table scans, and let’s say we’re okay with that for some reason because our application rarely does deletes. Now that all the content tables are gone, let’s go back and delete the User row:
1 2 |
DELETE dbo.Users WHERE Id = 26837; GO |
It’s just deleting one row, right? How bad can it be?
Terrible, as it turns out. SQL Server wants to double-check all of the content tables to make sure User 26837 doesn’t own any Badges, Comments, Posts, or Views. We do a ton of logical reads, too:
The first solution you probably think of: index all of the foreign key columns:
1 2 3 4 |
CREATE INDEX UserId ON dbo.Badges(UserId); CREATE INDEX UserId ON dbo.Comments(UserId); CREATE INDEX OwnerUserId ON dbo.Posts(OwnerUserId); CREATE INDEX UserId ON dbo.Votes(UserId); |
That makes the deletes go way faster, but it comes at a few costs: slower inserts & updates to maintain those indexes, a 3% larger database size in this case, 3% longer maintenance jobs, etc.
At one recent engagement, we simply couldn’t use that solution. They already had serious performance problems during inserts: their hardware couldn’t handle any additional slowdowns during inserts & updates, and they already had extensive blocking & writing problems. Other solution options included:
- Queueing the deletes for later – in this Stack Overflow example, say the application doesn’t directly delete the Users rows, and instead, adds rows to a queue table to be handled later. During a maintenance window, we could do the deletes in batches, dropping groups of users across the tables with only one pass.
- Drop the foreign keys – at this particular client, the keys were configured with NOCHECK anyway due to past bad data, and there wasn’t a way we could fix that quickly enough. In that case, dropping the foreign keys made the deletes instantly faster, which was part of the overall solution in getting enough pain relief to go through their busy season.
Do I wish every database had perfect data, perfect foreign keys, perfect indexes, and perfect hardware to sustain all of the load? Absolutely. But in real life, constraints aren’t black and white: they’re fifty shades of … oh, you know.
17 Comments. Leave new
Yep. Fifty shades of “It depends”.
Absolutely it depends.
In our App, users are NEVER deleted, their obsolete flat is set to true. Additionally we never search on a user_id so why index, the users table has the index on user_id so validating FKs from other tables is fast.
Proof the indexes were useless, checked the stats, all updates not seeks or scans.
If I needed to delete a user, then I would create a script to go out index all of the other tables and then delete the row and then drop those indexes.
In SQL 2000 I have seen 7 missing indexes bring down the system for deleting one or more row (even in a begin transaction/rollback transaction) on a much smaller database. So adding the indexes, deleting the bad rows, and then dropping the indexes was the way to go.
”
They already had serious performance problems during inserts: their hardware couldn’t handle any additional slowdowns during inserts & updates
”
but what hw are we talking about?
Ale – I love that my clients let me share stuff in order to help get y’all free real world training, but we do need to draw a line somewhere. Fair enough?
ok, but it was to get an idea if it was “average” poor hw , or is hw good but the application is mission critical. just curiosity
The hardware was good. My 2-day SQL Critical Care is $6,995, so that tends to filter out most poor hardware.
Too bad, otherwise it would be easiest money…
-“Hello, why is my DB so slow?”
-“Your hardware sucks. Replace it! Have a nice day!”
I actually handle that during the sales process for free – that’s one of the things I love about the Consultant Toolkit. I get all kinds of diagnostic data about their SQL Server, and in about half of the sales call requests I deal with, I’m able to solve the problem for free without even having a call.
It might sound odd to solve their problem for free, but if I can, then when they hit an actual real problem that can’t be solved with a quick change, then they’ll come back directly to me. They’ll remember that I got them across the finish line quickly, and they won’t consider hiring anybody else.
It is an interesting scenario where there are tables without foreign keys which would otherwise make the process too slow. And the hw is good.
I hope I never have to deal with it! 🙂
Nice post! Thanks Brent
I have run into this exact situation with one of my applications. Deletes aren’t even possible in the UI, but occasionally, bad things happen, and I have to do manual deletes out of said table. They can take minutes with a naive delete (say 1000 rows), blocking the whole application. Due to the rarity of deletes, my solution (which wouldn’t work in a lot of situations with fewer permissions) is/was to create the index, delete the records I needed to delete, then drop the index.
You would think SQL Server could recognize it was going to delete a lot of records and build some sort of temporary indexes in tempdb to do all the foreign key checks, but that’s probably way harder than it sounds.
> Drop the foreign keys – at this particular client, the keys were configured with NOCHECK anyway due to past bad data
You mean the FK was untrusted? If they’re NOCHECK, they should not appear in the query plan at all.
Bad news – that’s only for the enforcement during the FK creation, but not for subsequent delete/update/insert operations.
That is correct, however they’ll have checking enabled from that point on forward (they’re not configured NOCHECK, only were added with NOCHECK). You can see the difference with this small sample (excuse me not having the stackoverflow database at hand).
CREATE TABLE Principal (Id int CONSTRAINT PK_Principal PRIMARY KEY)
CREATE TABLE Dependent (Id int CONSTRAINT PK_Dependent PRIMARY KEY, PrincipalId int)
ALTER TABLE Dependent WITH NOCHECK ADD CONSTRAINT FK_Principal FOREIGN KEY (PrincipalId) REFERENCES Principal(Id)
SELECT is_disabled is_NOCHECK, is_not_trusted FROM sys.foreign_keys — is_nocheck = 0, is_not_trusted = 1
INSERT Dependent VALUES (1, 42) — The INSERT statement conflicted with the FOREIGN KEY constraint “FK_Principal”. The conflict occurred in database “Test”, table “dbo.Principal”, column ‘Id’.
ALTER TABLE Dependent NOCHECK CONSTRAINT FK_Principal
SELECT is_disabled is_NOCHECK, is_not_trusted FROM sys.foreign_keys — is_nocheck = 1, is_not_trusted = 1
INSERT Dependent VALUES (1, 42) — no problem
That code sample doesn’t relate to what I’m doing in the post. I understand that you think you know how this works, but before you comment further, please follow the code in the post. Further unrelated code won’t be productive for either of us. Thanks for understanding.
Just to be really clear – you can prove it by downloading the database used in the blog post above, and following the demo scripts in the post. That’s why I use open source data & publish my scripts. It’s really helpful for testing your assumptions.
I have a similar situation with deletion taking hours.
I have disabled the FKs, did the deletion, enabled them back WITH TRUSTED ON. It worked almost instantly,
The question is why did it work so well? Enabling back the FKs and making them trusted, does check the existing data. I would have expected to take the same amount of time as with the initial delete operation.