If you update a row without actually changing its contents, does it still hurt?
Paul White wrote in detail about the impact of non-updating updates, proving that SQL Server works hard to avoid doing extra work where it can. That’s a great post, and you should read it.
But foreign keys add another level of complexity. If you have foreign keys, and you tell SQL Server that you’re updating the contents of those related columns, SQL Server will check ’em even when the data isn’t changing.
To illustrate it, I’ll take the Stack Overflow database and add a foreign key between Posts and Users:
ALTER TABLE dbo.Posts WITH NOCHECK
ADD CONSTRAINT FK_Posts_OwnerUserId_Users_Id
FOREIGN KEY (OwnerUserId)
I’m saying that every Post’s OwnerUserId column must map up to a valid Users.Id. I’m using NOCHECK to get the implementation done faster because I’m not worried about the quality of the existing data in this case – I’m not trying to write a demo about foreign key join elimination.
Let’s grab a Posts row and see who the current OwnerUserId is:
Then set the Posts.OwnerUserId to the same value, and look at the actual execution plan:
The plan shows a clustered index seek against the Users table even though the data isn’t changing. Why? Because SQL Server has to go check that the Users row is still there.
The same thing happens even if I don’t specify a value for OwnerUserId, like if I just set it to itself. Here’s the actual execution plan for that, and it also hits the Users table just to check:
But if I update an unrelated column, like Score, then SQL Server doesn’t have to check the Users foreign key, and the actual plan avoids touching the Users table:
What this means for query authors
In a perfect world:
- Tables all have foreign keys that lay out their relationships
- You only update columns that are actually changing
- Therefore, you only incur the overhead of foreign keys when the related columns change
In an imperfect world:
- Your queries update every column, every time, regardless of whether they’re changing or not – maybe due to an ORM, or maybe due to a quick/easy/reusable stored procedure design
- SQL Server thinks you’re changing foreign key values every time
- You incur the overhead of foreign keys on every single update statement
This is especially problematic for transactions that update several tables in a row, each in their own update statement. As your concurrency needs grow, and you do thousands (or tens or hundreds of thousands) of updates per second, this can lead to blocking problems. At that point, you’re faced with a tough choice: either fix the T-SQL to only update columns that are changing, or…remove the foreign keys. In that case, removing the foreign keys is probably the easier quick fix.
DBAs: I’ll write your comment
to save you some time.
“Brent – this is completely unacceptable. We need foreign keys in the database to keep the bad data out. Those idiotic developers need to fix their T-SQL! Why would you update columns that aren’t changing?!? It’s their problem. Stupid ORM. I told them not to use it, and to craft every query by hand instead. I’m not changing our perfect database to lower its standards. Signed, Determined DBA”
And here’s my response.
Might as well get this over with now.
“Dear DBA: If we drop the foreign keys, our web site can sell more products right now. If we fix the code, it’ll take months, and in the meantime, our customers’ money will go to our competitors. Are you sure this is the hill you want to die on? Signed, The Consultant”