If you’re only indexing to make reads faster, you need to think again. Indexes can make every operation in the database faster, even deletes.
The Problem: Deletes are Very Slow
I was lazily researching new development techniques one day when I received an email from a client asking why deletes could be slow. I rattled off a few quick possibilities and promised that I’d look into it as soon as I was able to. Due to the vagaries of travel, it took me more time than I’d expected to dig into the problem, but I found something interesting that I should have come to mind right from the start.
This database contains some hierarchical data. My initial thought was that there was a cascading delete taking place in the hierarchy. After some initial checks into cascading deletes, fragmentation, statistics, and triggers, I ran an actual delete on a test system and found something very interesting – almost all of the time spent deleting the row was spent performing a clustered index scan on a different table.
What Happens During a Delete?
When you try to delete a row, a few things happen. SQL Server says “OK, let’s make sure that we can actually delete this row, what else depends on it?” SQL Server will check for dependent rows by examining all foreign keys. It will then check any related tables for data. If there is an index, SQL Server will use that index to check for related data. If there isn’t an index, though, SQL Server will have to scan the table for data.
Deletes and Table Scans
Don’t believe me? Try this out yourself.
Make a new database. Copy data in from the
Sales.SalesOrderDetail tables in AdventureWorks. I use the Import Data wizard to quickly copy data from one database to another.
ALTER TABLE Sales.SalesOrderHeader ADD CONSTRAINT PK_SalesOrderHeader PRIMARY KEY (SalesOrderID); ALTER TABLE Sales.SalesOrderDetail ADD CONSTRAINT PK_SalesOrderDetail PRIMARY KEY (SalesOrderDetailID); ALTER TABLE Sales.SalesOrderDetail ADD CONSTRAINT FK_SalesOrderDetail_SalesOrderHeader FOREIGN KEY (SalesOrderID) REFERENCES Sales.SalesOrderHeader(SalesOrderID) ON DELETE CASCADE;
With these three statements in place, we’re able to create a situation where SQL Server has to perform a full table scan just to delete a single row. Make sure you’ve told SQL Server to include the actual execution plan and run this:
DELETE FROM Sales.SalesOrderHeader WHERE SalesOrderID = 51721;
Once that query runs, the execution plan is going to look a bit like the execution plan below. If you add it up, 99% of the work comes from finding the rows to delete in the
SalesOrderDetail table and then actually deleting them.
Making Deletes Faster
How would we go about making deletes like this faster? By adding an index, of course. Astute readers will have noticed the missing index information in that execution plan I took a screenshot of. In this case, the missing index looks something like this:
CREATE NONCLUSTERED INDEX IX_SalesOrderDetail_SalesOrderID ON Sales.SalesOrderDetail(SalesOrderID);
Before adding the index, the query had a cost of 2.35678. After adding the index, the delete has a cost of 0.0373635. To put it another way: adding one index made the delete operation 63 times faster. When you have a busy environment, even tiny changes like this one can make it faster to find and delete records in the database.
What’s All of This Mean?
When you’re looking into database performance problems, remember that you aren’t always reading just to return data to the user, sometimes you need to find data in order to delete it. Even when we’re trying to get rid of data, it can be helpful to have an index to make deletes go faster.
Update: Also Use Fast Ordered Deletes – the Microsoft SQL Customer Advisory Team wrote about the Fast Ordered Delete technique of using a view or a CTE.