Using Triggers to Automatically Add Indexes

Last Updated 7 years ago

Let’s say you’ve got a vendor application that creates tables, loads them, queries them, and drops them all the time. You can’t change the code.

The tables involved desperately need an index in order to go faster, and they’re your biggest performance bottleneck.

Enter DDL triggers. Regular DML triggers fire when you insert/update/delete rows in a table, but DDL triggers fire when you create/alter/drop tables or databases. You may remember them from the last time Jeremiah and I played with them – way back in 2008!

The below code fires whenever a table is created, checks to see if the table name is like CustomerExport%, looks for the existence of a CustomerName field (which I probably should have done with XQuery, but I’m lazy), and if it exists, creates an index on the CustomerName field:

CREATE TRIGGER trgTableAdded ON DATABASE AFTER CREATE_TABLE AS
BEGIN
    SET NOCOUNT ON
    DECLARE @TableName SYSNAME, @SchemaName SYSNAME, @StringToExecute NVARCHAR(4000)
    SELECT @TableName = EVENTDATA().value('(/EVENT_INSTANCE/ObjectName)[1]','SYSNAME')
    SELECT @SchemaName = EVENTDATA().value('(/EVENT_INSTANCE/SchemaName)[1]','SYSNAME')
    IF @TableName LIKE 'CustomerExport%' 
        AND EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS
            WHERE COLUMN_NAME = 'CustomerName' AND TABLE_SCHEMA = @SchemaName AND TABLE_NAME = @TableName)
    BEGIN
        SET @StringToExecute = 'CREATE INDEX IX_CustomerName ON ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@TableName) + ' (CustomerName);'
        EXEC(@StringToExecute);
    END
END
GO

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

CREATE TRIGGER trgTableAdded ON DATABASE AFTER CREATE_TABLE AS

BEGIN

SET NOCOUNT ON

DECLARE @TableName SYSNAME, @SchemaName SYSNAME, @StringToExecute NVARCHAR(4000)

SELECT @TableName = EVENTDATA().value('(/EVENT_INSTANCE/ObjectName)[1]','SYSNAME')

SELECT @SchemaName = EVENTDATA().value('(/EVENT_INSTANCE/SchemaName)[1]','SYSNAME')

IF @TableName LIKE 'CustomerExport%'

AND EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS

WHERE COLUMN_NAME = 'CustomerName' AND TABLE_SCHEMA = @SchemaName AND TABLE_NAME = @TableName)

BEGIN

SET @StringToExecute = 'CREATE INDEX IX_CustomerName ON ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@TableName) + ' (CustomerName);'

EXEC(@StringToExecute);

END

GO

And when Kendra found out what I'd done, she broke my arm. — And when Kendra found out what I’d done, she broke my arm.

Voila – automatic indexing, and a lifetime of regret and careful debugging. See, there’s a boatload of ways this can break:

No datatype checks. In the above example, I’m not checking for datatype. If the vendor changes CustomerName to an NVARCHAR(MAX), my table creation will bomb with an error that CustomerName “is of a type that is invalid for use as a key column in an index.”

Queries can fail. We had some queries fail (particularly SELECT INTOs) because the table schema changed before the rows started going in. Multi-statement transactions may also fail because the whole transaction will get rolled back if the trigger fails.

No duplicate index checks. If the vendor later decides to add an index as part of their code, I could be creating a duplicate index, and now slowing my processes down. Or they might just change the way they do exports, and no longer need the index at all – and I won’t be able to catch that since the tables and indexes appear and disappear so fast. I could actually set up another DDL trigger looking for table drops of these same names, and log index usage information to a table, then periodically analyze that information to see if I needed different indexing.

Lots of risk and work here. Let’s not even go into how incredibly hard it is to catch these issues when the tables and the query plans keep disappearing from memory – it’s not even enough to just grab the slow queries, but you also have to grab the tables and their contents so you can reproduce the queries to figure out the right index to build. Obviously this isn’t a solution everyone should go implement on their vendor apps, particularly mission-critical ones, but it’s an example of how far you can go to tune vendor code when other options have been taken off the table.

7 Comments. Leave new

Tracy McKibben
July 30, 2014 1:12 pm

Terrifying, but sometimes ya gotta do what ya gotta do….

Reply
Robert L Davis
July 30, 2014 1:22 pm

Did something sort of similar once when I worked on a system where they stored the columns as name value pairs for the customers custom tables. Customers could add or delete custom columns at any time, and their system would do repeated self-joins to return the data as columns. Performance was horrendous and at some point would start to fail to generate a query plan when the self-join count got too high.

This was right after SQL 2005 came out. I upgraded them to SQL 2005 and rewrote their system to use pivot queries instead. That was a lot faster and no longer failed to generate a plan for large numbers of columns. however, we wanted it even faster, so i wrote triggers (DML) so that if they added, changed or deleted a column, it would dynamically recreate the pivot views of their data

Reply
Rob Volk
July 31, 2014 4:20 pm

Awesome! Can’t wait to use this on Sharepoint! 😉

Reply
- Mauricio
  August 5, 2014 1:08 pm
  
  ^ I literally laughed upon reading this ^
  
  Reply
Jeremie Grund
August 4, 2014 4:24 pm

I literally laughed out loud upon reading the caption of the photo!

Reply
- Brent Ozar
  August 4, 2014 5:08 pm
  
  Hahaha, thanks sir.
  
  Reply