Whew - the Microsoft folks really came through on this one and tracked down the problem pretty quickly. I want to personally thank Denis Churin, Nikita Shamgunov and Sundaram Narayanan for their detailed investigation into the issue and helping us get it fixed. I’m going to explain the issue to help anybody else get through it in case they run into similar problems.
How SQL 2008’s Full Text Search is Different than 2005
Let’s start with the basics: in SQL Server 2005, the full text data lived outside of SQL Server and wasn’t subject to transactional locking. If you inserted a gazillion records into a table that had a full text index on it, the indexes were rebuilt without worrying about simultaneous inserts and locking. It wasn’t like a traditional SQL Server table.
In SQL Server 2008, the full text indexes were moved inside the database server and became subject to transactional locking. By default, the indexes are updated automatically whenever SQL Server determines they need to be updated. When that merge process happens, SQL Server needs to obtain some locks on the indexed table. Ideally, it grabs the locks when there’s a brief period of no load, does its merge work, and lets go of the locks. The merge process can be quite brief (well under a second) as long as the amount of data hasn’t changed dramatically.
In a heavily transactional environment when there’s a whole lot of inserts/updates/deletes on the indexed table, the DBA may need to restrict those merge activities to only a certain time window of the day. SQL Server 2008 gives you that ability to do merges manually, but I wanted to avoid that on StackOverflow.com. Every time I put in a manual job into a solution, it requires manual maintenance, alerting and corrective actions, and I don’t take that lightly.
Where The Problem Comes In: Convoys of Queries
Imagine this scenario:
- A full-text select query is issued that looks like it’ll finish extremely quickly
- The SQL Server doesn’t see much activity in the full text table (only selects, not inserts/updates/deletes) so it kicks off a merge
- The select query doesn’t finish as quickly as the engine expected, and the merge can’t start until it obtains the locks it needs
- More full-text queries come in (could be selects, inserts, updates, deletes) that need to obtain locks
Those newly issued queries in step 4 are suddenly delayed while waiting for query #1 to finish. The impact on full text performance varies depending on how long it takes query #1 to finish - might be milliseconds, might be seconds.
Denis Churin, Nikita Shamgunov and Sundaram Narayanan (the Microsoft heroes) suspected this might be our performance problem at StackOverflow, so they had us grab a memory dump and a database backup at the exact moment we were having performance problems. They looked at the memory dump and isolated a single particular full-text query that was confusing the engine. The engine was building an execution plan for it that didn’t work well, and instead of taking milliseconds, it took seconds (as many as 50 seconds). During that time, performance went into the toilet.
They rewrote the query in a different way the engine would analyze better, and when that query was changed, presto, the performance problems disappeared.
There’s a QFE coming in a few weeks that will let the merge thread run without blocking other queries, but for now, we’re in good shape.
Diagnosing The Problem in Your Environment
Sundaram gave us this query to help troubleshoot when a long-running query is blocking other queries. This helps identify the issue when a full-text select query suddenly blocks the SQL Server 2008 full text merge thread, and you can look at that query to make it run faster. I haven’t tested this in depth, but these guys have proven to be much more qualified than me, so I have a hunch it’ll work, heh:
declare @temp int
declare @parent int
declare @final int
set @parent = 0
while (@parent = 0)
BEGIN
select @parent=blocked from sys.sysprocesses where lastwaittype=’LCK_M_SCH_M’ and waittime > 30000
WAITFOR DELAY ‘00:00:01′;
END
WHILE (@parent <> 0)
BEGIN
set @final = @parent
select @temp=blocked from sys.sysprocesses where spid = @parent
set @parent = @temp
END
select * from sys.sysprocesses where spid = @final
select * from sys.dm_exec_requests ER
CROSS APPLY sys.dm_exec_sql_text(ER.plan_handle) ST
where ER.session_id = @final
select * from sys.dm_exec_requests ER
CROSS APPLY sys.dm_exec_query_plan(ER.plan_handle) ST
where ER.session_id = @final
And now, finally, I can start performance tuning that server!






