Today, Jonathan Kehayias wrote about how fragmentation impacts execution plans. I’m really into performance tuning, so I read the post carefully, really interested to see how it impacted query runtime. You should read it too – it’s a great post with a solid demo.
But oddly, he didn’t mention the query speeds.
I thought that was kinda odd since that’s what users usually complain about – their queries running slow – so I fired open SSMS to check query runtimes, and…
The fragmented table ran faster. 3-4x faster!
- Fragmented table: ~500ms
- Rebuilt table: 1.5-2 seconds
Jonathan’s post explains why – SQL Server lowballs the cost estimates on the rebuilt table, causing the query plan to go single-threaded. With the fragmented table, it correctly estimates that a lot of work is going to be involved to scan those millions of rows, so it breaks the work across multiple cores – finishing it way faster.
What’s the Moral of the Story?
Rebuilding your indexes will slow queries down.
Just kidding – although the evidence does point to that. The real moral of the story is that it’s really hard to build demo code that effectively conveys the point you want to make. Jonathan’s right in that fragmentation changes the query plan. In this one case, it happens to make the query go faster.
“But what happens if you run tons of this query at exactly the same time?” you ask? As a consultant, my answer wouldn’t be to rebuild the indexes – because now, not only is your individual query slower, but the server’s still in bad shape due to the work involved with scanning 7M rows. This query maxes out an individual CPU core every time it runs – so concurrency is a hot mess.
In that case, the answer’s simple: create a nonclustered columnstore index. The query finishes in ~100ms with near-no CPU time and >10x less logical reads. That kind of magic doesn’t work in every situation – but as long as we’re using specific demos to prove a point, that one does the job quite nicely.
And that’s why in my defragmenting your indexes isn’t helping session, I explain that defragging won’t get your users to carry you through the hallways on their shoulders. The right indexes will.
In our OLTP environment, we wouldn’t be able to tolerate any frequent queries that scan 7M rows. (columnstore or not). In fact we’ve never encountered queries where an index rebuild takes performance from unacceptable to acceptable.
But that’s just our case. It’s a very nuanced topic.
Michael – yep, that’s what I usually see in high-performance environments as well. It’s one thing to scan a data warehouse table for a report, but if you’re scanning 7M rows in OLTP, something’s probably wrong, and it ain’t fragmentation.
The “dueling bloggers” for this specific issue has been really interesting to read and think about.
If nothing else, I think it shows different ways to solve the problem. Often times, index maintenance (even statistic updates) is one of those “it depends” kind of things.
In this specific example, it seems like the solution to the slowness of the query was to force it to run in parallel.
If you have the idle cores available then maybe that makes sense. But should index fragmentation be the way to get to parallelism? Doesn’t SQL Server offer us better choices?
Mike – heh heh heh, thanks, glad you liked it. You’re absolutely right in that SQL Server offers us better choices – but that’s where building the right indexes comes in at the end of the post. You don’t really wanna repeatedly scan 7M rows in OLTP – it’ll kill CPU whether your data is fragmented or not, just a matter of whether it kills multiple cores or one.
The discussion between Brent and Jonathan was very interesting – it is great having two experts discuss something like this. My take away from the discussion is that if many of your indexes are heavily fragmented you may get higher plan costs and therefore have more queries going parallel and putting pressure on CPU resources – however those queries could potentially run faster because they are now parallel. Also your expected plans may start changing as your data gets fragmented – ideally you want your expected plan running. But in terms of overall impact, index fragmentation is less likely to be the cause of poor performance and a better index strategy is where your focus should be. Thanks again, great discussion!
Alex – yep, you nailed it.