Logical Reads Aren’t Repeatable on Columnstore Indexes. (sigh)
Sometimes I really hate my job.
Forever now, FOREVER, it’s been a standard thing where I can say, “When you’re measuring storage performance during index and query tuning, you should always use logical reads, not physical reads, because logical reads are repeatable, and physical reads aren’t. Physical reads can change based on what’s in cache, what other queries are running at the time, your SQL Server edition, and whether you’re getting read-ahead reads. Logical reads just reflect exactly the number of pages read, no matter where the data came from (storage or cache), so as long as that number goes down, you’re doing a good job.”
To illustrate it, we’ll start with the large version of the Stack Overflow database, and count the number of rows in the Users table.
Transact-SQL
|
1 2 3 4 5 6 |
SET STATISTICS IO ON; GO DBCC DROPCLEANBUFFERS; GO SELECT COUNT(*) FROM dbo.Users; GO 3 |
Statistics io output shows that the first execution has to read pages up from disk because they’re not in cache yet:
Transact-SQL
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
Beginning execution loop (1 row affected) Table 'Users'. Scan count 1, logical reads 317822, physical reads 4, page server reads 0, read-ahead reads 329114, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0. (1 row affected) Table 'Users'. Scan count 1, logical reads 317822, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0. (1 row affected) Table 'Users'. Scan count 1, logical reads 317822, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0. Batch execution completed 3 times. |
The first execution has 4 physical reads and 329,114 read-ahead reads. Those were all read up off disk, into memory. But the whole time, logical reads stays consistent, so it’s useful for measuring performance tuning efforts regardless of what’s in cache.
The same thing is true if we create a nonclustered rowstore index too:
Transact-SQL
|
1 2 3 4 5 6 |
CREATE INDEX Age ON dbo.Users(Age); GO DBCC DROPCLEANBUFFERS; GO SELECT COUNT(*) FROM dbo.Users; GO 3 |
Statistics io output shows physical reads & readahead reads on the first execution, but logical reads stays consistent throughout:
Transact-SQL
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
Beginning execution loop (1 row affected) Table 'Users'. Scan count 1, logical reads 39000, physical reads 1, page server reads 0, read-ahead reads 38997, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0. (1 row affected) Table 'Users'. Scan count 1, logical reads 39000, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0. (1 row affected) Table 'Users'. Scan count 1, logical reads 39000, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0. Batch execution completed 3 times. |
But with columnstore indexes on SQL Server 2017 & newer…
On SQL Server 2017 or newer (not 2016), create a nonclustered columnstore index:
Transact-SQL
|
1 2 3 4 5 6 |
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Age ON dbo.Users(Age); GO DBCC DROPCLEANBUFFERS; GO SELECT COUNT(*) FROM dbo.Users; GO 3 |
And watch lob logical reads while we run it 3 times:
Transact-SQL
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
Beginning execution loop (1 row affected) Table 'Users'. Scan count 2, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 22432, lob physical reads 0, lob page server reads 0, lob read-ahead reads 89443, lob page server read-ahead reads 0. Table 'Users'. Segment reads 22, segment skipped 0. (1 row affected) Table 'Users'. Scan count 2, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 10947, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0. Table 'Users'. Segment reads 22, segment skipped 0. (1 row affected) Table 'Users'. Scan count 2, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 10947, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0. Table 'Users'. Segment reads 22, segment skipped 0. Batch execution completed 3 times. |
Lob logical reads shows 22,342 for the first execution, then 10,947 for the next two passes.
This isn’t true on SQL Server 2016, which produces the same logical read numbers every time the columnstore query runs, clean buffer pool or not. Just 2017 and newer.

<sigh> This is why we can’t have nice things.
This is also one of those reasons why it’s so hard to teach training classes. Stuff changes inside the product, and then years later, a demo you wrote no longer produces exactly the same results. You have to try re-running the demo from scratch, thinking you just made a mistake, and then you have to narrow down the root cause, and then to do it right, you really need to check each prior version to understand when the thing changed, and Google trying to find out if anybody else shared this and you just didn’t read that particular post, and then update your own training and write a blog post so that nobody else gets screwed by the same undocumented change, which of course they will, because not everybody reads your blog posts.
You won’t, though, dear reader. At least I helped you out, hopefully. And that makes it all worthwhile. (Not really. I’m going to go have a shot of my office tequila, and it’s not even 10AM as I’m writing this.)
Related

Hi! I’m Brent Ozar.
I make Microsoft SQL Server go faster. I love teaching, travel, cars, and laughing. I’m based out of Las Vegas. He/him. I teach SQL Server training classes, or if you haven’t got time for the pain, I’m available for consulting too.
Get Free SQL Stuff
"*" indicates required fields

10 Comments. Leave new
Interesting. Do you happen to have a sense of why the number changes? I could speculate that there’s some sort of compression (maybe even async) going on when it’s storing them in the buffer pool but that seems possibly wasteful.
Its not the compression. I checked it on my server, all rowgroups were compressed before the first run.
And one time I had the reduced read amount not at the second but the third execution – whyever.
Loved the frowny face!
And this is why we read your blog.
“Logical reads are always the same, doh.”
Brent: “It depends…”
On the other hand I don’t really care about the lob logical reads on a columnstore, since it always depends on the data distribution and compression strategy (when you don’t create one with a fixed order, the SQL server tries to find the best compressable order depending on the existing / inserted data).
I know, that
– when a table has more than 100k rows
– is rarely updated
– often queried by aggregates (SUM(), MIN(), MAX(), AVG(), COUNT() …)
– rarely queried with SELECT * (written as * or just a long list of columns)
– usually not queried for specific keys (customer_id = 123) but more often queried for random columns, since the customer can filter for anything in the GUI
a ColumnStore Index is usually the best index. In reality it is most time a compromiss, where not all the points above fits to the real workload, but hey – this is one of the reasons for my job 🙂
Uhhhm… let me be that guy.
If logical reads aren’t repeatable anymore when columnstore indexes are involved, what metric are we supposed to use to measure tuning success?
Logical reads used to be a stable number we could rely on.
So what’s the new yardstick?
Query duration? CPU time? Or just fewer strange noises from the CPU fan?
Just asking.
The answer is kinda beyond the scope of a blog post comment – I’ll cover that in an updated video in my Fundamentals of Columnstore class.
I know… we should always try this ourselves.
For us lazy DBA’s who don’t have “any” copy of the SO DB or the SO DB is on a home computer and I’m at work… are the CPU and run times in ms any different on the NC vs. CS index count query? LOBS have always burned me (expensive) in the past especially with Oracle. (IO, TIME on;)
I agree: your first sentence is precisely correct! Cheers.