When you run a SELECT COUNT(*), the speed of the results depends a lot on the structure & settings of the database. Let’s do an exploration of the Votes table in the Stack Overflow database, specifically the 2018-06 ~300GB version where the Votes table has 150,784,380 rows taking up ~5.3GB of space.
I’m going to measure each method 3 ways:
- How many pages it reads (gauged with SET STATISTICS IO ON)
- How much CPU time it uses (gauged with SET STATISTICS TIME ON)
- How fast it runs
Don’t obsess over small differences between the operations – I’m writing this blog post fast & furious to show you the big-picture differences, and to show you how my thought process works when comparing the different operations. In your own environment, for the tables you’re trying to count and the hardware you’re using and the version you’re on and the phase of the moon, you’re going to get different results, and that’s fine. There are also other ways to measure these methods depending on your own performance requirements: memory grants, ability to run without blocking, and even the accuracy of the results under concurrency. For the sake of these tests, I’m not going to talk about isolation levels or blocking.
I’m running these tests on SQL Server 2019 (15.0.2070.41) on an 8-core VM with 64GB RAM.
1: Plain ol’ COUNT(*) with only a clustered rowstore index, compatibility level 2017 & prior
1 2 3 4 |
ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 140; GO SELECT COUNT(*) FROM dbo.Votes; GO |
- Pages read: 694,389
- CPU time: 14.7 seconds of CPU time
- Duration: 2 seconds
2: Compatibility level 2019 (batch mode on rowstore indexes)
1 2 3 4 |
ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 150; GO SELECT COUNT(*) FROM dbo.Votes; GO |
- Pages read: 694,379
- CPU time: 5.2 seconds
- Duration: 0.7 seconds
Presto change-o – CPU just instantly drops thanks to batch mode. This isn’t obvious in the execution plans until you start hovering your mouse over individual operators:
Batch mode is a great fit for a lot of reporting-style queries doing aggregates over a lot of data.
3: Add nonclustered rowstore indexes, but use 2017 & prior’s row mode
I’m going to create an index on each of the 5 columns of the Users table, and then compare their sizes with sp_BlitzIndex:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
CREATE INDEX IX_PostId ON dbo.Votes(PostId); GO CREATE INDEX IX_UserId ON dbo.Votes(UserId); GO CREATE INDEX IX_BountyAmount ON dbo.Votes(BountyAmount); GO CREATE INDEX IX_VoteTypeId ON dbo.Votes(VoteTypeId); GO CREATE INDEX IX_CreationDate ON dbo.Votes(CreationDate); GO /* What are the sizes of each index? Turn OFF actual plans to run this: */ sp_BlitzIndex @TableName = 'Votes'; GO |
Check out the number of rows in each index versus its size. When SQL Server needs to count the number of rows in the table, it’s smart enough to look at which object is the smallest, and then use that one for the count. Indexes can have varying sizes depending on the datatypes of the contents they’re indexing, the size of each row’s contents, the number of nulls, etc:
I’ll go back to 2017 compat level (removing batch mode operations) and then run the count:
1 2 3 4 |
ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 140; GO SELECT COUNT(*) FROM dbo.Votes; GO |
SQL Server chooses to use the BountyAmount index, one of the smaller 2GB ones:
Which pays off in reading less pages, but we’re still performing the same count of 150M rows, so the CPU time & duration don’t really change:
- Pages read: 263,322
- CPU time: 14.8 seconds
- Duration: 2 seconds
If you want lower CPU time & duration, you really need to approach the count differently – and that’s where batch mode operation helps.
4: 2019’s batch mode with nonclustered rowstore indexes
So now let’s try batch mode operation with the indexes in place:
1 2 3 4 |
ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 150; GO SELECT COUNT(*) FROM dbo.Votes; GO |
It still uses the BountyAmount index and does the same number of reads as #3, but we get the lower CPU time & duration from step #2:
- Pages read: 694,379
- CPU time: 4.3 seconds
- Duration: 0.6 seconds
So far, that’s the winner. But remember that batch mode originally went live with columnstore indexes, which are awesome tools for reporting-style queries….
5: Nonclustered columnstore index with batch mode
I’m purposely running in 2017 compat mode here because I want to make it clear where the awesomeness is:
1 2 3 4 5 6 |
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_BountyAmount ON dbo.Votes(BountyAmount); GO ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 140; GO SELECT COUNT(*) FROM dbo.Votes; GO |
The execution plan has our fancypants new columnstore index scan operator, and all of the operators in the plan are in batch mode:
I have to change my units of measure here:
- Pages read: 73,922
- CPU time: 15 milliseconds
- Duration: 21 milliseconds
Hubba hubba. Let’s put that in perspective: I know some developers who try to hit system tables in order to count rows quickly, and they can’t even generate speedy results like this.
So to make SELECT COUNT(*) queries fast, here’s what to do:
In descending order of preference & speed, with the best results first:
- Get on SQL Server 2017 or newer, and put a columnstore index on the table.
- Get on any version that supports batch mode on columnstore indexes, and put a columnstore index on the table – although your experiences are going to vary dramatically depending on the kind of query you have. To learn more about the specifics, read Niko’s series on columnstore indexes, specifically the posts with the word “batch” in the title.
- Get on SQL Server 2019 or newer, and put your database in compat level 150 (2019) – even with rowstore indexes, you can still cut your CPU usage dramatically thanks to batch mode on rowstore. This one’s really easy to do – it probably requires the least changes to your application and database schema – but it’s just that you won’t have the amazingly millisecond-fast responses that a columnstore index can get you.
Here are the demo scripts if you wanna play around with ’em, like adapting ’em to work on your own tables.
27 Comments. Leave new
Great article,
I check few weeks ago for some ways to do counts quickly on SQL 2014 (to monitor a large table status and found this article https://blogs.msdn.microsoft.com/martijnh/2010/07/15/sql-serverhow-to-quickly-retrieve-accurate-row-count-for-table/ , though is limitation is that I cannot do filtering, but was good enough to know the general count.
Now that I know this , I am adding NONCLUSTERED COLUMNSTORE INDEX to my tool arsenal. thanks
My pleasure, and glad you liked it.
Does this mean you now no longer advocate to stay away from SS2019?
Rafi – read through the “In descending order of preference & speed, with the best results first:” section of the post really slowly, and think about where SQL Server 2019 comes in that list.
How wrong is sysindexes.rows ?
It should be kinda fast.
I’m not a fan of having users directly query system tables where accuracy, parallelism, and locking isn’t clear.
Do you see a lot of users doing row count checks? I am tired atm so maybe I am missing an extrapolation to real world stuff. In my life the only people that care about row count much are dbas who are okay with ‘roughly how many rows this week compared to last’. The topic looks interesting and I remember mentally bookmarking columnstore indexing a few years back but it didn’t ever really crop up as a useful solution and now I am wondering if I missed something important…
Simon – yep, thus the post. This stemmed from a client use case. It was a really neat one, but unfortunately due to those pesky NDAs, I can’t elaborate on what their app was doing. I try to share as much as I can here, though.
How does this compare to: select count(1) from Votes ?
Makes no difference: https://stackoverflow.com/a/54192184/26837
The speed of columnstore is unparalleled, but feels like index creation time has to be included for a fair comparison.
Yes, because every time you run a query, you also need to create a new index from scratch.
Wait, what? 😉
You’re assuming that there’s a recurring need to get a fast rowcount if you don’t include the index creation time – though to be fair, that’s probably the primary reason to care about count performance. Even then, I’d probably want to solve this differently through a recurring background job than take on the write overhead of a columnstore index just to get sub-100ms rowcounts.
Given the tradeoff, compatiblity level 150 is the best option if SQL 2019 or Azure SQLDB is available.
Yes, the point of this entire post – as one might guess from the title – was to make the relevant queries fast.
If one did not have that need, then one might not embark on this journey.
Not everyone might gather that, though. The folks who might not gather that from the title would have my condolences, though, because they’re the kind of people who would eventually electrocute themselves when sticking their naked fingers in an electric socket because they didn’t have a warning label on it.
I sure am happy that I’m not friends with those kinds of morons.
Ummm… wait… you mean I should stop doing that to stay awake? 😀 😀 😀
Indexed view on COUNT_BIG(*) should be even faster. I’m not sure about consequences for concurrency though.
Thanks Brent…learned something new.
At first I thought “Oh Brent is surely going after APPROX_COUNT_DISTINCT”….
Could you do another blog on this new way of counting as well?
I would be interested in your opinion on use cases for it.
Martin – ah no, counting distincts (even the new approx function) is way slower than just counting the number of rows. I don’t need a whole post to answer that one: that new function makes sense where you need distincts, but not exact accuracy, and you’re willing to sacrifice a little accuracy for speed.
I haven’t personally run into that use case, but I trust that there are folks like analysts who have.
Since I don’t know just how “approximate” that function is, I’ll probably never use it. I hate stuff like that.
Hi Brent,I don’t have Stack Overflow db on my laptop to check the scripts, so i have a question.
In step 4 you said “It still uses the BountyAmount index and does the same number of reads as #3”
but in the result you wrote “Pages read: 694,379” ,it is like #2, is n’t it?
You can download the database to find out.
I checked it, “Page read” in step 4 is wrong, It should be 263,322.
maybe it is typo error.
For the sake of completeness: The fastest way to do a SELECT COUNT(*) without any limiting WHERE / JOIN would be to use the meta data:
SELECT SUM(dps.row_count) Row_Count
FROM sys.dm_db_partition_stats dps
WHERE dps.object_id = OBJECT_ID(‘dbo.votes’)
AND dps.index_id <= 1 — only heap or clustered index
This will do only 6 reads / 1 ms CPU / 0 sek duration, (almost) regardless how big the table is (a few more reads if you are using partitioning).
Because it is a bit harder to write / remember you could create a function / procedure / snipped / auto type text or whatever you prefer to make your daily work easier 🙂
Thomas – I’m not a big fan of that because the DMVs aren’t guaranteed to honor isolation level requests. Over the years, I’ve hit several DMVs that just flat out didn’t honor READ UNCOMMITTED or NOLOCK, and if you tried to get data out of them, you’d cause blocking that then in turn blocked other user queries. For example, there’s this gem: https://github.com/BrentOzarULTD/SQL-Server-First-Responder-Kit/issues/2176
BWAAA-HAAAA!!!! Sounds like a great way to find other, ummmmm… “troublesome code” (OMG! It actually caused physical pain to be so politically correct there!).
It’s those sneaky functions embedded underneath some of these DMV’s that appear to run under a different isolation context no matter what you use. At least the one’s that have given me trouble blocking/getting blocked. I don’t see any functions used with sys.identity_columns at least as far as what’s shown in the plan so that one is a head scratcher.
I think it’s funny…. most of the time, the count is out of date as soon as it completes.