Lock Escalation Sucks on Columnstore Indexes.

If you’ve got a regular rowstore table and you need to modify thousands of rows, you can use the fast ordered delete technique to delete rows in batches without hitting the lock escalation threshold. That’s great for rowstore indexes, but…columnstore indexes are different.

To demo this technique, I’m going to use the setup from my Fundamentals of Columnstore class:

  • An 4-core, 30GB RAM VM with the data & log files on ephemeral (fast) SSD
  • SQL Server 2019 Cumulative Update 11
  • The Stack Overflow public database as of 2018-06 (about 180GB)
  • The dbo.Votes table – which has 151M rows

I’ll create a new copy of the Votes table with a clustered columnstore index:

Let’s say we’ve decided to no longer use VoteTypeId 12, Spam, and we need to delete those 29,852 rows.

The bad way: plain ol’ delete

I’m doing this inside a transaction only so we can slow it down to see the locks that are held open:

The locks output from sp_WhoIsActive @get_locks = 1 looks long, but only because there are so many rowgroups involved. I’ve trimmed it here to just show a few rowgroups:

The important lock to be aware of is the HOBT X: that’s an exclusive lock on the table. That’s the part that I need to avoid: in high concurrency situations, it may not be easy to get that kind of a lock on the table.

The good way: Fast Ordered Deletes

After rolling back that transaction, let’s try again with the fast ordered deletes technique. This time around, I’m using a CTE rather than the view technique I used in that last blog post:

This time, I don’t even have to trim sp_WhoIsActive’s output. It’s way shorter because there are less rowgroups involved:

Note these lines in particular:

Awww yeah: we only have intent-exclusive locks on the object, and the exclusive locks only involve 1,000 of the keys. Yay! We’ve avoided table-level locking, so now we get better concurrency on the object.

But…it doesn’t really work.

The real test of the fast ordered delete technique is the ability to run queries on the other 150,783,380 rows that we’re not deleting. (Remember, we’re only deleting 1,000 rows.) And here’s the bummer: all of these queries are blocked:

I can get single rows if I specify the Id, and the row isn’t in a locked rowgroup:

But if I specify a row in a locked filegroup, that row is locked – even if the VoteTypeId isn’t 12. This is blocked:

I can prove that it isn’t VoteTypeId by running it with NOLOCK:

Columnstore lock escalation gets ugly.

As far as our users are concerned, even with just 1,000 rows involved in a transaction, the entire table is practically locked for most kinds of queries. Queries can only succeed if SQL Server can do rowgroup elimination to bypass the locked rowgroups altogether – but even that is practically impossible because even the simplest queries, like top 10 by Id ascending, won’t eliminate rowgroups here.

This is one of the many reasons columnstore isn’t really a good fit for most transactional workloads. Don’t get me wrong – I looove columnstore indexes – but if your workloads involve concurrency, and even more so if they involve modifying thousands of rows at a time, you’re going to need to do special table design like partitioning and sliding window loads in order to get the performance you’re already used to with rowstore indexes.

Fundamentals of Columnstore IndexesFor more learning, check out:

Previous Post
Learn Fundamentals of Columnstore for $1.
Next Post
Store Files in a File System, Not a Relational Database.

1 Comment. Leave new

  • George Walkey
    June 29, 2021 1:51 pm

    Yep Brent,
    I tried out NonCLCStore when it came out back in 2012 on a few “highly contentious” tables with a bunch of regular rowstore NCCIX already on the table “just to speed things up”…….mmmmm…..bang

    Star Schemas
    Power BI

    not good ole OLTP


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.