How to Add Nonclustered Indexes to Clustered Columnstore Indexes

Last Updated 7 years ago

Bad Idea Jeans, Columnstore Indexes, Indexing, SQL Server

SQL Server 2012 introduced nonclustered columnstore indexes, but I never saw them used in the wild simply because once created, they made the underlying table read-only. Not a lot of folks like read-only tables. (Bad news, by the way – that limitation hasn’t disappeared in 2014.)

SQL Server 2014 brings clustered columnstore indexes, and they’re totally writeable – you can insert, update, or delete into them. They’re best suited for wide data warehouse fact tables that have lots of columns, but your queries might only want a few of those columns, and they might pick any of the columns for filtering. These types of tables are notoriously difficult to index, and columnstore indexes can give you dramatic performance improvements here.

Books Online says you’d better be sure your access patterns all benefit from columnstore indexes, because you can’t add any nonclustered indexes to your columnstore tables. The CREATE CLUSTERED COLUMNSTORE INDEX syntax page explains that the clustered columnstore “is the only allowable index on the table,” meaning you can’t add non-clustered indexes.

Or can you?

Allow me to demonstrate what I call the Locke Technique:

CREATE TABLE dbo.SimpleTable(
    ProductKey [int] NOT NULL,
    OrderDateKey [int] NOT NULL,
    DueDateKey [int] NOT NULL,
    ShipDateKey [int] NOT NULL);
GO
CREATE CLUSTERED COLUMNSTORE INDEX cci_Simple ON dbo.SimpleTable;
GO

CREATE VIEW dbo.vw_SimpleTable WITH SCHEMABINDING
AS SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey
FROM dbo.SimpleTable;
GO

CREATE UNIQUE CLUSTERED INDEX CLIX_ProductKey ON dbo.vw_SimpleTable (ProductKey);
GO
CREATE NONCLUSTERED INDEX IX_OrderDateKey ON dbo.vw_SimpleTable(OrderDateKey);
GO

CREATE TABLE dbo.SimpleTable(

ProductKey [int] NOT NULL,

OrderDateKey [int] NOT NULL,

DueDateKey [int] NOT NULL,

ShipDateKey [int] NOT NULL);

CREATE CLUSTERED COLUMNSTORE INDEX cci_Simple ON dbo.SimpleTable;

CREATE VIEW dbo.vw_SimpleTable WITH SCHEMABINDING

AS SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey

FROM dbo.SimpleTable;

CREATE UNIQUE CLUSTERED INDEX CLIX_ProductKey ON dbo.vw_SimpleTable (ProductKey);

CREATE NONCLUSTERED INDEX IX_OrderDateKey ON dbo.vw_SimpleTable(OrderDateKey);

Presto – you can have as many nonclustered indexes as you need – and technically, in a strange twist, the clustered index on the view is a nonclustered index on the table as well.

These have the same drawbacks as any other nonclustered index: slower inserts/updates/deletes on the underlying table, more complex query tuning, more space requirements, and so on. They also have even more drawbacks because the schema-binding view means you can’t alter the columnstore table without first dropping the nonclustered indexes and the view.

After You’ve Been a DBA, What Do You Do?

Refactoring T-SQL with Windowing Functions

15 Comments. Leave new

hobs
March 26, 2014 8:34 am

That’s a good trick, even if it is one I should have thought of. (The best kind of trick)

Reply
- Brent Ozar
  March 26, 2014 8:35 am
  
  Thanks!
  
  Reply
tobi
March 26, 2014 9:19 am

Very good!

Makes me wonder why the query processor cannot maintain a regular NCI while it can maintain indexes on a view. Whatever implementation difficulties they had, they internally could have treated NCIs like a CI on a view. Well, it’s always easy to say these things as an outsider.

Reply
Andre Ranieri
March 26, 2014 5:24 pm

Whoa, Brent. That’s brilliant.

Reply
- Brent Ozar
  March 27, 2014 6:40 am
  
  Thanks sir!
  
  Reply
Adam
June 26, 2014 6:12 am

Unfortunately this method has a variety of bugs. In the most simplistic form, it works well enough, but once you add in table partitions and bulk loading with varchars, bugs start appearing. See my thread:

http://social.technet.microsoft.com/Forums/en-US/0a967ae8-a1fe-48e6-b251-88348ecdc281/sql-server-2014-columnstore-with-partitioningindexed-views-bug?forum=sqldatabaseengine

Reply
Nic Neufeld
March 18, 2015 9:37 am

I hate to ask such a silly question, but in your experience the optimizer does recognize the index and use them, right? I did some testing on a very similar scenario, with a query designed to be the perfect seek use of the index (in your scenario, SELECT OrderDateKey FROM [table or view] WHERE ProductKey = [a known constant value]), and it still goes back to expensively scanning the clustered columnstore, regardless of whether I name the original table, or the view, to select from. Even if I put in an index hint, SQL Server basically responds with a middle finger: Warning: Index hints supplied for view [view name] will be ignored.

We’re on Enterprise, and I had no problem creating the indexes, just haven’t been able to convince SQL Server to actually do anything with them yet! 🙂

Reply
- Brent Ozar
  March 18, 2015 9:38 am
  
  I actually don’t have any clients using columnstore. I haven’t found a good fit for it yet – I’m totally convinced the right shops are out there, but I just haven’t seen it be the right solution for the problems I’ve been facing. Absolutely nothing against the technology though.
  
  Reply
  - Nic Neufeld
    March 18, 2015 9:51 am
    
    It’s impressive, but odd, technology to be sure. The compression is fantastic. It definitely is a specialty tool that is very good for a few things and awful at a lot of things (ie SELECT *)…our use is for fact tables in a warehouse. We also tried doing date based partitioning as well against it to do partition elimination to make date range queries go even faster, but we ripped that out after we determined it was just too much mess and hassle with inconsistent gains in performance. Then you have to (or don’t have to? 😉 ) worry about maintenance/rebuilds, etc.
    
    Reply
Felipe Maurer
January 13, 2016 12:18 pm

I can’t think of a scenario where this makes sense. To my understanding, a columnstore index is recommended when accessind a large set of contiguous rows (for example, summarizing a column by the year of a date coulmn) and when the query retrieves a relatively small number of columns. Why would it make sense to index a column other than the ones included in the columnstore (clustered or nonclustered) index?

I’m really confused.

Reply
- Brent Ozar
  January 13, 2016 3:25 pm
  
  Felipe – Books Online does a pretty good job of explaining when a nonclustered B-tree index makes more sense for your queries than columnstore indexes:
  
  https://msdn.microsoft.com/en-us/library/dn935005.aspx
  
  Reply
  - Felipe Maurer
    January 14, 2016 7:58 am
    
    Thanks!!!
    
    Reply
Mark Terry
February 19, 2018 6:42 pm

The Unique Clustered Index makes you lose all the storage savings from using the Clustered Column Store Index. Just tried this on a table which had compressed to 6GB with Clustered Columnstore Index, when I added the Unique Clustered Index on the Shema Bound view it created statics of 146GB!!! And I only indexed snapshot date and source key.

Reply
- Mark Terry
  February 19, 2018 6:43 pm
  
  Also, I would add that the logical reads were (slightly) greater against the view with Clustered index than just querying the columnstore table.
  
  Reply
SQL Sherpa
September 27, 2018 2:19 pm

For anyone reading this in 2018 or later. SQL Server 2016 and greater does allow row based non-clustered indexes in addition to the columnstore clustered index on the same table. I happened to accidentally stumble upon this while using my recently upgraded server!
“A clustered columnstore index can have one or more nonclustered rowstore indexes. Previously, the columnstore index did not support nonclustered indexes. SQL Server automatically maintains the nonclustered indexes for DML operations.” https://docs.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-what-s-new?view=sql-server-2017

Reply