SQL Server 2012 introduced nonclustered columnstore indexes, but I never saw them used in the wild simply because once created, they made the underlying table read-only. Not a lot of folks like read-only tables. (Bad news, by the way – that limitation hasn’t disappeared in 2014.)
SQL Server 2014 brings clustered columnstore indexes, and they’re totally writeable – you can insert, update, or delete into them. They’re best suited for wide data warehouse fact tables that have lots of columns, but your queries might only want a few of those columns, and they might pick any of the columns for filtering. These types of tables are notoriously difficult to index, and columnstore indexes can give you dramatic performance improvements here.Books Online says you’d better be sure your access patterns all benefit from columnstore indexes, because you can’t add any nonclustered indexes to your columnstore tables. The CREATE CLUSTERED COLUMNSTORE INDEX syntax page explains that the clustered columnstore “is the only allowable index on the table,” meaning you can’t add non-clustered indexes.
Or can you?
Allow me to demonstrate what I call the Locke Technique:
CREATE TABLE dbo.SimpleTable(
ProductKey [int] NOT NULL,
OrderDateKey [int] NOT NULL,
DueDateKey [int] NOT NULL,
ShipDateKey [int] NOT NULL);
CREATE CLUSTERED COLUMNSTORE INDEX cci_Simple ON dbo.SimpleTable;
CREATE VIEW dbo.vw_SimpleTable WITH SCHEMABINDING
AS SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey
CREATE UNIQUE CLUSTERED INDEX CLIX_ProductKey ON dbo.vw_SimpleTable (ProductKey);
CREATE NONCLUSTERED INDEX IX_OrderDateKey ON dbo.vw_SimpleTable(OrderDateKey);
Presto – you can have as many nonclustered indexes as you need – and technically, in a strange twist, the clustered index on the view is a nonclustered index on the table as well.
These have the same drawbacks as any other nonclustered index: slower inserts/updates/deletes on the underlying table, more complex query tuning, more space requirements, and so on. They also have even more drawbacks because the schema-binding view means you can’t alter the columnstore table without first dropping the nonclustered indexes and the view.
That’s a good trick, even if it is one I should have thought of. (The best kind of trick)
Makes me wonder why the query processor cannot maintain a regular NCI while it can maintain indexes on a view. Whatever implementation difficulties they had, they internally could have treated NCIs like a CI on a view. Well, it’s always easy to say these things as an outsider.
Whoa, Brent. That’s brilliant.
Unfortunately this method has a variety of bugs. In the most simplistic form, it works well enough, but once you add in table partitions and bulk loading with varchars, bugs start appearing. See my thread:
I hate to ask such a silly question, but in your experience the optimizer does recognize the index and use them, right? I did some testing on a very similar scenario, with a query designed to be the perfect seek use of the index (in your scenario, SELECT OrderDateKey FROM [table or view] WHERE ProductKey = [a known constant value]), and it still goes back to expensively scanning the clustered columnstore, regardless of whether I name the original table, or the view, to select from. Even if I put in an index hint, SQL Server basically responds with a middle finger: Warning: Index hints supplied for view [view name] will be ignored.
We’re on Enterprise, and I had no problem creating the indexes, just haven’t been able to convince SQL Server to actually do anything with them yet! 🙂
I actually don’t have any clients using columnstore. I haven’t found a good fit for it yet – I’m totally convinced the right shops are out there, but I just haven’t seen it be the right solution for the problems I’ve been facing. Absolutely nothing against the technology though.
It’s impressive, but odd, technology to be sure. The compression is fantastic. It definitely is a specialty tool that is very good for a few things and awful at a lot of things (ie SELECT *)…our use is for fact tables in a warehouse. We also tried doing date based partitioning as well against it to do partition elimination to make date range queries go even faster, but we ripped that out after we determined it was just too much mess and hassle with inconsistent gains in performance. Then you have to (or don’t have to? 😉 ) worry about maintenance/rebuilds, etc.
I can’t think of a scenario where this makes sense. To my understanding, a columnstore index is recommended when accessind a large set of contiguous rows (for example, summarizing a column by the year of a date coulmn) and when the query retrieves a relatively small number of columns. Why would it make sense to index a column other than the ones included in the columnstore (clustered or nonclustered) index?
I’m really confused.
Felipe – Books Online does a pretty good job of explaining when a nonclustered B-tree index makes more sense for your queries than columnstore indexes:
The Unique Clustered Index makes you lose all the storage savings from using the Clustered Column Store Index. Just tried this on a table which had compressed to 6GB with Clustered Columnstore Index, when I added the Unique Clustered Index on the Shema Bound view it created statics of 146GB!!! And I only indexed snapshot date and source key.
Also, I would add that the logical reads were (slightly) greater against the view with Clustered index than just querying the columnstore table.
For anyone reading this in 2018 or later. SQL Server 2016 and greater does allow row based non-clustered indexes in addition to the columnstore clustered index on the same table. I happened to accidentally stumble upon this while using my recently upgraded server!
“A clustered columnstore index can have one or more nonclustered rowstore indexes. Previously, the columnstore index did not support nonclustered indexes. SQL Server automatically maintains the nonclustered indexes for DML operations.” https://docs.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-what-s-new?view=sql-server-2017