Blog

SQL Server 2012 introduced nonclustered columnstore indexes, but I never saw them used in the wild simply because once created, they made the underlying table read-only. Not a lot of folks like read-only tables. (Bad news, by the way – that limitation hasn’t disappeared in 2014.)

SQL Server 2014 brings clustered columnstore indexes, and they’re totally writeable – you can insert, update, or delete into them. They’re best suited for wide data warehouse fact tables that have lots of columns, but your queries might only want a few of those columns, and they might pick any of the columns for filtering. These types of tables are notoriously difficult to index, and columnstore indexes can give you dramatic performance improvements here.

Inventor of The Locke Technique

Inventor of The Locke Technique

Books Online says you’d better be sure your access patterns all benefit from columnstore indexes, because you can’t add any nonclustered indexes to your columnstore tables. The CREATE CLUSTERED COLUMNSTORE INDEX syntax page explains that the clustered columnstore “is the only allowable index on the table,” meaning you can’t add non-clustered indexes.

Or can you?

Allow me to demonstrate what I call the Locke Technique:

CREATE TABLE dbo.SimpleTable(
    ProductKey [int] NOT NULL,
    OrderDateKey [int] NOT NULL,
    DueDateKey [int] NOT NULL,
    ShipDateKey [int] NOT NULL);
GO
CREATE CLUSTERED COLUMNSTORE INDEX cci_Simple ON dbo.SimpleTable;
GO

CREATE VIEW dbo.vw_SimpleTable WITH SCHEMABINDING
AS SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey
FROM dbo.SimpleTable;
GO

CREATE UNIQUE CLUSTERED INDEX CLIX_ProductKey ON dbo.vw_SimpleTable (ProductKey);
GO
CREATE NONCLUSTERED INDEX IX_OrderDateKey ON dbo.vw_SimpleTable(OrderDateKey);
GO

Presto – you can have as many nonclustered indexes as you need – and technically, in a strange twist, the clustered index on the view is a nonclustered index on the table as well.

These have the same drawbacks as any other nonclustered index: slower inserts/updates/deletes on the underlying table, more complex query tuning, more space requirements, and so on. They also have even more drawbacks because the schema-binding view means you can’t alter the columnstore table without first dropping the nonclustered indexes and the view.

↑ Back to top
  1. That’s a good trick, even if it is one I should have thought of. (The best kind of trick)

  2. Very good!

    Makes me wonder why the query processor cannot maintain a regular NCI while it can maintain indexes on a view. Whatever implementation difficulties they had, they internally could have treated NCIs like a CI on a view. Well, it’s always easy to say these things as an outsider.

  3. Whoa, Brent. That’s brilliant.

  4. Pingback: (SFTW) SQL Server Links 28/03/14 • John Sansom

  5. Unfortunately this method has a variety of bugs. In the most simplistic form, it works well enough, but once you add in table partitions and bulk loading with varchars, bugs start appearing. See my thread:

    http://social.technet.microsoft.com/Forums/en-US/0a967ae8-a1fe-48e6-b251-88348ecdc281/sql-server-2014-columnstore-with-partitioningindexed-views-bug?forum=sqldatabaseengine

  6. I hate to ask such a silly question, but in your experience the optimizer does recognize the index and use them, right? I did some testing on a very similar scenario, with a query designed to be the perfect seek use of the index (in your scenario, SELECT OrderDateKey FROM [table or view] WHERE ProductKey = [a known constant value]), and it still goes back to expensively scanning the clustered columnstore, regardless of whether I name the original table, or the view, to select from. Even if I put in an index hint, SQL Server basically responds with a middle finger: Warning: Index hints supplied for view [view name] will be ignored.

    We’re on Enterprise, and I had no problem creating the indexes, just haven’t been able to convince SQL Server to actually do anything with them yet! :)

    • I actually don’t have any clients using columnstore. I haven’t found a good fit for it yet – I’m totally convinced the right shops are out there, but I just haven’t seen it be the right solution for the problems I’ve been facing. Absolutely nothing against the technology though.

      • It’s impressive, but odd, technology to be sure. The compression is fantastic. It definitely is a specialty tool that is very good for a few things and awful at a lot of things (ie SELECT *)…our use is for fact tables in a warehouse. We also tried doing date based partitioning as well against it to do partition elimination to make date range queries go even faster, but we ripped that out after we determined it was just too much mess and hassle with inconsistent gains in performance. Then you have to (or don’t have to? 😉 ) worry about maintenance/rebuilds, etc.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

css.php