If you’ve been following the blogosphere, or heading to SQL Saturdays, you’ll know that many folks are excited about the brand new SQL Server cardinality estimator (CE for short). As Kendra showed in her article The SQL 2014 Cardinality Estimator Eats Bad TSQL for Breakfast, the new CE provides dramatically improved query plans.
What if you want to know more about the new CE? What’s your option? Well, a lot of bloggers, presenters, and crazy people (I’m looking at you Paul White), will observe how the CE behaves by building carefully contrived scenarios and verifying the outcome. We can observe the CE in the same way that we observe the optimizer – by observing the side effects.
What if you want to know why something happened? To get an answer to that, you’d have to go directly to the source. Thankfully, the source has published their work in the paper Testing cardinality estimation models in SQL Server. Warning – this paper does cost money if you aren’t a member of the ACM.
Why would you want to go and read an academic paper like this one? Well, it contains a lot of juicy information about what the SQL Server product team considered in terms of possible optimization paths for the CE. One item of interest is the concept of a simple join optimization. To summarize – the simple join ignores histograms and assumes that all distinct values from the child are contained in the other table. This could cause some query estimation issues, but there will be faster optimization. The authors note that for some workloads, this will be much more desirable.
If you’re not up for an academic paper or two, Joe Sack has put together an in depth look in Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator. It’s well researched, peer reviewed, and based on observable verifiable information that’s in the product right now.