Blog

Is this really happening?

Is this really happening?

Sometimes I run into a query plan that just shocks me. It’s like looking up and suddenly seeing an adult dressed as a squirrel riding a unicycle down a busy city street. You have to stop and ask yourself, “Did that really just happen?” (I live in Portland, Oregon, so yeah, I’ve seen that.)

A while back I blogged about how to write a query to demonstrate a big memory grant against the AdventureWorks2012 database. The trick was to stuff the query’s joins full of functions so that the query optimizer became baffled about how many rows might come out of the join. The query is terrible, but it’s useful for demonstrating some problematic situations with memory on a test server.

Recently I dug out the query to set up a demo on SQL Server 2014 and something strange happened: it saw through the awfulness of my query. It made my horrible TSQL fast.

Let’s feed my terrible query to SQL Server 2014

First, make sure the new cardinality estimator will be used for your query. You do this by setting the database compatibility level to 120:

ALTER DATABASE AdventureWorks2012 SET COMPATIBILITY_LEVEL=120;
GO

Now we’ll run this terrible, awful, no good query:

SELECT e.[BusinessEntityID],
       p.[Title],
       p.[FirstName],
       p.[MiddleName],
       p.[LastName],
       p.[Suffix],
       e.[JobTitle],
       pp.[PhoneNumber],
       pnt.[Name] AS [PhoneNumberType],
       ea.[EmailAddress],
       p.[EmailPromotion],
       a.[AddressLine1],
       a.[AddressLine2],
       a.[City],
       sp.[Name] AS [StateProvinceName],
       a.[PostalCode],
       cr.[Name] AS [CountryRegionName],
       p.[AdditionalContactInfo]
FROM   [HumanResources].[Employee] AS e
       INNER JOIN [Person].[Person] AS p
       ON RTRIM(LTRIM(p.[BusinessEntityID])) = RTRIM(LTRIM(e.[BusinessEntityID]))
       INNER JOIN [Person].[BusinessEntityAddress] AS bea
       ON RTRIM(LTRIM(bea.[BusinessEntityID])) = RTRIM(LTRIM(e.[BusinessEntityID]))
       INNER JOIN [Person].[Address] AS a
       ON RTRIM(LTRIM(a.[AddressID])) = RTRIM(LTRIM(bea.[AddressID]))
       INNER JOIN [Person].[StateProvince] AS sp
       ON RTRIM(LTRIM(sp.[StateProvinceID])) = RTRIM(LTRIM(a.[StateProvinceID]))
       INNER JOIN [Person].[CountryRegion] AS cr
       ON RTRIM(LTRIM(cr.[CountryRegionCode])) = RTRIM(LTRIM(sp.[CountryRegionCode]))
       LEFT OUTER JOIN [Person].[PersonPhone] AS pp
       ON RTRIM(LTRIM(pp.BusinessEntityID)) = RTRIM(LTRIM(p.[BusinessEntityID]))
       LEFT OUTER JOIN [Person].[PhoneNumberType] AS pnt
       ON RTRIM(LTRIM(pp.[PhoneNumberTypeID])) = RTRIM(LTRIM(pnt.[PhoneNumberTypeID]))
       LEFT OUTER JOIN [Person].[EmailAddress] AS ea
       ON RTRIM(LTRIM(p.[BusinessEntityID])) = RTRIM(LTRIM(ea.[BusinessEntityID]));
GO

Shazam, it finishes using only 63 ms of CPU time. It asks for a 27MB memory grant and estimates 290 rows (quite accurately).

Top of Execution Plan-New Cardinality Estimator

The execution plan contains warnings that “Type conversion in expression … may affect “CardinalityEstimate” in query plan choice”, but wow, it really did a remarkably good job with this!

Using OPTION(QUERYTRACEON 9481) to test Compatibility Level 110

You can go back to the old cardinality estimator with SQL Server 2014 in two ways: You could change the whole database’s compatibility level back to 110 like this:

ALTER DATABASE AdventureWorks2012 SET COMPATIBILITY_LEVEL=110;
GO

But that changes it for everything in the whole database. We might just want to see how THIS query would behave using the old cardinality estimator, but still on SQL Server 2014. You can do that by adding OPTION(QUERYTRACEON 9481) to the very end of our gruesome query:

SELECT e.[BusinessEntityID],
       p.[Title],
       p.[FirstName],
       p.[MiddleName],
       p.[LastName],
       p.[Suffix],
       e.[JobTitle],
       pp.[PhoneNumber],
       pnt.[Name] AS [PhoneNumberType],
       ea.[EmailAddress],
       p.[EmailPromotion],
       a.[AddressLine1],
       a.[AddressLine2],
       a.[City],
       sp.[Name] AS [StateProvinceName],
       a.[PostalCode],
       cr.[Name] AS [CountryRegionName],
       p.[AdditionalContactInfo]
FROM   [HumanResources].[Employee] AS e
       INNER JOIN [Person].[Person] AS p
       ON RTRIM(LTRIM(p.[BusinessEntityID])) = RTRIM(LTRIM(e.[BusinessEntityID]))
       INNER JOIN [Person].[BusinessEntityAddress] AS bea
       ON RTRIM(LTRIM(bea.[BusinessEntityID])) = RTRIM(LTRIM(e.[BusinessEntityID]))
       INNER JOIN [Person].[Address] AS a
       ON RTRIM(LTRIM(a.[AddressID])) = RTRIM(LTRIM(bea.[AddressID]))
       INNER JOIN [Person].[StateProvince] AS sp
       ON RTRIM(LTRIM(sp.[StateProvinceID])) = RTRIM(LTRIM(a.[StateProvinceID]))
       INNER JOIN [Person].[CountryRegion] AS cr
       ON RTRIM(LTRIM(cr.[CountryRegionCode])) = RTRIM(LTRIM(sp.[CountryRegionCode]))
       LEFT OUTER JOIN [Person].[PersonPhone] AS pp
       ON RTRIM(LTRIM(pp.BusinessEntityID)) = RTRIM(LTRIM(p.[BusinessEntityID]))
       LEFT OUTER JOIN [Person].[PhoneNumberType] AS pnt
       ON RTRIM(LTRIM(pp.[PhoneNumberTypeID])) = RTRIM(LTRIM(pnt.[PhoneNumberTypeID]))
       LEFT OUTER JOIN [Person].[EmailAddress] AS ea
       ON RTRIM(LTRIM(p.[BusinessEntityID])) = RTRIM(LTRIM(ea.[BusinessEntityID]))
OPTION(QUERYTRACEON 9481);

Retesting the query with the old cardinality estimator… Ouch! The query uses 84,109 ms of CPU time and the execution plan is back to a world of confusion, thinking it’s going to have to handle a kazillion rows:

Execution Plan-High Estimations

Hey there new cardinality estimator, I’d like to get to know you better.

I’m not saying the new cardinality estimator will be better at every query, that it won’t have any regressions, or that you should start putting functions around all your joins.

But it’s pretty remarkable when the optimizer takes code that you wrote to be as terrible as possible, and suddenly makes it fast. Sign me up for more of that.

↑ Back to top
  1. Joe Sack covered the new CE at #SQLSaturday in Madison – quite interesting to say the least. I can’t believe it’s the first major update to the CE engine since the 7.0 days! The one takeaway I got from his presentation was the last sentence on the last slide (as you too alluded to): “If you don’t have time to test, don’t enable the new CE in production until you have.” While the vast majority of performance should improve, there undoubtedly will be some portions of your workload that will benefit from the old CE.

    • I’m always a fan of testing, but just as a counterpoint: doing a full code regression isn’t a simple task, and there are some applications with a higher tolerance of risk than others.

      The fact that you can use trace flags with QUERYTRACEON to control the behavior of certain queries to use the lower compatibility level (or higher) is a great thing, and I think it’ll make the transition less difficult for many people than, say, the transition from compat 80 to higher levels. Man, that wasn’t fun.

      • I agree – the biggest issue (one of many) was with the use of NOLOCK w/out WITH. Many developers used the old syntax and all heck broke loose when I changed the compatibility level one fine day. At least it was in Dev ;-)

  2. Hey – really interesting post!

    Has anyone come across any information on how the CE actually manages to pull this off?

    How can it know how many matches will be found between *nested* function calls on *both* sides of a join? I guess using basic information on typical correlation between tables?

    I’m sure its quite elegant when you know the trick, but at the moment it seems indistinguishable from magic! ;-)

    Best Regards

    S

    • I haven’t really looked. Honestly, I assume it actually IS magic.

      If that’s not the case, it may be revealed in an upcoming whitepaper by the SQL Server product team (referenced here: http://blogs.msdn.com/b/psssql/archive/2014/04/01/sql-server-2014-s-new-cardinality-estimator-part-1.aspx)

    • In previous versions of the cardinality estimator, SQL Server punted on this estimation and it used some wildly bad guesses about the number of rows being returned out of the function.

      In terms of estimating cardinality, a nested scalar function call is no different than a single function call. Mathematically speaking, f(g(x)) is the same as z(x) as long as z(x) does the same thing – nesting function calls in an RDBMS is no different as long as the result of one or more built-ins does the same thing.

      The advantage SQL Server 2014 has is that it immediately punts to using hash joins instead of merge and nested loop joins. I’m not sure why SQL Server 2014 uses merge or nested loop joins with the terrible estimates that it makes. (Note that UDFs don’t behave in the same way because they are pure evil.)

      There really are two things at play:

      1) The Cardinality Estimator is making much more accurate estimates about the rows being returned.
      2) The Query Optimizer is making much more interesting guesses about how to deal with these potentially poorly estimated data sets.

      With the choice of a hash join, SQL Server can take advantage of a number of

      As far as figuring out the trick, I suspect you’d want to start your search by looking for papers coming out of the University of Wisconsin’s computer science department and/or Microsoft Research.

      Chengkai Li’s Set Predicates in SQL: Enabling Set-Level Comparisons for Dynamically Formed Groups looks like it might be a great start, but some of the other research is in a for pay article Testing cardinality estimation models in SQL Server.

      That second paper makes reference to a faster/more improved join model estimation when the CE assumes that all distinct values from the child table are contained in the parent set. It seems to me that SQL Server may have this simple join model optimization built in, but that’s just speculation.

      • Are UDFs in 2014 still as bad as they use to were? Specially table-valued UDFs (and table variables as well) in which CE is awful and performance degrades as soon as more than just a few rows are returned?
        Thanks

        • The problem with any UDF is that it’s not a physical object – it’s a function. Therefore there is no data to create histograms or statistics on – they are a black box from a cardinality estimation perspective. If you are returning a lot of rows from a UDF and joining to it, etc. – you can safely assume it won’t perform very well.

          I’m sure others may be able to add more.

          • I know… And it’s a big mess when you realize too late, in a heavily loaded production database :)

            Just wondering if there were some other kind of magic for UDFs in SQL 2014.

            Thank you!

          • Adam Machanic demonstrated a really cool way to trick the query optimizer at SQL Saturday 291 in Chicago a couple weekends ago. You can find the scripts in his blog: http://sqlblog.com/blogs/adam_machanic/. I think his method could be used to make the query optimizer consider a more realistic number of rows returned from a UDF.

    • Simon – if you really want to read up on CE, as I mentioned in my other post Joe Sack has quite a few detailed blog posts on the topic.

      • Thanks matey – just reading through them now. The slide deck that Joe uploaded looks like information gold dust at the moment – at least until the MS whitepaper comes out.

        Cheers

        S

  3. Pingback: SQL Server Radio: Show 14 – Simplification

  4. Pingback: Enabling the New Cardinality Estimator in SQL Server 2014 | Michael J. Swart

  5. Pingback: Enabling the New Cardinality Estimator in SQL Server 2014 - SQL Server - SQL Server - Toad World

  6. This is an example where 2014’s CE makes things better, but we’ve got an example where it makes it worse.

    We have a statement which works just fine on SQL 2005, 2008, 2008R2 and 2012 (and 2014 with “OPTION (QUERYTRACEON 9481)”) but which grinds to a halt on 2014. i.e. from a second or two to nearly 2 minutes.

    Has anybody else experienced a worsening of performance in 2014? We’re considering what to advise customers when they upgrade their servers to 2014.

    • Oh, absolutely people have had results where it hasn’t been better. For any application, you want to test it carefully before turning it on, and you might want to only enable or disable it for specific queries.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php