Blog

Imagine you work for a zoo, and you need to ship an animal to another zoo. If you are told one will be a mouse and one will be an elephant, you will need to handle them differently. The Detroit Zoo used a moving van to move two elephants, while mice can be put in a box and driven or flown.

Something Smells Funny

And it’s not the elephant. SQL Server uses a process called parameter sniffing when it executes stored procedures that have – you guessed it – parameters. When the procedure is compiled or recompiled, the value passed into the parameter is evaluated and used to create an execution plan. That value is then stored with the execution plan in the plan cache. On subsequent executions, that same value – and same plan – is used.

This is a normal, expected behavior in SQL Server. Because compiling queries is expensive, you want plans stored in the cache. You want SQL Server to re-use them as much as possible.

But what happens when the values in a table you’re querying aren’t evenly distributed? What if one value would return 10 rows and another value would return 10,000 rows, or 10 million rows? I call this the elephant and the mouse problem. You would handle one animal differently than the other; SQL Server might create different plans for the queries. But it doesn’t, because you’re using parameters. elephant

What will happen is that the first time the procedure is run and the plan is compiled, whatever value is passed in is stored with the plan. Every time it’s executed, until it’s recompiled, the same value and plan will be used – regardless of whether it is the fastest or best plan for that value. If this is happening to you, and causing performance problems, there are ways to deal with it.

Parameter Sniffing in Action

I’m going to run a few queries to show you this behavior. I’ll be using AdventureWorks2012 on a SQL Server 2012 instance.

I’m going to query the Sales.SalesOrderDetail table. I want to know the order numbers and quantity ordered for specific products. My first set of queries will use literal values in the WHERE clause. When using literal values, SQL Server will compile each separately, and store a separate execution plan for each.

SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = 897;

SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = 945; 

SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = 870;

The same query was run three times. The queries returned 2, 257, and 4,688 rows, respectively. Two different execution plans have been created, even though the query remained the same!

Elephant Mouse 1

At some point, the optimizer decided it was faster to do a clustered index scan, instead of a nonclustered index seek and a key lookup.

How does this behave when parameterized? I create a stored procedure to test this.

CREATE PROCEDURE Get_OrderID_OrderQty 
 @ProductID INT
AS 
 
SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = @ProductID; 

I’m going to execute this stored procedure for the first time with the “elephant” – product ID 870, which returns 4,688 rows.

EXEC Get_OrderID_OrderQty @ProductID=870

The data is retrieved through a clustered index scan once again.

Elephant Mouse 2

Now, I’ll execute this stored procedure with the “mouse” – product ID 897, which returns 2 rows.

EXEC Get_OrderID_OrderQty @ProductID=897

This time, instead of using a nonclustered index seek and a key lookup, the values are retrieved with a clustered index scan. Also note that the estimated number of rows and actual number of rows are very different!

Elephant Mouse 3

What happened? This is parameter sniffing in action. One value is stored in the execution plan, and that is used to create the plan, regardless of what value is passed in. I can verify this by right-clicking the execution plan and selecting “Show Execution Plan XML”. In the XML, I search for “ParameterCompiledValue”. I find the following line.

Elephant Mouse 4

The compiled value is 870. Until the stored procedure is recompiled, this is the value that will be used.

Is This Bad?

The real question is, “Is this bad?” It isn’t – until it is. Depending on the query and the data, each execution of this query may return the results in an acceptable amount of time. It is only when query performance reaches an unacceptable threshold that you may need to do something to prevent it from happening.

When Parameter Sniffing Stinks

When parameter sniffing is negatively affecting a stored procedure, what can you do to fix it?

The first option is to do nothing. You can explain the behavior of SQL Server to your end users. They might even listen! But they won’t be happy. They need their data, and they need it faster.

Other options you have are to recompile the stored procedure each time, use a query hint to produce a good enough plan, or do some creative coding. None of these are perfect options – they all have drawbacks. Understand the benefits and drawbacks of each option. Test the variations carefully in a development environment. The last thing you want to do is make the problem worse!

Recompiling

You can force SQL Server to recompile the stored procedure each time it is run. The benefit here is that the best query plan will be created each time it is run. However, recompiling is a CPU-intensive operation. This may not be an ideal solution for stored procedures that are run frequently, or on a server that is constrained by CPU resources already. Another thing to remember is that the plans won’t be stored in the cache, which makes them harder to find if they are problematic.

To show this, I’m going to alter my stored procedure to include the WITH RECOMPILE statement.

ALTER PROCEDURE Get_OrderID_OrderQty 
 @ProductID INT 
 WITH RECOMPILE 
AS 
 
SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = @ProductID; 

What plan will executing this with the “elephant” – product ID 870 – produce?

EXEC Get_OrderID_OrderQty @ProductID=870

Elephant Mouse 5

A clustered index scan has been performed.

What plan will executing this with the “mouse” – product ID 897 – produce?

exec Get_OrderID_OrderQty @ProductID=897

Elephant Mouse 7

The nonclustered index seek and key lookup are being performed, as expected. This is happening because WITH RECOMPILE tells SQL Server, “Don’t store a plan in cache for me. I’m smart enough to figure this out on my own each time.” Remember, the cost of this is increased CPU usage each time the stored procedure is run.

Query Hinting

Another option is to use the OPTIMIZE FOR query hint. This tells SQL Server to use a specified value when compiling the plan. If, through testing, you can find a value that produces a “good enough” plan each time, and the performance is acceptable for both mice and elephants, this is a good option for you.

However, understand that you are bossing the query optimizer around. You are telling it what you think is best. The biggest drawback with OPTIMIZE FOR is on tables where the distribution of data changes. The faster it changes, the more out of date this hint could become. What if the value you provide is not optimal in a month, or a year? You need to have a method in place to regularly review and revise this.

I know that using product ID 945 produces a “good enough” plan for this query, so I alter the procedure to include the OPTIMIZE FOR query hint.

ALTER PROCEDURE Get_OrderID_OrderQty 
 @ProductID INT
AS 
 
SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = @ProductID 
OPTION (OPTIMIZE FOR (@ProductID=945)); 

What plan will the “elephant” – product ID 870 – produce?

EXEC Get_OrderID_OrderQty @ProductID=870

Elephant Mouse 7

A nonclustered index seek and key lookup has been performed. Viewing the XML shows us that the compiled value is 945.

The “mouse” – product ID 897 – has the same execution plan.

EXEC Get_OrderID_OrderQty @ProductID=897

Elephant Mouse 9

Be Creative

Another option may be to create separate stored procedures for each value, or at least for those values you know will produce a specific execution plan. I’m not going to show this. Why? Because it’s ugly, it’s cumbersome, and I don’t recommend it. Who wants to maintain multiple stored procedures, when you can maintain one? What do you do when the distribution of data changes? How do you make the logic clear and easily understood?

When Parameter Sniffing Stinks

Understand that parameter sniffing is not a bad thing – it is an integral part of SQL Server. The data in your tables and your queries can lead to parameter sniffing not making sense, however. Understanding how to identify if it is a problem and knowing various ways to fix it when it is a problem are valuable skills to have. When parameter sniffing stinks, test methods to correct it and determine the best way to freshen the air in the server.

↑ Back to top
  1. What is also useful at times is to add the option recompile at the statement level.

    The added benefit I believe is that the plan is cached to the point it encounters such a statement and all _current_ values of the arguments _and_ local variables are taken into account.

    Example:

    SELECT …
    Table1 Join Table2 ON
    WHERE @Midnight <= col1
    OPTION (recompile)

    • You’re correct – you can recompile at the procedure or statement level. When the WITH RECOMPILE statement is added at the stored procedure level, none of the statements in the stored procedure have plans cached. When OPTION (RECOMPILE) is added to an individual statement in the stored procedure, that particular plan will not be cached – but others in the procedure are not affected.

  2. I ran into an issue like this a few days ago. One stored procedure was looping through a bunch of values and calling a second SP with those values as a parameter. This second stored procedure was taking about a minute to run. Examining the execution plan I could see that it was using the compiled parameter value to estimate the plan, rather than the current value. My solution was to assign the parameter value to a local variable, and change the queries in the SP to use the local variable rather than the parameter. Execution time came down to about 3 seconds.

  3. I usually avoid it by loading a local variable with the parameter value at the start of the sproc. Is that a bad idea?

    • Nope – local variables are another way to manage it!

      • Thanks for the wonderful article. I am somewhat confused with the local variable discussion and would greatly appreciate your insights on this.
        So, If you set the local variables and pass these as input to the query, SQL can only generate a ‘generic’ plan as per the statistics.
        So, let’s say there is a ‘Names’ table and ‘id’ is the PK and ‘last name’ has Non clustered index. Assume there are total of 100000 rows out of which 90000
        are ‘Smith’ and remaining are unique(let, ‘Santa’ be one of them). Now, if we run the procedure using the local variables method, it would use only
        generic plan, i.e “clustered index scan” even while looking for “Santa”(Ideally this should be a Non clustered Seek).So, I am wondering, this is still in a way, parameter sniffing problem (Using the same plan for everything). Please clarify if I have misunderstood anything. Thank You.

        • I think you got it. The generic plan is better but still suboptimal. Look at my previous comment for option recompile at the statement level which does take into account the current value of a local variable. It was introduced with sql2005.

      • This has been very useful, thanks for writing this. I think we have a case of parameter sniffing but with a slight twist due to some “interesting” coding. In this case we pass in three parameters, but one can be reassigned. To explain more clearly:

        MyProc @VarA INT, @VarB INT, @SearchKey INT
        ….
        IF @VarA = 0 SET @VarA = VarB

        SELECT * FROM MyTable
        WHERE Key1 = @VarA
        AND Key2 = @SearchKey

        In our database we dont have any data in MyTable where Key1 = 0, so I think what happens is that if the procedure is first called and compiled with a @VarA = 0 then it generates a plan using 0 for Key1, whereas infact the value to be used is really in @VarB. I then think that due to parameter sniffing it keeps the plan for subsequent calls.

        I think this is what could be happening, but would welcome any thoughts on this. The assumption is that the query is compiled using the parameter values and doesn’t take into account the reassignment.

        My fix for this is going to be to define a local variable and assign the relevant parameter to that and then use the local variable in the selection.

        Thanks

        • Ian – well, once you get to the point where there’s code samples involved, you’re best off posting that at a dedicated question and answer site like http://DBA.stackexchange.com where lots of people can jump in to help refine your question and get you great answers fast.

          • Thanks for the quick reply, I’ll try there. I’ve run a basic test and the fix does appear to work, but I’ll see if anyone can shed some light on what the optimiser is doing.

  4. Thanks for writing this, Jes. I think it’s the best explanation this DBA has seen yet on the topic and gave me better clarity as to how to find the “source of the smell”. :) Thanks again!

  5. Jes,

    Great article! Thanks!! Never saw such a good explanation about plan parameterization.

    If you excuse me, maybe I can add something here. When you are covering the hint OPTMIZE FOR, you could explain about OPTMIZE FOR UNKNOWN too.

    I had never a case to use “OPTMIZER FOR @somevariable” and I think it’s a little risky if one can’t keep an eye on the plan. But I already used OPTMIZE FOR UNKNOWN more than once with absolute success.

    Regards

  6. Hi, this is very insteresting, What do you think local variables like this ?

    ALTER PROCEDURE [dbo].[Get_OrderID_OrderQty]
    @ProductID INT
    AS
    DECLARE @ProductIDLocal INT = @ProductId
    SELECT SalesOrderDetailID, OrderQty
    FROM Sales.SalesOrderDetail
    WHERE ProductID = @ProductIDLocal;

    Is this useful ? Why didn’t you describe this option ? Is bad option ?

    Thanks a lot !

    • You can use local variables in stored procedures to “avoid” parameter sniffing. Understand, though, that this can lead to many plans stored in the cache. That can have its own performance implications. There isn’t a one-size-fits-all solution to the problem!

      • You mentioned that if you use local variables you will get a plan for each unique parameter. Could you do a mix of parameters and local variables? Thus getting a plan for the different local variables while using the other parameters for the initial plan? For example, I have @p_topic_ID and @p_participant_ID for parameters. I’d like to set the @p_topic_ID to a local variable to get the different plans for each topic, but I don’t want a plan for each @p_participant_ID, so I wouldn’t set that as a local variable.

        • Marc, I’m not entirely sure what you’re asking. Why don’t you post a sample over at http://DBA.stackexchange.com – include code, and there are a lot of other people there to answer the question and offer opinions, too.

          • Jes, I’m going to do that. And here is a sample for what I was asking. Once I here back for the other site, I’ll post the response. And thanks, this article has been very useful.

            ALTER PROCEDURE [dbo].[stp_get_topic]
            @p_topic_ID INT,
            @p_participant_ID INT
            AS

            DECLARE @topic_ID INT = @p_topic_ID

            SELECT t.topic_ID, t.topic_name, pp.current_step
            FROM topic t
            INNER JOIN participant_progress pp
            ON t.topic_ID = pp.topic_ID
            WHERE t.topic_ID = @topic_ID
            AND pp.participant_ID = @p_participant_ID

          • What you’re describing will work similarly to the OPTIMIZE FOR UNKNOWN option. I didn’t cover that here, but it’s worth looking into!

  7. Thanks again for such a clear article in this subject. The information about OPTIMIZE FOR… helped me solve a problem in our production database yesterday. My question for you is, what conditions in the data would suddenly make SQL Server start using parameter sniffing in a stored proc to decide to apparently optimize on that parameter as numeric when the query ran fine for months and months before that? Situation is that the values of the parameter in question are always stored in the database as varchar(8), but 99.99% of the time, the values themselves are numeric. The problem I ran into was that all of a sudden the query just started hanging and would never complete. It would gain CPU time perpetually but never finish. Flushing caches/recompiling/etc. had no effect. It appeared that SQL was in an internal loop when trying to build a query plan–but I can’t verify that. The query in question would run fine (1 to 2 seconds) when run as a script outside the stored proc. Your article gave me the idea to try “OPTION (OPTIMIZE FOR (@param=’ABC’))”, and the stored proc then ran in a second or so.

    So the question is, do you think it was some kind of growth of the data over some threshold that triggered the server’s behavior? Have you had any luck with some kind of query that can predict when this sort of thing will come into play?

    • As data in your tables changes over time, the statistics on the table will be updated. As statistics are updated, the query plans are recompiled. The value passed in at the time the plan is recompiled is what will be saved in the execution plan (unless you use a hint like OPTIMIZE FOR).

    • This is the best explanation on Parameter sniffing I have read so far. Comments are equally helpful here as I got to know more ways to handle parameter sniffing by going through the comments.

  8. Pingback: SQL Server: Variables, Parameters or Literals? Or... Constants? - SQL Server Appendix Z - Site Home - MSDN Blogs

  9. very nice article…. In my project I am using output parameter everywhere.. I have a doubt like output parameter in stored procedure will create parameter snipping..?? If yes how we can avoid it

  10. I have a serious performance in some procedures where the parameters are not always used. This paremetros are tested in the WHERE clause like this:
    WHERE (COL1 = @ PAR1 IS NULL OR @ PAR1)
    AND (COL2 = @ @ PAR2 PAR2 OR IS NULL)

    What you guys suggest to solve this problem? Dynamic query?

    • This a a common pattern to specify optional parameters and its also common to run into performance problem because of plan caching. You can rewrite the query (I can think of some strategies but they take time to explain). My advice is to disable plan caching (this is a sproc right?) with the downside that there will be some additional time due to plan construction but this is probably “amortized”.

      • Thanks for the reply
        rewrite as a dynamic query would be one of your strategies or not?
        What would you do?

        • Marcelo,

          As you have code inside your stored proc it’s better to put conditions and split the OR
          Example:

          IF (@PAR1 IS NULL AND @PAR2 IS NULL)
          BEGIN
          SELECT …
          WHERE (COL1 IS NULL)
          AND (COL2 IS NULL)
          END
          ELSE IF (@PAR1 IS NULL AND @PAR2 IS NOT NULL)
          BEGIN
          SELECT …
          WHERE (COL1 IS NULL)
          AND (COL2 = @PAR2)
          END
          ELSE IF…

          And go on spliting whatever conditions you have. As SQL brakes down query plans inside procedures you will have a query plan for each condition.

          • Splitting as Marcelo said is one of the strategies but its clear that this quickly becomes impracticable when the number of optional parameters grows

        • Dynamic is something to try but I don’t like working with strings. You can also write dynamic sql in many ways such as using a parameter placeholder or substituting a value. I’m pretty sure that a placeholder also uses plan caching.

  11. It is a very good article about parameter sniffing.Now only i understand why my SP executed at different speeds for different stored procedures

  12. Nice job on the topic. SQL can count really quickly when the query is fully sargable. Count the number of results for the given parameter and use that to determine which plan hint to use. That way you could limit this query to using either of two plans based on a setting you control. You would not have to recompile – losing any benefit of reused plans and incurring CPU load.

    All code is nothing more than strings. Source control code-merging and compilers and sp_ExecuteSQL/EXEC take in strings and operate on it. If you split the code with CASE, you are typing a whole lot of extra text and introducing mistakes. SQL will still create/re-use separate plans for each case, so there really is no problem with dynamic sql inside the sproc.

    • Mike, not sure what you’re referring to. Can you post a code sample?

      • Hi Jes

        It seems there are only two ways to optimize your original query, so using a manually set threshold, we can tell SQL Server which plan to use. I’m hopeful counting to see if we exceed the threshold is fast enough to compete with OPTION RECOMPILE, OPTIMIZE FOR UNKNOWN, variables etc. My understanding is the count should always be an indexseek since SQL stores the number of matches for each distinct key value in the index itself.

        I’m concatenating @Hint into the SQL command, but that is not a risk for SQL Injection as @Hint was not passed into the sproc. I’ve parameterized @ProductID in the call to sp_executeSQL, so that is not a SQL Injection risk. The query will be exactly the same each time, so plans will be reused. The values for mouse, elephant and threshold puts optimization of this in the hands of a person.

        ALTER PROCEDURE Get_OrderID_OrderQty
        @ProductID INT
        AS

        DECLARE @Threshold Int,
        @Counter Int,
        @Hint Int,
        @Params NVARCHAR(max),
        @SQLQuery NVARCHAR(max)

        SET @Threshold = 100

        /*SQL will count really fast – but, is it faster than other methods?*/
        SELECT @Counter = COUNT(*) from Sales.SalesOrderDetail WHERE ProductID = @ProductID

        IF @Counter < @Threshold
        /*Mouse*/
        SELECT @Hint = 897
        ELSE
        /*Elephant*/
        SELECT @Hint = 870

        SELECT @Params = ' @ProductID INT '

        SELECT @SQLQuery =
        ' SELECT ' +
        ' SalesOrderDetailID, OrderQty ' +
        ' FROM Sales.SalesOrderDetail ' +
        ' WHERE ProductID = @ProductID ' +
        ' OPTION (OPTIMIZE FOR (@ProductID=' + @Hint + '))'

        EXEC sp_executesql @SQLQuery, @Params, @ProductID

        • I can see how this would work, and nicely – but I don’t ever want to leave decisions like “threshold for optimization” up to a person. I want to leave that up to what does it best – the query optimizer.

          What happens when the data for both the “elephant” and the “mouse” increases? What if the “mouse” value exceeds the threshold? Is it thoroughly documented so that the threshold gets updated? What if the developer who created that left, and the person taking over for her had no idea that was there?

          Leaving things like that to be managed by a person can be risky!

          • Hi Jess

            I agree, except the optimizer didn’t do it best this time. Microsoft will have to add a new feature for this.

  13. If the application server and the SQL server are not on the same server, where is the cache stored on? Is it cached on the application server, or on the SQL server?

    • The execution plans SQL Server uses are always stored on the SQL Server. The application may have its own cache elsewhere for query results that it can pick up again without talking to the SQL Server at all (memcached, app fabric cache are two common varieties of this).

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php