Another Hidden Parallelism Killer: Scalar UDFs In Check Constraints

Every single time

Really. Every single time. It started off kind of funny. Scalar functions in queries: no parallelism. Scalar functions in computed columns: no parallelism, even if you’re not selecting the computed column. Every time I think of a place where someone could stick a scalar function into some SQL, it ends up killing parallelism. Now it’s just sad.

This is (hopefully. HOPEFULLY.) a less common scenario, since uh… I know most of you aren’t actually using any constraints. So there’s that! Developer laziness might be a saving grace here. But if you read the title, you know what’s coming. Here’s a quick example.

Parallelism appears to be rejected for maintenance operations as well as queries, just like with computed columns.

Interestingly, if we look in the plan XML (the execution plan itself just confirms that the query didn’t go parallel) we can see SQL tried to get a parallel plan, but couldn’t.

Garbagio

Garbagio

There’s a short list of possible reasons for plans not going parallel here from a while back. A quick search didn’t turn up a newer or more complete list.

Check yourself, etc. and so forth

How do you know if this is happening to you? Here’s a simple query to look at constraint definitions and search them for function names. This query is dumb and ugly, but my wife is staring at me because it’s 5:30 on a Saturday and I’m supposed to be getting ready. If you have a better idea, feel free to share in the comments.

Thanks for reading!

Previous Post
I’m on the RunAsRadio Podcast
Next Post
Breaking News, Literally: SQL CLR Support Removed from Azure SQL DB

9 Comments. Leave new

  • This will get a list of all objects (and their types) that reference a function:

    SELECT [ObjectName] = [obj_constraint].[name],
    [ObjectType] = [obj_constraint].[type_desc],
    [ReferencedFunction] = [obj_function].[name],
    [ReferencedType] = [obj_function].[type_desc]
    FROM [sys].[objects] [obj_constraint]
    INNER JOIN [sys].[sql_expression_dependencies] [sed] ON [sed].[referencing_id] = [obj_constraint].[object_id]
    INNER JOIN [sys].[objects] [obj_function] ON [obj_function].[object_id] = [sed].[referenced_id]
    WHERE [obj_function].[type] IN (‘FN’, ‘TF’);

    This will also catch computed columns, views, and stored procedures that reference functions. It baffles me that a default constraint screws with the query plan on a select. That smells like a bug to me.

    Reply
    • I was also wondering about the constraint on SELECT. Can anyone share some information on this? Thanks!

      Reply
  • Dylan Bacon
    April 8, 2016 3:29 pm

    “We couldn’t make the plan parallel because we couldn’t make the plan parallel.”

    Brilliant.

    Reply
  • This article does a good job of demonstrating something to look for in a single query plan. It would be very beneficial it it also provided a means to identify if the condition exists in the plan cache. The query below will provide the query plans from the plan cache where this exist. Unfortunately, it also shows query plans that will should not run parallel, such as simple inserts or updates. It would be grate to be able to tweak the query below to remove such query plans.

    WITH XMLNAMESPACES
    (DEFAULT ‘http://schemas.microsoft.com/sqlserver/2004/07/showplan’)
    SELECT
    query_plan AS CompleteQueryPlan
    , n.value(‘(@StatementText)[1]’, ‘varchar(4000)’) AS StatementText
    , n.value(‘(@StatementOptmLevel)[1]’, ‘varchar(25)’) AS StatementOptimizationLevel
    , n.value(‘(QueryPlan/@NonParallelPlanReason)[1]’, ‘nvarchar(128)’) AS NonParallelPlanReason
    , ecp.usecounts
    , ecp.size_in_bytes
    FROM sys.dm_exec_cached_plans AS ecp
    CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS eqp
    CROSS APPLY query_plan.nodes(‘/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple’) AS qn(n)
    WHERE n.query(‘.’).exist(‘//QueryPlan[@NonParallelPlanReason]’) = 1

    Reply
    • Robert – yeah, just be careful because this query can be pretty slow to run. For bonus points, you might want to filter above a certain query cost, or logical reads, or duration, or execution count. You may not want to bother with smaller queries.

      Reply
      • I missed the obvious. I need to filter out all the queries where the cost is below the cost threshold for parallelism. Plan cache queries are generally slow. I’ve found that the better you filter the XML, the more likely you will be able to execute the query within an acceptable execution time.

        Below is a modified query that returns the plans where the cost is greater that the cost threshold for parallelism.

        DECLARE @Cost float

        SELECT @Cost = CAST([value_in_use] AS float)
        FROM [sys].[configurations]
        WHERE [name] = ‘cost threshold for parallelism’;

        WITH XMLNAMESPACES
        (DEFAULT ‘http://schemas.microsoft.com/sqlserver/2004/07/showplan’)
        SELECT
        query_plan AS CompleteQueryPlan
        , n.value(‘(@StatementText)[1]’, ‘varchar(4000)’) AS StatementText
        , n.value(‘(@StatementOptmLevel)[1]’, ‘varchar(25)’) AS StatementOptimizationLevel
        , n.value(‘(QueryPlan/@NonParallelPlanReason)[1]’, ‘nvarchar(128)’) AS NonParallelPlanReason
        , n.value(‘@StatementSubTreeCost’,’float’) AS StatementSubTreeCost
        , ecp.usecounts
        , ecp.size_in_bytes
        FROM sys.dm_exec_cached_plans AS ecp
        CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS eqp
        CROSS APPLY query_plan.nodes(‘/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple’) AS qn(n)
        WHERE n.query(‘.’).exist(‘//QueryPlan[@NonParallelPlanReason]’) = 1
        AND n.value(‘@StatementSubTreeCost’,’float’) > @Cost

        Reply
  • Bryan Rebok
    June 20, 2017 8:24 am

    Eric,

    Some more info on this – it appears that having a column with an untrusted check constraint that calls a scalar UDF does NOT force a plan for a query that references that column to run serially.

    Reply
  • Hi Brent,

    Its a nice article. I understand that when using a udf in a check constraint it affects the query performance.
    But why check constraint comes in when SELECT?
    Also, we have scenario like this, In the master table, two columns , code char(4), location int, both form the primary key whereas in the child table, we have only code column used. In this scenario, i am forced to use UDF with check constraint.. is there any other means to enforce integrity?

    Thanks for your time.
    KRS

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}