9 Signs Your T-SQL Might Fail a Code Review

T-SQL
28 Comments

It’s hard to set absolute rules about, “Feature X should absolutely never be used.”

However, there are some features that set off alarm bells when I see them. Usually, when I start asking more questions about when we’re using those particular features, I get answers of, “Oh, I didn’t know that was a problem.” As we have a bigger discussion, it leads to the piece of code failing the code review, and going back to the drawing board for improvements.

  1. Joining to table-valued user-defined functions: if it’s a multi-statement function, its underlying work doesn’t show up in execution plans, and it’s single-threaded, running one row at a time. I’ll pop open the plan to just double-check if it’s multi-statement or inline, and heaven forbid it calls another function.
  2. Joining to table variables: even with 2019’s improved row estimations, table variables still don’t get statistics on their contents, and inserting data into them is still single-threaded. There are a few edge case uses that we discuss in Fundamentals of TempDB, especially around reducing recompilations, but generally when I see table variables in code, it’s because someone didn’t know about their weaknesses. (The weaknesses of table variables, I mean, not their own weaknesses. Although now that you say it that way…)
  3. CROSS JOIN: sure, there are legitimate reasons why you might want every row from a table with no filtering whatsoever, but it’s fairly unusual. These two words set off an alarm bell that make me look more closely.
  4. Multiple joins to the same CTE: CTEs can be fine, although a temp table is often a better fit. The thing that raises flags during a code review is seeing the same CTE referred to multiple times, like FROM cte JOIN cte JOIN cte, each with a different alias. That CTE will show up multiple times in the execution plan, hitting the underlying tables repeatedly.
  5. The kitchen sink design pattern: WHERE (CustomerId = @CustomerId OR @CustomerId IS NULL) is tough for SQL Server to optimize, and we typically change that over to dynamic SQL or judicious use of RECOMPILE hints.
  6. SELECT (things) INTO #TempTable: I love temp tables, and I think they get a bad rap, and even SELECT INTO can have advantages over explicitly creating a table. However, when I see SELECT INTO, it’s usually because someone was working quickly, and they didn’t want to take the time to write CREATE TABLE and figure out all the datatypes. For example, during a recent review, the query author was copying an entire table into TempDB, and then doing filtering rows there – thinking that they were minimizing the amount of work they were doing in the user database.
  7. Creating indexes on a temp table: this sounds like a good thing – after all, who doesn’t like indexes? – but most of the time when I see it and I start asking questions, I get answers like, “I just guessed that the index might help.” That triggers us to spend a little more time experimenting, and also leads to teachable moments about how that affects temp table caching.
  8. WITH (NOLOCK): I keep demoing that it gets random results, and I know you might find this hard to believe, but not everyone reads my blog. (I know, right?) There are indeed places where random results are just fine, but most of the time when I see this red flag and I ask questions, the query author just didn’t know how dangerous this hint was. Bonus points for trying to use it in an UPDATE statement.
  9. BEGIN TRAN & COMMIT with no error handling – when I see this, I ask, “So you’re doing an explicit transaction because it’s important to you that all of this stuff gets committed, or not at all, right? And you’re expecting to have the transaction fail every now and then and you want to roll back, right? So, uh, what happens to your app when that happens?” If you care enough to specify a transaction in the database, then you need to keep going and specify how to handle the errors, too. Erland Sommarskog’s Error and Transaction Handling post is a great place to start.

I have absolutely given the thumbs-up to code that had more than one of these anti-patterns – heck, this stuff is in the First Responder Kit – but they’re all signs that I’m gonna look a little more closely at your code and ask questions, and you should probably ask those questions, too.

Previous Post
Contest: My Favorite Caption Wins a Live Class Season Pass
Next Post
You Captioned This Pic and Won Free Stuff.

28 Comments. Leave new

  • Johannes Vink
    May 19, 2021 1:44 pm

    Code without any comment or documentation does not fail a review? We often have fairly complex business logic in our ETL layer of the data warehouse and I encounter old SSIS packages with complex SQL on a weekly basis. No documentation whatsoever, no comment at all what the purposes is of some steps.

    ERP application tables with naming such as xyz123 joined with xyz1234 and then cross join 1-7 to likely flatten some data per day. Maybe. I guess. That sort of beauties.

    Reply
  • Question for #5, I get that OR’s are hard to optimize especially if the predicates aren’t trying to filter on the same column, but wouldn’t breaking the or’s into separate query’s for union be a good approach? I guess that depends is the answer, because there isn’t info in the hypothetical to say for sure.

    Reply
  • Adam C Jacobson
    May 19, 2021 2:07 pm

    thanks for this. One more piece of ammunition as I try to convince
    I have a client where select into is the favorite way of storing data into “permanent tables”. Of course, there not really permanent. Because when the sp blows up, the tables disappear and downstream tasks fail.
    And, beyond all that, they start each procedure dropping all the temp tables they are about to use:
    IF(OBJECT_ID(‘tempdb..#TEMP_table1of20) IS NOT NULL)
    BEGIN
    DROP TABLE #TEMP_table1of20
    END
    (I asked whey they did this. Only response was “Matt told us to”.

    Reply
    • I’ve done that quite a bit but more for style and debugging. It’s a nice way for me to section off my code and acknowledge I am working on a temp table. Granted I put it all on one line and make it look nicer. Like Brent, I am a huge fan of temp tables for processing before I load into a permanent table or present the results.

      IF(OBJECT_ID(‘tempdb..#TEMP_table1of20) IS NOT NULL) DROP TABLE #TEMP_table1of20

      Reply
  • Timothy Brown
    May 19, 2021 2:38 pm

    Agree with all of those, except maybe #7. In our house, I almost always insist on having the right indexes for temp tables. Mostly for performance, but also because as I understand it we can only have auto created stats is an index has been created.
    But usually we’re talking about several thousand rows.

    Reply
    • Rafael Colon
      May 22, 2021 10:15 pm

      I agree , in very rare and specific situations an index on a temp table could be helpful. I remember just one time on my career that was a life saver on a month end process but it has to be used and evaluated with a lot of caution.

      Reply
    • I also agree that #7 is important. A #temp table is just like any other table and good indexes improve performance. The rest is definitely spot on and typical patterns leading to performance issues.

      Reply
  • Seeing “WITH READPAST” all over a vendor source always left me wondering why it was needed.[1]

    I came to the conclusion it was more likely to pass QA controls than “WITH NOLOCK”.

    [1] coding it with tables used as work queues I get but not everywhere, dammit.

    Reply
  • Thom Holderman
    May 19, 2021 4:27 pm

    my position is a new one for my company, and they’ve done 1-5 regularly (including nesting tvf 5 deep) and now I get to fix it all

    Reply
  • How about Nested Views. Having to fix lots of nested views currently.

    Reply
  • Thanks for the post…. Good one! As previously stated lack of comments is a red flag also, to me!

    Reply
  • I think I could find a stored procedure that scores nine out of nine, do I get a prize for that?

    Bonus: DELETE WITH (NOLOCK) (but I think that dissapeared a couple of years ago)

    Reply
  • Daniel Johnson
    May 19, 2021 5:12 pm

    Mines would also be cursors. There are just so few use cases where they are needed. Granted, I do enjoy using them but there are a million ways to poorly implement. I tend to point developers to loops unless they want to take the time to invest in learning the ins and outs.

    Reply
  • How about assuming certain ANSI database settings such as NULL=NULL

    Reply
  • Im proud to say ive never failed a code review.

    Im ashamed to say why, though.

    Reply
  • Error handling and logging is a must for every db developer.

    Reply
  • Joachim Pense
    May 20, 2021 4:56 am

    QUOTE:
    >>>
    BEGIN TRAN & COMMIT with no error handling – when I see this, I ask, “So you’re doing an explicit transaction because it’s important to you that all of this stuff gets committed, or not at all, right? And you’re expecting to have the transaction fail every now and then and you want to roll back, right? So, uh, what happens to your app when that happens?” If you care enough to specify a transaction in the database, then you need to keep going and specify how to handle the errors, too.
    <<<
    COMMENT:
    Duh, the app will just fail, but it will leave the database in a consistent state. A transaction is a necessity if you have a sequence of updates with an inconsistent state between them, regardless of what will be done after the failure.

    Reply
    • If “the app will just fail” is alright with you, then you may not want to submit your code to me for review. 😉

      Reply
      • Joachim Pense
        May 21, 2021 12:23 pm

        Of course you care that the app can resume. But even if you don’t it’s worse to leave the database in an inconsistent state, that’s a no-no. Inconsistent states belong into a transaction, disregarding what else the code does.

        Reply
  • Refering to point 6: When do I decide for TempTable an when for Creating a Table? What is the disadvantage of using TempTables?

    Reply
  • #temp table after #temp table after #temp table after #temp table after #temp table… and on and on and on. Trying to troubleshoot or reverse engineer one of those beasties is challenging for my mental health.

    Reply
  • Thank you Brent!

    Reply
  • My absolute favorite update script With(NoLock) example:
    IF EXISTS (SELECT * FROM sys.objects WITH(nolock) WHERE object_id = OBJECT_ID(N'[dbo].[Adm_GetStuff]’) AND type in (N’P’, N’PC’))
    DROP PROCEDURE [dbo].[Adm_GetStuff]
    GO

    Reply
  • Kevin Martin
    May 8, 2022 7:35 pm

    For the #5. kitchen sink design pattern, I recommend using something like sp_CRUDGen to generate a dynamic TSQL stored procedure with best practices for kitchen sink (optional parameters) queries. It is really hard to do dynamic SQL safely and performant.

    sp_CRUDGen Links:

    YouTube Playlist: https://www.youtube.com/playlist?list=PL4bNfhq8cqH2HAvMYQHXTGF5nedzzJe24

    GitHub Repository Project: https://github.com/kevinmartintech/sp_CRUDGen

    Blog Articles: http://spcrudgen.org

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.