The Annals of Hilariously Bad Code, Part 1: Critique the Code


Hey Beavis

That’s the first time I’ve used “annals” correctly. I will be 40 sooner than later.

Sometimes when we blog, we get inspiration from clients, students, other bloggers, questions on Stack Exchange, nightmares, or just drinking heavily.

I’m going to anonymize some code, and I want you to guess where it came from.

I’d also like you to to critique the code, to see if it lines up with my feelings on it.

In the next post, we’ll talk about where it came from, and how we’d fix it.

I don’t want any Big Mouths Striking Again! If you happen to recognize the code, don’t tattle.

Begin Transmission

Thanks for reading!

Previous Post
Things I Have Not Heard Recently
Next Post
Using LIKE on Integers Gets You Implicit Conversion

35 Comments. Leave new

  • No idea where it comes from, but wouldn’t be surprised if it’s from the MDS database.
    * table variables
    * go to (try catch would be better)
    * correlated subqueries can be replaced with WHERE NOT EXISTS (and the IN clauses with WHERE EXISTS, but that’s not so important)

  • Daniel Hutmacher
    February 6, 2018 9:01 am

    I would guess that b, r and m represent backups, restores and backup media. This smells like the cleanup step of a backup suite of some kind.

    Things I would want to see to pass this in a code review:
    * Code commenting? Seriously, how many times do you need to read the last DELETE statement to understand what it actually does?
    * Reference the original tables instead of table variables, or at least, add clustered primary keys to them.
    * Horrible @@ERROR/GOTO pattern, should be replaced by TRY-CATCH.

    • Which version of SQL Server did TRY/CATCH come in at? This code may have been written before then and not touched since, despite regular updates by the company who wrote it 😉

    • Wendy Benedict
      February 9, 2018 2:38 pm

      Glad I”m not the only one that had to read the DELETE statement a few times

      • Wendy — it’s an especially good thing to read if you’re ever feeling down about your own code.

        • For example, like when you deploy an update with no where clause, and take down your firm’s accounting system for 3 days? You mean like that?

  • Is there some kind of tax on long table names?

    b_id, r_id and m_id? More like b_ad, amirite?

    • James — anyone who tells you they can anonymize SQL and have it be understandable is a filthy liar.

  • Come on, this is an easy one…
    Where it came from: [REDACTED] Someone who should know better, it’s [REDACTED] a cleanup stored procedure for AHEM in [REDACTED]… (you left “WHERE AHEM < @date" in line 30, that's the give-away)

    As for critique, I suspect the mindset behind it being this way is so if it fails at any point, minimal deletions have been done to be rolled back, keeping any blocking down.

    It undoubtedly could be written much "cleaner," while still adhering to the requirements.

    (PS, I know *exactly* what this is and where it's from, was just looking at it the other day)

  • Eric Prévost
    February 6, 2018 9:46 am

    Looks exactly what I would use as a “never do that” example.
    And to have found out where that code come from, it’s even more disgusting

  • Michel Zehnder
    February 6, 2018 10:10 am

    Yeah, it is from the almighty creators themselves. It looks ugly though 🙂

  • I always wondered why it was written this way. It was horribly deadlock-prone until some indexes were added to support it several years ago.

  • not sure, but my gut is telling me this is something from msdb cleaning up something.

  • Table variables don’t support transactions. Anyway I would not even bother trying to understand that nonsense. Throw it away, give me the requirements, and I will rewrite it. Because in my experience, when I see code that bad, then there is a great likelihood the writer did not understand the requirements anyway.

  • What I think is that this thing is being called for a specific date on a schedule basis, would have been an opportunity to use parameter default values,
    @date a cosmetic preference, I personally would declare more narrative variables and avoid system keyword.
    Modern exception handling
    interpretable error message/number/etc. always help in troubleshooting
    Although this is simple code to understand but no comments at all seriously ?
    Tempting opportunity to rewrite Code from line 77 to 83 and 87 to 93.

  • Here’s a new one to make it clearer and faster:
    Why not change the referential integrity to be ON DELETE CASCADE?
    Here are my assumptions that would support that design:
    1. The referential integrity is properly defined, indexed and one level deep.
    2. The idea is that if you want to “clean up” the top most item, you are going to want to remove all the children without exception.

    Why would I do this?
    I just tested a system where we deleted a parent set of rows and had to “CASCADE: delete 3 tables of children.
    Guess what in 2008R2 (I presume later as well) to use 4 deletes takes longer, involves more overhead and appears to be more prone to blocking/locking/deadlocking by other processes.
    Why? Because each of the 3 child deletes are simple deletes, the parent table delete has to scan and check referential integrity before deleting.
    Why is ON DELETE CASCADE better?
    The Execution plan throws in a Sequencer to handle all 4 deletes in the appropriate order.
    End result, in my tests the ON DELETE CASCADE consistently took 20 milliseconds less time 67 ms versus 87, and on some of the time with the 4 deletes it took 100 ms or more. I was using 5000 rows in each table, out of a few million on the delete.

    Presuming this is MSDB the old code (2000, 2005) to remove job history used to be a cursor with a missing FK index. I have always added that index as I would do 96 backups a day and needed to prune history.

  • Tony Fountain
    February 6, 2018 2:25 pm

    I agree with some of the previous responses:
    1) Wrap the deletes in a TRY CATCH FINALLY block
    2) Get rid of the table variables
    3) Just delete the records starting with deepest child up to parent
    4) I don’t agree with using ON DELETE CASCADE, I prefer to manage the data through explicit deletes rather than automatically. What if I forgot my WHERE clause on the delete? Oops!
    5) When an error happens, the end user or calling application will not know there was any type of error! The catch block should throw the same error up the stack after rolling back the transaction.

    This code simply smells of someone with a procedural mindset to coding.

  • Can someone please explain the purpose behind the AND part of the WHERE in the last DELETE statement?

  • Here are some points, and I’m agree with previous points:
    – use of TRY CATCH
    – I like to have XACT_ABORT is ON
    – @table = replace with #temp table
    – IN = replace with a JOIN
    – use THROW , no use of raisError
    – no GOTO
    – I like to have [schema] name in the name of the table

    • the IN (or better EXISTS) will be ok, since a JOIN could lead to multiple rows when it is a 1:n relationship

      But nobody should ever use as COUNT(*) for an existence check, since it has to count every matching row, while an EXISTS or IN will stop after the first match (to be sure, you could use a TOP (1), but this will make (usually) no difference)

  • Agree with most of the above comments.
    * Disagree with use of Cascade delete – you should know what you want to delete without the db going deleting records you haven’t thought about.
    * Use Exists and not Exists.
    * restructure without the use of goto – Structured or try catch
    * indentation to easily see start and end of transaction
    * Add comments
    * more friendly variable names.
    * @Date id datetime – I would check each of the tables to ensure that the various comparisons are checking datetime to datetime – i.e. is b_date actually a date or is it datetime
    * I would use the actual parent table to compare – I’ve found that often select distinct is used where there is no key and therefore is inefficient. by using the actual table and using exists would be better.
    * If there is no parent record to delete then we are still checking all the children – I would identify that there are records to delete before checking there are children to delete

  • Alexander Gay
    February 7, 2018 4:36 am

    I agree with most of the above:
    * Use #temp instead of table variables, but I would use one temp table, with a type column and and an index to speed it up, I am making the assumption that this works on thousands of rows, if it is < 100 ignore the index.
    * Comments would be nice, but may have been removed to anonymize the source.
    *Use try/Catch to avoid the repetition of the @@Error calls.
    * [Personal, dogmatic, stylistic bugbear alert] Put the parentheses around the parameter at the top so that it is
    @date DATETIME
    AS …"
    I don't consider them optional.

  • well, i’m not mssql expert but it seems that all data saved in temp tables is lost.
    Since it looks like archiving process, the insert from temp tables is missing and all the transaction handling has no purpose.

  • Yup, definitely written by the Almighty Creators ___, was looking at this SProc 2-3 days ago, also wondering “WHY LORDI, WHY?!?” (yes. Lordi, the awesome metal band.)

    I think the code may have been written eons ago and not looked at since, with the assumption that “it might have to deal with massive amounts of ___ rows”, so it was trying overly hard to not break stuff in this critical ___ database if it had to run for aaages over lots of rows.

  • David Simpson
    February 8, 2018 8:58 am

    High priority critique
    * Refactor the stored proc to do its work in batches
    Low priority critiques:
    * Return a result code
    * As many others have said, update the error handling
    * Use EXISTS instead of COUNT and IN. I happen to know that this will not improve performance on any of my servers, so this would be purely for stylistic consistency.
    * Try temp table instead of table variable. Again, I know that this will not improve performance on any of my servers with sufficiently small batches, so this would depend on your environment.
    * Explicitly designate table variable columns as NULL or NOT NULL
    Things I don’t care about:
    * Lack of comments. If a table or view is named poorly, use aliases to make your own query readable. If a stored procedure becomes unwieldy, encapsulate sections into their own procedures.

  • It’s vendor-supplied code and has a History (see what I did there?).
    Now ancient, it’s simply been left alone by the vendor, which is a shame as there are other issues related to the code, e.g. missing indexes.
    Oracle, too, has ancient code all over the place, with hard-coded hints and other naughtiness, but, again, it’s probably down to If It Ain’t Broke …

  • We could cheat and run it through a static Sql code checker. That should provide a few laughs.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.