T-SQL & Development

The Annals of Hilariously Bad Code, Part 1: Critique the Code

By Erik Darling · February 6, 2018 · 35 comments

Hey Beavis

That’s the first time I’ve used “annals” correctly. I will be 40 sooner than later.

Sometimes when we blog, we get inspiration from clients, students, other bloggers, questions on Stack Exchange, nightmares, or just drinking heavily.

I’m going to anonymize some code, and I want you to guess where it came from.

I’d also like you to to critique the code, to see if it lines up with my feelings on it.

In the next post, we’ll talk about where it came from, and how we’d fix it.

I don’t want any Big Mouths Striking Again! If you happen to recognize the code, don’t tattle.

Begin Transmission

CREATE OR ALTER PROCEDURE dbo.delete_some_stuff
    @date DATETIME
AS
    BEGIN
        SET NOCOUNT ON;

        DECLARE @b_id TABLE
              (
                  b_id INT
              );
        DECLARE @m_id TABLE
              (
                  m_id INT
              );
        DECLARE @r_id TABLE
              (
                  r_id INT
              );

        INSERT INTO @b_id ( b_id )
        SELECT  DISTINCT
                b_id
        FROM    b1
        WHERE   b_date < @date;

        INSERT INTO @m_id ( m_id )
        SELECT  DISTINCT
                m_id
        FROM    b2
        WHERE   b_date < @date;

        INSERT INTO @r_id ( r_id )
        SELECT  DISTINCT
                r_id
        FROM    r1
        WHERE   b_id IN (   SELECT  b_id
                            FROM    @b_id );

        BEGIN TRANSACTION;

        DELETE  FROM b1
        WHERE   b_id IN (   SELECT  b_id
                            FROM    @b_id );
        IF ( @@error > 0 )
            GOTO Quit;

        DELETE  FROM b2
        WHERE   b_id IN (   SELECT  b_id
                            FROM    @b_id );
        IF ( @@error > 0 )
            GOTO Quit;

        DELETE  FROM r1
        WHERE   r_id IN (   SELECT  r_id
                            FROM    @r_id );
        IF ( @@error > 0 )
            GOTO Quit;

        DELETE  FROM r2
        WHERE   r_id IN (   SELECT  r_id
                            FROM    @r_id );
        IF ( @@error > 0 )
            GOTO Quit;

        DELETE  FROM r3
        WHERE   r_id IN (   SELECT  r_id
                            FROM    @r_id );
        IF ( @@error > 0 )
            GOTO Quit;

        DELETE  FROM b3
        WHERE   b_id IN (   SELECT  b_id
                            FROM    @b_id );
        IF ( @@error > 0 )
            GOTO Quit;

        DELETE  b4
        FROM    b4 AS b4
        WHERE   b4.m_id IN (   SELECT  m_id
                                FROM    @m_id )
                AND ((   SELECT COUNT(*)
                         FROM   b1
                         WHERE  m_id = b4.m_id ) = 0 );
        IF ( @@error > 0 )
            GOTO Quit;

        DELETE  b5
        FROM    b5 AS b5
        WHERE   b5.m_id IN (   SELECT  m_id
                                FROM    @m_id )
                AND ((   SELECT COUNT(*)
                         FROM   b1
                         WHERE  m_id = b5.m_id ) = 0 );
        IF ( @@error > 0 )
            GOTO Quit;

        COMMIT TRANSACTION;
        RETURN;

        Quit:
        ROLLBACK TRANSACTION;

    END;

100

101

102

103

CREATE OR ALTER PROCEDURE dbo.delete_some_stuff

@date DATETIME

BEGIN

SET NOCOUNT ON;

DECLARE @b_id TABLE

(

b_id INT

);

DECLARE @m_id TABLE

(

m_id INT

);

DECLARE @r_id TABLE

(

r_id INT

);

INSERT INTO @b_id ( b_id )

SELECT DISTINCT

b_id

FROM b1

WHERE b_date < @date;

INSERT INTO @m_id ( m_id )

SELECT DISTINCT

m_id

FROM b2

WHERE b_date < @date;

INSERT INTO @r_id ( r_id )

SELECT DISTINCT

r_id

FROM r1

WHERE b_id IN ( SELECT b_id

FROM @b_id );

BEGIN TRANSACTION;

DELETE FROM b1

WHERE b_id IN ( SELECT b_id

FROM @b_id );

IF ( @@error > 0 )

GOTO Quit;

DELETE FROM b2

WHERE b_id IN ( SELECT b_id

FROM @b_id );

IF ( @@error > 0 )

GOTO Quit;

DELETE FROM r1

WHERE r_id IN ( SELECT r_id

FROM @r_id );

IF ( @@error > 0 )

GOTO Quit;

DELETE FROM r2

WHERE r_id IN ( SELECT r_id

FROM @r_id );

IF ( @@error > 0 )

GOTO Quit;

DELETE FROM r3

WHERE r_id IN ( SELECT r_id

FROM @r_id );

IF ( @@error > 0 )

GOTO Quit;

DELETE FROM b3

WHERE b_id IN ( SELECT b_id

FROM @b_id );

IF ( @@error > 0 )

GOTO Quit;

DELETE b4

FROM b4 AS b4

WHERE b4.m_id IN ( SELECT m_id

FROM @m_id )

AND (( SELECT COUNT(*)

FROM b1

WHERE m_id = b4.m_id ) = 0 );

IF ( @@error > 0 )

GOTO Quit;

DELETE b5

FROM b5 AS b5

WHERE b5.m_id IN ( SELECT m_id

FROM @m_id )

AND (( SELECT COUNT(*)

FROM b1

WHERE m_id = b5.m_id ) = 0 );

IF ( @@error > 0 )

GOTO Quit;

COMMIT TRANSACTION;

RETURN;

Quit:

ROLLBACK TRANSACTION;

END;

Thanks for reading!

Free, 3× a week

Get my new posts by email

Three posts a week, plus a Monday roundup of the best database news from around the web.

35 comments

Koen Verbeeck

February 6, 2018 at 8:49 am

No idea where it comes from, but wouldn’t be surprised if it’s from the MDS database.
Critique:
* table variables
* go to (try catch would be better)
* correlated subqueries can be replaced with WHERE NOT EXISTS (and the IN clauses with WHERE EXISTS, but that’s not so important)

Reply
Daniel Hutmacher VIP Student since 2017

February 6, 2018 at 9:01 am

I would guess that b, r and m represent backups, restores and backup media. This smells like the cleanup step of a backup suite of some kind.

Things I would want to see to pass this in a code review:
* Code commenting? Seriously, how many times do you need to read the last DELETE statement to understand what it actually does?
* Reference the original tables instead of table variables, or at least, add clustered primary keys to them.
* Horrible @@ERROR/GOTO pattern, should be replaced by TRY-CATCH.

Reply
1. OZ Student since 2021
  
  February 8, 2018 at 1:21 pm
  
  Which version of SQL Server did TRY/CATCH come in at? This code may have been written before then and not touched since, despite regular updates by the company who wrote it 😉
  
  Reply
2. Wendy Benedict Student since 2018
  
  February 9, 2018 at 2:38 pm
  
  Glad I”m not the only one that had to read the DELETE statement a few times
  
  Reply
  1. Erik Darling
    
    February 9, 2018 at 4:02 pm
    
    Wendy — it’s an especially good thing to read if you’re ever feeling down about your own code.
    
    Reply
    1. ken
      
      February 9, 2018 at 4:30 pm
      
      For example, like when you deploy an update with no where clause, and take down your firm’s accounting system for 3 days? You mean like that?
      
      Reply
      1. Erik Darling
        
        February 9, 2018 at 4:33 pm
        
        Er… Well I had a demo that didn’t work today, so…
James May

February 6, 2018 at 9:15 am

Is there some kind of tax on long table names?

b_id, r_id and m_id? More like b_ad, amirite?

Reply
1. Erik Darling
  
  February 6, 2018 at 9:18 am
  
  James — anyone who tells you they can anonymize SQL and have it be understandable is a filthy liar.
  
  Reply
Jason A

February 6, 2018 at 9:25 am

Come on, this is an easy one…
Where it came from: [REDACTED] Someone who should know better, it’s [REDACTED] a cleanup stored procedure for AHEM in [REDACTED]… (you left “WHERE AHEM < @date" in line 30, that's the give-away)

As for critique, I suspect the mindset behind it being this way is so if it fails at any point, minimal deletions have been done to be rolled back, keeping any blocking down.

It undoubtedly could be written much "cleaner," while still adhering to the requirements.

(PS, I know *exactly* what this is and where it's from, was just looking at it the other day)

Reply
1. Erik Darling
  
  February 6, 2018 at 9:27 am
  
  HUSH YOU
  
  Reply
  1. Jason A
    
    February 6, 2018 at 9:29 am
    
    Hey! I did [REDACT] my answer to *try* to keep from giving the game away!
    
    Reply
2. Brent Ozar
  
  February 6, 2018 at 9:51 am
  
  You deserve a gold star though, heh.
  
  Reply
  1. Jason A
    
    February 6, 2018 at 11:51 am
    
    Ooo! Bring it with you when you do the SPID presentation this Thursday and I’ll collect it!
    
    Reply
    1. Brent Ozar
      
      February 6, 2018 at 12:59 pm
      
      HAHAHA, deal.
      
      Reply
3. M McDonald
  
  February 7, 2018 at 2:27 pm
  
  +1 — My vote is you are spot on regarding [REDACTED].
  
  Reply
Eric Prévost

February 6, 2018 at 9:46 am

Looks exactly what I would use as a “never do that” example.
And to have found out where that code come from, it’s even more disgusting

Reply
Michel Zehnder

February 6, 2018 at 10:10 am

Yeah, it is from the almighty creators themselves. It looks ugly though 🙂

Reply
James L Student since 2018

February 6, 2018 at 11:56 am

I always wondered why it was written this way. It was horribly deadlock-prone until some indexes were added to support it several years ago.

Reply
trav Student since 2017

February 6, 2018 at 12:30 pm

not sure, but my gut is telling me this is something from msdb cleaning up something.

Reply
ken ambrose

February 6, 2018 at 12:57 pm

Table variables don’t support transactions. Anyway I would not even bother trying to understand that nonsense. Throw it away, give me the requirements, and I will rewrite it. Because in my experience, when I see code that bad, then there is a great likelihood the writer did not understand the requirements anyway.

Reply
Ahmad DRSHEN

February 6, 2018 at 1:45 pm

What I think is that this thing is being called for a specific date on a schedule basis, would have been an opportunity to use parameter default values,
@date a cosmetic preference, I personally would declare more narrative variables and avoid system keyword.
Modern exception handling
interpretable error message/number/etc. always help in troubleshooting
Although this is simple code to understand but no comments at all seriously ?
Tempting opportunity to rewrite Code from line 77 to 83 and 87 to 93.

Reply
Brian M

February 6, 2018 at 2:02 pm

Here’s a new one to make it clearer and faster:
Why not change the referential integrity to be ON DELETE CASCADE?
Here are my assumptions that would support that design:
1. The referential integrity is properly defined, indexed and one level deep.
2. The idea is that if you want to “clean up” the top most item, you are going to want to remove all the children without exception.

Why would I do this?
I just tested a system where we deleted a parent set of rows and had to “CASCADE: delete 3 tables of children.
Guess what in 2008R2 (I presume later as well) to use 4 deletes takes longer, involves more overhead and appears to be more prone to blocking/locking/deadlocking by other processes.
Why? Because each of the 3 child deletes are simple deletes, the parent table delete has to scan and check referential integrity before deleting.
Why is ON DELETE CASCADE better?
The Execution plan throws in a Sequencer to handle all 4 deletes in the appropriate order.
End result, in my tests the ON DELETE CASCADE consistently took 20 milliseconds less time 67 ms versus 87, and on some of the time with the 4 deletes it took 100 ms or more. I was using 5000 rows in each table, out of a few million on the delete.

Presuming this is MSDB the old code (2000, 2005) to remove job history used to be a cursor with a missing FK index. I have always added that index as I would do 96 backups a day and needed to prune history.

Reply
1. Brian M
  
  February 6, 2018 at 2:31 pm
  
  One last thing, the delete is very easy, delete the parent and the children go away.
  
  Reply
Tony Fountain VIP Student since 2019

February 6, 2018 at 2:25 pm

I agree with some of the previous responses:
1) Wrap the deletes in a TRY CATCH FINALLY block
2) Get rid of the table variables
3) Just delete the records starting with deepest child up to parent
4) I don’t agree with using ON DELETE CASCADE, I prefer to manage the data through explicit deletes rather than automatically. What if I forgot my WHERE clause on the delete? Oops!
5) When an error happens, the end user or calling application will not know there was any type of error! The catch block should throw the same error up the stack after rolling back the transaction.

This code simply smells of someone with a procedural mindset to coding.

Reply
Mazhar

February 6, 2018 at 2:44 pm

Can someone please explain the purpose behind the AND part of the WHERE in the last DELETE statement?

Reply
Sabibi

February 7, 2018 at 1:28 am

Here are some points, and I’m agree with previous points:
– use of TRY CATCH
– I like to have XACT_ABORT is ON
– @table = replace with #temp table
– IN = replace with a JOIN
– use THROW , no use of raisError
– no GOTO
– I like to have [schema] name in the name of the table

Reply
1. Thomas Franz
  
  February 7, 2018 at 1:39 am
  
  the IN (or better EXISTS) will be ok, since a JOIN could lead to multiple rows when it is a 1:n relationship
  
  But nobody should ever use as COUNT(*) for an existence check, since it has to count every matching row, while an EXISTS or IN will stop after the first match (to be sure, you could use a TOP (1), but this will make (usually) no difference)
  
  Reply
john

February 7, 2018 at 3:37 am

Agree with most of the above comments.
* Disagree with use of Cascade delete – you should know what you want to delete without the db going deleting records you haven’t thought about.
* Use Exists and not Exists.
* restructure without the use of goto – Structured or try catch
* indentation to easily see start and end of transaction
* Add comments
* more friendly variable names.
* @Date id datetime – I would check each of the tables to ensure that the various comparisons are checking datetime to datetime – i.e. is b_date actually a date or is it datetime
* I would use the actual parent table to compare – I’ve found that often select distinct is used where there is no key and therefore is inefficient. by using the actual table and using exists would be better.
* If there is no parent record to delete then we are still checking all the children – I would identify that there are records to delete before checking there are children to delete

Reply
Alexander Gay

February 7, 2018 at 4:36 am

I agree with most of the above:
* Use #temp instead of table variables, but I would use one temp table, with a type column and and an index to speed it up, I am making the assumption that this works on thousands of rows, if it is < 100 ignore the index.
* Comments would be nice, but may have been removed to anonymize the source.
*Use try/Catch to avoid the repetition of the @@Error calls.
* [Personal, dogmatic, stylistic bugbear alert] Put the parentheses around the parameter at the top so that it is
"dbo.delete_some_stuff(
@date DATETIME
)
AS …"
I don't consider them optional.

Reply
Zion

February 7, 2018 at 9:40 am

well, i’m not mssql expert but it seems that all data saved in temp tables is lost.
Since it looks like archiving process, the insert from temp tables is missing and all the transaction handling has no purpose.

Reply
OZ Student since 2021

February 8, 2018 at 8:13 am

Yup, definitely written by the Almighty Creators ___, was looking at this SProc 2-3 days ago, also wondering “WHY LORDI, WHY?!?” (yes. Lordi, the awesome metal band.)

I think the code may have been written eons ago and not looked at since, with the assumption that “it might have to deal with massive amounts of ___ rows”, so it was trying overly hard to not break stuff in this critical ___ database if it had to run for aaages over lots of rows.

Reply
David Simpson VIP Student since 2018

February 8, 2018 at 8:58 am

High priority critique
* Refactor the stored proc to do its work in batches
Low priority critiques:
* Return a result code
* As many others have said, update the error handling
* Use EXISTS instead of COUNT and IN. I happen to know that this will not improve performance on any of my servers, so this would be purely for stylistic consistency.
* Try temp table instead of table variable. Again, I know that this will not improve performance on any of my servers with sufficiently small batches, so this would depend on your environment.
* Explicitly designate table variable columns as NULL or NOT NULL
Things I don’t care about:
* Lack of comments. If a table or view is named poorly, use aliases to make your own query readable. If a stored procedure becomes unwieldy, encapsulate sections into their own procedures.

Reply
Richard Armstrong-Finnerty

February 9, 2018 at 4:17 am

It’s vendor-supplied code and has a History (see what I did there?).
Now ancient, it’s simply been left alone by the vendor, which is a shame as there are other issues related to the code, e.g. missing indexes.
Oracle, too, has ancient code all over the place, with hard-coded hints and other naughtiness, but, again, it’s probably down to If It Ain’t Broke …

Reply
ken

February 9, 2018 at 4:59 pm

We could cheat and run it through a static Sql code checker. That should provide a few laughs.

Reply

The Annals of Hilariously Bad Code, Part 1: Critique the Code

Hey Beavis

Begin Transmission

Get my new posts by email

Keep digging

35 comments

Leave a comment Cancel reply