If You Could Only Say One Thing About Writing Good Queries…

SQL Server
33 Comments

If you could give just one piece of advice about writing good queries, what would it be?

I asked Twitter yesterday because I wanted to make sure I didn’t miss anything in a new training course I’m working on, and the replies were fantastic. Here were some of my favorites:

What about y’all? Leave your most important tip in the comments, and read the Twitter thread for tons of gold nuggets.

Previous Post
Execution Plans Don’t Have the Yellow Bang They Really Need.
Next Post
What’s the Buggiest Feature in SQL Server?

33 Comments. Leave new

  • Koen Verbeeck
    May 4, 2021 9:22 am

    Format the damn thing.

    Reply
  • Try to use new built-in functions or stored procedures which each new release of the SQL Server or any other DB for that matter seems to be coming up with.

    Reply
  • scott.evans
    May 4, 2021 4:18 pm

    Keep it simple, the simpler the query, the better it will run and the easier it is to modify.

    Reply
  • Blaine J Trimmell
    May 4, 2021 4:24 pm

    Result sets are not ordered so if you care about the order then you must use an Order By.

    Reply
  • Jürgen Leis
    May 4, 2021 4:24 pm

    Think relational, not algorithmic.

    Reply
  • Document Why it’s written the way it is. Any business rules that it implements, any “clever” code, anything that another person would need to answer the question, “What was he thinking when he wrote this?” Most good SQL devs/engineers will be able to figure out What the code does, or at least what the steps in it do, but Why is critical. Most likely, the person asking “What was he thinking?” will actually be future-you asking, “What was I thinking when I wrote this?” Make life easier on future-you and you’ll end up with less to regret and more to be happy about.

    Reply
  • Butch Weber
    May 4, 2021 4:34 pm

    If you’re joining tables, put the damn table name (or alias if using those) before each column name. Don’t assume everyone knows which column goes in which table.

    Reply
  • Chad Estes
    May 4, 2021 4:40 pm

    Learn and appreciate niche features; use them when appropriate.
    Things you once thought of niche features will become everyday tools. For me personally, a whole new world was opened up each time when I learned about windowing functions, grouping sets, and (un)pivot functions.

    Reply
  • The fastest path to the records you want is to eliminate the ones you don’t as quickly in the process as possible.

    Reply
    • Seb Thirlway
      May 5, 2021 11:50 am

      Yes!

      The query optimiser will _probably_ make sensible decions. But for clarity I much prefer

      SELECT things
      FROM
      (SELECT OneRow FROM SmallLookupTable WHERE Name=’TheOneImLookingFor’) tinyset
      INNER JOIN
      (SELECT LoadsOfStuff
      FROM
      [Loads of joined enormous tables])

      to

      SELECT Things
      FROM
      [Loads of massive tables]
      […continues for several pages]
      WHERE
      [complicated logic, continues for lines and line] AND
      [Actually I only wanted one row out of the lookup table]

      Reply
  • Seb Thirlway
    May 4, 2021 4:43 pm

    If you’re constantly using DISTINCT (or indiscriminate GROUP BY), UNION or CTEs, you’re…. doing it wrong. And, as Cody puts it so well above: it gets the hose again.

    These are all incredibly useful in their place. Too often they’re used out of place.

    DISTINCT because for “some reason” the resultset (or subset) is returning duplicates. Why are there duplicates? Do you understand the cardinality of the tables involved?

    GROUP BY… again, because duplicates are being returned.

    UNION to filter the same set using an OR in the WHERE. Use ORRRRRR, that’s what it’s FORRRRR . UNION hits the set (e.g. table) twice. Or even more times….

    CTEs because the logic is “complicated”. You can only use a CTE in a single SELECT statement. No, this is not a reason to stick in a UNION to make a single SELECT statement possible. Every time someone else scrolls up 4 pages to find your ****verd****te CTE and figure out how it interacts with the logic 4 pages lower, a puppy dies. And fantasies of vengeance grow another inch towards realisation. If you use multiple CTEs all referring to each other, spanning 4 pages, then 2000 puppies and 1500 fairies die, the Virgin Mary cries, and the hitman is already turning the corner of your street.

    Put your logic all in the same place so someone else can see what you’re doing.

    Reply
    • Bad Oedipus
      May 4, 2021 5:51 pm

      Iced the hit man, told Mary to get over it, clapped real hard to bring the fairies back to life, ¯\_(?)_/¯ not sure what to do about the puppies, someone should have spade or neutered their pet like Bob Barker recommends; 4 pages of CTE’s referencing themselves for the win. ;-P

      Good recommendations though.

      Reply
    • Regarding the union: It depends. If you want to filter on column A or B and both have an index, it may be better to do the union as SQL Server sometimes prefers to ignore both indexes with an or. Then the or becomes much more expensive than the union.

      Reply
  • Richard Armstrong-Finnerty
    May 4, 2021 5:05 pm

    Check for new features (or older ones) that can do what you want with ease.

    Reply
  • jeff moden
    May 4, 2021 6:03 pm

    Heh…

    I have 4 basic rules for code writing… “Make it work, make it fast, make it pretty… and it ain’t done ’til it’s pretty”.

    I have one rule for scheduling… “If you want it real bad, you’ll probably get it that way”.

    When it comes to choosing two of the three items of “good, fast, or cheap”, there’s only one choice… “good”… because if you always practice “good”, fast and cheap will naturally follow.

    Reply
  • Shout out to my homie, MB. Friends don’t let friends ‘SELECT *’.

    Reply
    • Notable moments over the past few weeks:

      this is what it is doing… from the start of the select * to the load of the truncated table
      -MB

      the query is
      SELECT *
      FROM [Table]
      -MB

      i don’t know why it takes so long now, it only uses 4 columns. it’s a select * but that shouldn’t matter
      -MB

      Reply
  • Kim Crosser
    May 4, 2021 8:47 pm

    Avoid scalar functions whenever possible – inline table-valued functions work much better and faster.
    Prefix all your fields with table names or aliases, and if using aliases, make sure the alias name conveys the intent of the table name. (i.e., don’t ever use anonymous aliases – aa, bb, cc, x, t, etc.).
    When returning wide result sets, try to get the record keys in an efficient inner query, then add any fields that aren’t part of the query criteria at the end – don’t make the system shuffle gigabytes of data when your inner query could return a simple set of record keys.
    If you are doing something more than once, that something should be a function.
    If you coded something well, it should feel elegant, and vice-versa.

    Reply
    • Seb Thirlway
      May 5, 2021 12:00 pm

      “When returning wide result sets, try to get the record keys in an efficient inner query, then add any fields that aren’t part of the query criteria at the end”

      Oh, this. This, a million times. Work with keys, work with key, work with keys… Get a resultset with just the key(s) you need. Check that the population is correct and unduplicated. Then decorate it.

      This avoids monstrosities like

      SELECT
      PersonID,
      MIN(FullName),
      MIN(AddressLine1),
      MAX(TaxPayerReference), — Ticket number BLARG666: MIN didn’t work for [reasons], shove MAX at it and hope
      AVG(AgeInYears)
      FROM
      [I can’t be bothered to work out why the monstrous procession of INNER, OUTER and CROSS JOINS, UNIONs etc, which sits here, returns duplicates]
      GROUP BY PersonID

      But you have know what the unique keys are. This is just one of the many reasons why heaps are evil. A table with a PK is self-documenting. Whatever other nonsense is going on in there, the PK column is guaranteed unique.

      Reply
  • Only one thing? Collect most of above suggestions, put them in the Coding Standards of your company, and tell everyone to:

    “Follow the Coding Standards.”

    It will not work, but at least you tried…

    Reply
    • HAHAHA, I was so excited until that last line, and then I remembered that yes, you’re right. Dang.

      Reply
      • Jeff Moden
        May 5, 2021 1:49 pm

        Heh… you should still be excited… it helps consultants, trainers, a book writers busy forever. 😀 It’s like a volcano… it’s always there just waiting to happen and, when it does, you don’t know which direction it’s going to go but it’s always a spec-hackular opportunity. 😀

        Reply
  • Jeff Moden
    May 5, 2021 1:51 pm

    And pardon the typo’s… I’m only on my second cup o’ joe! 😀 I think I’m legally dead until my third one. 😀

    Reply
  • I’ve compiled the list below from all suggestions until now! Great advices from and for all! Congrats!

    1. Write neat code.
    2. Pay attention to the indentation.
    3. Leave a comment where necessary – think about other team members who’s gonna read and understand your code.
    4. Splitting in two separate SPs (not too many) can make it easier to read/understand the main logic/control flow
    5. Understand the “what” and the “why”, aka the business requirements, before you start coding.
    6. Before running UPDATE or DELETE, rewrite the query to a SELECT query and make sure that the outcome is what you would expect.
    7. Know your data
    8. Know your indexes
    9. Know your query plan – always start with your FROM and JOINs, then WHERE, then any aggregate clauses, and only at the end worry about the SELECT clause fields (COUNT(*) or COUNT(1)).
    10. If you need to deconstruct for simplicity, do it by chaining CTEs not using temp tables (unless reusing data and not just simplifying)
    11. Do not abuse DISTINCT, only use it when you really need it and only after you’ve tested your query with COUNT(*) to make sure you are pulling only the necessary records – in other words, only use DISTINCT when it becomes obvious you need it to eliminate duplicates, which should be rare in most cases.
    12. Keep it short and simple as possible – a future maintainer (maybe yourself) will thank you.
    13. What isolation level can you get away with. Are dirty reads OK?
    14. Do not use more joins than necessary nor more columns than required.
    15. Avoid “Select * FROM”.
    16. Make sure you are on the test server before you press F5.
    17. If you have some joins, just don’t start writing like a dev maniac: think about the best possible way to filter data on each join so each one will need less IO than the previous one.
    18. Think about its reusability.
    19. Anything that makes your life easier makes the optimizer’s job harder

    Reply
  • AlessandroD
    May 6, 2021 8:29 am

    more often you can, review code and comments the next day

    Reply
  • marcscirri
    May 6, 2021 4:50 pm

    Understand how data is stored and how the engine does its thing. There are many ways to get the data that you want, and in most cases, there is a “best” way.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.