SQL Interview Question: “Talk me through this query.”

Last month’s post “For Technical Interviews, Don’t Ask Questions, Show Screenshots” was a surprise hit, and lots of folks asked for more details about the types of screenshots I’d show. Over the next few weeks, I’ll share a few more.

Normally I’d show this query as a screenshot, but for easier copy/pasting into comments, I’m showing it as code here.

I’d say to the job candidate, “You’ve been asked to take a quick look at this code as part of a deployment. Explain what the business purpose of the code is, and tell me if there’s anything that concerns you.”

After a few days, I’ll follow up with my own thoughts in the comments.

Previous Post
SQL Server 2016 Release Date: June 1, 2016
Next Post
Where Clauses and Empty Tables

48 Comments. Leave new

  • Stored procedure that returns all items for a given category with their current prices.

    Things that concern me:
    1. GetDate(). Someone’s storing datetimes in local time instead of UTC time. Enter all the assorted timezone bugs.
    2. If price is null on an item and there is no override, the item is free.
    3. Price is the price at the time you retrieve it, if it’s kept too long, it’s no longer valid.

    Reply
  • remi bourgarel
    May 2, 2016 9:26 am

    1 – @Category = NULL won’t work, you have to use IS NULL
    2 – many item can be returned if there is manyovverrides
    3 – no code convention respected : sometime the column is prefixed with table name, sometime not, sometimes camel case sometimes not
    4 – use TVF instead of SP
    5 – category should not be a text field but a table referenced by foreign key
    6 – the coalesce with the price is odd : null doesn’t mean 0, null means either the business doesn’t know the price or at least the systme doesn’t know it, so you’d better take these item out of your results (0 means free !!)

    Reply
  • Chris Tucker
    May 2, 2016 9:37 am

    1. The parameter @category does not have a default value, therefore will always be required.
    2. The SalesStartDate and SalesEndDate logic seems too inclusive, Is that a true requirement?
    3. Procedure name does not specify the object it’s working against, what are the companies naming standards?
    4. There are no comments.
    5. The magic name of ‘Default’ is a problem, consider abstracting to a lookup table.

    Reply
  • Brandon M.
    May 2, 2016 9:53 am

    Speaking of start/end range check; doesn’t this query look at sales that start “after now” and end “before now,” thus, a sale that starts after it ends?

    Reply
  • Brendan Morgan
    May 2, 2016 10:12 am

    = NULL will not behave correctly. That must be changed to IS NULL. Need to verify the primary key of the joined tables to make sure we only get one price per item.

    No handling of time zones in the date matching.

    Could say items are free.

    Reply
  • And of course comparison of SaleStartDate and SaleEndDate with current date is incorrect and you should swap the place of the comparison operators.

    Reply
  • In addition to the ones already mentioned, were missing a nocount on setting, and the tables aren’t schema/owner qualified. There’s also no ordering, which may be a problem depending on how this is used.

    Reply
  • Unicode parameter declared, but ‘Default’ not Unicode (no N prefix), so implicit conversion occurs to the default code page of the database, which may result in garbled text in the parameter.

    IF @Category = NULL, instead of IS NULL

    The COALESCE will lead to NULLs being zeroes. Could be an accident waiting to happen.

    Mixing case for alias “i”/”I”.

    Selecting for the SaleEndDate being less than or equal to current time looks like a non-no, if, as implied, SaleStartDate is supposed to be greater than or equal to current time.

    Using a WHERE clause for Category, instead of including it in the ON clause, means that the filter is applied after the JOIN, rather than during, so it is, behind the scenes, converting the LEFT OUTER JOIN to an INNER JOIN, as no NULL records from table PO will ever be returned. May not be what the query was designed to do.

    Reply
  • I’m curious as to the length of “Items.Category”, because the parameter length is NVARCHAR(20). I wonder if Category is also an NVARCHAR(20). I would want to make sure the parameter matches the column so there are no implicit conversions or truncation (if a parameter of over 20 character is passed).

    Reply
  • Andy (@SQLBek)
    May 2, 2016 12:25 pm

    *GASP*

    I see inconsistent camel casing on column names!!!

    SELECT i.itemID, i.itemName,
    ON i.itemID = po.itemID

    vs

    COALESCE(po.Price, i.Price, 0) AS Price
    LEFT OUTER JOIN PriceOverrides po
    ON i.itemID = po.itemID
    AND po.SaleStartDate >= GETDATE()
    AND po.SaleEndDate <= GETDATE()

    REJECT!

    Reply
  • where i.category = @Category => data skew -> possible bad execution plan due to “bad/ easy to trick ” statistics.

    consider option recompile if the proc is not called too often(tm).

    ^ This and null comparation.

    Reply
  • Not to spoil all those who have already commented on =NULL not working, but the =null check isn’t automatically wrong if ANSI_NULLS is set to ON. While I agree that’s not the default, nothing prevents someone from setting it. There are some subtle gotchas in the BOL page for that setting if you haven’t read the fine print, including things that require it to be ON, and “The SQL Server Native Client ODBC driver and SQL Server Native Client OLE DB Provider for SQL Server automatically set ANSI_NULLS to ON when connecting.” https://msdn.microsoft.com/en-us/library/ms188048.aspx

    Of course the parameter does not accept null in the first place, as several have also pointed out, but that’s not the point I was trying to make about =NULL.

    Reply
    • RD Francis
      May 5, 2016 5:12 pm

      Unless it’s changed since 2008, you could do this:
      EXECUTE usp_ByCategory NULL;

      In which case @Category would be NULL.

      Reply
  • Also, no error handling.

    Reply
  • High priority – wrong results
    – If goal of IF line is to replace NULL @Category with ‘Default’ then this line will work only when ANSI_NULLS is OFF. Because this setting is deprecated, a future version, will enforce ANSI_NULLS ON this line is dangerous. It should be replaced with SET @Category = ISNULL(@Category, /*N*/’Default’). Regarding data type of @Category parameter see first note from Medium priority section.
    – If for the same itemID there are many rows within PriceOverrides table, then for the same itemId are allowed overlapping ranges ? If not then there are constraints which prevents overlapping time ranges ?
    – What is PriceOverrides.SalesEndDate’s data type ? Is it DATE ? How is filled this column ? If this column contains only the DATE part this BETWEEN filter could return less rows. Note: GETDATE() returns a datetime value (DATE + TIME). Example: if for one itemId the PriceOverrides.SalesEndDate /*DATETIME*/ = {ts ‘2015-05-03 00:00:00.000’} and GETDATE() = {ts ‘2015-05-03 14:28:42.997’} then current row will be excluded. If only the DATE part is filled then GETDATE() could be replaced with CONVERT(DATE, GETDATE()).

    Medium priority – performance is affected
    – We should check @Category’s data type and max. length by comparing with definition of Items.Category column: is it NVARCHAR or VARCHAR ? Is correct the maximum length ? If data type of Items.Category is VARCHAR and this column is indexed and column’s colllation is SQL% then instead of Index Seek the execution plan will include an Scan data acces operator.
    – There is an index on PriceOverrides (itemID, SalesStartDate, SalesEndDate) # Price ?
    – If there is above index and SalesStartDate & SalesEndDate’s data type is DATE then an implicit conversion will be added to XP because GETDATE() returns a datetime value (according BOL). Solution: GETDATE() could be replaced with CONVERT(DATE, GETDATE()).

    Low priority – formatting
    – SP should have a better name – usp_ItemsByCategory
    – Formating – parameters should be on separate line(s)
    – itemId should be ItemID or ItemId.

    Reply
  • SQL Server Performance Tune
    May 3, 2016 9:19 am

    Is SaleEndDate NULLable?

    Reply
  • [Code]
    CREATE PROC dbo.usp_ByCategory
    @Category NVARCHAR(20) = ‘Default’
    AS
    SELECT i.itemID,
    i.itemName,
    COALESCE(po.Price, i.Price, 0) AS Price
    FROM Items I
    LEFT OUTER JOIN PriceOverrides po
    ON i.itemID = po.itemID AND
    po.SaleStartDate = GETDATE()
    WHERE i.Category = @Category;

    [/code]

    Reply
    • CREATE PROC dbo.usp_ByCategory
      @Category NVARCHAR(20) = ‘Default’
      AS
      SELECT i.itemID,
      i.itemName,
      COALESCE(po.Price, i.Price, 0) AS Price
      FROM Items I
      LEFT OUTER JOIN PriceOverrides po
      ON i.itemID = po.itemID AND
      po.SaleStartDate = GETDATE()
      WHERE i.Category = @Category;

      Reply
  • Rusty Shackleford
    May 3, 2016 11:29 am

    There is a lot of yuck in this query as already mentioned, but circumstances allowing the price to be zero seems like a really poor plan and not good for business. I’m also concerned if someone happens to have more than one price override for an item as the query would then return multiple rows. Which override price is the right override price in that case? Something better make sure that price overrides can have only one active override…

    Reply
  • Maybe I am just always looking for traps, I would have addressed the first part of your questions, that you disguised as one question, first and would have responded something like “I can give you my guess at the BUSINESS purpose of the code, but honestly, the real purpose would be unknown to me since I have no business context to go on. If you tell me who is\are the owner(s) of the code, I can go ask them what they are trying to accomplish. While I am at it I would give them the following list of questions and concerns I have with the procedure as it is written.”

    Well, that is what I would like to think I would have said, who knows what would actually come out under the pressure of a real interview.

    Reply
    • Jason – part of being a data professional is looking at queries and guessing what the author was trying to do.

      I’m gonna give you some tough love here: if I’m your manager, and every time I ask you to look at a query, you’re going to ask to talk to the original author (who may have long since left the company), it’s not going to work out well.

      Reply
      • Jeff Stebbens
        May 4, 2016 2:02 pm

        Question then, as I am just starting out as a Jr. DBA. The topic is “You’ve been asked to take a quick look at this code as part of a deployment.” In a perfect world, you’ve hopefully been involved in a design review and/or a code review. If this is coming to your door as a deployment, would you not have scheduled a code review to ask questions and get the dev team in a room with you?

        Am I looking at this wrong? Or should I be looking at if from a new hire point of view where your predecessor started this and you are expected to finish it?

        Just a question so I can learn. Thanks!

        Reply
      • There is a difference between being an existing employee and a new employee in any tribe. AT BEST, interviews simulate the first more than the second in my opinion. And I get it, you are trying to ascertain knowledge and behavior in a setting that is difficult for everyone.

        If you read my reply a bit closer, you will notice I was willing to give you what I thought the business case was, but if you wanted a real answer and not one that is made up, outside of possessing super powers, the only way anyone will really know the business case of anything is to talk to people or read documents. In fact, the manager asking this question is doing just that I believe, they just didn’t do a good job picking the person they asked and from a tough love standpoint, I would say bad on that manager.

        Really not being defensive here, I love debates. 🙂

        Another issue I have with you claiming tough love, and I speak to the business reason portion of the question only, is it seems you have projected some self believe on the situation. It appears you may have made the assumption that how a person answers one question establishes a pattern, but then you take that and use it in a whole different context with no evidence to say they would react the same.

        That all said, in my current gig, and has a manger in past jobs, I have had to work with code bits, product areas and entire systems that the original person(s) who were involved had long since flown the coop. I will say in almost every one of those cases, there where bad decisions made because no one knew what the real business case of something was before I/they changed it which translated in a regression for a customer or having a lot of explaining to do lucy. Both cost time and money and almost always more than what it would have taken to ask a simple question or do a simple search. Do I just not move forward if I can’t, no and that is probably why I still have a job to be honest.

        So, let us assume this proc worked, there is more than a single business case that the procedure addresses. Depending on the case the proc could not work at all correct, even if you address all the technical issues. Example from other comments, is a zero price a valid answer for the real business case? No? You sure? Perhaps this is only used for some internal (to us or our customer) and they want products that don’t have costing to return 0 because the person who reads the report is an accountant and reads numbers faster. Or it sorts by price and all the zero values are then shown at the top or bottom of the report. Whoops, probably should not have “fixed” that.

        Reply
  • Stephen Merkel
    May 3, 2016 11:59 am

    Why didn’t anybody answer the first question — which was “Explain the business purpose of this code”? Ignoring this question may be more important than most of the other syntax details identified so far. Probably doesn’t matter how or if it performs if it doesen’t meet the business purpose.

    Reply
  • Carlos Torres
    May 3, 2016 3:24 pm

    The proc is trying to get all the items and their current price, if any. If no price, then it’s free.

    Concerns:

    1. The proc doesn’t have a BEGIN and END. Only the first statement actually belongs to the proc.
    2. @Category = NULL wouldn’t work.
    3. The assignment to @Category should be N’Default’ to avoid an implicit conversion.
    4. Identation
    5. Missing schema in the tables (dbo?)
    6. GETDATE. Hmm. “Does the business have multiple sites across time zones?”
    7. If there is not a price, the item is free.
    8. The date range can be changed for a BETWEEN instead.
    9. Might return more than one current price for the same item.
    10. Table Design: Items should have a CategoryID and not a category description.
    11. Alias i vs I

    Reply
    • @Carlos Torres:
      “”The proc doesn’t have a BEGIN and END. Only the first statement actually belongs to the proc.”
      Unfortunately, BEGIN and END are optional for a SP (if I were Microsoft, I would make them mandatory). This may lead to misunderstandings to the person that reads the code.

      Reply
  • The object of the proc seems to be to pull prices out for items of the specific category, overriding with the sale price if the dates match, although I think the comparison operators might be the wrong way round and there would have to be a business process elsewhere to ensure that there were no overlapping periods for the item/sale date combinations.

    Also there’d need to be a ‘default’ category for a null input to report anything at all.

    Plus all the stuff everyone else said…

    As for asking the original developer what it’s for, it would be more fun to describe what it actually does, then ask if that’s what he was after. You make way more friends that way…

    Reply
  • The business purpose appears to be retrieving items for any given category (or the default category if none are specified) and displaying it’s current price, starting with an override price should it fall within a sale then falling back to its default or worst case 0.

    As to things that could potentially be wrong.
    – @Category = NULL instead of @Category IS NULL
    – The sale price will only show if it happens to be on sale at the exact millisecond the query is ran on the server
    – The category text should probably not be stored on the items table, but in a separate table referenced by a FK.

    Reply
  • There’s also another issue i believe which nobody I dont think anyone has commented on. The join on PriceOverrides wont return any data because the getdate checks are the wrong way round for the start and end dates. The proc as it stands is checking for overrides which start after “now” and end before “now”.

    Reply
  • When presented this question, people are asked to put on an interviewee hat (to demonstrate technical knowledge). It’s slightly different from the DBA hat (support and keep production safe).

    If I’m wearing my DBA hat, I get to expand the scope of the original request. One question I’d ask is “Hang on, is a version of this in production today? Or is this a new sproc?”

    In a real interview, I’d have to make a judgement call based on context whether it’s more valuable to demonstrate attention to detail or to demonstrate out-of-the-box thinking.

    Reply
  • This seems to be intended to find items with errors in either pricing or sales period date range in a given category, or to get a price list of items in a given category including items that are on sale today only.

    Concerns:

    Where clause returns the items that are on sale only on GETDATE date. so all other items, even though they might be on sale now, will return regular price/incorrect price. In other words only non-sale and today-only items will have the correct price returned.

    COALESCE seems to have been intended to find both the CURRENT price and pricing errors. If I.price is NULL then it’s easy to see something is wrong and fix the pricing. I dont think there is anything wrong with this technically but if the sale price was mistakenly inflated, this will not catch the error. Yet, the error in WHERE clause will cause unexpected result returned by COALESCE.

    Even if the result set was correct, the lack of information is a concern. While it might be helpful to know the price, without knowing items are on sale seems to make the data set incomplete and only half useful.

    Parameter default value should be set in declaration, with N

    Reply
  • The business purpose?

    Get the Item ID, Item Name, and Item Price for items in a given Item Category. If the Item Category is not passed to the stored procedure, then the ‘Default’ Item Category is used.

    If there is a Price in the Price Override table with the following criteria, return that Price instead of the Price in the Item table:
    Price Override Item ID is the same as the Item ID in the Item table AND
    Price Override Sale Start Date is greater than or equal to the current date time AND
    Price Override Sale End Date is less than or equal to the current date time*

    If there is no Price value in either the Item or Price Overrides table that matches the above criteria, then use 0 as the default price for an item.

    * As Darren W noted, the date comparison operators appear to be incorrect for what I assume the business would expect out of this sp.

    Reply
  • WillemWout
    May 4, 2016 3:19 pm

    I’m a (production)DBA, not a coder, so my concern is that this code seems written without proper lay-out, no comments of what it does, probably not documented etc. Seems quick and dirty to me, let alone that fancy stuff like errortrapping or debug instrumentation is there.

    This layout seems sloppy, and I am unwilling to compensate their lack of technical leadership.I never answer questions about the business purpose, or about coding syntax. Giving comments on code it the deployment phase is a career limiting move. At best, you will become co-responsible in a too-little-too-late phase, when the project is way overdue ” the DBA is blocking the deployment because he thinks there’s something wrong with the code”. I have commented many times on the things I had to deploy but I was never interrupted by applause.
    Sorry if this answer is not going into your intended direction. But I feel better now. 🙂

    Reply
  • The purpose of the procedure appears to be to return the prices of items for a given category, returning a the item’s sale price, if it is on sale.

    I’d have to ask what is the “default” category is and if one exists – or want to see the list of products in that category to determine its use because the author of the SP may be expecting zero products in the “default” category and is using “default” to guarantee 0 rows. My other concern is the author doesn’t make sure the product category exists before getting the products and prices. In either case, I would want the author just require a valid product category to be supplied before using SQL resources to run a select against a table that may contain thousands/millions of records and if this procedure was called very often then it would only compound the problem.

    With that issue outstanding, the procedure should look something like this….

    CREATE PROC dbo.usp_ByCategory
    @Category NVARCHAR(20) = ‘Default’
    as
    SELECT i.itemID, i.itemName,
    COALESCE(po.Price, i.Price, 0) AS Price
    FROM Items I
    LEFT OUTER JOIN PriceOverrides po
    ON i.itemID = po.itemID
    AND GETDATE() between po.SaleStartDate and po.SaleEndDate
    WHERE i.Category = @Category

    Reply
  • It appears the intent is to retrieve the (current) price of an item, including the item name, based on category, and returns 0 price if a price isn’t available

    Before asking questions about the stored proc itself, you probably should start asking questions about the table designs and what it’s trying to achieve. Questions may include:

    – Why are you storing price on the item table?
    – Price on the items table appears to be nullable based on the query. Is that true? If so, why?
    – What happens if the price changes over time and the price changes are not related to override/sales events?
    – Does the items table behave like a SCD Type2?
    – Is there a Category table?
    – Is there CategoryID in the Items table?
    – If there’s CategoryID in the items table, why is there a Category column that appears to be string based?
    – Why are prices (by time period) not stored in its own price table?
    – What led to the decision to store price overrides instead of price adjustments, or some other method?
    – What is the datatype of the startdate and end date fields in the price override table (e.g. date vs. datetime/datetime2/etc)?
    – What happens if there are price overrides for the same item within the same time period? Is it possible?

    And probably some other questions that may be missing

    (If you happen to be talking to the database architect who designed this database, you may have offended this person enough by questioning his data architecture that you’re immediately shown the door)

    Reply
  • Great job everybody! You did a great job on sniffing out things I’d be interested in hearing about.

    Now, here’s what I’d be looking for as an interviewer:

    First, I want to see that the candidate can read the query and explain what it’s doing in plain English. As a data professional, a lot of your job involves reverse engineering someone’s code to determine whether or not it has bugs, and if it’s going to produce the intended results. I want to see someone take 10-20 seconds reading the query and then say, “It looks like they’re getting a list of prices for items in a specific category, with overrides based on their sales dates.”

    Next, I want to see an attitude of helpfulness. There are a bunch of pitfalls in the query, and I want someone to find as many as possible, but I want them to be cheery and helpful when they’re pointing out these issues. Developers often joke that DBA stands for Don’t Bother Asking, and I need a DBA that doesn’t personify those initials. I want someone who can be upbeat when doing code reviews.

    I want to see that the candidate understands the difference between a code review and a design review. If we’re just going live with a query, we can’t refactor the database design. (Meaning, I don’t want to see a candidate say, “Well, you shouldn’t be storing the data this way, you shouldn’t allow nulls here, etc.” – that kind of thing can’t usually be fixed in a simple query code review.)

    In terms of issues in the code, I don’t expect a candidate to raise all of these questions in the span of a couple of minutes, but the more they raised, the better:

    Really important ones:
    * No BEGIN/END in the stored procedure
    * The difference between = NULL and IS NULL
    * Possible issues with implicit conversions on an NVARCHAR parameter
    * The PriceOverrides join could return multiple rows for the same item

    Bonus points:
    * Parameter sniffing concerns when using a local variable inside a proc
    * No comments – if this is a code review, asking for comments is fair game
    * Tables need to be qualified with schema name
    * Time zone issues with GETDATE() if the server fails over between time zones
    * Case sensitive errors – the i/I aliasing won’t work if the server is case-sensitive

    That was fun! We’ll take a week off and do this again next week (May 16th).

    Reply
    • I’m surprised that, given the open ended ness of the original question “You’ve been asked to take a quick look at this code as part of a deployment. Explain what the business purpose of the code is, and tell me if there’s anything that concerns you.”, you don’t want the candidate to look into possible design issues based on this code and limit the scope to be strictly code review of what is possibly wrong with the stored proc

      P.s. The name of the proc is also pretty unintuitive, what am I exactly doing by category

      Reply
      • Brian – the key is in the phrase “quick look at this code as part of a deployment.”

        Reply
        • Read the first sentence, maybe I didn’t interpret the first sentence as code review only but moreso as an opportunity to review, and interpreted “tell me if there’s anything that concerns you” as a wide open question that begged the interviewee to look beyond the code that’s in front of you and see the bigger picture. I guess that’s what I would like to see when I’m the interviewer.

          Reply
          • Brian – OK, great! This post happens to be about what *I* expect when *I’m* the interviewer, but I totally understand that different folks will look for different things.

    • Mark Freeman
      May 12, 2016 10:56 am

      >No BEGIN/END in the stored procedure

      How would you feel about no BEGIN/END but a GO at the end instead?

      Reply
  • James Parker
    May 9, 2016 2:16 pm

    I know I am very late to this party so going to leave out the details of my answer.
    After doing this work for over twenty years now the things that are listed and not listed in the posts amaze but do not surprise me.

    My “Real World” “Tough Love” answer that I give as a professional DBA and deployment engineer:
    Re-writ the entire procedure, update the code in the Version Control System before/during deployment, send an email to management letting them know that this code failed code review and has been updated to something that worked, compiled correctly, and returned a distinct record set.

    In the “Real World” where things need to work and deployment code is submitted the day of deployment we would all still be using SQL 2000 SP2 and IE 6 if bad code was not corrected on the day of deployment.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}