Getting Started With Oracle Week: NULLs and NULL handling

We’re not so different, you and I

In any database platform, you’ll have to deal with NULLs. They’re basically inescapable, even if you own an island. So let’s compare some of the ways they’re handled between Oracle and SQL Server.

Twofer

If you take a look at the two queries below, there are a couple things going on. First is the NVL function. It’s basically the equivalent of SQL Server’s ISNULL function, where it will return the second argument if the first is, well, NULL.

The second thing you may notice is the ORDER BY. In here we can do something really cool, and specify whether to put NULLs at the beginning, or end, of our results. SQL Server will just put them first, for better or worse. If you want to put them last, you need to do some dancing with the devil. Or just use a CASE expression in your ORDER by.

I love stuff like this, because it gives you easy syntactic access to presentation goodies.

There’s another function, NVL2, which I haven’t quite figured out a lot of uses for, but whatever. It takes three arguments. If the first argument is NULL, it returns the third argument. If the first argument isn’t NULL, it returns the second argument.

The results end up something like this below.

I just learned how to do this, too.

I just learned how to do this, too.

There’s also NULLIF! Which does what you’d expect it to do: return a NULL if the two arguments match. Otherwise, it returns the first argument. Dodge those divide by zero errors like a pro.

At long last, not a Rump

At long last, not a Rump

Last, but certainly not least, is the lovely and talented COALESCE. It’s a dead ringer for SQL Server’s implementation, as well.

Is it me you're looking for?

Is it me you’re looking for?

Intentionally left blank

NULLs happen to the best of us. Three-valued logic can be sneaky. I prefer to use canary values when possible. Those are values that could never naturally occur in data (think -999999999 or something). Again, this isn’t meant to be an exhaustive piece on NULLs and NULL handling, just a toe in the water for any SQL Server people who need to start working with Oracle.

Thanks for reading!

Previous Post
Getting Started With Oracle Week: Aggregating
Next Post
Getting Started With Oracle Week: Creating Indexes and Statistics

33 Comments. Leave new

  • Possible gotcha: in Oracle, unlike in MSSQL, an empty string is a NULL.

    • Erik Darling
      June 9, 2016 9:17 am

      I actually like that better!

    • If the empty string is considered null then I wonder if an empty string equals an empty string. If it realy is null than they don’t right? Which I find counterintuitive.

      • In Oracle, as an empty string is a NULL it can’t equal anything, including another empty string.

        • Damn 🙂

          • That’s trading in the exotic nulls (trying to avoid weird or even counter intuitive which they are but you expand your intuition after you learned right) for something else exotic but hiding the exoticness instead of being explicit about it: you can’t compare something you don’t know with something else. Null is being explicit about it. Canary values produce more intuitive results given inequalities I do admit. But than again it’s part of the package if you realy want to consider unknown as a possibility in the asked queries.

  • Hey Brent, thanks for another great article in your “why Oracle is so damn expensive” series.

    In 10,000 words or less, why do you prefer canary values over NULL? The absolutist in me prefers NULL over canary values, but I barely know my NULL from a hole in the ground.

    • Erik Darling
      June 9, 2016 10:39 am

      Hey Jimmy,

      Thanks for stopping by. It’s a pleasure writing these here blog posts.

      I prefer canary values because inattentive developers will handle NULLs incorrectly in many scenarios.

      Have a great day Tommy!

      • Hey Ben,

        With regards to canary values, wouldn’t an inattentive developer (read: me) need to address/work around canary values much the same as they would NULLs? So instead of filtering out rows where ThingID is NULL, you filter out rows where ThingID=-999999999, right? I guess my high-level question is; what advantage and/or safety net do canary values provide?

        I’m currently working in a shop that uses canary values, and I’ve never understood why.

        I tried using the Googles, but I only found information about the overflowing of the buffers and whatnot.

        Your friend,
        Timmy

        • Erik Darling
          June 9, 2016 11:45 am

          Mary Katherine,

          I see a lot of people trying to code around NULLs rather than try to code with NULLs. I’ll see JOINs and WHERE clauses full of ISNULL and COALESCE, which can really throw things for a loop. Internally, you may also have to deal with a histogram step for NULL values, and SQL will have to maintain a NULL bitmap per row that contains information about whether every column that allows NULLs is actually NULL. Additionally, inequalities act kinda funny when you allow NULLs.

          Canary values are a totally valid choice. Are they always? Nah.

          Yours truly,
          Debbie Brenda Cosmetics Supply Company

          • There is nothing wrong with NOT EXISTS, it’s the solution right. If a language allows for 3VL it should deprecate NOT IN. Just keeping it in for completeness/symmetry because you have IN causes massive amounts of buggy sql. As you say the trouble is with inequalities and not so much equalities. I’m pretty sure that less than 10 percent knows about that caveat and a not in is so easy to write (that’s a very safe bet) Nested IN’s /NOT INS x levels deep are loved by beginners to even intermediate sql developers that cause solutions to fail apparently intermittent as a sub query one is debugging with does not have a null in it. It’s too much of an edge case to leave it responsably in the language. This blog post describes it better than I can here I’m comments. http://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/

          • Erik Darling
            June 9, 2016 4:38 pm

            So… You want to deprecate decades of NOT IN code that contains literal values, that will never touch a NULL? And replace it with NOT EXISTS that will either hit tables or force people to add clunky VALUES constructs?

            Dude. You’re losing me.

          • Yes deprecate and I want a public apology from the ANSI sql committee who allowed it to slip into the standard and who allowed the pile of bugs to happen over the decades. Believe me, ask this question in a full conference room and I hope a few hands will go up and say they hit something weird with it which disappeared when they reformulate the query using not exists and hormone with their lives. Having it in the language for so long is not an excuse to keep supporting it.
            And there is a nice way to get rid of it by disallowing an altered or create statement containing N@T $N and just executing the buggy code that exists as is. I’m ordering my T-shirt right now.

          • No, AND binds more tightly than OR so that part is fine. In my example no parentheses were actually necessary, but I’ve been bitten so many times by stuff like:

            where x=1
            and y=0 or y is null

            that I just automatically put parentheses around ORs to make sure they are interpreted properly in the (current or future) context of ANDs

        • Canary values as a means to cover for developers/dba’s that don’t know 3 value logic seems a rather dumb thing too do. Sure I have been bitten by it but once you allow nulls in a column you should expand true/false logic. Outer joins introduce them too tough which is an implicit gotcha.
          These two things are always on my rather when writing tsql. Where col 123 will … return nulls or not? And if you use Where NOT IN col than it will return nothing once there is a single null in col. That’s a real danger that even experienced developers miss (use not exists). Other than these two what is also weird?

          • Comments are not being HTML encoded so greater than and smaller then don’t appear in the previous comments. The first example is about inequality so: col does not equal 123 which will not return col with a “value” of null (which the theory considers a placeholder I think). Indeed, something you don’t know can be in reality the same thing you test for so it is unknown which has the same effect as false.

          • Also, a single null in the list of col has this effect. If there is no null the not in behaves as all of us expect. The standard should deprecate not in for this reason. Code that is correct with data that contains no null can suddenly start to fail when nulls enter. Don’t use not in is my advice.

          • Erik Darling
            June 9, 2016 12:20 pm

            I like NOT EXISTS as much as the next guy, but I’m not sure I’d advise people not to use a totally valid and standard part of SQL rather than understand how to handle NULLs. If you use canary values, you don’t have to place such oddball restrictions on people.

        • Imagine a stored proc that you can pass a match value into:
          — With ‘canary’ values:
          …where MaritalStatus = @MaritalStatus

          –With nulls
          …where (MaritalStatus = @MaritalStatus or MaritalStatus is null and @MaritalStatus is null)

          • I think the last @MaritalStatus is null is redundant and can be dropped. You are describing the pattern where you can make a parameter optional right? I like that pattern too although its considered an anti pattern as parameter sniffing can get you in big trouble. I’ve seen it happen many times. The alternative is dynamic sql which I found inelegant and not maintainable compared to the beatiful solution you describe. Just add a with option recompile add the end of the statement. You will take a hit cause the exec plan will be remade every time. Only if it’s a sprocket that’s executed many times this is an issue.
            Back on the topic of nulls and the pattern. Here you can also use a canary instead of a null and there are no pros and cons to consider right? Instead of passing null one can agree (that’s a slight disadvantage nonetheless) on passing a certain canary. That’s the disadvantage you identify or is it something else?

          • Erik Darling
            June 9, 2016 3:39 pm

            Tom, good luck with just adding Recompile.

            Hope you’re patched up.

            Hope there aren’t any other bugs.

            https://support.microsoft.com/en-us/kb/968693

            https://support.microsoft.com/en-us/kb/2965069

          • No, not so much an optional value as just something that might be unknown: Pass in M to get the married people, S to get the single people, or SOMETHING to get the unknown people. But what’s the ‘SOMETHING’? If it’s null (which I agree is logically cleaner) then the query gets ugly and it’s easy for people to forget. If it’s a canary value ( ‘U’ or ‘?’ or whatever), then the query is always easy to write. And probably faster, too.

          • Mark, I get your point thanks. You use ugly and that’s a qualification expressing your taste. For me personally null is the natural way to go as it expresses “I don’t care about that argument”. Which can get us in the different meanings of null right? Which the purest is unknown IMO. Not applicable is so much used that most consider it and just get it right. For not applicable I would consider a canary btw like n/a which is unique enough given most domains.
            Also, being married is a binary proposition I think. It can be unknown but not something else. Then again there are countries which allow polygamy so it could say 2 times married? 🙂

          • Erik, I remember reading such a kb and thinking about the poor souls who can’t patch such a serious sql server bug. Of course there are but this bug is so serious that It warrants the update. It’s not a performance bug or lack of functionality. It’s a bug against correctness which sits high for me together with security. So yes, if you can’t patch than write the cumbersome dynamic sql, fight the powers that block it or maybe start to plan for an alternative job. The sql 2014 case is new for me but I see it is patched and I hope they did it timely (?)

          • True, ‘ugly’ is a subjective call, so maybe I shouldn’t use terms like that. But I hope we can agree that the canary version is easier to read / understand than the null version. And I’m sure we can agree that it’s shorter, which means less typing!

            Plus I suspect it may run faster, though I don’t have hard data to back that up.

            As for the whether or not marital status is a binary proposition, consider that you might also have ‘D’ for divorced, ‘W’ for widowed, etc. But the cardinality of the concept of marriage isn’t really the point; it was just an example of something you might use for filtering, which you also might not know for some people.

          • Right, those marital statuses also apply here in Belgium. Its not binary I agree. As for the matter of taste, no, I like pistache better sorry 🙂

          • Parentheses in wrong place.
            –With nulls
            …WHERE MaritalStatus = @MaritalStatus OR (MaritalStatus IS NULL AND @MaritalStatus IS NULL)

          • @Tom Pester: I just realized we may have different interpretations of what passing null into the hypothetical stored proc that does the marital status query might mean:

            I meant it as “find the people whose marital status is unknown”, but I think you were thinking “find people without regard to marital status”. This highlights another problem with nulls: there are actually a whole bunch of different meanings that ‘null’ might represent!

  • Hers is another one that serves as an excellent interview question asked by Mr Burns : select * from table where x is smaller than 5 OR x is 5 OR x is greater than 5.
    Surely this returns the whole table no? No

  • Renato Pedroso
    June 10, 2016 3:36 am

    Hi Brent. If you’re starting with Oracle, i’d suggest you check Flashback Technology if you haven’t already…..

    ps: I use nvl2 when i want to check if some column is null, but i want to return some status value of another type. It’s basically a simpler case.

    eg: select nvl2(commission,’Has commission’,’Doesnt Have commission’) from table

  • I’ll never resign myself to this ‘FROM DUAL’ thing 🙂

Menu
{"cart_token":"","hash":"","cart_data":""}