Getting Started With Oracle: Working With Strings

Oracle
9 Comments

The almighty string

It’s so good for holding all sorts of things. Chicken roulade, beef roulade, salmon roulade. It’s also the way you should store phone numbers. If I could go back in time to when I first started working with SQL, that’s what I’d tell myself. Stop. Just, please, for the love of Codd, STOP TRYING TO STORE PHONE NUMBERS AS INTEGERS. It will only end in heartbreak and weird exponential notation.

But we’re here to talk about Oracle. And strings. And stuff you can do with them. I spent one Christmas writing a text parser that my wife still brings up when we go shopping. So this is an important area to me.

First things first, man

Oracle ships with a function to put your strings in proper case. It sounds trivial to most people, until you look at the amount of time, energy, and forum posts that have gone into getting the same behavior out of SQL Server. Why this isn’t built in is beyond me. I guess we really needed CHOOSE and IIF instead. Those are super helpful. Game changers, the both.

But check this out!

I LOVE YOU ALL CAPS
I LOVE YOU ALL CAPS

But it doesn’t end there! I’m going to skip over the CONCAT function, because it only takes two arguments. In a situation where you need to do something like ‘Last Name, First Name’ you have to concatenate the comma + space to the last name. I think you actually end up typing more than just writing the whole thing out. So how do you do that?

You use the double pipes (||) to tell Oracle to mush everything together. A slightly more complicated example is if you needed to generate a ‘password’ based on some different bits of data.

Oracle is nice enough to let you put it all together without whinging about data types:

Hint: don't actually generate passwords like this.
Hint: don’t actually generate passwords like this.

Oracle also has easy ways to pad strings, using LPAD and RPAD. This beats out most methods I’ve seen in SQL Server, using RIGHT/LEFT and some concatenation inside. It’s another situation where if you mix data types, the conversion happens automatically.

What you get is about as expected: strings padded to 6 digits with the character of your choosing.

Rubber Room
Rubber Room

SQL offers RTRIM and LTRIM to remove leading and trailing spaces from strings. If you need to remove other stuff, you have an additional step of replacing them, or calculating substrings. Oracle’s TRIM function gives you several different ways to have it operate. Oracle also has REPLACE, but whatever. It does what you think it does.

TRIM by itself will remove trailing and leading spaces. You can also hint it to only do trailing, leading, or both, and additionally specify which character you want to remove. In the example above. I found a phone number that started with a 6 and ended with four 6s. Below is how each run of the TRIM function worked:

Several Sixes
Several Sixes

It’s much more flexible and useful than SQL Server’s trimming functions, because it can be used to trim off things other than spaces.

Second String

Oracle has a really cool function for searching in strings, cleverly titled INSTR. It takes a few arguments, and sort of like CHARINDEX, you can tell it where in the string to start searching. The real magic for me is that you can also specify which OCCURRENCE of the string you want to find. The only downside is that it appears to be case sensitive.

That runs, but it only gets us a hit for people whose names start with A:

A is for Ack
A is for Ack

All those B names have an a for the second character. But it’s easy to solve though, just use the UPPER function (or LOWER, whatever).

Just like CHARINDEX or PATINDEX, it returns the position of the string you’re searching for. The last column is where it’s super interesting to me. I love that kind of flexibility. Here are the results:

I want all positions!
I want all positions!

And just like with SQL Server, you can use substrings and character searching to parse out text between two characters. This is made a bit easier in Oracle by being able to specify the nth occurrence of a character, rather than having to a separate call to CHARINDEX.

And you get that there string parsing magic to just the middle three digits.

Middling.
Middling.

That’s a lot harder when the strings aren’t uniform. We could just just as easily specified the constant positions in SUBSTR. But hey, declaring variables in Oracle is hard work. Seriously. There’s a lot of typing involved. And a colon. It’s weird.

Strung Out

This is as far as I’ve gotten with string manipulation in Oracle. I tried to figure out the stuff that I used to have to do in SQL Server a lot first. That usually makes the rest make more sense down the line. This is one area where I think Oracle clearly wins out. Though there’s a lot of conversation, and rightly so, about if the database is the proper place to do string manipulations like this.

Many people don’t have a choice. SQL either acts as the presentation layer, or the DBAs/developers who have to make these changes only have access to the data at the database level. It’s not rare that someone only knows some form of SQL, either.

Whatever your philosophy on the matter, it’s likely going to keep happening. Oracle makes it much easier to remedy some pretty common issues. One item I didn’t cover is using the LISTAGG function, which is another thing that SQL’s workarounds for are quite hacky, error prone, and involve XML. I can see why it would be a pretty big turn off for more people to have to implement it over a simple function call.

Thanks for reading!

Brent says: Oracle’s string functionality is a good example of why it’s so hard to port apps from one platform to another. It’s not just a matter of mapping functionality exactly, but also finding simpler ways to write the same queries.

Previous Post
Getting Started With Oracle: Date Math
Next Post
Getting Started With Oracle: Working With Numbers

9 Comments. Leave new

  • Surprisingly CONCAT in SQL Server is more powerful than in Oracle. Apparently in Oracle CONCAT has two input string arguments, in SQL Server CONCAT takes a variable number of string arguments. Thus CONCAT cannot be used as a substitute to pipes ‘||’.

    Reply
  • Søren Kongstad
    February 17, 2016 12:30 pm

    Also check out the translate function in Oracle. It takes three strings as input. The characters in string 1 present in string 2,are replaced with the characters in string 3.want to replace all square brackets with parentheses, and all slashes to backslashes in one go, you can do that with translate http://www.techonthenet.com/oracle/functions/translate.php

    Reply
  • I also love you can use a negative for your occurrence number to start from the end of the string without resorting to the REVERSE nonsense.

    TRIM ftw

    Reply
  • For me, using CHAR to represent what looks like numbers is best illustrated with social security numbers. They can have leading zeroes, which makes for problems if they’re stored in a numeric data type.

    I’ve always used the rule of ‘If it’s going to be used in a calculation, store it in a numeric.’

    I’m looking forward to seeing how joins are done, I expect that’ll be soon.

    Reply
  • Don’t forget about the regular expression support. Man, I missed that in MSSQL.

    Reply
    • I haven’t forgotten. There’s a post on that coming up. Turns out I’m still horrible at RegEx 🙂

      Reply
      • Keep practicing. They are just _so_ useful. And they’re built-in, you get to use them with set operations in Oracle.

        Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.