Stupid Things I’ve Done With SQL

I Always Enjoy Reminiscing

And that header is a lie. I do enjoy thinking about stuff I used to have to do at work, because I don’t have to do it anymore. While we mostly gripe about the on-call stuff, the late night maintenance windows, etc., we don’t usually talk about the goofy stuff we’ve done. What follows is a non-comprehensive (and embarrassing code free) list of things I’ve done with SQL that I really should have used something else to do.

The problem is usually this: when you have to do something on the fly (and it’s expected sooner than later), you don’t have time to learn a whole new method. There were plenty of times when I looked at something and said “this would be perfect for PowerShell!” but then, three hours later, while standing atop a pile of smashed keyboards and monitors and disemboweled mice, I usually just gave up and renamed the damn files myself.

Without further ado, here’s some dumb stuff I’ve done!

Validate Emails

I know, I know. The answer is to just send the email. But, project managers. That’s all. So there I went, writing a smorgasbord of ridiculous checks to be reasonably sure that every email we sent out was going to a valid address. The real fun here was trying to screen out people who put in obviously fake email addresses, like no@email.com, or 12345@12345.com. This was one of the first times I got angry at the SQL Server development team for not having anything close to a decent RegEx implementation, but far from the last.

Format Names

I don’t just mean put them in proper case. No, that would be fairly easy. This also involved guessing at first, middle, and last names, titles, and everything else, and putting them into the correct column. When you have no control over the source data, this is a nightmare. Oh, and some of them might be business names, so can you check that, too?

Get Text Between Dynamic Delimiters

Imagine you got a book, and someone wanted to get all the text that occurred between N patterns. Yeah. Keep thinking about that while you head to the liquor cabinet. Of course the field to search was NVARCHAR(MAX). Of course. Thanks.

Check An FTP Site For A File

“SSIS”, you scream. And rightly so. That would have been smart, if I had time to figure out how to open SSIS. Not only did I have to check for the file, but I had to move it, and then load and process it. It was simple in T-SQL, but it was also very stupid. I believe I used the FileZilla command line, but can’t recall exactly.

Unzip Files

I was moderately thankful for 7-Zip’s CLI here. I still can’t talk about this without getting a twitch. We used to get these hundreds of zipped up text files, and I’d unzip them, copy them all to one big file, and then do some stuff with it. Since it was a weekly process, it got automated.

Download A File

This was a two step process. In one, I’d generate a cookie based on user information, and then I’d call cURL with that cookie to download the file. At least it wasn’t a zip file, but it still had to get moved and loaded into SQL.

Once Upon A Dream

There’s a reason I didn’t supply code examples here. I mean, aside from the fact that it’s owned by my past employers, and that would be theft or something. You should know it’s possible, but that it’s a bad idea. Learn from my mistakes. Many of these used xp_cmdshell to interact with the file system. A lot of people will scream about using it, but I’m not one of them.

Thanks for reading!

Brent says: <cough>I generated tables in HTML with T-SQL</cough>

Previous Post
[Video] Office Hours 2016/11/09 (With Transcriptions)
Next Post
Updated First Responder Kit: sp_Blitz, sp_BlitzCache, sp_BlitzIndex

20 Comments. Leave new

  • In the primordial soup of my career, before I knew my SQL, I built the most absurd SSIS data flow of conditional splits and derived columns to do what a couple of SQL CASE statements could have done in a tiny fraction of the time.

    Reply
    • That sounds awesome! Did you blog about it?

      Reply
      • ROFL! No, though it was awesome! It was a data-cleaning operation — the source had fact information at multiple grains, and I needed to do some incidental cleaning and generate a new set of rows to represent the appropriate grain of data. I got to watch my proto-query data flow down the branches of this gnarly tree, turning from yellow to green as things completed. Each branching (conditional split) would probably have been a WHEN and each derived column a THEN in this imaginary CASE expression. Since I also had to separate some values into multiple rows, I probably had multiple data sources that fed into these branches (so the FROM clause would likely UNION a couple of queries on the same table before our CASE expression). It was a head trip for someone just starting a career working with data – before, there was chaos, then there was (frankly more chaos), and finally, a data source I could use to appease end users.

        Reply
        • Yeah. I gotta say, the toughest part about starting with a product like SQL Server (with no training, and no senior staff to guide you) is knowing if you’re using the right tool from the outset, or knowing what your other options are.

          It’s always months or years later that you find something out and scramble to look at all your old code.

          Reply
  • I once wrote a password cracker in (mostly) set-based T-SQL, to put off SQL Server users from using bad passwords, by showing them how quickly they could be cracked…

    Reply
    • I’m No Security Expert® but that sounds really cool. What kind of stuff did you look for?

      Reply
      • It would compile a tiny custom dictionary with horrible passwords (password, password1, 123456 etc.), UNION that to the list of user names (you’d be surprised how many people use their username as its password) and it would then do a:

        WHERE pwdcompare(#customPwList.Password, l.password_hash) = 1

        If it didn’t find anything, you could then optionally start a brute-force attack by just compiling another list with password lengths of 1 up to 8, using numbers/lowers/capitals/symbols (all combos configurable).

        Wasn’t as fast as hashcat, but it was a fun T-SQL writing afternoon 🙂

        Reply
  • I raise Brent’s t-sql html tables to yes… html pages generated by plsql, and still in production… bwahahaha…

    Reply
    • I generated a fair amount of HTML for Oracle Applications customizations w/ PLSQL back in the day.
      I also built a nifty little barcode generator!

      Reply
  • apart from the fact that you are way smarter, we sound very much alike.
    xp_cmdshell automation and way too many HTML tables.

    Reply
  • “But, project managers.” Bahahaha!

    Reply
  • Yeah, Brent. That tables-in-HTML-using-TSQL, is what we used to call Stupid ASP Tricks. Talk about Separation of Concerns.

    Reply
  • I’ve also created a CLR assembly to call google maps so I can cleanse my addresses. I’ve even recently used and abused SSRS so it’s not just a reporting tool but a way to create/update records and an interface for uploading and downloading files. What’s worst I get a perverse sense of pleasure when I do it.

    Reply
  • I created a process that used PowerShell, stored procedures, and SQL Agent to call a third party API, get the login token, submit records that needed to be processed, grabbed the results and loaded them back into SQL. All through API calls. I was tired of waiting for the .NET developer to finish the service she was supposed to create to do this.

    **note: I am truly sorry to the poor soul that has to support this mess.

    Reply
  • Stupid things to do? Trying to parse stuff like addresses. Now I’m all like “shit in, shit out bro”.

    Reply
  • I like this blog post. Reminds me of putting SQL to the edge of its usability trying to conform addresses parsed from XML. I did a lot of stuff using window functions and string aggregation (argh….not simple anyway…either use some crappy CLR or have fun with lots of bad performing XML subqueries) trying to match up different ways of writing the same address like “Main Street” and “Main St”, matching same people who in one case had an address containing the house number and in another case not. In the end this got pretty complex and I was still far from the perfect solution. Something like deduping 250.000 rows into 90.000 but there would be still duplicates left (to be spotted by the human eye). Maybe I should have been looking into Data Quality Services back then. Anyway this never went into production as it turned out to be that this data was not needed at all (shame on me by not being able to clarify that earlier :-0).

    Reply
  • We used physical tables to store temporary data from many many processes. We though we were very smart using a “login” column to identify each process. Of course the table had no indexes 🙂

    Reply
  • Before the days of Powershell, I wrote a TSQL script to move a set of databases (matching a particular naming convention) from one server to another. Alter Database commands to set databases offline, Calls to xp_cmdshell to copy the data and log files, more SQL to attach the databases.
    Can’t remember anything in the way of error handling, no checksums on the copies, no logging or alerting.
    Used in multiple environments, included remotely-hosted production.
    *shudder*

    Reply
  • Right, validating. Luckily I was quick enough to think about few of things like that when working with names:
    https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

    But I never was smart enough to stop thinking about validating e-mails, even while having one in domain like this one:
    http://abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijk.com/

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.