Most DBAs Say They’re Encrypting Personally Identifiable Data. Most.

Despite the continuous stream of data breaches, ransomware, and GDPR violations, some companies still aren’t encrypting personally identifiable information (PII) inside our databases or backups.

I see this all the time when working with clients, but I can’t exactly talk about that or give numbers, so I decided to take a highly unscientific poll on Twitter. I figure if DBAs are sitting around on social media, they’re probably done with all the important parts of their job. The 147 votes came in:

Hoping for more upbeat news, I also asked on LinkedIn, a more professional social media network, where maybe people do their jobs better? I’m just guessing, who knows:

The story was better, but it’s still sketchy. Depending on where you ask, one in five to one in three shops isn’t encrypting their personally identifiable information at all.

I actually have a hunch the number is much worse because in a social media poll, we’re just taking folks at their word. In reality, when I’ve lifted the covers to dig a little deeper with the business, we’ve found things like production data being restored to development servers, and stored unencrypted.

When I’ve discussed this with clients, some folks even say, “Yes, we’re encrypting data – we use encryption at rest at the storage level. Everything that hits disk is encrypted automatically.” Okay, great – now what happens when someone takes a backup? They can copy that backup anywhere they please, like writing directly to someone else’s Azure Blob Storage account or to Amazon S3, and you won’t get any errors or warnings. Storage-level encryption doesn’t stop data exfiltration.

It’s okay if you don’t have all of the answers, but as a database administrator, you should start by asking a few questions:

  • Where are we storing personally identifiable information?
  • Is that data encrypted, and how?
  • If it’s not encrypted, what’s the plan to get it encrypted? Is the business waiting on me to do something?
  • Has anyone given my manager and the company’s security & compliance departments those above answers in writing?

Asking those questions is the first step to getting your company to a safer place.

Previous Post
[Video] Office Hours: Ask Me Anything About SQL Server and Azure
Next Post
Office Hours: Gif Reactions Edition

13 Comments. Leave new

  • Good post. It’s a complex topic because there are so many methods of “encrypting the database” that act as controls for different attack surfaces…there is no one single “database is encrypted”. For example, storage SAN encryption protects against, say, somebody pulling a hard disk from the datacenter (basically, this is bare minimum). TDE prevents the sysadmin with OS access from absconding with unencrypted MDF files. Is the data encrypted in memory though, if your system is subject to memory type attacks? Is it encrypted on the wire? Can your netadmins with their sniffer tools see it in plaintext? And then you have the option of data encrypted at the application layer and sent/stored in the db in encrypted form (like hashed passwords stored in the db). And the (last I checked…which was a while ago, so maybe they’ve improved it?) fairly easily cracked AlwaysEncrypted feature.

    None of which is to say, and this isn’t why you shouldn’t consider and implement the appropriate encryption to meet your security requirements, on the contrary…you just don’t want people saying “well, the db is encrypted” (in this one specific way that may be vulnerable to other methods of attack) “so we now don’t have to think any deeper about it”. Backup encryption is particularly important IMO because they are some of the most mobile aspects of your data…going onto various backup tapes, getting transferred across systems for archival, being stored offsite, all that fun.

    • In the organization “my friend” works at, they entirely avoid solutions that intentionally store PII in our managed databases. They just lack the confidence to do it correctly without a CISSP (or anything close to it) on staff. It seems there are just too many “gotchas” (even down to where things like private keys are stored and the integrity of layers protecting them). I have a real strong suspicion that most organizations (especially SMB’s with limited security resources) that do encrypt PII in their databases have gaps that effectively render their security efforts moot. Consider just the PII security breaches that make headlines alone.

  • If you encrypt at rest using TDE then the data encryption is tied to a cert. The restore would fail If the backup was usurped and then attempted to be restored without that cert. This is also a peril if you don’t keep those certs available after a restore due to a hardware failure. Did I miss something?

    • You’re stating a collection of facts related to the post, yes.

      Can you try rephrasing the question?

      • Brent, I think Carl is countering you on this statement in your post, “When I’ve discussed this with clients, some folks even say, “Yes, we’re encrypting data – we use encryption at rest at the storage level. Everything that hits disk is encrypted automatically.” Okay, great – now what happens when someone takes a backup?”. After I re-read your statement, I think this is in reference to encrypting at the storage level, not TDE. Which is true, if not TDE and encryption is at storage level, a backup can be restored anywhere.

        • TDE is not encryption at the storage level. It’s at the database level.

        • Storage encryption is usually done by the controller of your disks / SAN and encrypts everything you write to it (e.g. a simple text file). The primary reason to do this are hardware failures, where you have to replace the disk and have to do whatever with the old one. Theoretical someone could grab one of those chips on the defect disk, solder it to something other similar disks and use some low-level tools to read the data out (this is what some restore specialists do, when the notebook with your dissertation (and no backups) was run over with a truck)

          To be able to use self encrypting disks in another Server / SAN you would have need to taken a backup of the internal encryption keys (hint: do NOT store it on the disks in the same SAN), which can be usual be done via the management web surface (or sometimes some console tools).

          Storage encryption does not protects you in any way from someone taking a backup or copying the MDF/NDF/LDF-Files, since it is 100% transparent to the OS.

          TDE is a SQL Server feature, which encrypts the database files (but no other files) when writing it to disk (exception: data in filestream filegroups) and protects you from stolen backups / copied database files and – as the hardware storage encryption – from data theft on replaced disks. To restore or attach TDE encrypted databases, you need the certificate installed on the target server.

          And of course you always need a copy of your certificates (as for the disk encryption keys for storage encryption) at a VERY safe place (nowhere in / next to your data center, because it could caught fire or be flooded or whatever)

  • Sadly, while not an excuse, encryption caused havoc with our backup/de-dupe software Avamar. The backups are huge due to every block being changed during the encryption process. As well, (I don’t admin Avamar), de-duplication is not possible or difficult from what the admin told me. Cost is another factor. Before SQL 2019, encryption required Enterprise.

    We had to implement encryption over time and not all at once. We have over 10k databases, not all with PII, but it was a huge undertaking and one that is now done. We are in cruise mode with new DBs that get created. However, recently, we’ve started using Azure Key Vault for TDE and are slowly migrating all of our on prem TDE to AKV. Many clients now require us to have a key management solution and AKV has filled that void. Any comments on AKV would be helpful. So far, so good on our end.

    • 40oz Coffee Breakfast
      September 15, 2022 8:52 pm

      I’ve seen similar issues with EMC storage as well from the Virtual Admin side. On some level I think that the entire SAN/NAS industry relies on Dedupe to justify cost, even if it isn’t the best practice they need to squeeze every byte per dollar they can.

  • I’m one of the “Databases no, Backups yes” people. We use a third party application from a vendor that’s notoriously slow to the point of being unresponsive. Our backups use Quest Litespeed for encryption. I’d be curious to know what others in a similar situation are doing.

  • We have a check for unencrypted data. It is not checking for all potential PII data columns but some others like password, credit card, SSN, et cetera. The check is looking for potential column names and using a simple check for lengths not larger or equal to a normal value. Meaning, if a SSN is encrypted, the length will be longer than 11 characters.

    We use it when we are working with a new client so we can report to them and there will be a record of us raising the concern if there is a breach or would like us to help remediate the issue.

    SQL command to run:
    EXECUTE dbo.sp_Develop @RunCheckIds = ’27’;

    This wiki link is a quick start to using the stored procedure.

  • The reality is that the vast majority of customers I work with are running Standard Edition, pre 2019, so TDE is not an option.

    However, even in 2016 standard edition, the ‘encryption’ option is available through the SSMS GUI, and it appears possible create master keys and certificates. So perhaps some version/equivalent of TDE is available, or maybe it just doesn’t do anything!

  • I think it would be fun to see the percentage of people “using PII (ex: SSN) as a primary key”.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.