How to Run DBCC CHECKDB for In-Memory OLTP (Hekaton) Tables

tl;dr – run a copy-only full backup of the Hekaton filegroup to nul. If the backup fails, you have corruption, and you need to immediately make plans to either export all your data, or do a restore from your last good full backup, plus all your transaction log backups since.

Yeah, that one’s gonna need a little more explanation, I can tell. Here’s the background.

DBCC CHECKDB skips In-Memory OLTP tables.

Books Online explains that even in SQL Server 2016, DBCC CHECKDB simply skips Hekaton tables outright, and you can’t force it to look at them:

Hey, that's the same way I handle merge replication - I just look away
Hey, that’s the same way I handle merge replication – I just look away.

However, that’s not to say these tables can’t get corrupted – they have checksums, and SQL Server checks those whenever the pages are read off disk. That happens in two scenarios:

Scenario 1: when SQL Server is started up, it reads the pages from disk into memory. If it finds corruption at this point, the entire database won’t start up. Even your corruption-free, non-Hekaton tables will just not be available. Your options at this point are to restore the database, or to fail over to a different server, or start altering the database to remove Hekaton. Your application is down.

Scenario 2: when we run a full (not log) backup, SQL Server reads Hekaton’s data from disk and writes to the backup file. If corruption is found, the backup fails. Period. You can still run log backups, but not full backups. When your full backup fails due to corrupt in-memory OLTP pages, that’s your sign to build a Plan B server or database immediately.

Here’s the details from Books Online:

Beep beep beep
Beep beep beep

The easy fix: run full native backups every day, and freak out when they fail.

Backup failures aren’t normally a big deal, but if you use in-memory OLTP on a standalone server or a failover clustered instance, backup failures are all-out emergencies. You need to immediately find out if the backup just ran out of drive space or lost its network connection, or if you have game-over Hekaton corruption.

Note that you can’t use SAN snapshot backups here. SQL Server won’t read the In-Memory OLTP pages during a snapshot backup, which means they can still be totally corrupt.

This works fine for shops with relatively small databases, say under 500GB.

The harder fix: back up just the In-Memory OLTP data daily.

With SQL Server 2016, the Hekaton limits have been raised to 2TB – and you don’t really want to be backing up a 2TB database the old-school way, every day. You could also have a scenario where a >1TB database has a relatively small amount of Hekaton data – you want to use SAN snapshot backups, but you still have to do conventional backups for the Hekaton data in order to get corruption checks.

Thankfully, Hekaton objects are confined to their own filegroup, so Microsoft PM Jos de Bruijn pointed out to me that we can just run a backup of just that one filegroup, and we can run it to NUL: to avoid writing any data to disk:

Miami Vice Versa
Miami Vice Versa

Oops, did I say we could just back up that filegroup? Not exactly – you also have to back up the primary filegroup at the same time.

If you’re doing great (not just good) database design for very large databases, you’ve:

  1. Created a separate filegroup for your tables
  2. Set it as the default
  3. Moved all the clustered & nonclustered indexes over to it
  4. Kept the primary filegroup empty so you can do piecemeal restores

If not, hey, you’re about to. An empty primary filegroup will then let you do this faster:

Checking for corruption by backing up to NUL:
Checking for corruption by backing up to NUL:

Tah-dah! Now we know we don’t have corruption.

This comes in handy if you’ve got a large database and you’re only doing weekly (or heaven forbid, monthly) full backups, and doing differential and log backups the rest of the time. Now you can back up just your in-memory OLTP objects for corruption.

Note that in these examples, I’m doing a copy_only backup – this lets me continue to do differential backups if that sort of thing is your bag.

For bonus points, if your Hekaton data is copied to other servers using Always On Availability Groups, you’ll want to do this trick on every replica where you might fail over to or run full backups on. (Automatic page repair doesn’t appear to be available for In-Memory OLTP objects.)

If you’d like CHECKDB to actually, uh, CHECK the DB, give the request an upvote here.

Previous Post
Breaking News, Literally: SQL CLR Support Removed from Azure SQL DB
Next Post
Old and Busted: DBCC Commands in 2016

5 Comments. Leave new

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.