How to Run DBCC CHECKDB for In-Memory OLTP (Hekaton) Tables

Last Updated 5 years ago

tl;dr – run a copy-only full backup of the Hekaton filegroup to nul. If the backup fails, you have corruption, and you need to immediately make plans to either export all your data, or do a restore from your last good full backup, plus all your transaction log backups since.

Yeah, that one’s gonna need a little more explanation, I can tell. Here’s the background.

DBCC CHECKDB skips In-Memory OLTP tables.

Books Online explains that even in SQL Server 2016, DBCC CHECKDB simply skips Hekaton tables outright, and you can’t force it to look at them:

Hey, that's the same way I handle merge replication - I just look away — Hey, that’s the same way I handle merge replication – I just look away.

However, that’s not to say these tables can’t get corrupted – they have checksums, and SQL Server checks those whenever the pages are read off disk. That happens in two scenarios:

Scenario 1: when SQL Server is started up, it reads the pages from disk into memory. If it finds corruption at this point, the entire database won’t start up. Even your corruption-free, non-Hekaton tables will just not be available. Your options at this point are to restore the database, or to fail over to a different server, or start altering the database to remove Hekaton. Your application is down.

Scenario 2: when we run a full (not log) backup, SQL Server reads Hekaton’s data from disk and writes to the backup file. If corruption is found, the backup fails. Period. You can still run log backups, but not full backups. When your full backup fails due to corrupt in-memory OLTP pages, that’s your sign to build a Plan B server or database immediately.

Here’s the details from Books Online:

The easy fix: run full native backups every day, and freak out when they fail.

Backup failures aren’t normally a big deal, but if you use in-memory OLTP on a standalone server or a failover clustered instance, backup failures are all-out emergencies. You need to immediately find out if the backup just ran out of drive space or lost its network connection, or if you have game-over Hekaton corruption.

Note that you can’t use SAN snapshot backups here. SQL Server won’t read the In-Memory OLTP pages during a snapshot backup, which means they can still be totally corrupt.

This works fine for shops with relatively small databases, say under 500GB.

The harder fix: back up just the In-Memory OLTP data daily.

With SQL Server 2016, the Hekaton limits have been raised to 2TB – and you don’t really want to be backing up a 2TB database the old-school way, every day. You could also have a scenario where a >1TB database has a relatively small amount of Hekaton data – you want to use SAN snapshot backups, but you still have to do conventional backups for the Hekaton data in order to get corruption checks.

Thankfully, Hekaton objects are confined to their own filegroup, so Microsoft PM Jos de Bruijn pointed out to me that we can just run a backup of just that one filegroup, and we can run it to NUL: to avoid writing any data to disk:

Oops, did I say we could just back up that filegroup? Not exactly – you also have to back up the primary filegroup at the same time.

If you’re doing great (not just good) database design for very large databases, you’ve:

Created a separate filegroup for your tables
Set it as the default
Moved all the clustered & nonclustered indexes over to it
Kept the primary filegroup empty so you can do piecemeal restores

If not, hey, you’re about to. An empty primary filegroup will then let you do this faster:

Checking for corruption by backing up to NUL:

Tah-dah! Now we know we don’t have corruption.

This comes in handy if you’ve got a large database and you’re only doing weekly (or heaven forbid, monthly) full backups, and doing differential and log backups the rest of the time. Now you can back up just your in-memory OLTP objects for corruption.

Note that in these examples, I’m doing a copy_only backup – this lets me continue to do differential backups if that sort of thing is your bag.

For bonus points, if your Hekaton data is copied to other servers using Always On Availability Groups, you’ll want to do this trick on every replica where you might fail over to or run full backups on. (Automatic page repair doesn’t appear to be available for In-Memory OLTP objects.)

If you’d like CHECKDB to actually, uh, CHECK the DB, give the request an upvote here.

5 Comments. Leave new

Jon Morisi
April 12, 2016 12:11 pm

Very clever, nice workaround!

Reply
Linda Wenglikowski
April 12, 2016 3:10 pm

OLTP Tables just not ready for prime time serious production applications. Corrupted OLTP Tables would surely make a mess of any critical database and make it impossible to meet any aggressive RPO/RTO objectives.

Reply
Tom Pullen
April 13, 2016 5:41 am

Never touching Hekaton with your own (or anyone else’s) bargepole is also a robust way to avoid this issue.

Reply
SQL Server In-Memory OLTP data/delta file corruption | Ned Otter Blog
April 4, 2017 12:35 am

[…] restore, backup, and any other operation that reads data/delta files. As Brent Ozar blogged in this post, you can backup the memory-optimized filegroup to DISK = ‘nul’ to force recalculation of all […]

Reply
jadarnel27
April 6, 2018 9:14 am

I came across this today while investigating some oddities around CHECKDB and a database with a memory optimized file group – thanks for the info and workaround! I realize this post is two years old, but thought I’d pass along an updated link for what I think is the connect item (since Microsoft broke all of these links recently):

https://feedback.azure.com/forums/908035-sql-server/suggestions/32902300-dbcc-checkdb-needs-to-check-the-checksums-on-in-me

Reply