In this multi-part series, we’ll start with easy corruption, work our way up to nonclustered indexes, then show how checksums and backups aren’t enough for protection, either. To get all the parts as they’re published, subscribe.
Hold my beer.
CREATE DATABASE [50Ways]; GO ALTER DATABASE [50Ways] SET PAGE_VERIFY NONE; /* Normally a bad idea */ GO USE [50Ways]; GO CREATE TABLE [dbo].[ToLeaveYourLover]([Way] VARCHAR(50)); GO INSERT INTO [dbo].[ToLeaveYourLover]([Way]) VALUES ('Slip out the back, Jack'), ('Make a new plan, Stan'), ('Hop on the bus, Gus'), ('Drop off the key, Lee') GO SELECT * FROM [50Ways]..[ToLeaveYourLover]; /* Yep, we have data */ GO ALTER DATABASE [50Ways] SET OFFLINE WITH ROLLBACK IMMEDIATE; GO
Now, let’s corrupt it. Open it with a hex editor – my personal favorite is the free xvi32 because it doesn’t require installation. Just download it, fire it up (you’ll want to run it as administrator), and open the database’s MDF file:
Next thing you know, you’re looking at the contents of the MDF file. The scientific way to approach this would be to identify the exact 8K page you’re looking for, and jump to that point of the file. However, you have my beer, so I’m trying to finish this quickly so I can get back to that. We’ll just click Search, Find, and look for text:
Presto – now you can see the raw contents of your database. What, you thought SQL Server encrypted it or something? Heck no, Stan’s social security number, credit card number, and disease history are all in there, free for the reading.
And for the writing, it turns out. Close the Find box, click on the S in Stan your right hand window, and start typing. I’ll change his name to Flan, and click Save. Close XVI32, and bring the database back online:
ALTER DATABASE [50Ways] SET ONLINE; GO SELECT * FROM [50Ways]..[ToLeaveYourLover];
And see what you get:
Or rather, see what you don’t get: no corruption warnings, no dropped connection, no errors. As far as SQL Server is concerned, this is just nice good data. You can even run DBCC CHECKDB, and no errors will be reported.
What about clustered columnstore indexes?
Look, the only reason I’m doing this is because you’re still holding my beer hostage. Same script, but now with a clustered columnstore index:
CREATE DATABASE [50Ways]; GO ALTER DATABASE [50Ways] SET PAGE_VERIFY NONE; /* Normally a bad idea */ GO USE [50Ways]; GO CREATE TABLE [dbo].[ToLeaveYourLover] ([ID] INT IDENTITY(1,1), [Way] VARCHAR(50), [Guy] VARCHAR(50)); GO CREATE CLUSTERED COLUMNSTORE INDEX cci ON [dbo].[ToLeaveYourLover] GO INSERT INTO [dbo].[ToLeaveYourLover]([Way],[Guy]) VALUES ('Slip out the back','Jack'), ('Make a new plan','Stan'), ('Hop on the bus','Gus'), ('Drop off the key','Lee') GO SELECT * FROM [50Ways]..[ToLeaveYourLover]; /* Yep, we have data */ GO ALTER DATABASE [50Ways] SET OFFLINE WITH ROLLBACK IMMEDIATE; GO
After the database is offline, fire up your trusty XVI32, search for Stan, and he’s still visible in clear text:
Edit him, turn him into Flan, save the file, bring it back online, and run the SELECT query again:
A clustered columnstore index works just like a regular clustered rowstore index: if you don’t have page verification turned on, no corruption is detected, even when you run CHECKDB.
PAGE_VERIFY is really important.
When I created the database, I threw in this line:
ALTER DATABASE [50Ways] SET PAGE_VERIFY NONE; /* Normally a bad idea */
That tells SQL Server not to do any page verification when pages are read or written from disk. You have 3 options:
- NONE – just a dumb, suicidal idea.
- TORN_PAGE_DETECTION – also a dumb, suicidal idea. You can retry this same demo with TORN_PAGE_DETECTION instead of NONE, and you’ll get the exact same results.
- CHECKSUM – SQL Server includes a checksum as it writes each page from here on out, and then checks the page when it’s read.
If I repeat either the rowstore or columnstore demo, but this time change the PAGE_VERIFY NONE to PAGE_VERIFY CHECKSUM, I get a totally different result:
SQL Server didn’t detect the corruption simply by setting the database online – bringing it online didn’t read this particular data page. However, when I read the page, whammo, SQL Server detected that the data on the page didn’t match the checksum.
The checksum isn’t used for data recovery, mind you – only for alerting us that the data is wrong. This isn’t like parity in a RAID array where we can rebuild the data from scratch.
What you need to do next
- Run sp_Blitz, which tells you if any databases don’t have checksum enabled
- Set checksum on those databases
- Do something that causes pages to be written (like rebuilding indexes, but keep in mind that it’s going to generate a ton of transaction log traffic)
- Set up alerts to notify you when a corrupt page is read
- Attach a corrupt database (like the one you just created) to your production SQL Server, and run the SELECT statement, which should trigger the alerts you just created
- Make sure you’re doing CHECKDB regularly from here on out
- Talk to stakeholders about what’s being stored unencrypted in the database
- Give me back my beer