At first glance, if you set up log shipping the way most folks do, it looks like you can’t run CHECKDB on a subscriber:

On SQL Server Enterprise Edition, you can try working around it by creating a database snapshot – but that won’t work either:

Here’s the trick: your database needs to be in standby mode instead of norecovery. When you run restores with the standby option, you’re able to query the database in between restores.
You don’t even have to restore a transaction log in order to switch the database over to standby mode – you just have to run the restore with standby command, like this:
1 |
RESTORE DATABASE DBAmetrics WITH STANDBY = 'E:\MSSQL\DATA\Standby.ldf'; |
Now, you can run CHECKDB against that database – or run any ol’ select queries that you want:

Now you can feel confident that your log shipped secondary is ready to go. Remember, it’s only applying log file changes from the primary – just like Always On Availability Groups, just because CHECKDB succeeds on the primary doesn’t mean the secondary is free from corruption.
19 Comments. Leave new
Inverse also true…checkdb on secondary won’t ensure corruption free primary, for the same reason. Figured I’d reemphasize that because lots of folks would love to get out of eating the spinach of checkdb on a primary. Only full restores will get that done, to my knowledge.
Hi Brent. Great advice. For standby files I use *.tuf extension (transaction undo file). To change “status” of secondary database, I use sp_change_log_shipping_secondary_database with @restore_mode = 1.
nice
And how it works with subscriber being on SA licence? As far as I’m concerned SA allows only standby?
Guessing it’s similar with maintenance database (for storing FirstResponder/Ola’s/WhoIsActive results), nada for SA licenced server?
Marcin – if you’re offloading your production checkdb onto another server, then you’re doing production work there.
If, on the other hand, you’re running checkdb on both the primary and the subscriber – in order to insure that the subscriber will be good to go for failover – then you might have an easier time arguing that it’s not really for production use and doesn’t need to be licensed for production, only under software assurance as the failover partner for the primary.
Thanks for response!
Luckily I’m in second situation – primary, while important, isn’t big (yet I guess) and allows for maintenance windows, so checkdb is on schedule on that one.
And also, as mentioned…if you’re doing that first scenario (checkdb only on logshipped secondary) you aren’t really ensuring a corruption free primary, unless you reinitialize it every time with a full backup. If a page gets scrambled by the SAN on the primary after log shipping is set up, unless it gets rebuilt or written over, it will sit happily corrupted there while the secondary checkdb runs without error.
Yeah, the only thing you’re buying is the ability to fail over to the log shipped secondary when you encounter corruption on the primary. (Which is a nice thing to buy, don’t get me wrong.)
This seems to assume that the CHECKDB on the secondary can complete in between executions of the LSRESTORE job. Otherwise you need to temporarily disable the job, which can jeopardize your SLA if the you need to fail over to the secondary during the CHECKDB. You’d have to run the LSRESTORE job before failing over. Am I correct?
Mark – close, but not quite. If you set up your restore jobs so that they don’t kick out users when running restores, then how will things react instead?
In that case, I would think that the LSRESTORE job would wait on the CHECKDB to finish, or it would fail and retry at the next scheduled execution time. Regardless, the logs can’t be restored while the CHECKDB runs. The only difference is whether the restore process is manually disabled and re-enabled or if it is handled automatically, but I would think that the issue of the secondary getting behind on log restores while the CHECKDB runs would still exist, which could impact SLA in the unlikely event of a water landing or needing to fail over during the CHECKDB on the secondary.
Mark – bingo, the LSRESTORE job waits for CHECKDB to finish.
The question about missing your SLA still happens with corruption, too, right? If you fail over to a database that hasn’t had a CHECKDB in months, you’re just rolling a different set of dice.
But wont the restore log fail if there is corruption? It might succeed if the corrupted page is not accessed while restore but when it is, restore should fail.
Quite so!
I was just pointing out that there would be a delay in the restores, and that needs to be accounted for in the fail over process. I didn’t mean to imply that this was a reason not to do the CHECKDB.
But wont the restore log fail if there is corruption? It might succeed if the corrupted page is not accessed while restore but when it is, restore should fail.
Sonali – only if the log restore touches the corrupt page. Archive tables, lookup tables, logging tables, etc can go untouched for long periods of time.
Event with page_verify_option_desc = CHECKSUM?
That only checks for physical corruption, not logical (like the many recent SQL Server bugs.)
Great Thank You, Brent!