DBCC CheckDB FAQ: Check for Corruption in SQL Server

DBCC CheckDB: Frequently Asked Questions

It's eating our logo. Will it stop at nothing???
It’s eating our logo. Will it stop at nothing???

Q: Do I really need to run DBCC CheckDB? Yes. If you’re even partially in charge of a SQL Server and you’d like to keep your job, you should make sure CheckDB is being run regularly. (Seriously!)

Q: But WHY do I need to run CheckDB? CheckDB helps alert you when data becomes corrupt. Data corruption can cause your users to get incorrect data, cause queries to fail, or can even take your entire SQL Server instance offline. Corruption can happen anytime. It could be caused by a problem in SQL Server, an issue with Windows, or an issue with your storage. It could also happen due to other types of software– something like a filter driver (replicating data, defragmenting drives, etc) or a virus scanner.

Q: How often do I need to run CheckDB? Every day you’d like to keep your job, you should run CheckDB. OK, maybe that’s a bit extreme.

Most people run CheckDB against user databases every weekend. That’s “normal”. Is normal good enough for you? Well, maybe not! When availability is important, people choose to run CheckDB more frequently: often once per day.

For system databases, why not check them every night?

Q: How can I make CheckDB faster? First, do you really need to make it faster? If you have a regular maintenance window where decreased performance is OK, that’s often good enough.

The best way to speed up CheckDB is to offload the work to a totally different server. Do this by automating a process where you regularly restore a full backup, then run CheckDB. This approach has multiple benefits:

  • Tests your restores
  • Doesn’t impact the production database (especially if they’re using independent storage)
  • Secondary instance can be configured to use as a “warm standby” in case the primary server fails

And sure, you can potentially offload CheckDB work to a virtualized SQL Server instance. Just make sure that it has enough resources to run CheckDB fast enough.

Q: Are there any shortcuts with LogShipping, Mirroring, or AlwaysOn Availability Groups?
Bad news: None of those technologies mean you don’t have to run CheckDB against your main production database. The only valid way to offload CheckDB is to run it against a restored full backup.

In fact, AlwaysOn Availability groups may mean you probably have to run more CheckDB, not less.

Database-Corruption-Sleeping-Under-Desk
This is what a corrupt database looks like

Q: Do some features make CheckDB slower? Yep, some indexes check slower than others. Having nonclustered indexes on computed columns can make CheckDB crawl, as can having indexed sparse columns. If you’re using those features and CheckDB is painfully slow, you might consider disabling the nonclustered indexes before running the job– but that means writing custom code and making sure that you handle any error situations when the job fails. (Kind of gross, right? That could lead to a bad day. Head on back up and see “How can I make CheckDB faster” for a better answer.)

Q: What about “PHYSICAL_ONLY” or “NOINDEX”? Can I use those? Well, you can, but then you might not detect corruption as fast. That could be really bad, right? I’m not a fan of either of these options because they turn off valid features in CheckDB that could really save your bacon. I’d much rather offload CheckDB to another server by restoring a backup as describe above– it has much greater benefits and is even better for performance of your production database.

Q: Should I run CheckDB against TempDB? Yes– you should. CheckDB can’t do every single check against TempDB that it can against other databases (the output will let you know that it can’t create a snapshot), but it can still run a variety of checks against TempDB and it’s supported to run it. Don’t skip it! You should be checking master, model, msdb and tempdb, plus all your user databases.

Q: How can I automate running DBCC CheckDB? You can do this by using the Check Database Integrity task in a Maintenance Plan. You can also use a SQL Server Agent job and a free script to help with database maintenance.

Q: Is CheckDB all I need to protect me from corruption? Running CheckDB can help you find out about corruption, but you also need:

To allow alerts and jobs to notify you, you need to configure Database Mail, enable the database mail profile on the SQL Server Agent (and then restart the SQL Server Agent service), configure an operator, and then set the operator on the alerts and jobs.

Q: What if I find corruption? Corruption happens– that’s why this is so important. The first thing you should do is sign up for our online DBA Interview Training to find that next job. (Awkward laugh.)

All kidding aside (and that was a joke! stop updating your resume!), you must respond to the issue immediately. The first thing to do is to alert others on your team to the issue and make a plan. You need to make sure that a full CheckDB has been run against the database, that you’ve read the output of CheckDB carefully in the SQL Server Error Log.

Gail Shaw has posted an article with a great list of do’s an don’ts when you hit corruption here (login required, sorry). If the database is important to your business, I urge you to spin up a support call to Microsoft Support right away– even while you read Gail’s article. At around $500 USD Microsoft Support is your absolute best value when it comes to getting help right away for corruption. (Yep, I’m an independent consultant and I’m telling you that!)

Once you have the specific incident resolved, make sure you follow up. How long has the issue existed? Can root cause of the original corruption be identified? Could you have the issue again, or somewhere else in your environment? It’s rare for CheckDB to strike once and not repeat itself, so it’s important to follow up.

Previous Post
How to Tell if You Need More Tempdb Files
Next Post
What Amazon RDS for SQL Server Users Need to Know about Multi-AZ Mirroring

77 Comments. Leave new

  • Derek B. Bell
    May 27, 2014 9:42 am

    Oops, there’s no site reference on the anchor tag for “jobs notify you when they fail”.

    Reply
    • Kendra Little
      May 27, 2014 6:00 pm

      Hey Derek, The link for that one is the same link as the Alerts link above. I felt weird linking to the same page on two lines, so I just left it Link-Less. 🙂

      Reply
  • What if you’re not allowed to run DBCC CHECKDB because many of your databases are between 1 and 4 terabytes and even minimizing the impact ( http://sqlperformance.com/2012/11/io-subsystem/minimize-impact-of-checkdb ) doesn’t help?

    Reply
    • Kendra Little
      May 28, 2014 5:23 pm

      What do you mean by “doesn’t help”? Do you mean you can’t restore the databases anywhere?

      Reply
      • I mean that it doesn’t help enough to successfully run on these in a reasonable maintenance window without degrading the thoroughness of the checks themselves (PHYSICAL_ONLY for instance). I’m juggling thousands of databases across about half a dozen production servers and three different software environments that rely on other nightly maintenance tasks. I’d love a solution, I’m just not sure what a good one would be that would get to everything in a timely enough manner to do something about potential corruption.

        Reply
        • Kendra Little
          May 28, 2014 5:43 pm

          Sorry if this is an obvious question, but if you’re restoring the database to a completely different server, why would you need the checkdb to complete within a “maintenance window”? Do you only have shared components available (such as storage) that would impact production?

          You probably see what I’m getting at here: your challenge is to use automation tools and request the resources you need to make this happen. Sometimes it isn’t easy, but that’s a huge part of the responsibility of a DBA.

          Reply
          • The short answer on resources is yes, the long answer is probably one you’ve heard a million times. I’m aware of the tools and the necessity and the importance.

        • Erik – if you’re feeling overwhelmed with too many databases and not enough protection resources, check out our high availability and disaster recovery planning worksheet:

          https://www.brentozar.com/archive/2014/05/new-high-availability-planning-worksheet/

          Take one of your big databases/servers, fill out the “current” column, and then hand it to management. Ask if that situation is okay. If it is – and sometimes it is, like with data warehouses that can be easily rebuilt from source data – then there’s good news: you can stop losing sleep over the inability to run DBCC. You’re meeting the business objectives.

          If the situation isn’t okay for management, then it’s up to them to get you the resources you need, and it’s up to you to explain those resources. (That’s where the back of the worksheet comes in.) Sometimes management just hasn’t heard the right message about how dangerous the situation is, and seeing it in writing helps escalate the issue to get the resources you need.

          Hope that helps!

          Reply
  • We restore to our staging server every week and run dbcc checkdb every time. It gives a good view to the devs and makes sure we’re aware of any issues.

    Reply
    • Paris Hilton
      May 27, 2016 9:08 am

      Bill,
      have you found a way to restore master to your staging server (as a user DB) and run dbb checkdb. It seems that there’s somethign special about master that will ALWAYS report corruption.

      P

      Reply
  • Bruno Martinez
    June 1, 2014 3:27 pm

    Why check master, model, msdb and tempdb when checking a backup is enough?

    Reply
    • Kendra Little
      June 2, 2014 10:25 am

      Hi Bruno,

      Couple of questions– how are you “checking” the backup? And how are you backing up tempdb?

      Kendra

      Reply
      • Bruno Martinez
        June 2, 2014 2:16 pm

        Hi Kendra,

        I meant running DBCC CheckDB over a backup of the user database. You advise running DBCC CheckDB over TempDB but also offloading the work to a totally different server. I don’t see how to do both.

        Bruno

        Reply
  • One of the suggestions in this article is to run DBCC CHECKDB on another instance of a database server in order to offload the impact on a production system. What if some potentially corrupted data is found in the ‘tempdb’ ? Since backup and restore operations are not allowed on tempdb, how would one approach this? Will a restart of SQL Server fix the tempdb corruption, since this database gets re-created?

    The following was found in ‘suspect_pages’:

    SELECT * FROM msdb.dbo.suspect_pages sp

    database_id file_id page_id event_type error_count last_update_date
    2 8 1264855 1 3 2014-09-15 20:03:55.590
    2 8 1264854 1 2 2014-09-15 20:03:43.817
    2 8 1264853 1 2 2014-09-15 20:03:43.830
    2 8 1264852 1 2 2014-09-15 20:03:43.840
    2 8 1264851 1 2 2014-09-15 20:03:43.850
    2 8 1264850 1 2 2014-09-15 20:03:43.857
    2 8 1264849 1 2 2014-09-15 20:03:43.867
    2 8 1264848 1 2 2014-09-15 20:03:43.877
    2 8 1264831 1 2 2014-09-15 20:03:43.887
    2 8 1264830 1 2 2014-09-15 20:03:43.893
    2 9 1305143 1 2 2014-09-15 20:03:55.440
    2 9 1305142 1 2 2014-09-15 20:03:55.450
    2 9 1305141 1 2 2014-09-15 20:03:55.460
    2 9 1305140 1 2 2014-09-15 20:03:55.467
    2 9 1305139 1 2 2014-09-15 20:03:55.477
    2 9 1305138 1 2 2014-09-15 20:03:55.483
    2 9 1305137 1 2 2014-09-15 20:03:55.493
    2 9 1305136 1 2 2014-09-15 20:03:55.503
    2 9 1305127 1 2 2014-09-15 20:03:55.513
    2 9 1305126 1 2 2014-09-15 20:03:55.523
    2 9 1305124 1 2 2014-09-15 20:03:55.533
    2 9 1305123 1 2 2014-09-15 20:03:55.547
    2 9 1305122 1 2 2014-09-15 20:03:55.557
    2 9 1305121 1 2 2014-09-15 20:03:55.567
    2 9 1305120 1 2 2014-09-15 20:03:55.577

    The tempdb database has 214132.44 MB, with the following allocation

    name fileid filename filegroup size maxsize growth usage
    tempdev 1 E:\TempDB\tempdb.mdf PRIMARY 49304832 KB Unlimited 10% data only
    templog 2 E:\TempDB\templog.ldf NULL 742656 KB Unlimited 25% log only
    tempdev2 3 E:\TempDB\tempdb2.mdf PRIMARY 54277888 KB Unlimited 10% data only
    tempdev3 4 E:\TempDB\tempdb3.mdf PRIMARY 54236160 KB Unlimited 10% data only
    tempdev4 5 E:\TempDB\tempdb4.mdf PRIMARY 37463680 KB Unlimited 10% data only
    tempdev5 6 E:\TempDB\tempdb5.mdf PRIMARY 6421952 KB Unlimited 10% data only
    tempdev6 7 E:\TempDB\tempdb6.mdf PRIMARY 9314496 KB Unlimited 10% data only
    tempdev7 8 E:\TempDB\tempdb7.mdf PRIMARY 3762816 KB Unlimited 10% data only
    tempdev8 9 E:\TempDB\tempdb8.mdf PRIMARY 3747136 KB Unlimited 10% data only

    DatabaseName FileSizeMB LogicalFileName PhysicalFileName Status Updateability RecoveryMode FreeSpaceMB FreeSpacePct PollDate
    tempdb 48149 tempdev E:\TempDB\tempdb.mdf ONLINE READ_WRITE SIMPLE 48073 100.00% 2014-10-03 11:14:21.217
    tempdb 725 templog E:\TempDB\templog.ldf ONLINE READ_WRITE SIMPLE 353 49.00% 2014-10-03 11:14:21.217
    tempdb 53005 tempdev2 E:\TempDB\tempdb2.mdf ONLINE READ_WRITE SIMPLE 52929 100.00% 2014-10-03 11:14:21.217
    tempdb 52965 tempdev3 E:\TempDB\tempdb3.mdf ONLINE READ_WRITE SIMPLE 52889 100.00% 2014-10-03 11:14:21.217
    tempdb 36585 tempdev4 E:\TempDB\tempdb4.mdf ONLINE READ_WRITE SIMPLE 36509 100.00% 2014-10-03 11:14:21.217
    tempdb 6271 tempdev5 E:\TempDB\tempdb5.mdf ONLINE READ_WRITE SIMPLE 6254 100.00% 2014-10-03 11:14:21.217
    tempdb 9096 tempdev6 E:\TempDB\tempdb6.mdf ONLINE READ_WRITE SIMPLE 9077 100.00% 2014-10-03 11:14:21.217
    tempdb 3674 tempdev7 E:\TempDB\tempdb7.mdf ONLINE READ_WRITE SIMPLE 3659 100.00% 2014-10-03 11:14:21.217
    tempdb 3659 tempdev8 E:\TempDB\tempdb8.mdf ONLINE READ_WRITE SIMPLE 3645 100.00% 2014-10-03 11:14:21.217

    Reply
    • Hi Vic. If you’re getting any corruption in TempDB, the first thing I’d do is check the user databases. If TempDB’s storage is unreliable, odds are the user databases are in bad shape too. If you’re using different storage for TempDB, it’s time to reconsider the integrity of that storage.

      Don’t just restart SQL Server – whatever caused that corruption will strike again.

      Reply
      • Hi Brent,
        Thanks. There is a “Database Integrity Check” maintenance plan with a “Check Database Integrity Task”, that produces a set of T-SQL as below, but the statement for DBCC on TempDB is not in this generated script

        USE [msdb]
        GO
        DBCC CHECKDB(N’msdb’) WITH NO_INFOMSGS
        GO

        The other system databases are also included: distribution, master, model, but NOT tempdb.

        This job happens every Friday at 11:00PM, and there is a notification going out when the Job Fails.

        As such, I don’t think there are any issues with the integrity of the user databases, since I don’t see any rows in the suspect_pages table pointing to database_id other than TempDB .

        We have EMC Symmetrix VMAX Full Automated Storage Tiering.

        Reply
        • Kendra Little
          October 7, 2014 1:02 pm

          Just a note — you mention that all the system databases (except tempdb) are being checked, but you don’t mention that any of the user databases are being checked. If you don’t check them, you don’t know if they have corruption.

          Reply
    • Kendra Little
      October 3, 2014 10:25 am

      I would start by running CheckDB against tempdb, if you haven’t, to get more information and document the situation. Checkdb can be run against tempdb even though it can’t do everything it can do against a user database. I would definitely take it seriously and assess the impact its having on users and decide what to do from there.

      Restarting tempdb doesn’t actually recreate the files, and won’t necessarily fix the issue. But good news, Robert Davis wrote a post to help explain that and get you on the right path: http://www.sqlsoldier.com/wp/sqlserver/day14of31daysofdisasterrecoveryfixingacorrupttempdb

      Like Robert says, you should make sure to follow up on root cause.

      Reply
  • Great article Kendra 🙂 I’d like to know your thoughts if this process is the grand finale to look/detect any DB corruption:

    I have a 250GB Prod database that I moved to a Test Environment , It restored without errors then I ran DBCC Checkdb (MyProdDB) which took about 4-5 hours to crunch and it finished with the following message:

    CHECKDB found 0 allocation errors and 0 consistency errors in database ‘MyProdDB’.
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    Is there anything else you would think of to run or test against for any further results or is it safe to say the DB appears to be fine? Thanks

    Reply
    • Kendra Little
      October 27, 2014 6:06 pm

      Yep, that means the job succeeded, and ask of the date you ran the backup, it didn’t find corruption.

      Bonus points/ the advanced details: If you created the database in SQL Server 2000, there is one type of check that isn’t run on later versions unless you manually run it once. So if this is an older database that you’ve upgraded, you may want to run it once on production with the DATA_PURITY option to enable column value checks. If the database was created on SQL Server 2005 or later, or if this has previously been run against the production copy with that parameter, it happens by default.

      There’s details in the CHECKDB books online page about the option: http://msdn.microsoft.com/en-us/library/ms176064.aspx

      You’re probably fine, but you asked such a specific question, I felt I had to give you an interesting answer 😉

      Reply
      • This is very interesting indeed …. as far as I know this current SQL 2012 DB was upgraded from SQL 2005 but most likely it did exist prior to that (Hello SQL 2000) but I’m not sure if anyone had run DBCC checkdb with DATA_Purity at the SQL 2005 level years ago.

        Now you’ve done it hahaa , for the peace of mind (my mind of course) I’ll run it manually once to get it out of the system and my head!

        Since I’ve already ran CheckDB successfully is there a way to run CheckDB skipping the majority of checks and sneak in Data_Purity to minimize run-time (avoid the 4-5 hours ) … something like DBCC checkdb (MyDB) with NoIndex , Data_Purity, noinfomsgs etc.? Thanks 🙂

        Reply
  • We turned on check database integrity check on our maintenance plan before doing backups We have 128 GN memory and out max memory set to 110000 and awe enabled. When dbcc checkdb runs our memory spike to 95% and it stays there until we shut and restart the SQL service. During regular hours it’s at 55-60 range.
    How can I prevent integrity check using the memory

    Reply
  • Hello Kendra,

    What would cause dbcc checkdb to timeout? I have this issue where the checkdb on the user database is timing out. We are using powershell to run checkdb. This is the error being generated:

    An exception occurred while executing a Transact-SQL statement or batch.
    Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

    I could not find anything that might explain this behavior. The other checks on system databases run fine. Does dbcc checkdb need locks on objects? What could be causing it to timeout? I’m kind of afraid to run an adhoc check because I am not sure what to expect and do not want to break the production database.

    Also, what do you think of this thought from my co-DBA: Checkdb must be ideally run on a single user mode and that it could fail whether there is a corruption or not. It is not a problem if we don’t run checkdb or if checkdb fails. If there is an actual corruption problem, we will know with or without checkdb.

    I do not agree with the thinking but I’m not sure if there might be something he/she has seen before that led to that thought…. Do you have any comments on that?

    Reply
    • Kendra Little
      March 27, 2015 2:10 pm

      Hi there,

      Thanks for giving lots of detail on your question. My first guess is that this is related to the way you’re calling CHECKDB with Powershell, and that powershell is implementing a timeout on the command. CHECKDB is taking longer than the timeout, and PowerShell is just giving up. One way to test this would be to run CHECKDB without PowerShell, directly from a SQL Agent Job Step using TSQL. Set the step to log to a text file and see if you get the same result, or if it fails whether or not you get more information in that log text file.

      You do not need to run CHECKDB in single user mode, and it does not need to use locks unless you’re running on SQL 2000 or you’ve specified the TABLOCK option, which you should not typically need to do.

      I would be careful with your co-DBA. It is NOT true that you will definitely know about corruption without running CHECKDB. If you don’t run CHECKDB regularly it is quite easy to end up with a corrupt database and corrupt backup files, so when it is finally discovered you have very few options to fix the issue.

      Good luck and please let me know if running directly in TSQL in a sql agent job fixes the issue!

      Reply
  • Hi Kendra,

    Thank you very much for your response. I really appreciate it as I could not find any helpful resources regarding my issue elsewhere.

    I ran dbcc checkdb against the database directly in tsql last Friday and the results returned in about 40 seconds. There were no errors.

    I tried to run it again just now, and it ran for about 2 mins and was still going on, so I just killed it.

    The timestamp on our log file is usually written to about 25 mins after the checkdb job starts to run. If that gives any idea on how long the timeout needs to be…

    On a different instance, I tried to run dbcc checkdb directly in tsql but it took 1.5 hours and it was still going. I had to kill it. The DB is less than 300 GB. What should I look for in this case? I also noticed disk response times jumping up to 500 ms while running it.

    I agree 100% with you that corruption is not something to take risks on. It is always better to be safe and be aware of potential issues early on before it is too late to save the DB. I will have to work on this to make sure that we do our due diligence on the matter.

    Thank you very much!!

    Reply
    • Kendra Little
      March 30, 2015 10:43 am

      Check out the “Q: How can I make CheckDB faster?” part of this post.

      Reply
      • Thanks for that. Unfortunately, because of the nature of the data, it cannot be moved. =(

        Is it normal for checkdb to run for more than a couple of hours?
        I will try to run it manually over a holiday or a weekend to minimize impact on performance.

        Thank you so much.

        Reply
        • Kendra Little
          March 30, 2015 11:04 am

          CHECKDB has to read all the data off of disk and do a lot of checks for corruption– if disks or slow, CPU is limited, or if there’s limited memory that can make it slow when you’ve got hundreds of GB or terabytes of data. The level of activity running also influences performance as well.

          Reply
          • Thank you for your prompt responses. Our storage is a bit slow so that may be contributing to it. I will try to see if I can set an unlimited timeout to the powershell script so that it can successfully run checkdb. If not, I will create a separate agent job to run this check.

            Thank you!

  • Hi Kendra,

    I’m thinking of implementing the DBCC on our warm standby, but a bit hesitant to do so because I can’t sort out the steps of recovery in my head.

    Currently, we backup a 500Gig database to 3 separate files that are overwritten daily. When I currently do the DBCC as the first step of the maintenance plan, I would not overwrite my good backup if there is corruption. However, if I implement the DBCC on the Warm standby and then detect corruption, my good backup has been lost.

    Does it make sense to keep multiple days of backups when you do the warm standby DBCC model so you could revert a backup prior to corruption and then apply transaction logs? I have not dealt with a corrupt database yet and want to minimize surprises if/when I do go down that path.

    Reply
    • Kendra Little
      July 10, 2015 8:27 am

      I absolutely recommend not overwriting your existing backup files and keeping multiple days, even if it means copying them off to slower storage. If the backup ever fails mid-run, that can leave you in a bad spot.

      Reply
  • I have an application database that seems to be running running checkdb command at random times during that day. How can stop that?

    Reply
  • Is there any way I can find out when last DBCC INFO failed? I specifically want the last time any DBCC check failed..

    Reply
  • Shahul Hameed
    October 7, 2015 2:37 pm

    Hi Brent , Could you please help me on How to solve this error ” could not continue scan with nolock ” in SQL Server 2008 R2.

    Reply
    • Kendra Little
      October 7, 2015 2:41 pm

      Hi Shahul, This post isn’t on that topic, and Brent didn’t write it. Guessing you may have gotten your webpages confused!

      Kendar

      Reply
  • I have been reading this site for years and I am a subscriber. This site has made me a better DBA and the tools, links and advice have helped me often. I sincerely thank all of you here for that.
    Until today I have never had a reason to ask a direct question, and unfortunately it is one about consistency.
    So here is my situation, we have an SharePoint farm with one MSSQL Server with around 100 databases in it, nothing huge most are in the 100gb range. I am not a SharePoint expert and our 1 SharePoint person is more design than admin, but he is the one in charge of the farm; however the farm has been checked over by an outside consultant who has given the setup and deployment a clean bill of health so to speak.
    The SharePoint sites have to be available from 4am to 12am due to the locations of the people accessing it, that only gives me 4 hours a night to do all the backups and maintenance.
    The SharePoint_Config database keeps getting consistency errors. (and only that database, all others are fine)
    I have restored the config db four times in the last two months with known good backups and everything is fine for a week or so and then more errors. (random time intervals)
    We have the following:
    • 2 SharePoint app servers
    • 2 SharePoint web servers
    • 1 OWA server
    • 1 Forms server
    • 1 SQL server
    Recovery and Backup:
    • Hourly log backups
    • Daily backup
    • Separate weekly backup
    • Twice a day DPM backup
    • SAN level replication of all servers and data to our rollover location (one hour lag time)
    • All backups are kept 30days, except weekly which are kept six months or longer depending on storage space
    Maintenance: (using Maintenance plans, Blitz, BlitzIndex, Ola’s scripts and a couple of self-made scripts)
    • Monthly re-indexing
    • Weekly index optimization
    • Weekly DBCC checks (currently daily on SharePoint_Config)
    • Backups
    • Purges
    • Others (etc.)
    Troubleshooting and Tracing:
    • I/O logging
    • IOPS logging
    • Latency logging
    • Normal server monitoring (cpu, memory, disk, read/writes, cache, etc.)
    • SPID tracing and logging
    • SP logging

    I have found no smoking gun, no bad drives, no heavy I/O spikes, nothing in the event logs of any server, basically nothing at all to point me in the right location.
    So my question is this, any idea where I can look to try and track down what is causing the consistency errors, were they are coming from? I am pretty much exhausted everything I can think of. Any help would be greatly appreciated. If you would like more or specific information please ask me.

    Thank you.

    Reply
    • Ron – if you’re encountering CHECKDB errors, open a support ticket with Microsoft. It’s $500 and they work the problem with you until it’s done. You’ll get much better help than we can give you in a blog post comment. I’d move quick – you don’t want permanent data loss.

      Reply
      • Sir,
        Thanks for that. I actually suggested this to my VP awhile ago, but as of yet he has not given me permission to do it. Of course once the whole thing comes crashing down he will, while at the same time reading me the riot act for allowing it, never mind all the e-mails of me telling him we should contact Microsoft support. Oh the joys of corporate IT.

        Reply
  • Hi Brent/Kendra,

    In case of corruption errors, First approach is to recover database from last known good backup. Now how to decide the backup is good? I mean how we can be sure that errors are not in DB backup. Backups with checksum will not help 100%? right? … So does it mean that we should know the backup which has passed in DBCC checkDB(after restore)? Could you please advise me on this?

    I did tried (successfully) recovering indexes from Backup by page restore. But I think I was lucky that the backup was good :).

    Thanks,
    Shailesh

    Reply
  • Why dbcc checkdb used more space in tempdb in sql server 2012 than sql server 2008 r2?

    Reply
    • Jose – we haven’t seen that. Can you tell us more about your measurements?

      Reply
      • Hi Brent.

        Well we had a database in SQL Server 2008r2 where execute dbcc checkdb normally the tempdb database doesn’t grow then we migrated the database to SQL Server 2012 and execute dbcc checkdb here tempdb database grow like 200 GB

        Reply
        • I understand, but we need specific, reproducible measurements.

          I knew a guy who bought a lotto ticket once and then got into a car accident the next day. That doesn’t mean everyone who buys a lotto ticket will get into a car accident. You’ve gotta be able to tie things together with a more scientific method.

          Reply
  • Had DBCC suddenly start issuing these messages over the Easter weekend (and a change freeze was on so nobody changed anything!) – this is a test DB created to replicate the problem:

    Msg 1823, Level 16, State 2, Line 3
    A database snapshot cannot be created because it failed to start.
    Msg 1823, Level 16, State 8, Line 3
    A database snapshot cannot be created because it failed to start.
    Msg 5119, Level 16, State 2, Line 3
    Cannot make the file “C:\ClusterStorage\Volume1\DATA\test1.mdf_MSSQL_DBCC28” a sparse file. Make sure the file system supports sparse files.
    Msg 7928, Level 16, State 1, Line 3
    The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams.

    This is SQL2014 SP1 cluster running on Windows 2012 R2 – all has been OK for months, and I thought this sparse file error was a Windows 2008 problem that had been hotfixed?

    Also there are two clustered volumes and if you create a DB on the second one this doesn’t happen – even weirder!

    Anybody come across this?

    Reply
    • John – for random questions, your best bet is to either post a question at http://dba.stackexchange.com or call Microsoft support, especially if it’s a CHECKDB problem in production.

      Reply
      • Thanks, it turned out that it was because the SQL Data disk was owned by the second node of the cluster, and as soon as we moved the ownership of that volume to the primary node all the problems stopped.

        Can’t find anything anywhere as to why, when the disk is owned by another node, it suddenly doesn’t allow sparse files, except it being a cluster volume may mean it doesn’t present as NTFS and fails a check remotely.

        Anyway, now we know how to fix it at least.

        Reply
  • Vicky Sukhwani
    June 27, 2016 6:39 am

    Recently we had Prod Db failure on our environment i can see *.mdmp files were created in the Error Log directory when i was running CheckDB and finding corruption. I want to set up a SQL Alert which will notify me whenever a new mdmp file or SQLDUMP*.txt is created, can you please help me that how should i set it up. I looked everywhere but couldnt find anything.

    While setting up SQL Alert i need to give combination of Error Number/Severity or Message text if i now but looking SQL Server Error logs i dont see any Error Number or Severity which i can use so please suggest how can i setup automated Alerts to check if new Dump file was created on my environment

    Thanks in Advance

    Reply
  • Hello. A question on the topic of restoring to another server to run checkDB. I understand that it has to be a full backup and not an AG replica, log-shipped copy, etc.. My question is if, in those cases, whether we still need to periodically run checkDB against the live, online database. Even if you use a full restore, it’s restored to another sector of the SAN, running through different switches and fibers, to another server in another chassis, etc.. My concern is there would be something in the underlying hardware supporting the actual prod instance that would be missed if you run checkDB against a restored copy. Is there any validity to that concern or am I making up a problem that doesn’t exist? Thanks!

    Reply
  • Hi Kendra,

    I have a big challenge, and I need your advice. I have to implement “DBCC CheckDB” on 50+ Servers = 50+ instances = 200+ user databases and 200+ system databases. The main problem and most important part is maintenance, and because of that I have to implement a simple solution.

    The best solution is probably centralized solution (for logging output, for adding and removing databases, for changes like database names, exceptions …). Is it the best solution creating stored procedure for “CheckDB” and implementing it on all instances in master db, and running it all from my central instance? Next question: Is it better implementation through Maintenance Plan on my central instance or maybe through SQL Server Data Tools or there are other solution?

    In any case, I have a lot of other complicate problems like polices (for the begin, I will implement “weekly check” for all databases, timing (I have to find the best date and time for each database 🙁 ), exceptions, very big databases, monitoring performances during process…

    I don’t see simple solution without a complicate maintenance 🙁

    Reply
    • Bojan – the easiest, most reliable method is to run it on each SQL Server. If you try coordinating it through a single SQL Server, you can face problems when that server goes down.

      Reply
  • Thanks. Finally, I will develop small application with GUI for managing (adding/removing CheckDB, creating and deploying Jobs and all necessary object, reports,…, collecting outputs, generating reports) ,but all those database object and Jobs regarding CheckDB process will be on each SQL Server (if central server goes down, this will not affect CheckDB process).

    Reply
  • can we read the data during dbcc check db operation running on the database

    Reply
  • Reply
  • What has your experience been with the accuracy of “with estimateonly”? I ran a checkdb on a 42GB database and it came out fine — no errors. Then I found the estimateonly option and decided to test it. The answer was “Estimated TEMPDB space (in KB) needed for CHECKDB on database whatever = 71415657.” If I am reading this correctly, that’s 68gb. My tempdb is only a little over 1gb after running the checkdb, so I don’t have much confidence in this estimate.

    I must be missing something. Any thoughts on estimating how much tempdb space is going to be needed before starting checkdb?

    Reply
  • Neal Erdmann
    June 26, 2017 12:10 pm

    For years I’ve used a very simple maintenance job that run “exec master.dbo.sp_msforeachdb ‘DBCC CHECKDB([?]) WITH NO_INFOMSGS'”. If the job succeeds it sends out a notification to that effect and if it fails it sends an email saying something was wrong and I go back and run DBCC CHECKDB against the database that had the issue.. I’ve always assumed that when I rec’d the “success” email that things were fine. Then I came across Paul S. Randal’s blog and the use of DBCC Page to find the last known good DBCC CHECKDB run (dbi_dbccLastKnownGood). When I ran this against one of my instances I found that several databases’ last known good run was a couple of months ago. I’m now thinking that along with DBCC CHECKDB I need to run regular checks using DBCC PAGE in order to truly ensure that my databases have been recently checked.
    My question is this: Can you tell me why my CHECKDB job was reporting success when it appears that several of the databases either had integrity issues or, and probably worse, not checked during the run of the job?

    Reply
  • Neal Erdmann
    June 26, 2017 12:12 pm

    Sorry Brent… I saw your blurb re: dbcc dbinfo as well.

    Reply
  • Shira Bezalel
    August 15, 2017 2:42 pm

    Testing 123. Is this thing on?

    Reply
  • Shira Bezalel
    August 15, 2017 2:54 pm

    Do you have a recommended approach for being notified by corruption detected by a DBCC CHECKDB apart from manually reading the log file? It doesn’t seem like SQL Server Agent Alerts would work to achieve this goal since corruption results detected by DBCC CHECKDB aren’t actual error messages. Is a customized stored procedure the way to go?

    Reply
  • If we are running DBCC CHECKDB, do we need to run DBCC CHECKCATALOG separately or is it included with DBCC CHECKDB.

    Reply
  • Thanks for sharing !

    Reply
  • […] on whether or not you should run a CHECKDB against the tempdb. You should check out this post by Kendra […]

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.