Building SQL ConstantCare®: Which Recommendations Do People Mute?

Our SQL ConstantCare® service sends you daily recommendations for specific tasks that will improve your SQL Server’s health and performance.

Next to each recommendation, there’s a Mute link that you can click on to either mute the recommendation entirely, or mute it for specific databases, indexes, or queries:

When we brought out that feature last year, I was really curious to see what recommendations people would mute. Before the Mute links came out, users had always been able to mute things by hitting reply and telling me what to mute, but I guessed that they’d get more ambitious with muting stuff once they were allowed to mute it themselves without talking to me. You can read what they were muting back then.

Here’s what you’re muting:

  1. Consider Enabling Query Store – now muted by 151 users. I’ve blogged about why people aren’t using Query Store, and this is still a bit of a bummer. I think we could turn this number around, but users need to see value out of the feature. It’s just not easy enough yet for most folks to go from turning it on, to configuring it properly, to getting value out of it.
  2. Check for Corruption ASAP – 110 users. Keep in mind that this isn’t the number of people who have been alerted about the lack of corruption checking, and then actually fixed it. These are just the people who said, “Yeah, I know I’m not checking for corruption, and you can mute that recommendation.” That’s why I’m asking for SQL Server to just start automatically check data integrity in the background. It’s too important to leave to end users to implement.
  3. Transaction Log Larger than Data File – 104 users. I need to go back and tweak the code on this rule because I’ve seen some commonalities: for example, people seem to run into this problem a lot on the SSIS DB and want to ignore it.
  4. Set Fill Factor Higher – 65 users. This rule currently fires for fill factors 79% and lower. I may tweak the threshold on this one too.
  5. Check Unusual Database States – 60 users. Dang, y’all really like keeping offline databases around. I think I wanna tweak this one to just alert when a formerly-online database drops offline.
  6. Take a Full Backup – 59 users. COME ON NOW.
  7. Move User Databases Off the C Drive – 56 users. That’s fair: once we’ve found this problem, you probably don’t want to hear about it repeatedly. I’m not sure there’s value in keeping this rule around because it doesn’t look like a lot of people are willing to fix it.
  8. Take a Log Backup – 44 users. We’re only alerting you for databases in full recovery model that haven’t had a log backup in several days. Right now, we don’t have logic to detect if you’re offloading log backups to a readable secondary, but based on this high number of mutes, that’s something we need to build.
  9. Move TempDB Off the C Drive – 42 users. See #7.
  10. Enable Checksums for Corruption Detection – 39 users. This one kinda surprises me because it’s so safe, easy, and fast. I think I need to work on my communication about the rule to explain why it’s so important.
Previous Post
Three Reasons to (Temporarily) Change Max Worker Threads
Next Post
Can SELECT * Make a Query Go…Faster?!?

19 Comments. Leave new

  • Some of those involve actual work! Like learning AND doing! 🙂

    Reply
  • Did you change your mind about user databases and tempdb on c: at some point? I thought you had previously advocated one big volume, reasoning that splitting up storage into multiple volumes was more likely to cause you to run out of space than to prevent it.

    Reply
    • I never said the databases should be on the C drive. The OS should always be a separate volume to make sure you can still boot up even if an out of control transaction fills up your he database drive.

      Reply
  • Hey Brent. We’re guilty of muting some of these. I wanted to explain why.

    Corruption checks – our dbas do this as part of our restore testing. They have one server that they restore the dbs on and then run checkdb.

    Dbs off the c: drive – a few of our servers were set up with mount points that were mapped to folders on c. I’m not well versed but I’m guessing it’s hard to distinguish from your end.

    Reply
    • Mike – sure, those make sense, but I worry about the restore-and-check pattern. If you *can* do the checking on the primary, I’m such a huge fan of doing that. I can’t tell you the number of times I’ve gone into a shop where they’ve said their scripts automatically do a restore-and-check, and then when we look at what the scripts are doing, they have obvious bugs where they’ve been failing for years and no one knew. The restores weren’t working and they were checking the same (old) database over and over, or the checkdb was failing but the emails weren’t working, etc, etc.

      Reply
      • That makes sense. At the risk of offending them, I’m going to ask them how we are sure that all the restores are succeeding.

        Reply
        • As the owner of the data, database, or system it’s your responsibility to make them prove the systems work and can restore from a certain time period. I also audit retention periods to business requirements.

          Asking is it backup is not good enough. Short retention periods, corruption, can just be as harmful as broken backups.

          Reply
  • Some of the problems that these recommendations fix are (no exaggeration) job-killers. Why wouldn’t a DBA implement them? WHY????!!!!!

    Reply
  • I keep offline databases around for weeks or months sometimes. Developers often seem to say they don’t need a database anymore–then immediately need it the moment it has been deleted. So I just got in the habit of setting the db offline so I can do an “instant restore” when inevitably the devs do need the database.

    Reply
    • Why not detach it?

      Reply
      • Now I need to set a reminder for myself to go back and clean up files off the server. Which is OK, but I didn’t really consider having an offline db a problem, to be honest, so detach would be an extra step. What is the benefit of Detach vs Offline? Maybe it’s a blog post idea for you? 🙂

        Reply
        • You would have to set a similar reminder to remove the offline databases as well. There’s no change there.

          The benefit of detach is that the database simply disappears from the system and from your users’ views. I’ve seen plenty of scripts that fail when they hit offline databases, for example. I’ll turn it around on you: what’s the benefit of offlining a database?

          Reply
          • Well the benefit for *me* is, since I’m solo managing about four dozen servers, I am often trying to reduce processes to the most simple they can be. I also have no team to bounce ideas off of.

            Of course, you are correct, the DB does disappear from SSMS when it’s detached. I had not run into any scripting issues. If I had, I would probably have reconsidered this approach sooner.

      • Once detached, it’s no longer anything to do with an instance; it’s just a file and more susceptible to accidental deletion.

        Reply
  • danielle.paquette-harvey
    January 14, 2020 8:03 am

    Yeah, I’m guilty of muting the “Check for corruption ASAP”. The reason is I’m dealing with VLDB and I can’t afford to do a CHECKDB on it. So I do CHECKTABLE spread on 7 days, with CHECKALLOC AND CHECKCATALOG once per week. I keep 14 days of backups too to be able to get back to the day before corruption is founded. I also have a check on the msdb suspect_pages every 3 minutes to be able to detect corruption as soon as possible + I have page checksum on my databases.

    I know it’s not the best but it’s the best I could do. Next step will be to ask for my boss for a separate server in order to do the CHECKDB or a cluster or something like that, but as for now, there is no budget for those options yet. But budget comes real fast in case of catastrophe so even if they are warned about it, I guess that when catastrophe strikes, budget is going to appear to have a better setup than that.

    Reply
  • Hugo Kornelis
    January 15, 2020 5:56 am

    I’ve got a few databases where the TLog grows to larger than the data file during index maintenance. Not much I can do to prevent it (apart from stopping index maint at all).
    And I, too, set databases offline if I suspect they’re unused but want to give people a grace period in which they can start to yell when they’re suddenly unable to work. I know, detach works too. Restore from backup works too. But putting offline is easy, and keeps the DB visible (which works as a simple reminder to myself to go back and delete it later).

    Reply
  • David Williams
    March 31, 2020 4:09 pm

    I also like offline for visibility – bit hard to write a report saying these database were detached but the files are still there! With a historical repository (or cache!) you could put the list of previous files from the repository and if the db is gone since last time but the files are still there then warn. Perhaps also warn on files which are in the same directory as files from the instance but are not part of the instance (more involved!)

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.