Triage Quiz: Is Your SQL Server Safe?

Contrary to popular belief, we spend a lot of time with clients when we’re not blogging, answering questions in Office Hours, or working on new features for the download pack. Something we hear a lot is, “How do we compare to other clients?” or “Is this the worst/best setup you’ve seen?”. This got me thinking, so I’ve created this totally non-scientific “Triage Test” for anyone who wants to know how they’re doing or who has nothing better to do than take quizzes on the internet.

You are just answering questions; no changes to your systems. Here how it works:

  1. Pick ONE production SQL Server for your score
  2. Pick the answer that is closest to your setup
  3. If the answer is the 1st answer, you get 1 point. If it’s the 3rd, you get 3 points. (This would be worth 3 points, right? Right.) Get it?

Despite how honorable everyone who reads our blog is, since we can’t prevent cheating, you’ll have to settle for the glorious prize of having a comment on this post, and hopefully either knowing your server is in a pretty good place or knowing where to start to fix it.

 

DO YOU HAVE RPO/RTO ESTABLISHED FOR THIS SERVER IF IT GOES OFFLINE (We’ll stick to HA scenario only)?

  1. What’s RPO/RTO?
  2. No, but we have informal goals in the IT department
  3. Yes, we set this within (only) IT
  4. Yes, we have it in writing from the business

Bonus Point: Yes, we set it with business and tested (at least once) that we can meet it

 

ARE YOU BACKING UP ALL DATABASES ON YOUR SERVER?

  1. What’s a backup?
  2. No, only the ones we use the most
  3. Yes, system and user databases
  4. Yes, full backups for system and user databases, plus transaction log backups on user databases

 

ARE YOU RUNNING DBCC CHECKDB FOR ALL DATABASES?

  1. What’s DBCC CHECKDB?
  2. No, only the ones we use the most
  3. Yes, system and user databases
  4. Yes, and we log the entire output

 

DO YOU HAVE DATABASE MAIL ENABLED AND ALERTING ON THIS SQL SERVER?

  1. What’s Database Mail?  What Alerts?
  2. No, Database Mail is enabled but no alerts are configured
  3. Yes, Database Mail is configured and we receive job failure/completion and/or error alert emails
  4. Yes, we have 3rd party SQL Server-specific monitoring software

Bonus Point: What’s your software, and do you like it?

 

ARE YOU RUNNING SP_BLITZ ON YOUR SERVER?

  1. What’s sp_Blitz®?
  2. No, nothing is wrong with my server
  3. Yes, I ran it once
  4. Yes, I run it on a regular basis

Bonus Point: What shocked you the most in your results?

 

HOW WELL DID YOU DO?

There are 23 possible points.

Did you do as well as you thought?

Are you surprised by other results?

While there are several other factors that go into keeping your server safe, these are some of the things I use when I triage a client’s server. Hopefully you had a chuckle, and maybe even learned something new along the way.

CHEERS!

Tara says: I first heard about sp_Blitz at PASS 2011 when I attended Brent’s session on it. I was eager to get back to work and run it on my servers. Well that’s until I actually did run it on my servers and saw so many issues: UNSUBSCRIBE. There were things in there that I had never heard of or thought about. Do your servers a favor by running it on your servers periodically.

Previous Post
SQL Server 2016: Availability Groups, Direct Seeding, and You.
Next Post
Questions You Should Ask About the Databases You Manage

34 Comments. Leave new

  • Richie Rump
    June 15, 2016 8:51 am

    FIRST!

    Reply
  • SECOND – with 21 point

    Reply
  • 15 points here.

    Couldn’t get the bonus points, and because of where I work, I have to be careful bringing in 3rd party scripts like BLITZ…

    Reply
    • Angie Rudduck
      June 15, 2016 11:52 am

      I totally understand, Jason! We had Blitz installed at one company, but I was still afraid to run it because I just didn’t know. Now that I run Blitz on client servers all day, I realize it’s not impacting, but I stand by my concern about running scripts that dropped off the back of the internet truck in production before understanding what I was doing. 🙂

      Reply
    • 15 for me too…

      Reply
  • 16

    Reply
  • 18 (we’re using Spotlight for SQL Server and I overall like it)

    Reply
  • Stephen Falken
    June 15, 2016 11:04 pm

    21 (4+1, 4, 4, 3, 4+1)

    We’ve tested RPO/RTO three times in the last year.

    The shocking thing from first running sp_Blitz was that the DBA [still] here when I started had never used it, and that no DBCC CHECKDBs had been run on any of the 70+ servers spread across Prod, Test and Dev environments.

    Reply
    • Angie Rudduck
      June 16, 2016 10:14 am

      Hey, at least you knew that no CHECKDB was a bad thing! 🙂

      Reply
      • Iwer Mørck
        June 17, 2016 6:47 am

        Oooh have I learned a lot since I came across Brent Ozar, earlier I didn’t want to have anything to do with DBA-stuff, now I feel like an apprentice 🙂

        sp_Blitz opened my eyes to many things, including how sloppy vendor provided DBs where set up.
        Cheers

        Reply
  • First blog post as Angie Rudduck – Congratulations 🙂

    18, lost points on the business agreeing the RTO/RPO and SP_BLITZ.

    What would be the worst and roughly average scores you see on a client sites?

    Reply
    • Angie Rudduck
      June 16, 2016 10:17 am

      Thanks David!

      And what a GREAT question…. You know, I didn’t actually do the quiz for a client server, so I just looked at it again and I’d say I’ve seen servers that would score 6-9 points more than once.

      Reply
  • Andrew Gothard
    June 16, 2016 8:15 am

    Sort of 19 1/2

    We have RTO and RPO agreed with the business on a system by system basis and currently in the process of migrating onto a new infrastructure with servers tiered to reflect the RTO/RPO, but regular structured testing’s not going to start properly until the new infrastructure’s fully commissioned.

    No 3rd party monitoring tool (:-(

    Reply
    • Angie Rudduck
      June 16, 2016 10:19 pm

      I think that’s awesome how far you’ve gotten with RPO/RTO so far though! Monitoring tools can be great, but there’s a lot you can do with native alerting and DMVs to do a decent job of monitoring at least.

      Reply
  • 17 points

    We tested RPO/RTO every six months.

    We don’t have 3rd party software.

    When I begun to work with a customer, I found backups on same drive, no RTO/RPO, no logs backups, users as sysadmin, no alerts, no DBCC CHECKDB. I think before me they had 4-6 points 🙂 A scared story heard by DBA, isn’t it?

    Reply
    • Angie Rudduck
      June 16, 2016 10:22 pm

      Wow, Daniel! Your customer is lucky to have you, seriously. I knew our followers would generally have a lot of these items in place to some degree, but I think it’s fun to hear about crazy situations out there too, hehehe.

      Reply
  • Hi there,
    my score depends on what you mean with “system dbs”. We “only” backup master, msdb and user dbs. Model db is not worth doing backups as we don’t store anything in this db (except our standard db grow setting).
    SCOM isn’t really a third party tool but doesn’t fit to the answers 🙁
    Regular DBCC can be problematic on prod environment with 24/7.
    And yes, I know that we should restore such dbs on another sql server to run DBCC. This is one of the next points on my roadmap 😉
    All the best
    Gerald

    Reply
    • Angie Rudduck
      June 16, 2016 10:31 pm

      Gerald, I’ll give you points for SCOM since with enough tweaking, you can set it to monitor SQL Server, although I will say that’s a hard job. Hopefully whoever manages knows and loves SCOM.
      As for backing up model, I’d ask you how you take backups and by ignoring model, could you accidentally ignore a new and valid user database? Maintenance plans make you select each database unless you choose all databases, and Ola Hallengren’s scripts have a job specifically for system databases as well.
      CHECKDB on 24/7 prod can be tricky, you’re right. How busy are you overnight? How long does it run? At least get it running on a different server. Imagine if you don’t find corruption so late and it’s in all of your available backups; what would you tell the business that you have to accept an unknown amount of data loss? :-/

      Reply
      • Dear Angie,
        we use Tivoli TSM with TDP for SQL for DB Backups – it automaticly selects all user DBs. On dedicated customer environments without TSM we use a custom stored procedure in with dynamic database select (also with excludes on secondary databases in cause of differential backups). In all scenarios we exclude the model db.
        You are right – it took a long time to organize our sql server in groups and to configure overrides to meet our needs without generating too much alerts.
        Our’ CheckDB strategy is indeed something we have to work on because on our biggest instances (shared environment) we have abt. 230 databases. I think of something like a procedure with a control table and only do abt. 20% of the databases selected with a pattern plus abt. 10% random selected databases each night. Also I could implement an option for critical databases with daily CheckDB.
        Overall we get a score of 18

        Best regards
        Gerald

        Reply
        • Angie Rudduck
          June 17, 2016 1:01 pm

          Sounds like you know what you need to do, and now you just have to do it!

          The random percentage for CHECKDB sounds scary to me, but then I’m a pessimist so I’d be worried corruption would occur in the databases I don’t check.

          I think you could probably figure something out to either get them all done on a weekly basis (different databases on different days maybe?) or just entirely offload it to a different server.

          Good luck!

          Reply
  • Name Withheld This Time
    June 16, 2016 11:48 am

    13 – and several points are only because I joined several weeks ago and quickly installed sp_Blitz, sp_WhoIsActive, etc. and gasped and ran CHECKDB. But I haven’t done all servers…just the BI servers I’m involved with.

    So, Angie, these would be the servers you’ve seen in the 6-9 range.

    I don’t fault them too much though, they don’t have anyone in the company that has the role of DBA. I won’t tell you how big the company is nor who it is, publicly.

    Let’s just say, out of generic concern, I will be taking on the role of DBA as I can fit it in.

    Reply
    • Angie Rudduck
      June 16, 2016 10:35 pm

      Name Withheld – it’s never too late to get started! You should save the output from Blitz, BlitzIndex and anything else you run, and use it to track how awesome you are. Then show that compared to your better, safer state to your manager right before review time! 🙂

      Reply
  • 21! Woot woot! (reminds me of a birthday party once… ok enough about that…)

    I use Spotlight on SQL Server and I like it (mostly). There are a couple quirks that I can’t figure out but other than that, it’s a great help.

    Reply
    • Angie Rudduck
      June 16, 2016 10:37 pm

      I can only hope my 21st is as cool as the story you didn’t share. 🙂

      Quirks are good, people always tell me, so that’s good, right…?

      Reply
  • Timothy Sceurman
    June 16, 2016 6:23 pm

    19.
    We have an SLA for DR, and we conduct formal exercises, as well as informal tests of our scripts, but no written RPO/RTO.

    We use SQL Sentry, and are quite happy with it. It has paid for itself by allowing us to focus tuning efforts, as well as helping determine when to escalate issues.

    The most shocking thing from sp_Blitz wasn’t the output, but how it can be used to raise or shut down conversations about issues.

    Reply
    • Angie Rudduck
      June 16, 2016 10:43 pm

      Timothy, I love hearing how you’re using facts to make things clearer for people. People can usually make better decisions when they are properly informed.
      RPO/RTO can be hard, but it’s invaluable when you have it in writing. There’s nothing specifically saying that you can’t explain when an actual scenario might put RPO or RTO at risk and allow the business to make the final decisions. After all, I never want to be responsible for losing money or customers because I didn’t tell the right people that I can’t complete a task in the time I previously told them I could.

      Reply
  • 19 points on each of 40 servers, let down by lack of any formal RPO/RTO – something I’ve being banging on about for the last 5 years since I started here! However, I do at least have a policy of my own in place to do Full Backup and integrity check every night and regular log backups. I had spBlitz in place on every server within a couple of weeks, plus the various other Ozar tools too of course! I also insisted of some kind of monitoring so we’ve had Red Gate SQL Monitor on our prod servers for years. And it’s been well worth the money too in alerting us to potential disasters.

    Reply
  • Chris Harshman
    June 21, 2016 7:58 am

    18 points, which is definitely better than when I started here 5 years ago, our score was probably about 9 or 10 back then. We use HP SiteScope for general high level monitoring because it covers all our local datacenter and cloud servers, but I do most of my detailed monitoring using a home grown tool I wrote that I put several of my most frequently used queries into so I have quick drill down troubleshooting when I notice a problem on a particular server.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.