In SQL ConstantCare®, you send in your SQL Server’s diagnostic data, our systems analyze it, and we email you a list of tasks you should perform to make your SQL Server faster and more reliable.
Some recommendations are really straightforward: for example, hey, you should turn off Priority Boost. It’s really easy to determine if Priority Boost is on or not, and it’s easy to give someone the right steps to turn it off. We coded the no-brainer stuff first. Now that we’re analyzing data for over 1,000 SQL Servers, though, we’re starting to think bigger.
For example, Erik recently coded a recommendation rule:
- If your server has at least 32GB memory, and
- Max memory is set to less than half of the server’s memory, then
- You should revisit your max memory setting
Sounds straightforward, right? But hilariously, when I was picking a recommendation example for this blog post, I opened up the code for this rule, and we proceeded to have a 20-minute discussion in Slack about how the WHERE clause worked.
Building rules to generate accurate advice is hard.
So we need to roll out changes gradually.
To do it, here are some of the tables involved:
- Registered_User – one row per customer
- Registered_User_Level – each customer can be in one or more levels (level 1 = free users, level 2 = paid users, level 3 = early access to new rules)
- Rule – one row for each thing we check about their data, like Erik’s new rule
- Rule_User_Level – which rules belong to which user levels (so Erik’s new rule is only hooked up to the rule testers)
When we add a new rule, we:
- Insert a row for it in Rule
- Insert a row into Rule_User_Level for the new rule ID, for User_Level = 3 (early access to new rules)
- Figure out which users should see it first, and put them in Registered_User_Level 3
When picking out which users should first see a brand new rule, we start by finding some users that should definitely get the advice, and some users that should definitely NOT get it. By this point, we’ve already tested it by manually running the queries against their data, but by letting it run on an automated basis, we’re just double-checking our assumptions.
It’s not just about testing the rule – it’s also about testing the end user reactions. People can reply and say, “Wait, that recommendation doesn’t make sense for me, and here’s why.” Their replies can influence our guidance – they might come up with something we’d never thought of before.
Then, as we gain confidence in the rule, we gradually add more users to the early-access group. Eventually, as we’re happy enough with it, we remove the users from the early-access group, and move the rule into User_Level 2 (all paid users.)
How this could be expanded
This design keeps things simple enough for our needs, but I know what you’re thinking, dear reader: “How do they test individual rules across different users at the same time?” The answer is that we don’t – at any given time, we’re beta testing a few different rules across the same small group of people. (This is why we have to be pretty confident in our rules already just by running them manually.)
In the future, it’d be cool if we also:
- Identified new/beta rules with visual indicators in customer emails
- Ran experiments across the entire population – like running a rule across everyone who submitted data yesterday, and returning a list of everyone who would have passed/failed that rule
- Let users opt into riskier levels of rules – right now, we don’t want you seeing things until we’re extremely confident in the rules, but several users have told me, “we’re willing to be guinea pigs for whatever new experimental tests you want to run against our data”
To learn more about feature flags, check out:
- Feature Toggles (aka Feature Flags) by Pete Hodgson
- FeatureFlags.io – resources on how to get started, code patterns, and use cases
- Feature Flags by Bill de hÓra