We recently added some columns to sp_BlitzCache to help you remove undesirable plans from the cache. Doing this will force SQL to come up with a new plan, or you know, just re-create the old plan. Because that just happens sometimes. I answered it, but I wasn’t happy with my answer. So here’s a better one!
Handles and Hashes
SQL Server stores a whole mess of different things used to identify queries. Query Hash, Query Plan Hash, SQL Handle, Plan Handle, so on and so forth until eventually you get a collision. They identify different things differently, but let’s answer the question rather than get bogged down in pedantics.
SQL Handle is a hash of the SQL text. This includes comments, white space, different casing, etc. It will be unique for each batch.
Plan Handle is a hash of the execution plan. Both of them are stored per batch, but this one isn’t guaranteed to be unique. You can have one SQL Handle and many Plan Handles.
Query Hash and Query Plan Hash are a little bit different. They’re explained really well over here.
Thanks for reading!
Be honest, here
You don’t care about most indexing problems. Duplicates, Borderline Duplicates, Unused, Wide, Non-aligned Partitions… All you’re here for are the Missing Indexes. Wham, bam, your query finishes in less than a second, ma’am. Take this quarter, go downtown and have a rat gnaw that thing off your face.
And I get it. You’re busy! You have a million other things to do. Take a number, Non-Unique Clustered Indexes. Maybe you’re not all that Unique anyway. Maybe you’re just another dude (or dudette) with a beard (stranger things have happened) and tattoos (maybe a beard tattoo), wearing a plaid shirt (probably also a tattoo), wandering around a major metropolitan area.
You’re in a Mode 3 state of mind
When you run sp_BlitzIndex, you have some Modal options. Mode 3 gives you missing index details. Just missing index details. No Scrolling through warnings about the other horrors you’ve visited upon your poor data. But where do they come from? I’m glad you asked! No, really. I am. Brent says I’m not allowed to sleep until I hit 1000 blog posts. It’s been over a year. Shadows have voices.
SQL is lazy. Horribly lazy. It takes that ANSI-spaghetti query you halfheartedly wrote while waiting for lunch to show up and turns it into results magic. Before it does that, it goes through a process called Simplification, where it rewrites your query a bit by removing contradictions (where col = ‘thhbbbpppttt’ and col = ‘GILBERRRRRRRRT’). Some other stuff goes on during this process, but let’s stick to index matching. Once it has the query you meant to write ready, El Optimizer will think to itself “I’d really like to find indexes like these…” and goes hunting for them. When it doesn’t find its index soulmate, it logs that in the missing index DMV.
But there are problems
Not, like, world-ending, life-deaffirming problems. Just stuff you should be aware of. It’s very query specific, this DMV. It’s not looking at your whole workload, the way you should be doing, like a responsible DBA and/or Developer. Query runs. Optimizer wants index. Request is logged. This index may almost match an index you already have, match another request from a similar query, and sometimes, they’re just a cry for help.
Why just a cry for help? Sometimes the request will only be estimated to improve the cost of the query by something like 50%. Or less. Way less. And much like swallowing a handful of Good & Plentys, these requests are just a sign that something is really horrible. So horrible that The Optimizer can’t figure out a very helpful index. If you throw functions on columns in joins or where clauses, SQL won’t even bother asking for a index, because what would it do with it? Whole lotta nothin, that’s what.
It will also sometimes ask for some insane indexes. Ever see those requests that are like one key column, and then include every column? That’s because someone is running SELECT *, or the equivalent in EF, because they’re lazy and didn’t just pick the columns they actually need. It will also ask for MAX columns as includes, because it doesn’t check those things first. It’s fine, it just becomes your job to look at these things first.
No missing indexes, I’m a genius
Thanks for reading!
In celebration of AdventureWorks being out of business for a decade
Microsoft has released a couple new databases. Wide World Importers is a fresh-faced start up, probably a drug smuggling front, run out of a basement office next to a temp agency.
There’s some JSON involved. No one ordered bikes. It’s a hoot.
The OLTP database is about 120mb, and the data warehouse is around 50mb. Hopefully no inflation will be necessary.
Head on over and check them out.
Thanks for reading!
Brent says: as a presenter & author, I gave up on the Microsoft sample databases and switched over to StackOverflow’s public database. It’s elegantly simple (just a handful of tables), has real-world data distribution, there’s plenty of interesting sample queries available, and is big enough that you can demonstrate real performance tuning issues. (120MB? Really?)
This post covers two scenarios
You either created a database, and the sync failed for some reason, or a database stopped syncing. Our setup focuses on one where sync breaks immediately, because whatever it’s my blog post. In order to do that, I set up a script to create a bunch of databases, hoping that one of them would fail. Lucky me, two did! So let’s fix them.
You have to be especially vigilant during initial seeding
Automatic failover can’t happen while databases sync up. The AG dashboard reports an unhealthy state, so failover is manual. The good news is that in the limited test scenarios I checked out, Direct Seeding to Replicas will pick back up when the Primary is back online, but if anything really bad happens to your Primary, that may not be the warmest or fuzziest news.
Here’s our database stuck in a restoring state.
Now let’s look in the error log. Maybe we’ll have something good there. On the Primary…
Unknown,The mirror database "Crap903" has insufficient transaction log data to preserve the log backup chain of the principal database. This may happen if a log backup from the principal database has not been taken or has not been restored on the mirror database.
Okie dokie. Good to know. On the Replica, you’ll probably see something like this…
Automatic seeding of availability database 'Crap903' in availability group 'SQLAG01' failed with an unrecoverable error. Correct the problem then issue an ALTER AVAILABILITY GROUP command to set SEEDING_MODE = AUTOMATIC on the replica to restart seeding.
Oh, correct the problem. You hear that, guys? Correct the problem.
IF ONLY I’D THOUGHT OF CORRECTING THE PROBLEM.
So what do we do? We can check out the AG dashboard, see a bunch of errors, and then focus in on them.
Alright, let’s see what we can do! We can run a couple magical DBA commands and see what happens.
ALTER DATABASE [Crap903] SET HADR RESUME
ALTER DATABASE [Crap903] SET HADR AVAILABILITY GROUP = SQLAG01;
The two errors were:
Msg 35242, Level 16, State 16, Line 1
Cannot complete this ALTER DATABASE SET HADR operation on database ‘Crap903’.
The database is not joined to an availability group. After the database has joined the availability group, retry the command.
Msg 1412, Level 16, State 211, Line 1
The remote copy of database “Crap903” has not been rolled forward to a point in time that is encompassed in the local copy of the database log.
Interesting! What the heck does that mean? If Brent would give me his number, I’d call and ask. I don’t understand why he won’t give me his number. Well, let’s just kick this back off. We kind of expected that not to work because of the errors we saw in the log before, but it’s worth a shot to avoid taking additional steps.
ALTER AVAILABILITY GROUP [SQLAG01] REMOVE DATABASE [Crap903] GO ALTER AVAILABILITY GROUP [SQLAG01] ADD DATABASE [Crap903] GO
Right? Wrong. Digging into our DMVs and Extended Events, they’re telling us that a database with that name already exists. What’s really lousy here is that this error doesn’t appear ANYWHERE ELSE. It’s not in the dashboard, it’s not in regular, documented DMVs, nor in the XE health session. It’s only in the undocumented stuff. If you’re going to use this feature, be prepared to do a lot of detective work. Be prepared to cry.
What we have to do is go back, remove the database from the Availability Group again, then drop it from our other Replicas. We can’t just restore over what’s already there. That would break all sorts of laws of physics and whatever else makes front squats harder to do than back squats.
Since our database is in a restoring state, it’s a few steps to recover it, set it to single user so no one does anything dumber than our AG has done, and then drop it.
When we re-add the database to our Availability Group, it should start syncing properly. Lucky for us, it did!
There’s no Tinder for databases.
New features are hard
With direct seeding, you have to be extra careful about named instances and default database creation paths. If you used named instances with default database paths to Program Files, or different drive letters and folder names, this isn’t going to work. You don’t have an option to change those things. SQL expects everything to be there in the same place across all of your Replicas. I learned that the annoying way. Several times. Troubleshooting this was weird because I still can’t track down a root cause as to why anything failed in the first place. For the record, I created 50 databases, and two of them didn’t work for some reason.
Correct the problem. Just correct the problem.
Thanks for reading!
It takes as little as one word. One word from a co-worker, blog commenter, or Stack Overflow user can make you feel like a champ. (Or a chump, depending on the word.)
(Let’s focus on feeling like a champ.)
What’s the greatest compliment you ever received about your professional work? Did someone compliment your tenacity, your calm under pressure, your ability to make chaos orderly, or maybe how you rescued a doomed project? Do you remember you felt when you heard it? Does it still influence you today? Share it in the comments!
Here’s mine, from 2010: “You’re a great researcher.”
When I think about all the different ways being a good researcher makes me better at my work, I can’t help but put that compliment at the top of my list. I’ll never forget it. What compliment will you never forget?
Erik says: The best compliment I ever got? “Your a idiot.”
Brent says: “You’re taller than you look on the webcasts.”
These things used to confuse me so much
Despite having worked at a Market Research company for a while, I know nothing about statistics, other than that project managers have all sorts of disagreeably subjective phrases for describing them. Vast majority, convincing plurality, dwindling minority, et al. Less talky, more picture.
When I started getting into SQL Server, and learning about statistics, I heard the same phrases over and over again, but wasn’t exactly sure what they meant.
Here are a few of them:
This tells you how special your snowflakes are. When a column is called “highly selective” that usually means values aren’t repeating all that often, if at all. Think about order numbers, identity or sequence values, GUIDs, etc.
This is sort of the anti-matter to selectivity. Highly dense columns aren’t very unique. They’ll return a lot of rows for a given value. Think about Zip Codes, Gender, Marital Status, etc. If you were to select all the people in 10002, a densely (there’s that word again) populated zip code in Chinatown, you’d probably wait a while, kill the query, and add another filter.
If you mash selectivity and density together, you end up with cardinality. This is the number of rows that satisfy a given predicate. This is very important, because poor cardinality estimation can arise from a number of places, and every time it can really ruin query performance.
Here’s a quick example of each for a 10,000 row table with three columns.
USE [tempdb]; WITH x AS ( SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS [rn] FROM sys.[messages] AS [m] ) SELECT [x].[rn], CASE WHEN [x].[rn] % 2 = 0 THEN 'M' ELSE 'F' END AS [Gender], CASE WHEN [x].[rn] % 2 = 0 THEN 'Married' WHEN [x].[rn] % 3 = 0 THEN 'Divorced' WHEN [x].[rn] % 5 = 0 THEN 'Single' ELSE 'Dead' END AS [MaritalStatus] INTO #xgen FROM [x] /*Selectivity*/ SELECT COUNT_BIG(DISTINCT [x].[rn]) FROM [#xgen] AS [x] SELECT COUNT_BIG(DISTINCT [x].[Gender]) FROM [#xgen] AS [x] SELECT COUNT_BIG(DISTINCT [x].[MaritalStatus]) FROM [#xgen] AS [x] /*Density*/ SELECT (1. / COUNT_BIG(DISTINCT [x].[rn])) FROM [#xgen] AS [x] SELECT (1. / COUNT_BIG(DISTINCT [x].[Gender])) FROM [#xgen] AS [x] SELECT (1. / COUNT_BIG(DISTINCT [x].[MaritalStatus])) FROM [#xgen] AS [x] /*Reverse engineering Density*/ SELECT 1.0 / 0.00010000000000000000 SELECT 1.0 / 0.50000000000000000000 SELECT 1.0 / 0.25000000000000000000 /*Cardinality*/ SELECT COUNT_BIG(*) / COUNT_BIG(DISTINCT [x].[rn]) FROM [#xgen] AS [x] SELECT COUNT_BIG(*) / COUNT_BIG(DISTINCT [x].[Gender]) FROM [#xgen] AS [x] SELECT COUNT_BIG(*) / COUNT_BIG(DISTINCT [x].[MaritalStatus]) FROM [#xgen] AS [x] DROP TABLE [#xgen]
Bigger by the day
A lot has been written about cardinality estimation. SQL Server 2014 saw a total re-write of the cardinality estimation guts that had been around since SQL Server 2000, build-to-build tinkering notwithstanding.
In my examples, it’s all pretty cut and dry. If you’re looking at a normal sales database that follows the 80/20 rule, where 80 percent of your business comes from 20 percent of your clients, the customer ID columns may be highly skewed towards a small group of clients. It’s good for SQL to know this stuff so it can come up with good execution plans for you. It’s good for you to understand how parameter sniffing works so you understand why that execution plan was good for a small client, but not good for any big clients.
That’s why you should go see Brent in person. He’ll tell you all this stuff, feed you, give you prizes, and then you go home and get a raise because you can fix problems. Everyone wins!
Thanks for reading!
Five years ago, we turned BrentOzar.com into a consulting and training company. Now look at us. (Well, maybe not too closely, we’re getting a little gray and wrinkly.)
- Video classes are now 50% off with coupon code HighFive
- That includes bundles, too, which brings the epic Everything Bundle down to just $449!
- Two of our in-person training classes are 50% off too – the Senior DBA Class of 2016 in Philly in August, and the Performance Tuning class in Portland in August – just use that same HighFive coupon code
- Free Everything Bundle with all training seats bought during the sale
Head on over and shop now – the sale ends when April 2016’s over.
Enjoy, and here’s to another five years of cartoons, retreats, webcasts, free tools, horrific SQL Server problems, and laughter about the whole thing.
Sometimes you need garbage
Not because DBAs are the IT equivalent of Oscar the Grouch, but maybe you want to post a question on a forum, and you don’t want to use your own data. At least, you probably shouldn’t just post your own data without masking it. But masking data is annoying, and by the time you get everything in order, someone’s breathing down your neck for an answer.
Your other option is to write T-SQL to generate random data of your choosing, but that can be daunting too. Generating random numbers and dates between ranges, creating relationships, etc. isn’t a ton of fun. Adding more tables makes the whole process exponentially caustic.
Enter the website
A website I really like to help generate data is over here. It’s really easy to set up data the way you want, and you get a lot of neat customization options for a wide variety of data types.
You can have it generate the data to just about any format you can think of: CSV, Excel, SQL, JSON, HTML, XML, and you have all sorts of options for each type. It’s really awesome.
Next time you need to generate some test data, consider trying this site out. It’s been really helpful for me on a bunch of occasions. If you like it, donate. There’s no such thing as a free table.
Thanks for reading!
During this month’s Senior DBA Class in Chicago, I started joking with the attendees about what songs they should play during common DBA job tasks, like:
- Best Song to Play During a Server Outage
- Best Song to Illustrate Your Predecessor’s Skills
- Best Song to Accompany an Index Rebuild
- Best Song to Play When Examining a Slow Query
- Best Song to Play When Corruption is Found
- Best Song for a SQL Server with 4GB RAM
- Best Song to Play While a Cursor Runs
- Best Song to Prepare for a SQL Server Service Pack
We wanna hear your ideas.
- To submit your songs, leave a comment here (BrentOzar.com) before Wednesday, March 23rd, 2016.
- Your comment must include the category, the song & singer’s names, and the link to the song’s video. You can enter multiple times.
- The winners will be chosen by us, the Brent Ozar Unlimited team, judged using completely unreliable and unpredictable methods. Winners will be announced within one week of the contest closing.
- If multiple people submit the same song in a category, the winner will be the first one.
- Taxes are the responsibility of the winners.
- The winner in each category will get a training video class of their choice. (No bundles, just individual classes.)
- The overall winner will get an Everything Bundle.
I’ll get you started with my favorite song to play during a server outage.
Update March 23rd – comments have been closed. Let the judging commence!
Angie, Doug, Richie, and Tara get together for this week’s Office Hours to talk about how many cores you should leave for the operating system, how many unions you should do in a single query, and how often you should install patches.