[Video] Can You Tell When a Rollback Will Finish?

I happened to see this DBA.StackExchange.com question about killing a big delete, and I really felt for the admin involved:

Four days ago a user ran the command below on a table with 400,000,000 rows. It’s still running and the log file is increasing in size.

Hoo boy. When you kill a query, it’s hard to know when it’s going to finish. WITH STATUSONLY often shows you 0% complete, 0 seconds remaining – no matter how much longer it’ll take. To make matters worse, it doesn’t necessarily match up with how long the query took to execute, either: rolling work forward can use multiple threads, but rollbacks are single-threaded.

I’ll demo it with the big Stack Overflow database, specifically the monster PostHistory table. I’ve created a few indexes on it to make the delete even more interesting:

It’s got over 100M rows, with a clustered index size over 170GB and the indexes at around 1-2GB each. Say we “accidentally” forget to highlight the WHERE clause in this statement:

If we forget the WHERE clause, and the whole table starts getting deleted, we can get a rough idea of what’s going on by running sp_BlitzWho or sp_WhoIsActive, both of which can show you live query plans since SQL Server 2016 SP1:

Click on the live query plan:

And zoom in to look at specific operators, and you can see how much progress they’ve made so far:

Isn’t that neato? Anyhoo, it doesn’t save our poor hero – it just gives her a better idea of what’s happening with a running query.

But only a running query, not a rolling back query. Because watch what happens when we kill the query:

And then we try to run sp_BlitzWho again to get the live query plan:

Damn. No live query plan available. Live query plans are only for the roll-forward process, not the roll-back. If you want to get status on a rollback, you have to run KILL 68 WITH STATUSONLY, which gives you the exact progress of the rollback, and an atomic-clock-precision-calculated estimate of how much time is left:

Okay, I lied. It’s not accurate. As I show in this video, it’s laughably inaccurate. It basically either shows 0% completed or 99% completed, and either no time remaining, or the rest of man’s time on earth remaining:

It’s a known bug, too. Microsoft hasn’t fixed it because you haven’t upvoted it. You should probably do that.

Related links to keep the learning going:


Why Ordering Isn’t Guaranteed Without an ORDER BY

If your query doesn’t have an ORDER BY clause,
you can’t reliably predict the order of your results over time.

Sure, it’s going to look predictable at first, but down the road, as things change – the indexes, the table, the server’s configuration, the size of your data – you can end up with some ugly surprises.

Let’s start with something simple: we’ll do a SELECT from the Stack Overflow Users table. The clustering key on that table is Id, the identity number that starts at one and goes up to a bajillion. When we do this SELECT, the data comes back in the order of the clustered index:

But if someone creates an index on DisplayName and Location, then suddenly SQL Server will choose to use that index rather than the clustered index:

Here’s the execution plan as proof:

Why did SQL Server choose to use that index even though it didn’t need to have DisplayName and Location sorted? Because that index is the narrowest/smallest copy of the data that SQL Server needs to fetch. Let’s examine the index sizes with sp_BlitzIndex:

The clustered index of the table (CX/PK) has 8.9M rows, and it’s 1.1GB in size.

The nonclustered index on DisplayName/Location also has all 8.9M rows, but it’s only 368MB in size. If you have to scan the entire thing in order to get your query results, why not scan the smallest object, which will get done more quickly? And that’s exactly what SQL Server chose to do.

“Yeah, but my query has a WHERE clause.”

Okay, now that we’ve got an index on DisplayName, Location, try running a query that looks for a particular DisplayName. The results come back in DisplayName order:

And the execution plan uses the index:

But now if you try for a different username, they’re not sorted by name at all:

Because SQL Server decided that there are more Alexes, so it made more sense to do a clustered index scan rather than the seek + key lookup:

These are simple examples, and I could go on.

Even in these really simple cases, you can’t guarantee that SQL Server will always use the copy of the data that you expect. Over the last few weeks, I’ve encountered a lot more complex examples:

  • Someone dropping an index that was used by a query
  • Someone turned on Forced Parameterization, which changed SQL Server’s row estimates for a query plan, causing it to make different index choices
  • Someone changed the Compatibility Level of a database, introducing a newer Cardinality Estimator, which resulted in a Different Plan

If you need the data to be ordered tomorrow, when you’re not looking, just put an ORDER BY in there. Your grandchildren will thank you when they have to troubleshoot your query.


[Video] Altering Datatypes With (Almost) No Downtime

Sometimes, your INT columns run out of space, and you need to alter them to be a larger datatype, like BIGINT. Usually, when you do this, you’re met with screams of agony from your end users because this will totally lock the table and rewrite the whole thing:

In SQL Server 2016 & newer, you can even kinda-sorta do it online with this syntax – but only if there are no indexes referring to the column:

However, Gianluca Sartori (@SpaghettiDBA on Twitter) wrote about a brilliant technique:

  • First, turn on data compression on the table (which can be done ahead of time, and on Enterprise Edition, can even be done with ONLINE = ON, and even works when there are indexes on OwnerUserId)
  • Second, alter the datatype, which will be instantaneous
    (although it does require a schema modification lock)

Step 1: enable compression.

Let’s see it in action on the Stack Overflow database. Let’s say I want to alter the OwnerUserId column on the Posts table, which would normally be a giant pain in the rear. First, we’ll enable compression:

You can use either row or page compression (thanks to Paul Rizza for suggesting that I try page).

This rebuild is a big ugly operation that rewrites the whole table, so it’s going to generate a lot of transaction log usage and change a lot of data pages. The good things, though, are:

  • You can pick when you want to do that rebuild
  • It doesn’t have to be the same time that you alter the table
  • The rebuild can be online if you specify like I did above (assuming that you have Expensive Edition Enterprise Edition)

Step 2: alter the column.

After that rebuild finishes, I can alter the datatype:

(If there are indexes on that column, I still have to drop that index first – but now at least we’re talking about a smaller outage window, because I only need to drop the index at the moment in time when I need to alter the table – not when I do the rebuild and rewrite all the 8KB pages.)

In my case above, I did it in a transaction because I wanted to show you the locks that are required in order to make this change. The change does finish instantly and does no page reads, as evidenced by SET STATISTICS TIME, IO ON:

But it isn’t pain-free: we still do need a schema modification lock in order to do this, as shown by sp_WhoIsActive. If I try to do a SELECT against the same table, it’s going to be blocked until I commit the ALTER TABLE command:

So what this means is that you just need to prepare the ALTER TABLE command for a time window where you can deal with the higher writes due to data & log file work, buuuuut it doesn’t need to be a zero-load window. People can still be deleting/updating/inserting into the table while the REBUILD runs.

Then, your ALTER TABLE still needs locks, but it’s only for a really short amount of time (think milliseconds.) If I’d had an index on OwnerUserId, I could immediately start creating that index again (even with ONLINE = ON to minimize the outage.)

You’re gonna need this technique someday, trust me. You should probably just go say thanks to Gianluca in his blog post comments now. You’re welcome. When you’re done with that, you can read more details about how this works thanks to Paul White.

Livestream of me writing this blog post

I talk through my thought process and demonstrate it in this 40-minute video:

If you wanna watch me live stream as I write blog posts or work on the First Responder Kit, follow me on Twitch.


What’s Your Favorite SQL Server Interview Question? Here’s Mine.

“Out of the stuff on your resume, what are you the proudest of, the most excited about?

When I’m interviewing a candidate for one of my clients, I’ll usually start with this question to help understand the resume. See, recruiters seem to actively encourage folks to keyword stuff their resume: to dump in everything and the kitchen sink, every task they’ve ever performed. Recruiters see the resume as a place to play Keyword Bingo.

Asking the candidate what they’re the proudest of is kinda like saying, “If you could set the font sizes in this resume, which phrases would be in bold, and which would you shrink down to tiny italics?”

It lets them pick their favorite topic, start geeking out, and open up to you.

If you’re the one asking the questions, there’s an important rule: you’re not allowed to correct them. All you can do is ask encouraging followup questions. Even if they say something completely outlandish, like “Cauliflowers are the root cause of forwarded CXPACKET,” you have to nod and say things like, “Do you remember how you first learned about that?” Forgive small missteps – interviewing is so stressful – but just start building a mental picture of their seniority level, what they love to do, and what might just be a keyword on their resume stuffed in by a recruiter.

What’s your favorite interview question?

Share yours in the comments. Don’t share the exact answer – I wouldn’t want people Googling this and getting all the answers – but feel free to type in the kinds of things you look for in a candidate’s answer (and maybe things you DON’T want to hear, too.)

I’ll host a live stream this afternoon and on Friday morning, and I’ll pick my favorites, and we’ll discuss ’em on my live stream.

  • Monday, April 20 at 4PM Eastern
  • Friday, April 24 at 9AM Eastern – download iCal

And then for May’s free training, I’m going to focus on database administration: I’ll do an updated live version of my DBA Job Interview Q&A Kit, and the recordings will be free all May long too:

Here’s the recording of Monday April 20th’s livestream:


Upgrading My Training Game Again: My Home Office Studio Setup: Spring 2020

When I last blogged about my home office setup in December, I’d just invested in a new massive beast of a desktop: a Mac Pro with a 16 cores, 96GB RAM, and a AMD Radeon Pro Vega II 32GB.

Now it’s time to put that horsepower to work.

In the past, I’ve used GoToWebinar for my training classes. Attendees saw one window for my desktop, and one window for my webcam feed:

That works alright, but the thumbnail video of me takes up a ton of space. Attendees could move it, make it float, or hide it altogether, but it just wasn’t a great experience for live attendees. In post-production, I merged them so that my video floated over the recording, but…that only worked in the recorded version, not the live webcast, and it took time for me to edit/produce/upload.

Now, I’m using Streaming Technology™ like the cool kids.

The beefy new Mac Pro lets me process the effects live, stream them simultaneously to YouTube, Twitch, and Facebook, and embed it live at BrentOzar.com:

This has so many fun advantages:

I can pack the training & the webcam into less space . I can float right over the SSMS and slides, interacting directly with execution plans like some kind of SQL Server weatherman. (“I see a blocking storm coming in here…”)

I can meet people where they are. Since I simulcast to multiple services (Facebook, Twitch, and YouTube), my readers at any of those sites get notifications whenever I’m live. If you happen to be perusing YouTube when I’m broadcasting, you’ll get a recommendation to go check it out. (Only the free classes and the live coding will be simulcast everywhere, though: the paid ones will still be private at BrentOzar.com.)

I can skip GoToWebinar’s registration process. Those of you with a Live Class Season Pass can just hop into that streaming page whenever I’m having a paid private class and hang out with me. (And psst, my friends with fancy accents: avoiding moving registration data around means I’m a lot closer to being able to sell classes to the EU/EEA! More news on that over the coming months.)

Video & audio quality is better. It’s so good that I’ve got a new a fancypants DPA headset with a built-in earphone like a talking head on TV because I want you to have the best listening experience possible. (Also, because I wanna listen to good background music while I do live coding, and I can’t have that music going through the live stream without running into copyright problems.)

I’ve been working on my production game, too.

I’ve been relying on my Elgato Stream Deck more and more. It’s a little customizable keyboard with displays on each key:

The first 3 columns are sets of sound effects. Each key triggers a random sound – for example, the “Cheer” button picks a random sound from a group of people cheering. (This way it doesn’t sound exactly the same each time folks cheer.) These are on the far left because they’re the easiest to grab by feel, without looking.

Columns 6 is for streaming production, like fading the music out when I start, changing my desktop resolution, turning on my webcam & background lights, and going live with the stream. I have these in the middle because I don’t mind looking directly at them to pick the exact right button.

It looks like a lot of empty keys, but I’ve got big plans for the rest, hee hee!

The end result is pretty cool. Wanna see it live?

Every Saturday & Sunday morning from 6AM-8AM Pacific (iCal), you can catch me on my TwitchYouTube, or Facebook channels. The recordings are also available there free. I think this is the silver lining in the COVID19 cloud: as crazy as it sounds, we’re all going to be even closer than we were before. More of us are going to come out of this doing live streaming and online collaborations, and you’re going to have more ways to keep your skills sharp for less money (or completely free), all from the comfort of your own home. You can watch me:

Wanna see what they’re like? Here’s a Saturday stream when I worked on pull requests in the First Responder Kit:

And here’s a Sunday stream when I wrote a blog post about date table query plans and told the story of when Microsoft offered me a job:


Want SQL Server Training? Copy/Paste This Email To Your Boss.

Well, that escalated quickly, huh?

Just a few short months ago, we were all making plans about what we wanted to learn in 2020. We sketched out our goals, our conference plans, maybe even how we wanted to do our first user group presentations or SQL Saturday presentations.

These days:

  • The events you wanted to attend are canceled or postponed
  • Your company banned travel (either for health reasons, or budgetary ones, or both)
  • Your family isn’t too keen on you leaving, either
  • It feels like your projects are all in a holding pattern

So, what’s the rational thing to do? Learn online instead.

But here’s the thing: you have to take control of your training. In this economy, nobody’s gonna hand you a golden ticket and take care of you. You’re gonna have to take a risk, go to your manager, and lay out your case. In this post, I’m going to show you why to do it, and how.

“Brent, why would I wanna learn from you?”

About a decade ago, when I started working with SQL Server, I really struggled. The learning material was dry, boring, and unrelated to the actual stuff I worked with on a day-to-day basis. It was theory, mostly taught by people who’d never been an actual database administrator. It was like a blind person teaching you how to take pictures.

So over the years, when it became my turn to teach and give back, I was determined to have fun while discussing real-world topics that your users would notice right away. I designed the courses that I would wanna take myself:

  • I do a lecture, explaining concepts with a mix of slides & scripts
  • I pass it over to you, and you do a real-world-style lab exercise where you have to prove what you just learned (and you get all the scripts & tools necessary to do it on your own)
  • Then you watch me do the same lab exercise, and you can see how my thought process works, and pick up on things you missed out on during the lecture

It’s super-lively, super-interactive, and you can get a feel for it by watching me teach Fundamentals of Query Tuning free today. Of course, that’s just an introductory Fundamentals class – the Mastering classes get much more challenging, complete with hands-on labs.

“What do the Mastering class students say?”

After every class, I encourage students to leave a review, and I don’t edit ’em. Every single review goes out as-is, typos and all, hahaha. That’s because I’m really proud of what the students say:

“This is an excellent course as is the Mastering Query Tuning course. Brent’s teaching style is effective, efficient and entertaining. I learned a lot and have already put things I’ve learned to use in my job. I really like that he answers questions posted in Slack as well. The format is very interactive and has a similar feel to an in person class with other students. I highly recommend this course.” Laura Beranek

If there is one master class you take, it should be all of them. This class is a fantastic study to improving performance solely through indexes. Amaze your clients/bosses with newfound performance. Your biggest challenge will be convincing your clients to let go of old, tired indexes that aren’t being used. Tune your indexes to DEATH.” Sam Carey

This was my first experience with a multi-day on-line class. It could not have been better. Having the Q&A in Slack with the other students is brilliant.” Steve Malcolm

“This class is a must-have for database professional. Brent’s class delivery is impeccable. He is very proficient and engaging. He demystified performance tuning with a very effective methodology that I put in use at my work right away. I especially appreciate the way he talks through his logic of resolving problems. After years of being a DBA, I feel a lot of knowledge he shared was still eye popping/opening to me. His performance tuning training series are the most useful classes I have had.” Angela Couch

“I watched several Brent’s free online training before and finally got my boss to pay for the live season pass. Boy, it is life changing. First, I feel shocked to know how little I knew before. As I went through the three mastering tuning classes, I became more confident and by the time I finished this mastering server tuning class, I felt so empowered to tackle performance problem. I thoroughly enjoy all his classes, especially his humorous consulting stories and practical hand-on labs. (His “clippy” voice is hilarious!) It is also valuable to learn from other students through interactive slack channel. Brent is a very effective instructor. Unlike other training I went to, I am able retain the maximum amount of information from his classes and apply them at my workplace. It is amazing that he even offers every student to ask his tech questions outside the class. His training is no doubt “the best bang for your buck” and opens “a whole new world”! Thank you, Brent!” Angela Couch

“Okay, I’m sold. How do I convince management?”

Copy/paste this into an email with a subject of “SQL Server online training – need to join before April 21”


Hey – I know a conference or training class is totally out of the question this year, but Brent Ozar, a Microsoft Certified Master who runs one of the biggest blogs & maintains sp_Blitz and the First Responder Kit, is running a big sale on his online training classes.

Here’s why we should sign up:

  • I can start learning right now – his next round of Mastering classes starts on April 21.
  • I can learn for a year straight – he’s selling a pass that lets me attend all of his online classes.
  • I can do it from home – he’s been teaching online classes for years.
  • It’s way less expensive than a conference – and no worries about me being unavailable due to travel problems or quarantines.

His Live Class Season Pass is usually $3,995, but he’s running a 65% off sale for just $1395 due to the quarantines.

To get me a ticket, go here: https://www.brentozar.com/product/2018-live-class-season-pass/ You can pay with a card online, or you can even pay via PO or invoice, but if you do those, he has to receive payment before I can start attending classes.

If you have any questions about how it works, his email is Help@BrentOzar.com.


And that’s it. Copy/paste that in, hit send, and see what happens. You’ll never know if you don’t try – good luck, and see you in class!


[Video] How Can I Measure If My Database Does More Reads Than Writes?

When someone asks you, “Is this database read-intensive or write-intensive?” you probably look at sys.dm_io_virtual_file_stats to measure the file activity – but that isn’t really correct. After all, your users run a lot of queries all day that are simply satisfied from the buffer pool. SQL Server caches your data so it doesn’t have to hit the disks, but … those are reads, right? (Write? Get it?)

There’s a better way.

SQL Server is constantly tracking how many reads & writes each index gets. You can query sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats – in this blog post, I’m going to choose the former because it’s a little easier to write about. For the differences between those two and which one you should use when, hit this module of Mastering Index Tuning.

Here are the contents of sys.dm_db_index_usage_stats on my lab server:

That’s a good start because it shows seeks, scans, lookups, and updates – but we’re only seeing database, object, and index IDs. You, dear reader, probably want a little more data about which databases & tables we’re looking at, so here you go:

Here are my results:

Oops: we have our first problem. As I was writing this post, I booted up my lab VM for the first time today – it’s a bright eyed 6:14AM Pacific as I write this, being watched by countless strangers. Because my database only has activity in a couple of tables, these numbers only reflect a couple of tables. However, in terms of measuring reads vs writes, I only really wanna know about tables that have had recent activity, so these numbers will work just fine for me.

Next, I probably want to group this data together by database, so here’s a query for that:

Voila:

Some of you are doubtlessly going to say, “Hey Bert, I need this rounded to two decimal places and totaled across the entire server. Please do my job for me.” To which I will reply, “I am busy making coffee, please do the needful.” Instead, I’m going to give you a video of me writing this very blog post:

Related links to keep the training going:

I’m not going to schedule these ad-hoc live streams ahead of time – it’s just whenever they fit into my schedule – but if you want to be alerted when I start broadcasting, here’s how – I live stream simultaneously in all of these places:


Announcing Two Free Live Fundamentals Courses This Wednesday & Thursday

Fundamentals of Index TuningDuring the quarantines, I’m trying to keep you busy with free online training. Last week, I unveiled the new free Quest Database Training Days, and this week, I’ve got another one.

I’m going to teach my Fundamentals classes live, free!

The class will be streamed at BrentOzar.com/training/live, plus simulcast on my Facebook page, my YouTube channel, and my Twitch channel. No registration is required, and you can join in from anywhere, including the EU/EEA.

If you want to follow along with the lab homework or ask questions during the classes, make sure to read the Live Class Logistics info for the index class and the query tuning class. Just scroll down on each of those pages. Super important – we’re going to have a lot of students in these classes, and I’m not going to be able to take the time to get you up to speed once the broadcast starts.

See you in class!


[Video] Watch Brent Work on sp_Blitz

During the quarantines, I’m looking for new ways to help you level up for free.

I decided to start live-streaming whenever I’m working on things I can share with you publicly. I wanted to add a new check to sp_Blitz, so I figured I’d do it live on camera while you watch, and talk you through my thought process.

In this 52-minute video, you’ll see me working in Azure Data Studio and Github as I explain why it’s hard to add DMV-related checks since SQL Server’s available columns change over time. I end up using dynamic SQL to make it work on both SQL Server 2008 and 2012+.

Related links to keep the training going:

I’m not going to schedule these ad-hoc live streams ahead of time – it’s just whenever they fit into my schedule – but if you want to be alerted when I start broadcasting, here’s how – I live stream simultaneously in all of these places:


Congrats. Y’all Raised $8,919 for Doctors Without Borders in March.

I hate posting things on April 1, but there’s no laughing here, only cheering.

During our March quarantine sale, I donated 10% of all sales to Doctors Without Borders.

I’m proud to announce that y’all raised $8,919 to provide lifesaving medical humanitarian care in over 70 countries.

I’m excited to see you in class. Y’all are good people. Keep calm and carry on, stay safe and healthy, practice physical distancing, and wash your hands. I need every one of y’all around as readers.


Announcing Quest Database Training Days: Free Live Training with Me, Janis, and Pinal

During the quarantines, I’m trying to find as many ways as possible to get y’all free training. Today & tomorrow are the last days that you’ve got free access to the recordings of my Fundamentals of Index Tuning and Fundamentals of Query Tuning classes – get those while you can, because they’ll disappear when March is over.

For your next round of free online training, I’ve partnered with Quest Software to bring you Quest Database Training Days:

  • April 8: Kick Start! SQL Server 2019 Performance Tips and Tricks with Pinal Dave
  • April 15: Top Five SQL Server Query Tuning Tips with Janis Griffin
  • April 28: How to Measure Your SQL Server with me
  • May 5: Why Defragmenting Your Indexes Isn’t Helping with yours truly
  • May 12: The New Robots in SQL Server 2017, 2019, and Azure with me

Register here. See you in class – online only, of course. Keep practicing that physical distancing, wash your hands, and help flatten the curve.


[Video] What Percent Complete Is That Index Build?

SQL Server 2017 & newer have a new DMV, sys.index_resumable_operations, that show you the percent_completion for index creations and rebuilds. It works, but…only if the data isn’t changing. But of course your data is changing – that’s the whole point of doing these operations as resumable. If they weren’t changing, we could just let the operations finish.

Let’s see how it works to understand why the numbers aren’t really reliable:

Well, that’s … disappointingly inaccurate.

Here are the demo scripts if you want to give it a shot on your own systems:


Updated First Responder Kit and Consultant Toolkit for March 2020

In the immortal words of “Orange” Julius Caesar as written by Shake’s peer, “Beware the bugs of March.” It took us 2,064 years, but we’ve finally fixed those bugs, plus added a couple of new features.

To get the new version:

Consultant Toolkit Changes

Updated the First Responder Kit with this month’s script updates.

sp_AllNightLog Changes

  • Fix: skip databases where ignore_database <> 0 in the backup_worker table. (#2308, thanks Maxiwheat.)

sp_Blitz Changes

  • Fix: don’t error out when some databases have numeric roundabort turned on, and a rounding error would normally throw an error. (#2302, thanks DJH.)
  • Fix: bug in full backup check: if msdb.backupset contained backups taken on other servers, we would have included them as successful backups when they really weren’t. (#2313, thanks Solomon Rutzky.)
  • Fix: uninstall script now works on case-sensitive collations. (#2307, thanks Protiguous.)

sp_BlitzCache Changes

  • Improvement: added a new check for the UserStore_TokenPerm cache being >10% of the buffer pool. I’m seeing issues out in the wild where this cache has taken over memory and caused SQL Server to be unable to cache any execution plans. This was an issue over a decade ago with SQL 2005, but I’ve got a SQL Server 2016/2017 client with an open case with Microsoft on it. We’ll be adding this same check for sp_Blitz and sp_BlitzFirst. (#2134, thanks Erik Darling.)

sp_BlitzFirst Changes

  • Fix: @FilterPlansByDatabase was throwing an error if you passed in a list of databases (rather than just “user” for the user databases.) (#2311, thanks Sam Carey.)

sp_BlitzLock Changes

  • Improvement: now catches single-query parallelism deadlocks, too. (#2286, thanks Adrian Buckman.)
  • Fix: now works with Amazon RDS SQL Server Express Edition by skipping the sysjobssteps query like we do with Azure SQL DB. (#2317, thanks Ernesto-Ibanez.)

sp_BlitzWho Changes

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes us!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.


The First Time I Had to Restore a Database

Mophead

Not the exact same time frame, but not that far off.

I remember it really clearly.

In the mid 1990s, long, long before I went into IT as a career, I was working at a photo studio in Muskegon, Michigan. They specialized in high school class photos, and they did a LOT of ’em. Every morning, the photographers would come into the office to collect printouts for the pictures they were going to shoot that day – student name, address, time of the photos, that kind of thing.

My job duties included:

  • Removing the prior night’s backup tape and switching it out for a new one
  • Running a few database queries to prep for the day’s sessions
  • Printing out the appointments & labels for the photographers

One morning, I ran a query without the WHERE clause. Because that’s what you do.

I don’t remember how the queries worked, and I don’t even remember what the database was. I just remember that it ran on SCO Xenix because I remember the manuals so clearly, and I remember that I didn’t panic at all. I knew nobody else had accessed the database yet – I was one of the first ones in every morning – so all I had to do was restore the database and go through the steps again.

But I also remember that the boss (not manager – boss) had an epic temper. As in, swear-and-throw-things-and-fire-people kind of temper. And like me, he was an early riser, and I knew it was only a matter of time before he showed up and looked for his printouts to see who he was going to photograph that day. I was sure I was gonna get my rear handed to me, and I was resigned to that fact.

So I put last night’s tape in, started the restore, and waited. Sure enough, the boss came in, and before he could say anything, I said:

I screwed up, my fault, and I’m sorry. I messed up the queries I was supposed to run, so I’m restoring last night’s backup, and then I’m going to run the queries again and do the printouts. You can fire me if you want, and I’ll totally understand it if you do, but you should probably wait to fire me until after the restores finish and I do the printouts. Nobody else here knows how to do this, and the photographers need to work today.

Worked like a charm. He nodded, and I could tell he was pissed, but he didn’t yell or throw things. He just gruffly left the computer room and did other stuff.

The photographers and other people started trickling in, looking for their printouts, and I explained that they weren’t ready yet, and explained why. They all got big eyes and asked if the boss knew about this, and they were all sure I was going to get fired.

I didn’t get fired, and everybody was surprised. The boss used me as an example, saying, “No, this is what you’re supposed to do – own up when you do something stupid, fix it yourself, and be ready to deal with consequences.”

Part of me was a little disappointed that I didn’t get fired, though. I wasn’t a big fan of that job. I’ve only been fired once – from a Hardee’s – but that’s a story for another blog post.

What about you? Do you remember the very first time you had to do a database restore to fix something you messed up?


“Working” From Home? Watch A Bunch of Free SQL Server Videos on YouTube.

You’re new at this working from home thing, and you’re trying to put Game of Thrones up on a second monitor while you work? Listen, that’s not gonna play out well. You’re gonna get distracted and you’re not gonna get any work done. Instead, if you’re gonna have something in the background, learn about SQL Server.

Brent Ozar’s Conference and User Group Sessions – I’ve done a ton for various user groups and conferences that keep ’em in their own YouTube channel, making it a little hard to track down. To make your life easier, I put together a single YouTube playlist with ’em all.

Doug Lane’s 11-Part Series on Statistics – Doug originally filmed these for us as a training class several years ago. They’ve been available free on BrentOzar.com for a while, but I just realized I’d never uploaded them to YouTube, so I fixed that. (That’s what started this whole blog post.)

Doug Lane’s 11-Part T-SQL Level-Up – take your T-SQL from zero to hero – literally – as Doug teaches you heroic skills. The trailer tells you all you need to know about how the class will go down:

 

Happy learnin’!


How to Track Performance of Queries That Use RECOMPILE Hints

Say we have a stored procedure that has two queries in it – the second query uses a recompile hint, and you might recognize it from my parameter sniffing session:

The first query will always get the same plan, but the second query will get different plans and return different numbers of rows depending on which reputation we pass in.

I’m going to do some setup, free the plan cache, and then run the proc:

When I run it for @Reputation = 2, the second query returns 9,149 rows.

How a recompiled query shows up in the plan cache

If I check sys.dm_exec_query_stats right now, there are a few columns with interesting results – I’m going to move them around a little so you can see ’em:

Line 1 is the first query in the stored procedure. It’s a COUNT(*), and it only returns 1 row.

Line 2 is the second query, and it returns 9,149 rows for reputation = 2.

So right now, if I’m inclined to do math, I can add up the total number of rows for both statements and I can see the total number of rows returned by the query. It might seem odd to use rows as a measurement right now, but sys.dm_exec_query_stats’s columns – CPU, reads, writes, duration, etc – all behave the same way as I’m about to show you here, and the rows numbers are more repeatable than some of the others, so let’s use rows.

If I execute the stored procedure twice – for two different parameters – and then check metrics, things look different:

The results:

The first query’s plan stuck around in memory, so it now shows 2 executions, and 2 total rows returned. Its row metrics are correct through the life of the stored procedure’s time in cache.

However, the second query – the one with the recompile hint – has a brand new plan in the cache, but also new metrics. You’re not just recompiling the execution plan, but you’re also not getting query plan metrics here. (That’s fine, and that part I was also kinda aware of.)

But the part that I keep forgetting is that when I’m looking at the stored procedure’s totals in sp_BlitzCache, the total, min, and max values are useless:

In sp_BlitzCache, we add up the totals for each statement in a proc, and we present those as the total numbers for the proc. The problem is that they’re just not true whenever there’s a recompile hint: the totals are underreported, and the avg/min/max just reflect the last execution of any query with a recompile hint.

The plan cache just isn’t a good fit for tracking queries with RECOMPILE hints – but Query Store is.

How a recompiled query shows up in Query Store

I’m going to turn on Query Store in the Stack Overflow database, and then run the two queries again:

And then query Query Store:

I get a much better picture of the number of times that the recompiled queries have run, AND the number of rows they’ve returned each time. (And of course you get other much more useful query tuning metrics too, like CPU, reads, duration, etc.)

When you decide to use RECOMPILE hints,
you probably wanna enable Query Store.

Query Store first shipped in SQL Server 2016, and about half of all production servers are 2016+. Today, in 2020, when your query tuning efforts require you to put in RECOMPILE hints, then you should probably stop for a moment to consider Query Store too. It’ll make your performance troubleshooting easier down the line.

Query Store does require a little more planning than just putting in a RECOMPILE hint, though. Here are the things you wanna think through:

Are these things more work? Yep, absolutely, and the combination of RECOMPILE hints and enabling Query Store logging is going to have a performance impact on your server. You don’t wanna flip these switches unless you’re convinced it’s the best way to solve a parameter sniffing problem. There are other solutions, too – as a quick refresher, if you’re not familiar with any of the other ways, check out my video from SQLDay Poland:


How COVID-19 Affects Conferences

SQLBits just postponed their event, and I’m stepping out of SQL Saturday Iceland as well.

I’ve had the unusual position of being a conference attendee, speaker, and sponsor, all at different times of my life, so I wanted to do a quick brain dump explaining some of the gotchas involved behind the scenes.

Canceling a conference can financially ruin the organizers. The organizers spend a ton of money leading up to the conference – doing marketing, paying staff salaries, paying deposits (event space, food, hotels, printers.) Some of this stuff may be refundable, but … a lot of it isn’t, like staff salaries and supplies that are already purchased. Organizers can buy insurance, but to save money, they often don’t buy insurance to cover this kind of thing, especially diseases. As a result, when they have to cancel an event, they can be financially destroyed instantly. The SxSW festival just canceled their event, laid off 1/3 of their staff, and is considering bankruptcy.

Who cancels the conference can make a big difference. If the government cancels all events, then it can be easier to get money back for hotels, flights, event spaces, and make insurance claims. As a result, organizers can be tempted to play a game of chicken with government, trying to see who cancels first.

People are getting infected at conferences. For example, two people at RSA’s conference got infected, and it just kinda makes sense: people are flying in from all over the world, spending time in close proximity with strangers in airplanes, eating food from buffets, and of course, shaking hands with each other because it’s just hard to break that habit. (I’m still laughing at the Dutch prime minister who, after announcing a no-shaking-hands rule, promptly shook hands to close the announcement.)

Speakers and volunteers have to deal with more people. People come up afterwards and ask questions in close proximity, they want to shake hands, take selfies, hug, you name it. We’re at higher risk for infection, plus we’re especially dangerous if we’re the infected ones, and we spread it rapidly to other people.

(Personally, I have asthma, which means that if somebody infects me, I’ve got a much harder time fighting off the infection.)

Attendees are sensitive to the situation, too. They’re often packed in elbow-to-elbow with complete strangers in varying degrees of health, all breathing on each other for hours on end. Once an attendee starts sneezing and coughing, other attendees will start to feel uncomfortable, leading to awkward situations. For example, plane passengers became disruptive when an attendee had an allergy attack, and another plane ran into even stranger issues.

Sooner or later, conference attendees will ask organizers to remove someone of questionable health. For that to work, we all need to be on the same page about what’s accepted behavior at events, and attendees need to be told long ahead of time that they shouldn’t show up if they have symptoms that even look related to COVID-19. Sure, I get it – you don’t think you’re infected – but that doesn’t mean other people are going to be comfortable with you coughing into your elbow every five minutes, and wiping your mouth on your shirt. Conference organizers likely aren’t going to be sanitizing chairs and tables between sessions.

Organizers are already stretched to their limits. Leading up to an event, the organizers and volunteers do a heroic amount of work just dealing with regular conference issues. The Center for Disease Control has a prep document for event organizers, but just looking at the tasks in that list, I can tell that it’d take multiple full time bodies to check off all those tasks – and events often just don’t have the spare staff available.

Vendor staff don’t want to get infected. Companies make decisions to sponsor or attend a conference, and then they send their sales & marketing teams to the conference as well. Those employees may not have had much of a choice about whether they wanted to attend – they may not have the passionate feelings that you have about attending a conference to learn, because they’re just there to do sales and network. Their families ask tough questions about, “Why exactly is your company sending you to this event? Can’t you sell things from home?”

Everyone’s less likely to attend events right now. Companies are setting no-travel policies to protect their staff, which means the conference has less attendees, speakers, and sponsor staff available to attend.

When you add all this up, it’s a dark time for conferences: they have less attendees & revenue, but they have higher expenses to put on the event (because there are more health concerns to tackle, and all this costs money.) I don’t have easy answers – but as a speaker with asthma, I’m keenly interested in how events protect me and the other attendees. We’ve already got one person in the community being quarantined – we don’t need more. (Send your good vibes & prayers to Tim Radney, @TRadney.)

Footnote: this post might have even more typos & errors than usual because of its hurried nature. I’m scribbling it down quickly before Erika wakes up. We’re on vacation this month, driving around Iceland’s Ring Road, posting stuff on TikTokInstagramFacebook, and Twitter wherever you wanna follow along, and I’m trying not to do “work” work – but I wanted to get this out while I was thinking about it.

I know, it’s an odd time to go on a traveling vacation! We’ve been keeping a close eye on the news, washing our hands a lot, and frankly, avoiding other people, hahaha. The good news is that Iceland’s a great country for this kind of thing – plenty of beautiful wide open spaces for sightseeing – and the tourist places are even emptier than their usual winter season.


Things to Consider When SQL Server Asks for an Index

One of the things I love about SQL Server is that during query plan compilation, it takes a moment to consider whether an index would help the query you’re running. Regular blog readers will know that I make a lot of jokes about the quality of these recommendations – they’re often incredibly bad – but even bad suggestions can be useful if you examine ’em more closely.

In SQL ConstantCare®, we analyze SQL Server’s index recommendations and come up with our own suggestions. Here are some of the things we think about as we’re doing the analysis:

  • Did SQL Server ask for the same index yesterday?
  • Would the index have been used at least 100 times per day?
  • Does the table involved already have a clustered index? If not, we should probably think about adding one of those first.
  • Does the table have 5 or less indexes? If it has more, it might be time to think about deduping those with my D.E.A.T.H. Method first.
  • Does the table have <100M rows and under 10GB in size? If so, the indexes are easier to pop into place. At higher sizes, you want to approach the tables with a little more care and planning since even adding an index can require a maintenance window depending on the speed of your hardware.

SQL ConstantCareOnce we’re past those initial hurdles, we think about the index recommendation itself:

  • Does it have only one key and one included column? If so, it probably makes sense to promote the include into the key, as long as the first key makes sense too.
  • What are the datatypes on the columns? If they’re big (say, over 200 characters), consider removing ’em, especially if they’re included columns.
  • How many columns are in the index? If it’s more than 5 elements, consider removing the included columns.
  • Is the recommendation a narrower subset of an existing index? If so, it’s probably not a good fit.
  • Is the recommendation a wider version of an existing index? If so, you may want to remove the narrower (old) version after you replace it with the wider new version.
  • After your tweaks based on the guidelines above, does the index exactly match an existing index on the table? If so, set it aside – it’s time to review the high-read queries instead rather than looking at the missing index recommendations.

Whew. It’s a lot of stuff to think about – but with SQL ConstantCare®, you don’t have to. We’re gathering the metadata daily, watching what SQL Server does, gradually rolling out index recommendations to help, and then circling back to the index usage DMVs to make sure that the index actually helped.


Making TRY_CAST and TRY_CONVERT Queries Faster with Indexed Computed Columns

I know this is gonna sound crazy, but let’s say you had a table where people stored all kinds of things in one column: dates, integers, file names, sale prices, file names, you name it. And let’s say your application frequently ran a query looking for dates in that column, like this:

Even if you create an index on DisplayName, SQL Server ignores it because it believes so many rows are going to match, and it doesn’t want to do all the back & forth key lookups between the DisplayName index and the clustered index (to get SELECT *):

Yes, believe it or not, there are actually users in the Stack Overflow database whose names can be converted to dates:

So anyhoo, I’m not allowed to change the query, but I need to make it go faster. I have 3 problems:

  1. The estimates are wrong, so
  2. SQL Server refuses to use the DisplayName index (due to an artificially high number of key lookups)
  3. SQL Server can’t seek specifically to the rows that match (it has to do the cast on every row)

I can fix problems #1 and #2 with a computed column, like I explain in the Artisanal Indexes module of my Mastering Index Tuning class:

Presto, the estimates are more accurate, and SQL Server is now doing less logical reads (because it’s scanning the DisplayName index rather than the entire clustered index, which includes all the columns.) Now, to solve problem #3, I would usually add a nonclustered index on our newly created column, but…

You might think, “Wait, how can a cast as a date be non-deterministic? Does it have something to do with the fact that it might sometimes return null?”

No, it’s because your session’s date format can cause the same string to return different dates. Here, I’m using the Books Online example for SET DATEFORMAT to set my default date formats to different styles, and then when I do a TRY_CAST on the same value, I get different dates:

Diabolical. So I can’t use an index on a computed column to make this thing crazy fast, and I’ll just have to settle for an index scan.

But TRY_CONVERT *does* work,
as long as you specify a format.

If I try this same trick with TRY_CONVERT and I specify the exact date format I want – obviously ISO 8601, as any XKCD fan will tell you:

SQL Server is willing to create a nonclustered index on that, AND it’ll automatically do an index seek on it rather than scanning & computing the whole thing. That’s awesome! However, if I try that same trick without a specific format, I’m back to being unable to create the index on it:

Dang. And don’t even get me started on filtered indexes with that stuff – that’s a whole ‘nother ball of wax. For that kind of thing, hit the Artisanal Indexes module.


Menu
{"cart_token":"","hash":"","cart_data":""}