DBA Training Plan 8: What’s Inside Your Own Databases

In the last episode of the DBA Training Plan, we talked about what’s inside the MDF file: a series of pages that store a database’s tables. Now, it’s time to check out the contents of YOUR database to understand what’s using the most space, what’s getting used the most often, and what you might be able to clean out to make your job easier.

Let’s start by querying the DMVs.

SQL Server has tons of dynamic management objects, also known as dynamic management views (DMVs) or dynamic management functions (DMFs) depending on how they’re called. Generally, if you hear someone talking about DMVs, they’re talking about these system objects that LOOK like tables, but are really objects you can query to see what’s going on inside your database.

Try running this inside one of your databases – it doesn’t really matter which one, and it’ll go blazing fast without blocking anyone else:

The results of sys.indexes is a bunch of metadata about your indexes: what kinds of indexes you have, what kinds of locks they allow, whether they’re filtered, and much more. But, uh, reading this stuff is painful – for example, you don’t even get the name of the tables involved.

Think of the DMVs as building blocks: you have to join lots of them together before you’ve got anything useful. In the dark ages of the 1990s and 2000s, database administrators spent a lot of time doing just that – building their own houses out of building blocks. Their queries had all kinds of bugs, didn’t account for edge case scenarios, and weren’t kept up to date with the latest additions in each new version of SQL Server.

Today, there’s an easier way: sp_BlitzIndex.

sp_BlitzIndex is a free, robust, open source script originally started by Kendra Little. Since it’s open source and lives in a public Github repo (along with a lot of other cool scripts), it’s gotten a ton of updates over the years. Today, it’s a powerful, easy way to understand what’s being stored in your database.

Download it in our free First Responder Kit, and when you open the zip file, you’ll see a list of SQL scripts. For now, we’ll just start by opening sp_BlitzIndex.sql in SSMS, and execute that in your Master database on a development server. The script only creates sp_BlitzIndex as a stored procedure – that’s all. It’ll execute instantly.

There are lots of parameters for it, but let’s start with:

Change StackOverflow2013 to the name of a database you care about. What the Mode 2 switch does is give you an inventory of all the tables & indexes:

sp_BlitzIndex

In the Definition column, CX means Clustered Index, and PK means Primary Key. In my screenshot, all my primary keys are clustered – but that may not be the case in your own database.

To get to know your database, copy/paste this spreadsheet into Excel and turn it into a table. Sort it by a few different columns and ask yourself a few questions:

  • Sort by Rows descending, and make a note of which tables have the most rows.
  • Sort by Reserved MB descending, and think about how much space is being used up by these objects. Note that some of your tables might have really wide indexes with lots of columns – you might be duplicating the size of the table many times over. (Now’s also a good time to take note of any backup tables you might have made long ago, and forgotten to delete along the way.)
  • Sort by Create Date descending and look at recently created objects.
  • Finally, scroll way over to the right and note the Drop & Create SQL columns. Since the DMVs have so much metadata, we can basically build the index creation scripts from scratch – making it way easier to build undo scripts as we make changes.

Since you’re running this in a development server, a lot of the columns will be empty or zeroes. Total Reads is a great example: most of your indexes probably aren’t getting any action on the development server because you’re not running the app’s full workload over there. Once you’re comfortable running sp_BlitzIndex on a development server, and you understand that it isn’t doing damage or blocking other queries, then you can take it up a level by running it on a production server. In production, you’ll get much more valuable diagnostic data.

You’re going to want to drop indexes. Hold up.

For now, just focus on getting to know what you’ve got inside these databases – after all, they’ve probably been the same way for years, and another few days isn’t going to kill you. Before you make changes to the indexes, I’ve got some more stuff to teach you about the risks and rewards.

Instead, your homework today is to run:

That pulls the data for all databases at once. It’ll be slow, since it’s basically looping through each database one at a time. (We don’t have a way to quickly gather data across all databases at once.) If you have over 50 databases, we also make you pass in @BringThePain = 1 just so you understand that it’s going to take a while, hahaha.

Copy/paste the results into Excel, and don’t make changes to indexes – but DO make changes to TABLES. Before we start doing index tuning, I want you to clear out the garbage. Look at your largest tables – say, the biggest 25% by rows and by size – and look at their Reads columns. Are these tables backups that were made a long time ago in a galaxy far far away? If so:

  • Ask the team if they’re still in use
  • If not, rename the tables – add _ToBeDeleted_20190830 at the end of the table name (or whatever date you want) – that way, if somebody comes running in screaming that their table is gone or their query stopped working, you can just rename it back to bring it back to life
  • Set yourself a reminder for that YYYY/MM/DD date to go back in and drop those tables (making sure you’ve got a good database backup first, and keep that backup around)

Why clean out the garbage first? Because these tables are slowing down your backups, restores, corruption checks, statistics updates, and much more. I’ve seen cases where just taking out the garbage cut our nightly maintenance job times in half!

In our next episode, we’ll start digging into the health & design of those indexes.

“But NOLOCK Is Okay When The Data Isn’t Changing, Right?”

I’ve already covered how NOLOCK gives you random results when you’re querying data that’s changing, and that’s a really powerful demo to show folks who think NOLOCK is safe to use in production. However, I’ve gotten a question from several users:

But I’m querying data that isn’t changing – sure, OTHER rows in the table are changing, but not the rows I’m querying. Isn’t NOLOCK okay if I’m querying stable rows?

Nope, not even close. Get any Stack Overflow database, and start with the query from the last post – except this time we’ll modify it so that we’re updating some of the users, and querying others.

One of the more common DisplayNames is Alex. In one window, I’ll get a count of the number of Alexes:

And in the other window, I’m going to set everyone’s location and website – EXCEPT the Alexes, who I’m not going to touch:

Watch disaster unfurl:

The number of users named Alex appears to keep changing – EVEN THOUGH I’M NOT UPDATING THOSE ROWS! The reason: the location of Alex’s rows may be moved around by things that are happening to other rows.

I can’t emphasize this enough: with NOLOCK, you can:

  • See rows twice
  • Skip rows altogether
  • See data that was never committed
  • And have your query fail with an error

If you’re cool with that, great – NOLOCK is for you. If not, it’s time to consider other ways to get the performance you want while still getting the data accuracy that your users require.

What The Consultant Toolkit Does

It’s one thing for me to tell you how much easier The Consultant Toolkit makes my job and how clients love how much I can tell them about their server during the initial sales call – but it’s another thing to hear it from a happy customer:

Thought I would share a story, with my thanks for the Consultant Toolkit. A timeline:

  • Friday – random guy tweets he needs some DBA help. I get tagged, along with a couple of other consultants
  • Saturday – dude emails me. I send back an offer to “quick review” a server
  • Sunday – get results from the Consultant Toolkit
  • Monday afternoon call – impress the crap out of the prospect
  • Tuesday – negotiations
  • Thursday – contract signed to do dozens of HealthChecks and provide 7 hours of remediation effort per server
  • Today – I’m actually on their jump box they made just for me working.

One week from a tweet to a $35K contract. He had talked to several other firms before the tweet, but he sold me to management with “[Redacted] has already given us actionable items on the main server at a great price.”

And that’s what The Consultant Toolkit does: help you close deals faster and provide more value than your competitors. Over 100 companies are using it today – so you’re probably already competing with someone else using it. If you’re not closing as many deals as you used to…hmm…

DBA Training Plan 7: How SQL Server Stores Stuff in MDF Files

So far in the DBA Training Plan, we’ve been working hard to make sure the contents of our MDF files are backed up, corruption-free, and secured. Now, uh…what’s actually inside them?

Inside each of your database data files (MDFs, although technically you can use any extension you want, even PDF or MP3), SQL Server stores your data in 8KB pages. That’s kilobytes – not megabytes, not gigabytes, but just mere kilobytes.

Say that we create a table:

First off, yes, I understand, I shouldn’t do ReaderName in one field – I should have FirstName, LastName, MiddleName, Suffix, Prefix, yadda yadda yadda, but I’m trying to keep this short. Now see what you did? It’s long again. Doggone it, it’s hard teaching this stuff quickly.

Anyhoo, in this table, each record takes up just a little bit of space. ReaderID is an INTeger, which takes 4 bytes. It’s the same 4 bytes whether that number is 1 or 1,000,000. ReaderName is a VARCHAR(200), which means we can store up to 200 characters in here, and each character takes a byte. If we insert ‘BRENT OZAR’, that’s 10 characters (and boy, am I a character), so we need 10 bytes to store it.

If all of our readers average about 10 characters in their name, that means we could fit about 500-600 records per 8KB database page. (In reality, there’s some overhead because SQL Server also needs to use some parts of the page to store metadata, and we’ll talk about that in another post.)

This blog isn’t very popular, so we can keep all of our readers on a single 8KB page. As we insert, update, and delete readers, SQL Server fetches that 8KB page off disk, brings it into memory, makes the necessary changes, and then writes that data page back to disk. The 8KB page itself is the smallest unit that SQL Server will cache – it doesn’t cache individual rows/records – and each page belongs exclusively to just one object.

A Word About Objects

You’ll notice that I avoid using the word “table”. Tables are cool, but as we start to dig into what SQL Server’s doing under the hood, we want to start thinking about these three object types:

Heap – a table with no clustered index. In my dbo.Readers table, I didn’t specify in what order SQL Server should store my data, so it’s just going to slap the data down on my 8KB page in any old order.

Clustered Index – what we normally think of as a table. If I’d have created my table like this:

Then SQL Server would store my data in order of ReaderID. That way, when I search for ReaderID #42, it can look at a little directory (B-tree) that tells it where ReaderID #42 is, and it can jump there without scanning through all three of my readers. The clustered index is sorted in the order of the ReaderID field, but it’s actually the full copy of our table, including all of our fields – in this case, just ReaderName.

Nonclustered index – If I tell SQL Server to:

Then SQL Server will create a second copy of my table sorted by ReaderName. This copy of my table will only include the fields specified in my index (ReaderName), plus whatever fields it needs to get back to the clustered index or heap.

All three of these objects – heaps, clustered indexes, and nonclustered indexes – will be stored on separate sets of pages. We won’t have the clustered index and nonclustered index for the same table on the same page – they’re split. That’s why when we’re doing space analysis, we have to think in terms of indexes, not tables.

Pages & Rows on Disk

The sys.dm_db_index_physical stats Dynamic Management Function (DMF) returns the number of rows and pages stored in each database object. It takes parameters for database ID and object ID, or you can pass in NULLs to get information across all of your database objects. Scroll down to the examples link in that Books Online page, and you’ll get queries to examine a single table – I’d strongly recommend starting with a small table, because some of the parameters for this DMF will cause SQL Server to actually look at every page in the object. That means if all of the pages for that object aren’t cached in memory, SQL Server will go pull those pages from disk, and that can slow down your running SQL Server.

This DMF also includes average record size and max record size. This makes for fun spelunking: how big is each record, really? Just because we make everything a VARCHAR(8000) doesn’t mean we’re actually storing 8,000 characters in each field. Now, don’t go changing your database structure just yet – you can easily break applications when datatypes change. Let’s leave that for later.

You can get similar metadata much faster by using sp_spaceused, but it doesn’t get the cool info about average record size, and I wanted to encourage you to go spelunking here.

Your Homework: Watch a Free One-Hour Class

In my free class How to Think Like the Engine, I explain pages, indexes, joins, SARGability, and more. I use real database pages from the StackOverflow.com database for demos, and you actually get PDFs to print out and follow along as we go.

Print out this 5-page PDF to follow along – trust me, you’ll need these pages – and then here’s the first module of the class:

When you’re ready, move on to the next module.

As we start to move up the Hierarchy of Needs from capacity into performance, you’ll start to see how these foundational items are so important.

So far, we’ve been talking about the general structure of databases. In the next episode, we’ll get specific about what’s in your database.

DBA Training Plan 6: How to Monitor Drive Space

Don’t.

It’s not your job.

No, seriously, hang on a second. I’m not saying that you should ever say things like “It’s not my job,” but I do want you to understand when you should avoid reinventing the wheel. Your Windows administrators should be monitoring every Windows server you have, making sure it doesn’t run out of drive space. This isn’t some wild, unique need that only SQL Server has – it’s everywhere in your data center.

(You might actually even BE a Windows admin, just here because you need to learn about working with SQL Server. It might actually BE your job to monitor this stuff. That’s cool – that’s how I got started too.)

In our journey from the base of my Hierarchy of Database Needs to the top, we do indeed need to talk about capacity – but I don’t want to monitor drive space from inside SQL Server, and I don’t want to buy SQL-specific tools in order to pull it off. Yes, you can technically use commands like xp_fixeddrives to list all of the SQL Server’s drive letters and how much free space they have, but that doesn’t work everywhere. For example, you might have mount points or databases on UNC paths, neither of which will show up in xp_fixeddrives. So leave the drive space monitoring to the pros.

Why You Shouldn’t Build a Monitoring Tool

If you want to be a professional developer, you should build things. You should learn what exceptions could happen, how to trap them, and how to fail elegantly. It’s hard work, but if you get good at it – really good – you can build amazing things and make a killer living for yourself.

But if you want to be a professional DBA, you should leave the developing to the developers.

I’m not saying you shouldn’t learn the dynamic management views (DMVs), how to dig through them to find the evidence you need, and how to turn that data into actions. Hopefully, I’m going to teach you how to do a lot of that over the course of the next six months. Take another look at the Hierarchy of Database Needs again, and think for a second about all the things we’re going to be learning over the coming posts. Just in the last five, I’ve had you build an inventory of your servers, automate your database restores, start locking down security, and enumerate your database risks. The next few are chock full of things that will help you become a hero in the eyes of your users and your developers.

Building a crappy monitoring tool in your spare time will not give you that same level of respect. (And yes, if you’ve only got your spare time at work, your monitoring tool is going to be crappy. If you’ve got so much time that you can build a great monitoring tool, you need to focus on providing results to the business fast, because you’re about to be laid off as pure overhead.)

How to Pick a Monitoring Tool

There are basically two kinds of real time monitoring tools out there for SQL Server.

Up/down Alerting – these tools make sure the SQL Server service is running, and that it has all the basic needs taken care of. If the server runs out of drive space, if the service stops, if the event log starts filling up with errors, or zombies attack, these tools will let you know. The most popular software in this category is Microsoft System Center and Red Gate SQL Monitor.

Performance Diagnostics – these tools try to derive insight from SQL Server’s dynamic management views (DMVs) to help performance tuners wring the most throughput out of their hardware. The most popular tools here are Idera SQL Diagnostic Manager, Quest Spotlight, SentryOne SQL Sentry, and to some extent, the tools in the above category too.

If I was you, I’d start by asking the Windows team if they’ve got any software that handles the up/down alerting for services, drive capacity monitoring, etc. If so, get them to start monitoring your servers. I’m being selfish here – my real reason is that I want to save my monitoring budget for tools in the Performance Diagnostics category. Surprisingly, all of these tools are around the same price – around $1,000-$2,000 USD per monitored instance.

Then, no matter which category of tool you’re buying, make a list of the last 4-5 things that have caused your phone to ring after hours. Here’s some of my personal favorites:

  • Deadlocks
  • Out-of-control query running for hours
  • Long-running job, like a backup going too slow
  • Queries that desperately need an index

Figure out how to recreate those same symptoms in your development environment, and then get a free trial of a couple of the tools I mentioned. (All of them provide free 10-14 day trials.) Reproduce the problem, and watch how the monitoring software reacts. Does it lead you to the root cause quickly, or does it just have a bunch of flashing red lights on the console? The best ones will save you time by getting you right to the solution.

After you’ve done your evaluation and picked a favorite, get a quote from them – and get a quote from the other vendors as well. Share the quotes with the competing salespeople. You’d be surprised how willing they are to negotiate, and you might be able to use the cheapest quote to get the best tool at a price that works for you.

Your Homework: Start Tracking Your Pain Points

I don’t expect you to magically get budget approval for a tool this week, but I want you to start making a list of day to day problems as they strike. If you waste four hours troubleshooting a deadlock problem, make a note of the date, the time required, and a few words about the root cause. In a month or two, you’ll be able to present this list to your boss as proof of why the right monitoring tool could save you money.

If you’re still convinced you should build your own tool, go check out the open source project Opserver. It’s a free performance monitoring tool from the people at Stack Overflow. They use SQL Server (and a lot of other cool tools), and theirs is the best open source performance dashboard I’ve ever seen.

Whew. So far, we’ve been focused a lot on protecting these valuable MDF files. Next up, what’s in ’em?

Podcast: Data Driven Inteview w/Me

I had the pleasure yesterday of spending an hour chatting with Andy Leonard and Frank La Vigne for the Data Driven podcast. Some of the questions they asked me include:

Data Driven Podcast

  • How did you find your way into data?
  • What’s your favorite part of your current gig?
  • Complete this sentence: when I’m not working, I enjoy ___
  • Complete this sentence: I think the coolest thing in technology today is ___
  • Complete this sentence: I look forward to the day when I can use technology to ___
  • Share something different about yourself.
  • What book would you recommend today? (And we covered a few of my favorites)

You can listen to the podcast free here. Enjoy!

sp_WhoIsActive Is Now Open Source.

Adam Machanic, Japanese Snack Connoisseur

Adam Machanic, Japanese Snack Connoisseur

If you’re still using sp_who, sp_who2, and Activity Monitor, listen up: there’s a great free replacement, and it’s now open source, too!

sp_WhoIsActive is a much more powerful way to analyze what’s happening right now on your server. Here’s what you need to know:

With this, we’ll be deprecating sp_BlitzWho. The only reason we initially wrote it was that we needed something purely open source that we could distribute without any copyright concerns. Now, with Adam licensing sp_WhoIsActive under the GPL license, there’s absolutely no reason that any of you shouldn’t be using sp_WhoIsActive as part of your database troubleshooting routine. You should high-five Adam the first chance you get. Or buy him liquor. Or both.

DBA Training Plan 5: Knowing Who Has Access

Who can get you fired? Run our free sp_Blitz stored procedure on one of your production servers and pay particular attention to the Security section of the results. It lists the logins who have been granted the sysadmin or security admin roles.

Don’t think of them as logins.

Think of them as people who can get you fired.

These people can drop databases, drop tables, change stored procedures, edit data, or even change SQL Server configuration settings like max server memory or maxdop. You’re probably not getting alerted when any of these things change – we just can’t afford to monitor every single thing in SQL Server and send alerts on it. At some point, we have to be able to trust certain logins, and that’s where the sysadmin and security admin roles come in.

Except when we first get started learning database administration, it’s usually because we’re the only DBA in the shop, and the server is a mess. The front door is unlocked, the key is under the floor mat, and everybody knows we’ve got a big screen TV in the living room.

How to Get Started Locking Things Down

Before you start removing people’s SA rights, be aware that there can be political backlash. In one shop, the head developer’s SA rights were removed, and he stormed into the DBA’s office the next morning screaming. Turns out one of his apps automatically created a processing database every night, did a bunch of work in it, and then dropped the database. Nobody knew because it was only around for 30-45 minutes. The problem could have been avoided by communicating the security changes ahead of time, and that’s where we need to start.

Take the sp_Blitz output to your manager – just the security portions – and say something like this:

“Here’s the list of people who can do anything on the production server – delete data, drop databases, or change performance settings. If they do, here’s the list of applications that will be affected, including accounting and payroll. I don’t want to take away all of their permissions – I just want to start by giving them full permissions over their database, but not in any other databases, and not at the server level. Can I talk to them about doing that?”

Note that we’re only going to TALK to them, not actually do it, because we need to communicate with them first. Then, go to the end users or developers involved and say:

“We’re starting to lock down the production server, but I want to make sure you have all the permissions you need. I’m going to make you a complete database owner inside your database so you can do anything you want in there, but I’m going to take away your rights to the other databases (like accounting and payroll), and I’m going to remove your permissions to change server-level settings like how much memory the server can use. I’m planning on doing it next weekend, and I want you to have my email and phone number so that if anything breaks on that date, you can call me and I can audit what’s happening on the server to see if it’s related to the permissions change.”

When You Get Resistance

If you get any pushback from developers or users, go to the project managers or business people who have a large stake in the database. For example, if the accounting database is on the server, go to the CFO and say:

“Here’s the list of people who can take down the accounting system. They have the permissions to drop the database at any given time, and there’s nothing I can do to stop it. I’d like to get that changed – can I schedule a short meeting with you and the development manager to get everyone on the same page?”

You want to turn it into a business problem, not a technology problem, and the CFO will very much be on your side. She can’t afford to have her entire department go down just because some developer didn’t put a WHERE clause on a T-SQL statement.

If you want to have a little fun with it, read my post, How to Talk People Out of the SA Account, and the part 2 followup with a vicious sense of humor.

I Know, This Week Isn’t Fun

Database administration isn’t all candy and roses. Sometimes it’s boring politics and paperwork, and this is one of those weeks.

In the first week, we built a spreadsheet inventory of our servers, and now it’s time to fill in a little more details. Since we’re analyzing security, we need to know which applications live on each server, and who’s in charge of each of those applications. You don’t have to fill in the specifics of who has read or write permissions in each database, but we want to at least know the business purpose and the business contact.

The business contact is the one who really helps us get the database locked down because their job is on the line if this data is lost or unavailable. (Come to think of it, you’re someone who can get THEM fired!) In the coming weeks, you’ll be working more with them on reliability and performance, too, so now is a good time to start fleshing out that part of the spreadsheet.

Next week, we’ll move on to something you DON’T have to do: checking for free drive space. No, seriously!

DBA Training Plan 4: Checking for Corruption

You would think that when SQL Server writes your data to disk, it’s totally safe.

Forever.

You would be wrong. It’s almost like there are gremlins out to get you, purposely trying to trash your data. In the really old days, it was problems with magnetic hard drives. Later, it was buggy shared storage arrays. Today, it’s the cloud, where your storage is 99.971H!7 LOST CARRIER

That was an old joke. You probably won’t get that.

First, get alerted when SQL Server finds corruption.

By default, if SQL Server reads a corrupt page from disk, it logs a message, and then keeps right on going. It doesn’t email you, doesn’t show up with a red icon in SSMS, or tell you anything about the impending doom. Let’s get that fixed.

Here’s what you need to do:

  1. Configure Database Mail so it can send you alerts when all hell breaks loose
  2. Set up an operator so SQL Server knows who to email – ideally, this is a distribution list
  3. Set up Agent alerts so SQL Server emails you whenever it detects corruption

Second, get SQL Server to proactively check for corruption.

By default, when SQL Server backs up your databases, it happily backs up corrupt data without throwing an error message, either. We need it to start checking the checksums on each page to make sure they’re actually valid. Wherever you configured your backups – whether it’s maintenance plans, Ola’s scripts, or a third party tool – look for the option to check the checksums on backup. In most cases, it won’t take a noticeable amount more time, and besides, you need the peace of mind anyway.

Then, schedule regular jobs to run DBCC CHECKDB to proactively read the contents of your databases and look for corruption. We’ve got videos on how do to CHECKDB when you don’t have a DBA, when you do, and when you have a Very Large Database (VLDB).

Some folks are hesitant to check for corruption in production because they’ve run into timeouts during CHECKDB. That’s typically caused by bad storage or storage networking performance, and it’s a good sign that it’s time to dig deeper into the storage.

Third, check yo’self before you wreck yo’self.

Download a corrupt database from Steve Stedman’s Corruption Challenge. It’s a ten-week series where he gives you a different corrupt database each week, and it’s up to you to figure out how to fix the corruption with as little data loss as possible. Then, you can compare your answers to the winners and see how they solved it.

I’m not saying you have to actually fix the corruption – just download the corrupt database, and then attach it to your production server. Yes, your production server. I want you to find out if your corruption jobs really report errors, if your alerts actually work, and if your backup jobs are checking with checksum like they’re supposed to.

I would much rather have you learn with a known broken database than learn the hard way when corruption strikes your servers.

Wanna learn more about corruption?

Work along with me in this interactive blog post series:

  1. Let’s Corrupt a Database Together, Part 1: Clustered Indexes
  2. Part 2: Corrupting Nonclustered Indexes
  3. Part 3: Detecting Corruption

And sooner or later, you’re going to hit corruption. Here’s how to react.

In our next episode, we’ll find out who can get you fired.

DBA Training Plan 3: Treat Your Servers Like Cattle, Not Like Pets

In our last post in this series, I talked about why you should be using Ola Hallengren’s maintenance script for backups and sp_DatabaseRestore for faster restores. These two tools have implications for how you build, back up, recover, and fail over your SQL Server.

There’s a saying in the cloud: treat your servers like cattle, not like pets.

What this means for you, in graduating level of difficulty through your upcoming career:

1. Server setup isn’t done until you’ve tested and timed a point-in-time rebuild & restore. When you’re done setting up the full and log backup jobs on the new server, but before you’ve gone live, restore your real production databases. Run the full and log backup jobs, producing a few sets of backup files in their soon-to-be-normal location.

Now here comes the fun part: shut the server down, and start the clock. Your mission is to bring all of the databases online, with their databases up to a specific point in time – say, 5 minutes ago. Do it, and time how long it takes you. No, you’re not allowed to turn that original server back on – he’s broken. (Say for example that he’s having problems with a Windows patch, and the business is tired of waiting for the sysadmins to figure it out – they need you to start Plan B now.)

You’ll learn a lot as you step through this. It feels overwhelming at first, because you’ve probably been building your servers manually, stepping through the SQL Server install GUI manually, setting up things like Cost Threshold for Parallelism, your favorite trace flags, and SQL Agent jobs. That was fine when you weren’t a Database Administrator – but now that you’re moving towards becoming a full time DBA, you need to think about a better long term plan. That means…

2. Servers should be built from well-documented checklists. As you go through phase 1, you’ll realize that you really need to be writing this stuff down as you go. Documentation is a gift to your future self – not just in the sense that it makes your job easier, but it lets you pass work off to others when you need help. During an emergency, when you’ve lost 19 SQL Servers at once (hoo boy, August 2007), you’re able to hand different checklists off to different coworkers and say, “You, follow these checklists and tell me when you’re done.”

This affects your daily database management, too: I’ve seen DBAs apply a trace flag or sp_configure setting during an emergency in order to fix something, but then not document what they just did. When that SQL Server goes down hard, and they build a new one, they simply don’t remember every setting they made to the old server along the way. When the new server comes up online, and has the same performance issues they had a long time ago that required the unique setting, they have to go through the same old troubleshooting process. Bottom line: don’t make undocumented changes to servers – leave a bread crumb trail behind in your server’s documentation or build checklist instead.

I’ve heard folks say, “It’s no big deal: I can just copy the settings from the old server to the new one.” Not when the main server goes down hard, you can’t, nor when you’re dealing with a disaster recovery situation because your primary data center or availability zone is down.

3. Long term, servers should be built and user databases restored with automation. As you start to manage dozens of database servers, checklists don’t scale. You need a way to turn manual steps into automated actions so that you’re not sitting around waiting for the next progress bar to move to the end. However, this is where my advice starts to get a little fuzzy: the SQL Server product has sprawled out to a huge coverage area over the last decade (SSAS, SSIS, SSRS, clustering, Availability Groups, Linux, and now Kubernetes) and there isn’t a single desired-state installation & configuration tool that covers the whole surface area.

If you want to get to automated builds & restores, the closest thing right now is DBAtools & PowerShell, but just be aware that while we call this “automation,” it’s more about automation tooling than just something you download and click Go. Think of it as a box of tools, and you still have to build out an automation solution that works well for you. Thankfully, the DBAtools community is welcoming & hard-working, and they’re constantly making the tools better.

4. Even longer term, if it’s automated, it’s testable. For decades, serious database administrators have automated their own fire drill testing. They’ve built standalone SQL Servers that restore their production backups, run CHECKDB against them, and then send reports of success or failure to the DBA. This gives you a canary-in-the-coal-mine approach to detecting backup problems early.

Your Homework

Database Reliability Engineering – good buy

This post is a little shock-and-awe.

But it’s also a fork in the road: do you want to specialize in being a production database administrator (who makes sure the SQL Server is online and the database is accessible), or do you want to specialize in database development & performance tuning, making sure the queries run quickly?

These are really two different careers: people who tell you they do both are lying, because they suck at one of them (or maybe even both of ’em.)

If you want to specialize in production database administration, your future learning path focuses on things like PowerShell, scripting builds, monitoring for server configuration changes, applying patches in a controlled and recoverable manner, and engineering reliability into your infrastructure. If you think this is the route you want to take, pick up the book Database Reliability Engineering to get a glimpse of what your next ten years will look like. I don’t think there’s ever been a more exciting time to take that leap.

If you want to specialize in performance tuning, we’ll be talking more about that over the coming weeks as well – stay tuned.

If you’re not sure yet, that’s okay. You can always come back to this fork again later. For several years, I chose to focus on production database administration because I really liked building servers and troubleshooting ’em. Then back around 2015, I came to the fork again and decided to give up on that line of work and focus on performance tuning instead. I love the feeling of making a user’s query faster, or even better, making an entire workload go faster. (Plus I was sick and tired of getting called after hours for emergencies!)

In the next post, we’ll talk about the next job duty for production DBAs: checking for corruption.

DBA Training Plan 2: Backups (And More Importantly, Restores)

When I first started out as a SQL Server DBA, I thought things were going well as long as the backup jobs ran successfully. I’d go into SQL Server Agent every now and then, make sure they were still running, and … that was the end of it. I figured if disaster ever struck, I’d just do a restore. How hard could it be? I was missing the entire point.

Nines don't matter if users aren't happy

I bought two of these as soon as I saw ’em. Seriously. Shout out to @mipsytipsy.

In theory, we design our backup strategy ahead of time with 5 Simple Questions About Your Backups, and we’ve memorized the 9 Letters that Get DBAs Fired: RPO, RTO, and CYA.

In practice, small disasters strike all the time when we’re not ready. The most common reasons to do restores aren’t to revive an entire server – it’s just to get back a few small tables or an individual database. Somebody ran the wrong DELETE statement or dropped a database in production instead of development, and next thing you know, we’re all scrambling. Let’s think through a few things ahead of time to make the crisis easier.

So since the only reason we do backups is so we can do restores, so I’m going to talk about the restore process first before I talk about backups.

Where to Do Restores

When you’re restoring code (stored procedures, views, triggers, etc) or individual tables, don’t restore onto the production server. I don’t like touching production servers more than I have to, and let’s face it – you’re already having a bad enough day as it is. That’s why you’re doing a restore, remember? So when you only need to pull a few objects or tables out, let’s do our work on a different server (like dev or QA) and leave production as it is. I’ve also written about restores in my ideal dev, test, and production environments.

After we’ve safely restored the right data onto another server, it’s easy to copy that data across to other servers. For simplicity and security, you can set up a linked server on the production box with read-only access over to the restore server. Then, from production, you can run INSERT statements using a SELECT sourced from the linked server tables.

However, if you’re restoring tables (not databases) over 10GB, you’ll probably want to do the restores directly on the production server to make the data copies faster. Just make sure you’re extremely careful with the scripting and the database names – we don’t want to restore over the top of your working production database.

This may require adding extra space to the production server. In one emergency, I freed up the necessary space by shrinking all of TempDB’s data and log files down to just 1MB. TempDB was on fast drives, perfect for a one-time emergency restore, and that particular server didn’t have any other activity happening due to the outage. We’re not always so lucky, but it helps to think out of the box like that.

A word of warning: if referential integrity is involved, like if you’re trying to restore tables that have relationships to other tables that you’re NOT restoring, then you can be in for a world of hurt here. We’re not going to cover that scenario – it really is different in each case.

Doing the Restore

Big picture, your restore process goes like this:

  1. Restore the most recent good full backup
  2. Restore the most recent good differential backup
    (you don’t have to restore all of the differentials)
  3. Restore the transaction log backups since that differential, in order
  4. Run RESTORE WITH RECOVERY to bring the database online

As you go through steps 1-3, make sure you use the WITH NORECOVERY option as you do restores. This leaves the database in a restoring state so that you can continue to apply additional backups to it. If you forget those two key words or accidentally use WITH RECOVERY, your restore is done at that point – you can’t apply additional backups. If you get that wrong, your restore has to start over again from scratch, so please, for the love of all that’s holy, double-check that option before you start the restore.

Doing all of this with the GUI sucks. The more backups you have, the longer this takes, and the more likely you are to run into errors. Instead, what you need is a script that looks at all of the backups in a folder, plucks out the most recent relevant files, and restores them for you automatically, in order, and that’s where sp_DatabaseRestore comes in.

Designing Backups for Faster Restores

sp_DatabaseRestore requires the database backups to be laid out a particular way in advance, though. It’s designed to work best with Ola Hallengren’s free maintenance scripts, which can handle backups, checking for corruption, doing index maintenance, and more.

In the past, you’ve probably used the built-in maintenance plans for backups. Those aren’t a bad way to start – I mean, compared to the alternative of not having backups at all – but now that you’re going through a full blown DBA training plan, it’s time to graduate to something more powerful. Ola’s scripts are much more flexible and capable than maintenance plans, even integrating with 3rd party backup apps your sysadmin team might be using.

Your homework for this episode:

In the next episode, we’ll think about the implications that these things have on the way we do our backups – how often we back up, how we approach failovers, how we do disaster recovery for the cloud, and more.

And let’s pause here for just a moment to notice: isn’t it cool that all of the stuff above is open source? It’s all totally free, built by other members of the database community just like you. During the course of this training plan, we’re going to be referring to a lot of community & open source stuff. As you work with it, think about joining the communities for each of those tools to meet other folks like you and start expanding your career horizons.

If you’re a member of our Recorded Class Season Pass and you want to watch the training videos on this stuff, here are your next steps for learning:

In the next post, we’ll talk about why you probably won’t want to restore the master database.

DBA Training Plan 1: Build a Server Inventory

At your company, walk into the VP of Sales’s office and ask them how many salespeople they have.

NO, I mean, don’t actually DO that, because he’s going to ask you why the sales app is so slow. But I mean, imagine if you COULD walk into his office and ask him that. I bet he would have an instant answer. He wouldn’t wait for a single moment. Or walk into the CEO’s office and ask how many employees he has. Or ask the CFO how much the annual budget is.

My point is that when you’re in charge, you need to know exactly what you’re in charge of.

Make a Spreadsheet Inventory

Let’s start by making a spreadsheet. Across the top, make columns for:

  • SQL Server Version (2017, 2016, Azure SQL DB… well, maybe 2008 too, sadly)
  • Edition (Standard, Enterprise, Developer)
  • Environment (Production, QA, development, disaster recovery)
  • Department (sales, HR, accounting, IT, mixed use)
  • Business Users Affected (list of people to email when the server dies)
  • Application Names (internal or external product names)
  • Plan B

That last column gets a little tricky – it means, if this server dies in a fire, what’s our Plan B? Are we going to restore the databases from another server? Will we fail over to a log shipped copy? Or will we update our resume and head out for an early lunch? As we go farther into the training, we’re going to get much more specific about Plan B.

There’s no wrong answers here: this step is about understanding where we’re at today, not where we’d like to be. We’re never where we’d like to be. (Me personally, I’d like to be at a poolside bar right now, but noooo.)

If you’d like to get ambitious, add additional columns for Core Count, CPU Count, and Memory. The core and CPU counts will get you a head start on licensing, although I have to confess that we’re not going to cover licensing as part of our training plan. To get even more ambitious, read the comments on this post about what should be in your DBA inventory.

What We’ll Do With This Spreadsheet

Right now, you probably sleep well at night thinking you know everything that’s happening in these servers. Hoooweee, have I got bad news for you. Over the next six months, we’re going to progressively add more and more columns to this spreadsheet as we learn more about our environment, uncover problems, and learn how to solve them.

For bonus points, add a column for What Scares Me. Write a quick note about the one thing that scares you most about this server. Maybe it’s blocking problems, maybe it’s the failing jobs, maybe it’s code you don’t understand. Six months from now, I bet you’ll be proud of how this column has changed.

How to Survey Your Network for Servers

Put a row in the spreadsheet for every server you have – whether you’re in charge of it or not. We want to start with a good inventory of what we have.

The easiest way – if you’re just starting at a company and you have no idea what the network holds, check out the free Microsoft Assessment and Planning Toolkit. It’s designed for licensing compliance, but it works great for building server inventories. It scans your network looking for whatever programs you pick, but just confine it to SQL Servers only.

The PowerShell way – if you’re into PowerShell, Kendal Van Dyke wrote the open source SQL Power Doc to discover instances on the network and export the list to Excel. He since joined Microsoft and the tool hasn’t been updated much, but it may be enough to get the job done if you’re comfortable with PoSH.

The GUI wayIdera’s SQL Inventory Manager ain’t cheap, and I haven’t used it myself, but if you want a sealed-box-product tool to do it, there you go.

Got another way? If you’ve got a way that you’ve used successfully, leave a comment so we can add it to the list for other readers. Bonus points if it’s free, open source, and/or super-easy to use.

In the next post, we’ll use this inventory spreadsheet to assess how bad of shape we’re in.

Welcome to the DBA Training Plan.

Foreword: 8 years ago, I launched an email series with a 6-Month DBA Training Plan. I sent one email per week, bringing people up to speed on the most important topics that nobody taught ’em along the way. It’s hard to believe it’s been 8 years! This month, I’m revisiting the emails, updating their content, and publishing ’em as blog posts too to make ’em easier to find. Buckle up: here come 24 straight blog posts to take you from zero to…well, at least a hero who’s smart enough to wear the underpants on the inside.

I learned to be a DBA the hard way. The hard, crappy way. Our SQL Server was in trouble, and I was the kind of person who would roll up my sleeves and figure out whatever was broken. Next thing you know, I was the one responsible for managing it.

And boy, did that suck.

I didn’t know about any free videos or blogs or e-books. I didn’t have the budget to go to a class, and even if I did, I didn’t know where to find a good one. DBA wasn’t in my job title, and it wouldn’t be for years.

I want to make your learning experience much better than mine was.

Every Wednesday for the next six months, you’re going to get an email introducing an important DBA topic. We’ll cover the basics of the topic, plus link to our favorite free training material on that topic.

Your journey will start with Ozar’s Hierarchy of Database Needs – like Maslow’s Hierarchy of Needs, but it’s more about databases than pooping. (Although we’ll probably sneak in a few poop jokes.)

In the next six months, we’ll take you from the bottom of the pyramid up to the top. You may not be able to fix everything in your environment during those six months, but at least you’ll understand the work involved and how to confidently get started. It’s all about making your journey to Professional Database Administrator easier than ours was.

If You Have Questions

Oh, and you will! When you have questions about what you’re reading, start by Googling your questions. It sounds obvious, but you’d be amazed at how much good stuff there is out there to help. (I’m not being sarcastic. This is exactly how we get started whenever we have our own questions.)

If you’d like to post a question, try DBA.StackExchange.com or SQLServerCentral’s forums. Yes, both of these require registration, but they’re totally worth it. On both of these sites, there’s hundreds – sometimes thousands – of people who are itching to help answer your questions. They react fast, too – make sure to go back and revisit your question every 10-15 minutes for the first few hours to see what’s happening. Answer their clarification questions, and include as much detail as you can. For more instructions, read Getting Help with a Slow Query.

If you still can’t get the answers you need, hit Reply to any of our emails. They come from a real email address manned by, uh, me – Brent. This isn’t one of those emails where it says, “Don’t hit respond because nobody cares.” Seriously, I care, and that’s why I put these emails together. Just please don’t use that as your FIRST resort – there are only so many hours per week that I can spend answering questions. By using the above methods first, you’ll be able to leverage the whole community’s expertise instead of just me. I’m all about teaching you how to fish.

On to Part 1: Building a Server Inventory

Training Class Update: Fancier Audio/Video Setup

I’m constantly looking for ways to raise my game, so I made some investments this month to take my video class audio & video to the next level.

I’m now doing green screen effects on the recordings. I picked up an Elgato retractable green screen that mounts to the wall so that I can pull it down when I’m doing a training class, and roll it back up when I’m doing normal client work:

Home office 2019

I’d used floor-standing green screens in the past, but they were really frustrating for a lot of reasons: wrinkly, tough to get crisp background separation, had to light them separately, took up a lot of space on tripods, etc. Because of the hassles, I abandoned it for several years. This Elgato one totally pays off:

Old setup:

New hotness:

Second, you’ll notice that the recording doesn’t have a big microphone in the way. That’s because I’ve switched to a Sennheiser wireless lapel microphone hooked up to my computer via a Focusrite Clarett USB interface.

The lapel microphone picks up more room noise, so I’ve also added the iZotope RX 7 audio plugin to my production process to clean up the audio. It integrates well with Telestream Screenflow, the screen capture tool that I use, kinda like Camtasia. (Screenflow is a little more polished than Camtasia in that it allows for things like these 3rd party audio plugins.)

Between the chroma key (green screen) effect and the audio plugin, though, my CPU goes up to 100% during video editing as I scrub around the timeline, editing stuff, and the fan kicks off. I try to edit every class’s videos together during the bio breaks so that the day’s training videos are encoded/uploaded as soon as class finishes. This personal goal keeps my editing timeline tight, minimizing the amount of effects & things that I can add. I’m riiiight on the edge of what I’m comfortable doing with laptop power – I’m only on a 2018 6-core MacBook Pro, and the newer ones are calling to me. I’m waiting to see what the rumored 16″ version looks like.

The green screen effects and audio clean-up only happen in the recordings: live attendees see the green screen behind me, and the audio isn’t quite as crisp (since the cleanup happens in post-processing.) The next level-up will be figuring out how to run those effects live. Streamers do the chroma key thing with OBS, and I’m starting to privately experiment with some live streaming of coding, so we’ll see how that goes. OBS is great for public streams, but I need to figure out how to make it work well with private invite-only training classes.

Got a Live Class Season Pass or Team Membership? You can start watching the latest Mastering Query Tuning class’s fancy recording now, and the recording of today’s Performance Tuning in 21 Demos class will be up this weekend. Enjoy!

[Video] Help! SQL Server 2008 is Still Here!

Your company is ignoring the news: SQL Server 2008 and 2008R2 are officially out of support as of today, but nothing’s changing at your company. You still have SQL Server 2008 in production, and you’re a little nervous. How should you approach the conversations with management? Brent Ozar will help: he’s been there too, fighting managers who want to leave an old server hiding in the corner. We’ll role play: you be the manager, and Brent will show his techniques for convincing management to take the plunge and move to a newer SQL Server version.

Slide deck:

Using Implicit Transactions? You *Really* Need RCSI.

Implicit transactions are a hell of a bad idea in SQL Server: they require you to micromanage your transactions, staying on top of every single thing in code. If you miss just one little DELETE/UPDATE/INSERT operation and don’t commit it quickly enough, you can have a blocking firestorm.

The ideal answer is to stop using implicit transactions, and only ask for transactions when you truly need ’em. (Odds are, you don’t really need ’em.)

But if you can’t change the code, you need the delicious goodness of Read Committed Snapshot Isolation. Here’s a video demo showing why:

You can learn more about RCSI, its gotchas, and how to implement it here.

SQL Server 2008 and R2 Support Ends Tomorrow. Let’s Talk About That.

Yes, literally, extended support ends tomorrow.

  1. Take this 2-question poll about the oldest and newest versions you’re running
  2. Read the poll results
  3. Register for my DBA Fundamentals webcast tomorrow about 2008, and we’ll talk about the poll and the results

I’ve got a pretty good idea of what the poll results are going to look like, and it won’t be pretty. During my conference sessions, I often ask for a show of hands about the oldest version that people are still supporting. There are usually a *lot* more people running SQL Server 2008 & R2 in production than are running the most recent version.

I understand that there are a lot of reasons why 2008 & R2 are still here:

  • SQL Server just works – like anything in IT, it works until it doesn’t. As long as it’s still working, companies are hesitant to spend money on it.
  • SQL Server licenses are expensive – companies often bought it, but then didn’t buy Software Assurance because they didn’t have any plans to upgrade anytime soon.
  • Migrations are expensive – involving planning, prep work, and downtime.
  • Testing is expensive, too – developers aren’t sure if the new version will break their application. When given the choice between paying for testing SQL Server 2017 versus paying for development of new application features, users often pick the latter.
  • Change equals risk – if we’re not testing the app, there’s a risk that it’ll break on a newer version of SQL Server. (I think that risk is exceedingly small as long as you stick to the older compatibility level, but there’s an educational cost to get users to understand that.)

And for me, maybe the biggest reason is that database administrators believe that they’re powerless to change the business’s mind. I think you’re wrong: I think you have a little bit more power than you’re aware, and in the DBA Fundamentals webcast, I’ll talk about how you can exercise some of that power.

Poll Results: Yep, Developers are Using Unmasked Production Data.

Earlier this week, I asked where your developers are getting their development databases from. The poll results are in:

  • 9% of development happens on the production database server
  • 57% are copying the production data to another server, and using it as-is
  • 31% are copying the production data, but then masking/scrambling private data before developers get access to it
  • 25% are using made-up data for development

The totals add up to over 100% because y’all were allowed to pick multiple responses when development was happening in more than way in your organization.

To put it another way, 2/3 of the time, developers are seeing the production data as-is.

You left comments, too.

And I got a chuckle out of a lot of these, so I’m putting in my favorites verbatim. Based on the comments, you can probably guess what their answer was:

  • A Dev environment would be nice
  • Also, it’s on their own workstation.
  • and it pains me so.
  • And then we have problems with the test DB being tiny and developers missing needed indexes.
  • Application Development is a copy of production. Reporting teams develop directly in production environment.
  • But the data is of poor quality!
  • data is generally not copied from production database, but is loaded via etl from production sources
  • Depends on what i am doing, if it’s major then i will work on a copy of prod on my dev server but most of the time it’s just straight prod. I will caveat this by saying i’m a one man band here, doing dev and dba in a small company.
  • dev databases use to be replicas which caused devs to come up with stupid work arounds to deal with the possible overwrites and sometimes promote code to production that queried both dev and prod – or sometimes write dev code to production to replicate it down. am forcing them to use stale data now that is only updated ad hoc after a request, and the response to requests to get the data updated is usually “no.”
  • Developers are also Tier 2 & 3 support.
  • Development databases that import data from 3rd party databases, which in turn may or may not have sensitive data deleted/masked.
  • Gonna get better, soon. Boy, the users will hate it. Screw ’em. Growth hurts.
  • hooooooooooooooooooooooooooooboy
  • How do we remove the data but keep the data?
  • How does one get the devs to work in test after they’re all working directly in prod? I just started here…
  • I don’t always develop, but when I do, it’s in production. 😉
  • I try to keep my working development data sounding at least vaguely realistic, Pete’s Pretzels, Cyndi’s Cinnamon, Bill’s Barley, etc. so I can tell from names what relationships there are between different data items as I see them, and if a client drops by, it won’t be terribly embarrassing, and they won’t be confused by “Company 1” and “Company Xyz” that makes “Product 1” and “Product Test 3” and so forth.This way, too, I can be sure that no live data accidentally gets into my local database, and if I have to set the flags on it to pretend to be “production”, for testing, emails will never inadvertently sent to real clients.The downside is I can never generate more than a tiny fraction of the volume of data we have in production, so troubleshooting performance issues requires working in production.
  • It depends! if its a brand new app then no live data exists. if its to fix a bug related to specific data then a copy of prod – that may or may not get deleted
  • It is a copy of production data but we do not store sensitive data in our production database.
  • It’s a copy of Production from before the days when most of today’s functionality existed, therefore it has a lot of made up data and test orders created by developers and QA personnel.
  • it’s a pain in the neck having to generate so much synthetic dev data (c;
  • Its where the best data is 😉
  • Masked prod data is useful but too large for Dev env. Prod level data is tested at the prod support deploy stage
  • moving to dev db with make up contents soon
  • No sensitive data in production database
  • Not refreshed on a schedule so very out of date.
  • Oh – how we laughed!?!?!?!
  • Our client’s production data is free from PII, so that is not a major concern. We do try to use updated copies of production data whenever possible. In our case it doesn’t need to be up-to-the minute accurate data – it can be many months old and still meet our development and testing needs.
  • Production = a database per customer so there is no copying a single production database
  • Production copies with masked data are only used when needed to debug complex issues which are data dependent and unable to be reproduced on our standard development data.
  • Sensitive data remains unchanged, and devs are given full read access. Go figure.
  • Small development team; no sensitive data
  • small team — devs == operational staff
  • so much prod data in violation of internal security policy as well as good practices.
  • Some sensitive data is removed but not all.
  • Sometimes a combination, depending on how much the developer is trusted
  • Sometimes some of the above, but mostly a dev database on a dev server, but with data imported as per prod import process. Sometimes obfuscated, depending on data and client.
  • Switch between full prod backup for debugging and blank with only reference data for unit/integration testing
  • “The “made-up” contents may mimic the patterns we see in production (e.g. some records may have only minimum fields populated, the description fields may have really large HTML comments copied from a webpage or an E-mail, etc), but are fully synthesized. Our legacy applications used restored copies of production data, but anything new that we have developed in the last 5-6yrs use synthesized data.
  • The data needs to match across multiple data sets
  • The database is updated once a year, developers will add their own test data as well.
  • the Dev/UAT/SIT databases can’t be rolled back at all due to interaction with other servers!!
  • The production database, in 2–3 different schemas that mirror the production tables structure. Don’t ask!
  • There’s almost no sensitive data in the production database
  • They actually all work on a local instance. We are trying to change that to a centralized development database. There was no data team here before!
  • Trying to go to a model where Dev is Prod with sensitive data removed.
  • usually sensitive data is not stored in my databases and does not make it into the SQL production databases.
  • we are trying to move to masking sensitive data
  • We do sometimes run production data through the development database. Purging it when testing is done.
  • We have a single database structure used by 150+ different companies with very different data profiles, so using a copy of live data is a must in some circumstances. If we just had made up test data, we could easily miss potential problems. We only have 4 developers in the company, one of whom is the boss and 2 of us do db development/dba work as well as code development, so there is a fair bit of cross over of job roles.
  • We have Dev + QA. So they work in Dev which is “A development database with made-up contents, not restored from production” and test in QA “A development database copied from production”. Right at this moment i am working on a FULL refresh of the QA system from production. All databases with production data.
  • We have development databases on a “production” server – it mostly just houses non-client (internal use) databases. Increasingly, our developers are using Docker with a database with fake contents in a Docker image.
  • We have multiple “development” environments for the various phases of development (QA, UAT, Staging) and the later environments are copies of production with sensitive data removed.
  • We really do not contain sensitive data in our SQL Server environment. However if that changes we would mask the sensitive data.
  • We require financial info to be tied back to it’s respective client, employees and other reporting systems. Masked data would unfortunately be nonsense for our testers.
  • We want to go to the made-up content, but the developers don’t take the time to define the test case data.
  • You skipped, “development database on prod server”

These are all good comments about hard problems.

I wish there was an easy, free, quick answer to solve these problems.

There isn’t one: the answers involve process changes and extra labor. I get the feeling this situation is going to continue for a decade or more. I salute those of you who are fighting the good fight to keep data safe from breaches and stolen backups. It ain’t easy.

Updated First Responder Kit and Consultant Toolkit for July 2019

Nothing too dramatic, mostly bug fixes this month. Been pretty quiet on the development front the last couple of months. Some stability is nice around here, eh?

To get the new version:

Consultant Toolkit Changes

  • Improvement: the sp_BlitzCache recent compilations tab is now populated first to make sure that it shows user queries, not queries from the Consultant Toolkit itself.
  • Improvement: if deep dive is turned on, also turn on sp_BlitzIndex’s @BringThePain = 1. This gathers data for over 50 databases.
  • Fix: on the Plan Cache tab, the contents of column G are now accurate. (They were being overwritten by the contents of column H before.)

sp_Blitz Changes

  • Improvement: easier troubleshooting when implicit transactions are running. (#2042)
  • Improvement: @OutputType = ‘XML’ will now export the entire result set as a single XML field. (#2047, thanks Rich Benner.)
  • Fix: fixed link typo on alerts URL. (#2055, thanks Rich Benner.)

sp_BlitzCache Changes

  • Fix: excludes readable secondaries on 2016 by default. Otherwise, it was throwing an error when the database wasn’t available without specifying read-only intent. (We could probably improve this code to check for that as well.) (#2027, thanks sivey42.) Note that Adrian Buckman is still having issues with logging sp_BlitzCache to table on a readable secondary, and he’s tracking that down in issue #2072 if you want to give him a hand over there.
  • Fix: many display fields are now rounded to ints (rather than formatting with the money datatype.) An example of the output is below – the top is before, below is after, showing the new rounded numbers. (#2046, thanks Ian Manton.)

sp_BlitzFirst Changes

  • Improvement: easier troubleshooting when implicit transactions are running, and now only shows lead blockers. (#2042)

sp_DatabaseRestore Changes

  • Fix: when @BackupPathFull was null, no transaction log backups were being applied. (#2036, thanks Troy Jennings for the bug report & code.)

sp_ineachdb Changes

  • Improvement: compatibility with 2008 and R2, VMware snapshots where @@SERVERNAME returns the wrong name. (#2061, thanks sm8680.)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes us!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.

Yet Another Way Missing Index Requests are Misleading

Graduates of my Mastering Index Tuning class will already be familiar with the handful of ways the missing index DMVs and plan suggestions are just utterly insane. Let’s add another oddity to the mix: the usage counts aren’t necessarily correct, either. To prove it, Let’s take MattDM’s Stack Exchange query, “Who brings in the crowds?” He’s querying to see which users’ questions bring in the most new views:

When I run that against the Stack Overflow database without any indexes on the Posts table, SQL Server 2017 CU15 (the latest) helpfully pipes up with a missing index request.

Make that two missing index requests.

And they’re identical twins:

Identical twins

HEY BUDDY!

In my head, I hear Clippy saying:

“HEY BUDDY! IT LOOKS LIKE YOU’RE TRYING TO QUERY BY COMMUNITY OWNED DATE AND POST TYPE ID! YOU KNOW WHAT WOULD MAKE YOUR QUERY 97.73% FASTER? AN INDEX ON THOSE FIELDS!”

“OH AND YOU KNOW WHAT…”

“WHILE YOU’RE AT IT, YOU KNOW WHAT WOULD MAKE YOUR QUERY 98.1201% FASTER? AN INDEX ON COMMUNITY OWNED DATE AND POST TYPE ID! DOESN’T THAT SOUND AMAZING? WHAT DO YOU MEAN YOU’VE HEARD THAT BEFORE?”

This artificially inflates the missing index DMVs.

When this query runs just once, sys.dm_db_missing_index_details records not one, but two requests for this same index, so sp_BlitzIndex looks like this:

sp_BlitzIndex output

The query only ran once – but it’s showing 2 uses for that missing index. <sigh>

There are parts of me that would love to see inside SQL Server’s source code to know how this kind of thing happens. The other parts of me are glad that I don’t know.

Menu
{"cart_token":"","hash":"","cart_data":""}