Blog

How to Get Your Very First Full Time DBA Job

It's about time.
It’s about time.

Step 1: get a job as a developer or a sysadmin.

Step 2: stand near the database.

Step 3: wait.

Almost every DBA I know got their start this way. Sooner or later, something went wrong with the database, and management grabbed the nearest person and yelled, “You! You look smart and helpful! Can you figure out how to get this thing back online?”

This person then spent more and more time doing stuff related to the database until they began calling themselves an Accidental DBA. At that point, the acronym DBA still wasn’t on their email signature, and they still didn’t feel comfortable using that acronym by itself. They were still a developer or sysadmin.

But wait a little longer, and that person’s next job – either at the same company, or another – was full time DBA.

There are no shortcuts. Here’s why.

Almost everyone who isn’t a DBA thinks they can somehow shortcut this process. They think they can present a near-empty resume and say, “I’ve learned a lot about databases by studying them and playing around with them. Give me a job managing your databases.”

Thing is, when companies have enough databases to hire a full time DBA, they’re saying, “Our data is valuable enough to warrant a full time professional watching over them.” That means they don’t want someone without real-world experience. After all, if they have that many databases, they likely also have a lot of developers and/or sysadmins, and they’ve likely been already filling the Accidental DBA role. The company isn’t likely to hire you in as a full time DBA if they have people on staff with more real-world experience than you.

Certifications aren’t going to get you there, either, because the Internet is chock full of braindumps – sample questions and answers that are sometimes from the actual tests themselves. Seasoned IT managers understand that certifications are icing, not cake – they don’t prove that a candidate can do a job. Certification tests aren’t even related to the job tasks that IT workers perform, as I ranted about on the Away From the Keyboard podcast.

If you want a DBA job, start with the job you have.

Stand near the database server. Look smart. Be helpful.

And above all, remember that the people who will hire you for your next DBA job are already around you.


Cheat Sheet: How to Configure TempDB for Microsoft SQL Server

SQL Server, TempDB
105 Comments

The short version: configure one volume/drive for TempDB. Divide the total space by 9, and that’s your size number. Create 8 equally sized data files and one log file, each that size. Presto, the drive is full and your TempDB is configured for easy performance.

The long version is a little more complicated.

If you’re on SQL Server 2012, get on SP1 Cumulative Update 10 or newer. Microsoft made performance improvements to SQL Server 2014 so that it writes to TempDB less frequently. Because you were so good this year, they even backported it to SQL Server 2012, too. Get on this level – and by this level, I mean the latest updates shown on SQLServerUpdates.com.

Create a volume/drive/array for TempDB. You want this to be a separate volume from your user database & log files – not just for performance reasons, but also for logistical management reasons. Any user can dump a ton of stuff into TempDB without your control, and they can even run TempDB out of space. If TempDB lives on the same volume as your user data and log files, and the drive runs out of space, you can have a tough time restarting SQL Server. However, if TempDB is on its own volume that runs out of space, no big deal – just restart the instance and you’re back in business.

If you’re using a SAN-backed cluster, use local SSDs for TempDB. Since SQL Server 2012, Microsoft has fully supported using local solid state drives even in a cluster. This has three benefits: the local SSD is often cheaper, it’s often faster, and it keeps your SAN path traffic freed up for your valuable user data file & log file reads and writes.

Create 8 equally sized data files. This helps you avoid the GAM/SGAM/PFS page contention issues described in Microsoft KB 2154845. In your journey through the blogosphere, you may have read that at one point you were supposed to create one data file per core. That’s no longer the case – just start with 8, and only add more when you’re facing the page contention issues described in that KB article.

tempdb-uneven
TempDB files need to be sized evenly to avoid page contention.

Size them (and the log file) to fill up the drive. If the drive’s dedicated to TempDB anyway, why leave free space? You don’t want SQL Server pausing user activity to stop to grow the log file out, and you don’t want to have to hassle with possibly uneven data file growth sizes, and you haven’t got time for the pain. Just set it up and walk away with confidence.

Make it easier with this script. Mike Petri created a gist that takes parameters for the size of the drive, number of files you want, and the folder path, and generates the T-SQL for you.

And while you’re setting up the server, check out our free SQL Server Setup Checklist in our First Responder Kit.

Want to learn more? Take my class.

My Fundamentals of TempDB class is for curious folks who want to learn:

  • What uses TempDB, and how that affects performance: not just temp tables and table variables, but also triggers, cursors, sorting indexes, workspace spills, the version store, and more
  • How to host TempDB depending on how you use it: understanding what goes in the log file, the data files, why the number of data files matter, and whether your particular workload should be on SAN, local storage, or ephemeral storage, and why
  • How to monitor it: after you understand what uses it and how to host it, you need to keep tabs on how its performance changes over time using DMVs, Perfmon counters, and the First Responder Kit scripts

Learn more and register now.


How to Talk People Out of the SA Account, Option 2

SQL Server
13 Comments

In my last post, I explained the kind, gentle way to talk people out of the SA account.

Today, I’ll describe a time when Option 1 didn’t work, and I had to get ugly.

Power User: “You can’t change the SA password. It’s embedded everywhere. Everything we have relies on it, and we all use it for all kinds of processes.”

Me: “I see. Alright, well, talk to you soon.”

(The next day)

Power User: “EVERYTHING IS DOWN! THE SA ACCOUNT PASSWORD ISN’T WORKING! DID YOU RESET IT?”

Me: “Of course not. You told me not to.”

Power User: “THEN WHO DID IT?”

Me: “Oh, I have no way of knowing. Anyone who uses the account can change the password with the ALTER LOGIN command. And you said everyone has it, right?”

Power User: “YES, BUT…CHANGE IT BACK. NOW. SET IT TO P@SSW0RD1 THE WAY IT ALWAYS WAS.”

Me: “Sure, but I can’t do that kind of thing without a ticket. Just create a help desk ticket, and I’ll set it to whatever you want.”

(The next day, after the password was reset)

A surprisingly helpful read for administrators
A surprisingly helpful read for administrators

Power User: “EVERYTHING IS DOWN AGAIN! THE SA PASSWORD GOT RESET AGAIN!”

Me: “Oh, my. That’s unfortunate.”

Power User: “YOU ARE DOING THIS AREN’T YOU?”

Me: “No, but I’m worried because at this point, if it’s embedded all over the place, someone might have scripted out the ALTER LOGIN command. They might be doing it programatically. It might happen again, at any time. Right when you least expect it.”

Power User: “I HATE YOU!”

Me: “Well, we certainly can’t have that. How about I give you a nice, new, separate account of your own, not SA, and you start using that instead? No one else will have the password but you, and that way you won’t have to worry about anyone else changing it.”

Power User: “NO, I JUST WANT YOU TO RESET THE….ACTUALLY, YES, I SEE WHAT YOU DID THERE. GIVE ME MY OWN ACCOUNT.”


How to Talk People Out of the SA Account, Option 1

SQL Server
29 Comments

So you’ve got a bunch of people using the SA account all over the place. It’s hard-coded into connection strings, embedded in linked servers, and the password is on post-it notes throughout the data center.

Step 1: Create an alternative SA account.

In a perfect world, you’d create a minimally-privileged AD login that only has limited access to specific databases.

However, when you’ve got a toddler running with scissors and razors, sometimes you’re happy just to get the razors out of their hands first, and then you’ll work on the scissors next. One step at a time. Preferably not running.

For now, create another SQL account with DBO permissions on all of the databases involved with the application. (If you’re dealing with multiple different tenants on the same server, give them each their own SQL login.) Let them be complete owners of their databases for now.

Step 2: Explain your concerns to the stakeholders.

If several different departments have databases on the same server, and the SA account is being used, email all of the departments to explain what’s going on:

Hi! Your application, ___, lives on SQL Server ___. During security checks, we discovered that ___ uses a highly privileged account. This means they can drop your database, delete your data, shut down the SQL Server, or leak all of the data to the public.

I’m totally sure they would never do this on purpose – but I’m worried about accidents. I want you to be aware that as long as they’re using this account, there is nothing I can do to prevent accidents like that.

If you’re not okay with that risk, I’ve got an easy plan to secure the server with no disruption to the end users. Are you interested in hearing more about that?

Most stakeholders will hit the roof when they read something like this, and they’ll become your political ally to help get things done. They’ll practically demand the other users to let go of the SA account – but to be ready, you

Step 3: Give the users the new account – and an SA expiration date.

Armed with the political support of your stakeholders, send an email to the SA-equipped users, and copy your stakeholders:

Hi! We’re going to be changing the SA account’s password on YYYY/MM/DD. To prepare, here’s a new account that you can use for your applications: username ___, password ___. That account has full owner permissions on all of your databases.

I know changing your connection strings will take some time, so we made sure to set that SA date two months from now to give you plenty of time to prepare. On the password change date, I’ll be right here in my office for four hours after the change to help you troubleshoot any problems that arise.

Inevitably, you’re going to get grumbles from the users who have to make the change. They may even say they have to be system admin on their SQL Server. Be prepared to offer them options:

  1. Postpone the transition until YYYY/MM/DD, or
  2. Move to your own SQL Server, thereby avoiding the risk for the other stakeholders, or
  3. Get the other stakeholders to sign off that they’re okay with the continued use of SA (but at the same time, they’re also okay with me no longer being on call to support this server, because I can’t support servers where non-DBAs have SA permissions)

That last one takes the most backbone – you have to stand up for yourself. There’s absolutely nothing wrong with the business letting lots of people be SA on the server – as long as they’re in the on-call rotation with me.

Step 4: Only people in the on-call rotation get SA access.

Facepalm Jesus reacts to your SA permissions request
Facepalm Jesus reacts to your SA permissions request

Not along with me – no, in the rotation.

I’ll gladly take more people in the SQL Server on-call rotation with me. And I’m perfectly willing to present that to the stakeholders. The conversation inevitably goes like this:

Me: “OK, cool, so you’ll be in the rotation then? You’re on next week, and I’ll be off. If jobs fail, backups fail, cluster goes down, etc, it’s on you.”

Developer: “No, wait, what? No, I don’t wanna be on call for SQL Server. I’m a developer – I don’t know how to  troubleshoot that stuff.”

Me: “No? Are you saying you don’t know how to manage the database? But you want the rights to do it?”

Stakeholder: “Yeah, no, let’s take that SA permission away then.”

The conversation doesn’t always go that way, and in my next post, I explain Option 2.


sp_Blitz® v44: Reorganized and Reprioritized Results.

SQL Server
25 Comments

Angie Walker, our new Triage Specialist, had the fun experience of sitting in and watching several of our SQL Critical Care® engagements. When I asked her if she noticed any patterns or ways we could improve, she said, “The order of sp_Blitz’s output needs to be reworked for 2016. There’s some stuff at the top that isn’t really high-priority anymore, and there’s stuff that used to be low priority but is now pretty important.”

She was totally right – and that’s what I focused on for this month’s version.

We moved some of the findings into Monitoring, some into File Configuration, set less ambitious thresholds for storage, and removed some things that just don’t matter as much these days.

Our goal is to give you a really actionable set of findings – when you run sp_Blitz®, I want you to be able to say, “Here’s the next things I need to go work on fixing in my environment.”

Grab it and the newly updated sp_BlitzFirst® in our First Responder Kit.


Meet Our Next Consultant: Tara Kizer

Company News
21 Comments
Tara-Kizer-Cartoon
Would you buy a used database from this cartoon?

Tara Kizer is starting a new job today, and I’m proud to say that it’s with Brent Ozar Unlimited.

You might know Tara from:

She has two decades of experience working with SQL Server. I’ll let that sink in for a moment. Two decades. I don’t know if this is true, but I heard that when Santa migrated his naughty and nice lists into a database, Tara did the design.

Brent: Welcome to the team! It’s great to have you here. Twenty years with SQL Server – wow, that’s awesome. It sounds like you’ve found your calling. What is it that calls to you about databases?

Tara: It was a natural fit. When I was a student worker in college doing paperwork, a friend got me in touch with a manager in the IT department. The manager was in charge of the database team, and they were looking for a new student worker. He hired me with zero experience. He liked my attitude and how I presented myself. After I graduated from college, they hired me full time, and the rest is history.

Brent: You’ve been giving back to the community for quite a while too. How’d you get started, and what keeps you going?

Tara: I had some free time at a new job and was eager to learn more about SQL Server. I decided I would start answering questions on the Internet and would seek out questions that I didn’t yet know the answer to. It grew into a passion and later extended into a blog.

Brent: When you’re not working with SQL Server, what are you up to?

Tara: When I’m not working, I am spending time with my family or hiking. Sometimes if I’m lucky, the family comes hiking with me. I also like to camp and run. I love the outdoors.


Still Serial After All These Years

SQL Server
30 Comments

With each new version of SQL comes a slew of new stuff

While some changes are cosmetic, others bewildering, and the rest falling somewhere between “who cares about JSON?” and “OH MY GOD TAKE MY MONEY!”, but not really my money, because I only buy developer edition. Aaron Bertrand has done a better job finding, and teaching you how to find new features than I could. Head over to his blog if you want to dive in.

What I wanted to look at was something much more near and dear to my heart: Parallelism.

(It’s so special I capitalized it.)

In the past, there were a number of things that caused entire plans, or sections of plans, to be serial. Scalar UDFs are probably the first one everyone thinks of. They’re bad. Really bad. They’re so bad that if you define a computed column with a scalar UDF, every query that hits the table will run serially even if you don’t select that column. So, like, don’t do that.

What else causes perfectly parallel plan performance plotzing?

Total:

Zone:

  • Backwards scan
  • Recursive CTE
  • TOP
  • Aggregate
  • Sequence
  • System tables
  • OUTPUT clause

Other:

  • CLR functions (that perform data access)
  • Dynamic cursors
  • System functions

We’ll just look at the items from the first two lists. The stuff in Other is there because I couldn’t write a CLR function if I had Adam Machanic telling me what to type, cursors don’t need any worse of a reputation, and it would take me a year of Sundays to list every internal function and test it.

I’m going to depart from my normal format a bit, and put all the code at the end. It’s really just a mess of boring SELECT statements. The only thing I should say up front is that I’m leaning heavily on the use of an undocumented Trace Flag: 8649. I use it because it ‘forces’ parallelism by dropping the cost threshold for parallelism to 0 for each query. So if a parallel plan is possible, we’ll get one. Or part of one. You get the idea.

Just, you know, don’t use it in production unless you really know what you’re doing. It’s pretty helpful to use as a developer, on a development server, to figure out why queries aren’t going parallel. Or why they’re partially parallel.

All of this was run on 2016 CTP 3.1, so if RTM comes along, and something here is different, that’s why. Of course, backwards scans are probably close to 15 years old, so don’t sit on your thumbs waiting for them to get parallel support.

Backwards scan!

This is what happens when your ORDER BY is the opposite of your index.
This is what happens when your ORDER BY is the opposite of your index.

Scalar!

Not only do they run serial, but they run once for every row returned. Have a nice day!
Not only do they run serial, but they run once for every row returned. Have a nice day!

Table with computed column

Hint: the Id column isn't the computed one.
Hint: the Id column isn’t the computed one.

MSTVF

Multi-statement Table Valued Garbage
Multi-statement Table Valued Garbage

Table Variable Insert

You probably can't name one good reason to use a table variable.
You probably can’t name one good reason to use a table variable.

Top

Top o' the nothin!
Top o’ the nothin’!

Aggregating

You're making me wanna use Excel, here.
You’re making me wanna use Excel, here.

Row Number (or any windowing/ranking function)

You're gonna get all those row numbers one by one.
You’re gonna get all those row numbers one by one.

Accessing System Tables

Twisted SYSter
Twisted SYS-ter

The recursive part of a recursive CTE

This part was fun. I liked this part.
This part was fun. I liked this part.

Picture time is over

Now you get to listen to me prattle on and on about how much money you’re wasting on licensing by having all your queries run serially. Unless you have a SharePoint server; then you have… many other problems. If I had to pick a top three list that I see people falling victim to regularly, it would be:

  1. Scalar functions
  2. Table variables
  3. Unsupported ORDER BYs

They’re all relatively easy items to fix, and by the looks of it, we’ll be fixing them on SQL Server 2016 as well. Maybe Query Store will make that easier.

Thanks for reading!

Das Code


Announcing the Office Hours Podcast (And 78% Off Videos)

SQL Server
5 Comments

Every Wednesday, we get together on a live webcast to take your questions. (We don’t actually give answers, we just admire the questions.) Hundreds of folks attend every week to hear us flail about.

People kept emailing in saying, “I really wanna attend Office Hours, but I’m working/asleep/drunk. Can you turn it into a podcast so I can download it and play it on the way to church/home/therapy?”

Office Hours ArtworkYou got it:

And if you rate us on iTunes before January 31, 2016, we’ll give you a coupon for 78% off any of our online courses. (Why 78? Because we record the show masters on 78 rpm vinyl. We’re hipster like that.)

Here’s how to rate us and claim your 78% off coupon. Enjoy the podcast!


Does index fill factor affect fragmentation?

Everybody wants to know about index fragmentation

It is an inescapable vortex of malaise and confusion. Like that swamp in The Neverending Story that killed the horse. Sorry if you haven’t seen that movie. The horse wasn’t that cool, anyway.

Neither is index fragmentation, but it’s not worth losing sleep over. Or a horse.

I see a lot of people messing with the fill factor of their indexes. Sometimes you gotta. If you use GUIDs for a clustering key, for example. If you don’t lower fill factor from 100, you’re going to spend a lot of time splitting pages when you insert records. GUIDs are hard to run out of, but they’re even harder to put in order.

Setting fill factor under 100 tells SQL to leave free space on index pages at the leaf level for new records to be add to. If you don’t, and a record needs to be added to a page, it will do about a 50/50 split to two other pages.

When does it hurt?

Like most things, not at first. To prove it, let’s rebuild an index at different fill factors, and insert some fragmentation information into a table. It’s pretty easy. Create a table, rebuild the index, insert record to table. I could have done this in a loop, but I’m kind of lazy today.

Put on your thinking cap. Fragmentation percent doesn’t budge. Granted, we rebuilt the index, so that’s expected. But look at page counts. Every time we reduce fill factor, page count gets higher. Why does that matter? Each page is 8kb. The more pages are in your index, the more you’re reading from disk into memory. The lower your fill factor, the more blank space you’re reading from disk into memory. You could be wasting a lot of unnecessary space both on disk and in memory by lowering fill factor.

Space Age Love Song is probably the best Flock of Seagulls song, just so you know.
Space Age Love Song is probably the best Flock of Seagulls song, just so you know.

Let’s do math!

Because everyone loves math. Let’s take page count, multiply it by 8, and then divide it by 1024 twice to get the size of each index in GB.

Even reducing this to 80 takes up about an extra 600MB. That can really add up. Granted, disk and memory are cheap, but they’re not infinite. Especially if you’re on Standard Edition.

Ack! Numbers!
Ack! Numbers!

It’s in everything

It’s not just queries that reading extra pages can slow down. DBCC CHECKDB, backups, and index and statistics maintenance all have to deal with all those pages. Lowering fill factor without good reason puts you in the same boat as index fragmentation does, except regular maintenance won’t “fix” the problem.

You can run sp_BlitzIndex® to help you find indexes that have fill factor set to under 100.


What’s Covered in the Senior DBA Class of 2016?

SQL Server
13 Comments

You’re a SQL Server DBA who is ready to advance to the next level in your career but aren’t sure how to fully master your environment and drive the right architectural changes. That’s about to change in one week of learning and adventure with me, Brent Ozar.

Join me to learn how size, architect, maintain, and troubleshoot your SQL Servers.

During the training, we’ll provide you with breakfast, mid-morning breaks, lunch, and afternoon snacks. Dinner is on you, but during the training, we’ll organize Birds of a Feather groups so you can dine with fellow folks who share the same interests.

Schedule subject to change in the event of natural disasters, food poisoning, exploding demos:

Monday – Enterprise-Grade Processes

Building an Inventory and Support Matrix – You have dozens of SQL Servers, and you don’t have the time to dig into each one of them to understand what’s going on. You need a fast way to explain to management, “Here’s what we’re up against, and here’s how much time I need to get us to a safe place.”

Architecture Deep Dive for High Availability and Disaster Recovery – You’re building a new SQL Server, and you need to protect it. You want to learn when it’s right to use clustering, AlwaysOn Availability Groups, virtualization, or storage replication. You don’t have an unlimited budget, either, so you need to learn how 2014-2016 licensing impacts these choices.

How to Test Your High Availability and Disaster Recovery Setups – Once you’ve built your new HA/DR setup, you need to test planned failovers, unplanned failovers with data loss, and planned fail-backs without data loss. Get Brent’s battle-tested checklists for testing high availability inside the same data center, plus disaster recovery failovers to another data center.

Activity: Deciding Between Availability Solutions – You’ll play architect, and Brent will give you a set of requirements. Work together in groups to sketch out what you believe is the right HA and DR setup, and then see Brent’s recommendations. We’ll discuss the options chosen by different groups, and their pros and cons.

What’s New in SQL Server 2014 and 2016 – Microsoft is pouring on new features – even in service packs! You’ll get a fast-paced tour of what’s new, understand the drawbacks, and learn when you should consider – or avoid – each of the new features.

Tuesday – The Plumbing: Hardware, Storage, and Virtualization

Performance Workload Analysis – Before you try to guesstimate how much hardware a SQL Server needs, you need to understand its workloads. You’ll learn how to analyze an existing server’s Perfmon counters and DMVs to determine whether it’s doing OLTP queries, DW queries, or what the rough mix is between the two.

Server Hardware Sizing – Armed with details about your workload, now you can make a better prediction about the kinds of hardware required. You’ll see the differences between 2-socket and 4-socket servers, learn how much memory Enterprise Edition needs, and understand when local solid state storage is a better fit than shared storage.

Shared Storage for SQL Server DBAs – Your data lives on the SAN, but…what does that even mean? Learn the components of shared storage, what connects them together, and how to do real performance testing to identify the bottleneck.Understand shared storage and its impact on SQL Server, including tiering, snapshots, replication.

Advanced SAN Features – Shared storage offers automated tiering, snapshots, and replication. Learn how these features work, discover why the data and log file choices are different, and understand when these features make sense for your SQL Server sizes and workloads.

Virtualization Management & Troubleshooting – Virtualization changes the way you build, size, and troubleshoot SQL Server. Understand the basic differences, and learn how to use wait stats to identify the most common VMware and Hyper-V bottlenecks.

Activity: Size Your Environment – Give them a list of client requirements, and pick the right hardware size, VM vs physical, local or shared storage, SQL Server version and edition.

Wednesday – High Availability – Clusters, AGs, Mirroring, Cloud

How to Design Quorum for Windows Failover Clusters – Before you touch anything with AlwaysOn in the name, you need to understand how cluster quorum works and how it can bring your SQL Server down. Learn how dynamic quorum and dynamic witness should be configured and see 3 real-world scenarios.

Triaging Failures in Availability Groups – You’ve built an AlwaysOn Availability Groups, and it hasn’t failed yet – so what’s the big deal? Watch as Brent deals with a broken Availability Group, stepping through his thought process during a failure, and get a glimpse into just how complex clustering can become – even with a simple 2-node AG.

Field Medic’s Guide to Database Mirroring – Sure, database mirroring is technically deprecated – but it still works just as great as ever, and it’s still a good option for single-database failover scenarios. You’ll see why we still recommend it, discover a few common implementation gotchas, and learn how to test your failovers before going live.

Cloud HA and DR Options – Your executives have mumbled the terms “AWS” and “Azure” at meetings, and you want to know how to react. Running a SQL Server up there really is different, and just knowing what HA and DR options exist can help you talk more comfortably with management.

Activity: Recovering from Failover – You just failed over from one data center to another – how much data are you really going to lose? Can you get any of it back? You’ll role-play a scenario from Hurricane Sandy using simple pieces of paper and see exactly how aftermath recovery really works.

Thursday – The Daily Grind: Backups, Maintenance, Monitoring

Troubleshooting Backup and Restore Problems – You’re already familiar with full, differential, and log backups – but what happens when someone says they accidentally deleted 15 records an hour ago? What happens when one log file backup disappears? What happens if you need to restore to an exact point in time? Learn some surprising answers before you get bitten by a real emergency.

Optimizing Transaction Log Shipping – In theory, log shipping is easy: just restore your transaction log backups. In reality, you need to understand how to deal with unplanned failovers, reversing log shipping the other direction, and fixing a lagging secondary.

Optimizing CHECKDB – DBCC CHECKDB is easy with small databases – just run it every day. But as your database grows, this quickly stops becoming an option, and you need to start cutting corners. Learn how to cut them safely by running DBCC CHECKDB at a realistic frequency, discover how to run it faster, and see which SQL Server options like backup checksums and page checksums can give you CHECKDB-like protection without the long maintenance windows.

The Right Counters for Monitoring Performance and Availability – Managers say things like “We want five nines of uptime” and “We can’t build an SLA, just make queries fast.” Senior DBAs are stuck in the middle between screaming users and moaning managers. Learn what metrics to monitor in order to reduce the noise.

Index Maintenance for Enterprise Environments – You’ve learned over time that maintenance plans are the wrong way to go, and you’ve implemented index maintenance scripts from Ola Hallengren, the MidnightDBAs, or Michelle Ufford. However, you’re not really sure exactly what you’re fixing, or if it’s the least invasive way to improve performance. Learn the difference between internal and external fragmentation, see how fill factor destroys performance, and discover a simple set of parameters that can keep everybody happy.

Activity: AlwaysOn Availability Group Backup and CHECKDBs – You’ve decided to implement AlwaysOn Availability Groups, but now the really tricky part starts: where should you run backups and DBCCs? You’ll get a set of client requirements, design your own maintenance strategy, and then Brent will throw surprise curveballs at your design to see how it handles various real-world emergency scenarios.

Friday – Troubleshooting and Building Health Check Reports

Design Your Troubleshooting Process – When the brown stuff hits the fan, reach for an easy-to-understand, time-tested troubleshooting checklist. You’ll get a starter document that Brent Ozar Unlimited hands out to their own clients, learn how to use it, and see how to modify it based on your own environment. When you get back to the office, you can give this same checklist to your junior staff and sysadmins in order to make your on-call rotation easier.

Building a Health Check Report with sp_Blitz® – Your end users, managers, and sysadmins want a simple, easy-to-read report that shows whether or not their SQL Server is screwed. Learn the easy process we use every week to build assessment reports for clients around the world.

Building a Performance Check Report with sp_BlitzFirst®, sp_BlitzCache®, and sp_BlitzIndex® – You’ve got a slow SQL Server – but what’s the primary bottleneck? You don’t need in-depth performance analysis, T-SQL tuning advice, or index improvements, but you just need to know where to focus your efforts. Learn how to use our completely free stored procedures to get started performance tuning.

Afternoon: Open Q&A – Bring your toughest senior DBA questions, your project requirements, and your end user complaints. As long as you can talk through your evidence in front of the group, Brent will discuss his approaches to the problem. Everybody learns from real-world scenarios. (Or, if you need to fly out early, that’s fine too.)

Here’s the 2016 Cities and Dates

  • January 11-15, 2016 in Newark, NJ – at the Hilton Penn Station, an easy train ride from the NYC area
  • March 7-11, 2016 in Chicago, IL – at one of our favorite training spots, Catalyst Ranch downtown
  • May 2-6, 2016 in Denver, CO – at The Curtis Doubletree, a really fun, funky hotel downtown with lots of great eating options in walking distance
  • August 1-5, 2016 in Philadelphia, PA – at the Hilton at Penn’s Landing

These are the only cities & dates for that class in 2016, so if you’re interested, here’s your chance to burn up the last of your 2015 training budget – or next week, jump in to make sure you get a seat with your fresh 2016 budget.

Register now. See you there!


Who’s Really Doing Continuous Integration of SQL Server Databases?

SQL Server
52 Comments

One of our training class attendees asked a great question: who’s really doing this stuff? We talked about it in class, and I figured it’d make a great blog post question too.

So if you’re doing CI (automatic deployment of database changes), answer in the comments with:

  1. How many database developers you have?
  2. How many person-hours did it take to get CI working?
  3. What tools do you use?
  4. What benefits have you achieved by using CI?
  5. (New) Is the application deployed in only one database, or across many nearly-identical copies of the same database? (Could be internal or external.)

(Update 12/24 – I added question #5 because there’s a bit of a separate topic there. If you manage an application that gets deployed to customers – either hosted internally or externally – like if you call yourself an ISV, or sell access to these databases & applications to multiple clients – mention that in your answer.)


What happens to transaction log backups during full backups?

SQL Server
27 Comments

TL;DR

Unless you’re on SQL 2000, don’t worry about scheduling log backups during full backups
Log backups during full backups won’t truncate the transaction log
You want to keep taking log backups in case your full backup fails

The first time I ever set up backups

Was, unfortunately, using a maintenance plan. All of the databases were in simple recovery. It was largely used for staging and tenderizing data during ETL. No log backups need apply.

Fast forward a bit, and I’m setting up backups for a server where losing 30 minutes of data could set a project back several hours. We’re now hiring log backups.

You're hired!
You’re hired!

So there I was, dutifully creating extravagant maintenance plans, pondering the miracle of the differential backup, and the grand eloquence of log backups. They were running every 10 minutes, those log backups.

Every 10 minutes.

Even during full backups.

I’m a developer and what is this?

  • What is SQL going to do with those?
  • Do I have to restore them?
  • Should I pause log backups during full backups?
  • Will this break something?
  • How much Laphroaig do I have to drink to forget I thought about this?

This was confounding to me. So I did some digging. Back in the SQL 2000 days, this could have gotten weird. But I was, thankfully(?) on 2005. Heh. Yeah. 2005. I know.

After 2005, you can totally, absolutely, 100% take log backups during full backups. You can take them and nothing will break and you won’t have to restore them (unless your full backup fails).

Wanna see a demo?

Of course you wanna see a demo. This one is a little more complicated than usual. It will require several-SSMS-tab technology.

I usually keep my SO database in simple, because I do horrible things that I don’t want to fully log. Fun fact: if you switch from simple to full recovery model and don’t take a full backup, you’re basically still in simple recovery. You should think about that for a minute and then take steps to avoid getting fired. Backing up SO takes me about 3.5-4 minutes.

In another window, kick off some log backups 1 minute apart. Note that doing this won’t overwrite log backups, it will stack them all within a single file.

You can verify this behavior by running the RESTORE HEADERONLY command at the end of this block. If you want to restore a particular backup out of a file with multiple backups in it, you use the position column and specify it with FILE = [n], which you can read more about here.

Just so you know I’m not pulling any shenanigans, let’s generate some log activity. This will dump 100 rows into a dummy table every 30 seconds. It is neither pretty nor elegant.

When that’s all done, you can run something like this to see what happened. You’ll probably have to replace the date. I wrote this, like, two or three weeks ago by now.

When you take a full backup, the first thing it does it issue a checkpoint. That’s why the full and all subsequent log backups have the same checkpoint LSN. The first four log backups all have the same database backup LSN because they occurred during the full backup. That doesn’t change until the full is done.

RED SQUARES AT NIGHT
RED SQUARES AT NIGHT

For toots and snickers, I ran this all a second time, and cancelled the full backup halfway through. The full backup issued a new checkpoint, so the checkpoint LSN changes, but the database backup LSN never changes, because it got canceled. That means taking log backups during full backups is totally useful. If your full backup fails for whatever reason, these things keep the chain alive.

NOTHING CHANGES
NOTHING CHANGES

If the third time is to be a charm, and it is, the same thing occurs as the first run. New checkpoint LSN, and the database backup LSN runs through until the backup finishes. You can verify that by looking at the start and end times columns.

I'm always convinced that whoever came up with the term LSN Chains really liked 90's Grunge.
I’m always convinced that whoever came up with the term LSN Chains really liked 90’s Grunge.

If you still don’t believe me

Just look at sys.databases while you’re running a full backup.

Yes, I made you read all that to get here.
Yes, I made you read all that to get here.

The result is acknowledged, though not documented, here. This does indeed mean that log truncation will not occur during a full backup even if you take log backups. It will happen when you take the first log backup after the full finishes. You may want to consider this when scheduling maintenance items that may hog up log space alongside full backups.

Recap

Backups are beautiful things. You should take full ones, and probably differential ones, and if you’re in full recovery model, definitely log ones. How often you take them is up to you and your boss. Or maybe their boss. But it’s definitely not up to you. Unless you’re your boss.

Log backups during full backups won’t hurt anything, and may end up helping things if your full backup fails, and you need to restore something.

Log backups during full backups will not truncate the log. That has to wait until the first log backup after the full finishes.


Give Your T-SQL a Semicolonoscopy

SQL Server, T-SQL
38 Comments

In theory, all of your T-SQL statements are supposed to end with a semicolon, like this:

Why? Well, SQL Server 2005’s Books Online says:

Many code examples use a semicolon (;) as a Transact-SQL statement terminator. Although the semicolon is not required, using it is considered a good practice.

You know, like flossing your teeth, and puff-puff-give. But SQL Server 2008 marked a change:

Although the semicolon is not required for most statements in this version of SQL Server, it will be required in a future version.

semicolonoscopyWhoa. When exactly is “a future version?” The current BOL page for T-SQL syntax conventions still has 2008’s vague language, so at least Microsoft has given us a heck of a long grace period to get our act together.

The gambler in me doubts that Microsoft will ever be able to enforce this rule. A ton of legacy T-SQL code would suddenly start failing, which means businesses just wouldn’t adopt a new version of SQL Server (or Azure SQL DB) that included this rule. However, the DBA in me isn’t a gambler, so when I write new code, I try to end every statement with a semicolon.


We Are Ready For Risk-Taking Presenters

June 2, 2011. I woke up around 5:00 AM, got dressed, left my cruise ship cabin and went downstairs to find a place to write about the dream I’d just had: A database murdered. Suspects isolated together on a ship. Technical sleuthing.

I knew immediately this was a presentation I needed to give. It would be a huge departure from anything I’d done before. And I wanted to do it on the biggest stage I knew of: the PASS Summit.

My first delivery in 2013 (left), and at the 2015 PASS Summit (right)
My first murder mystery session at the 2013 SQLSaturday in Denver (left), and at the 2015 PASS Summit in Seattle (right)

Four and a half years later, I did exactly that; I presented SQL Server Mystery Hour: Dead Reports Don’t Talk at the PASS Summit. It’s been a couple of months since the Summit; the euphoria has worn off, and I want to share some of the risks I took — both in getting there and delivering the session — and why that matters to you as a presenter (or potential presenter).

Risks i’m glad i took

Field Testing: I had to run this session several times before feeling reasonably confident I could do it at the Summit. That meant delivering it at SQL Server User Group meetings and SQLSaturdays. They didn’t always go well — sometimes I was done in 45 minutes (too quickly), sometimes I read my lines very obviously just so I didn’t miss something important. I had to learn what worked and what didn’t, and the only way to find out was to do it live.

Call for Speakers: I started submitting my SSRS mystery session to events in 2012, but I also submitted conventional sessions at the same time. I was competing with myself as well as all the other abstracts in that track. This year, if selected, would be my fifth year speaking at the PASS Summit. I hoped my established speaker history might make the Program Committee trust me with a more off-beat abstract (as long as it was well-written). I decided to go all-in this year and submit nothing but murder mystery abstracts. Attempting to up my chances, I did some A/B testing by not explicitly titling them all mysteries (compare the SSRS abstract with this one: Living and Dying by Dynamic SQL). I was incredibly fortunate to get two of them selected for the 2015 Summit.

(Honestly, I’m still shocked they trusted me twice.)

Multimedia: I got a lot of positive comments on the A/V aspect of it. Both sessions begin with a faux Skype call from the CEO (the one and only Buck Woody!) explaining the situation. The SSRS mystery ends with a dramatic re-enactment video featuring the perpetrator. There’s background music while attendees discuss the case. I think that went well and hopefully made the session (and its content) more memorable.

Casting: For a mystery like this to go smoothly, I can’t do all the talking or recap suspect interviews. I have to do them live. I got experienced speakers to fill the cast: Mark Vaillancourt, Jes Borland, Mickey Stuewe, Jason Strate, Gina Meronek, Bill Fellows, Hope Foley, Jason Horner, and Allen White. There’s no way I could have pulled it off without their help. (Thank you, cast!)

risks i wish I had mitigated

Rehearsal: Even though I chose experienced speakers to be the suspects, the sessions felt unpolished to attendees. For SQL Saturdays and user group meetings (where I have audience volunteers read the parts), unpolished is okay. At a major conference, I needed to do more to make sure we didn’t stumble through lines. It’s on me to make sure that, as busy as other speakers are with their own sessions, we get together — if only once briefly — to run through our lines together.

Having said that, spontaneity created the most memorable moments — Mark Vaillancourt’s stream of puns, Jes Borland’s unicorn-flipping exit, and a joke that had Mark crying just a few minutes into the session.

Outline and notes taken during my first dynamic SQL mystery.
Outline and notes taken during my first dynamic SQL mystery. Discussion gaps were problematic right away.

Not planning the gaps well enough: There are breaks in between each chapter of the mystery where attendees turn to each other and discuss the clues and interviews they just saw. This can result in dead time if the groups either veer off-topic or just don’t talk much. I need to do a better job of keeping things on track, perhaps by shortening those discussion intervals.

Pushing the Networking Aspect: The abstract defines the murder mystery as part technical presentation, part networking event. I had roughly 80 and 50 attendees for the two sessions. This was great because people who showed up were willing to talk to others. However, I could have toned down the networking angle and gotten more attendees without their expectations.

risks i wish i had taken

Slide Decks: Before presenting at the PASS Summit, I had given the SSRS session several times at user group meetings and SQL Saturdays. Each time, I’d done them without any supporting slides. No bullet points, no summary slides. The only reasons I needed a projector at all were for the demo and re-enactment video at the end.

At the Summit, I panicked a little and decided not to go with an empty deck. I saw this as a make-or-break year for my mystery sessions and I wasn’t willing to risk screwing it up by forgetting material. I also didn’t want to risk getting skewered by attendees for not having bullet points to follow. I’m not afraid of that anymore (and I shouldn’t have been in the first place, honestly).

Marketing: I went out of my way to say as little about myself and my real-life company as I could. There was already some chatter about the free magnetic poetry we were giving away and I didn’t want to make any more waves. I removed the About Me slide from my decks and didn’t mention my BlitzRS script, even though it would’ve been a natural fit (and of some benefit) to those in my SSRS session. I did have SQL Sleuth badge ribbons that I offered to people for having come to the session, but those don’t advertise anything except the session itself. In hindsight, I could’ve left the About Me slides in without any fuss.

we are ready for risk-takers and storytellers

Whether it’s a user group presentation or a major worldwide conference, our SQL Server community has settled into a comfortable spot regarding session format. The expectation is we sit in a crowd and give a speaker our attention for 60-75 minutes. We take notes, maybe mention something about the session on twitter. That’s perfectly all right.

But what would happen if we took more risks with that model, or broke from it entirely? I found at least one way it can be done. There’s another group who’s been doing something similar (for much longer than I have), presenting a collection of short stories — with demos even! — and the audience absolutely loves it.

We need more storytellers. Audiences love storytellers.

Be a storyteller.

Weaving technical details into a story makes your content memorable for months after it’s delivered. If you have the technical topic in mind but need help with the storytelling or how to convey the material more memorably, read these books:

I’m only beginning to transition from presenter to storyteller. I’m still looking for ways to make the content I share stick in your mind long after the session is over. I’ve found these books to be invaluable and will be working more of their concepts into my 2016 talks.

risks are risky — what if I bomb?

You’ll always be taking risks, but preparation and practice will mitigate the largest ones. You can prepare alone, but if you’re going to try something our community has never seen before, you need to practice with a real-live audience. Find a user group to present to; that’s about as low-stakes as you can go while still having your target audience. If sixty minutes of daring is too much, try thirty. Try ten. Just give it a shot. From there, you’ll get feedback on what works and what doesn’t. Even if you bomb, it won’t hurt much, and you will be closer to realizing your vision.

Stories want to be told

There wasn’t a single day out of the 1,610 days between conception and realization that I didn’t think about this mystery session. I couldn’t get it off my mind. Instead of me having the idea, it’s like the idea chose me — I was just along for the ride.

Is there an idea, a story that has you captivated? Something nagging that won’t let you rest? Stop trying to rest.

Write it!

Accept that you will have to take risks. Know that the risks are worth it. Tell us a story the way only you can. We’re ready for you to take risks.


Improved diagnostics for query execution plans that involve residual predicate pushdown

Execution Plans, SQL Server
6 Comments

I love stuff like this!

Even though it’s not on my list of dream features, it’s pretty neat. Getting new views into what SQL is doing when queries execute is pretty cool. You can read the short and gory details at the KB here: Improved diagnostics for query execution plans that involve residual predicate pushdown in SQL Server 2012

The wording might seem a little odd at first. The phrase “predicate pushdown” might make your face scrunch like you just stepped in something wet when the only thing you have on your feet are your lucky socks, and “residual” might just remind you of the time you left Tupperware in your desk drawer before a long weekend.

I would imagine this is coming to SQL Server 2014 in an upcoming CU/SP, but it hasn’t yet. As of this writing, the newest CTP for 2016 is 3.1, and it is working there as well.

You’re rambling, man. Get on with it.

Let’s take a look at what this ol’ dog does, and how we can use it to troubleshoot query problems. We’ll start with a small-ish table and some indexes.

A million rows and nothing to do

Before the addition of this element, the only way to get any information like this right in SSMS was to SET STATISTICS IO ON, but that only gave us a partial story about the pages read by our query. Let’s look at a couple examples.

Running the above query with the actual execution plan turned on, the tool tip that pops up over the index scan looks like this:

We read all million rows! Bummerino duder.
We read all million rows! Bummerino duder.

This is something Brent wrote about years ago. Great post, Brent! It’s further evidenced here, by the new Number of Rows Read line. We had to scan the index, all million rows of it, to search out the double wildcard LIKE predicate.

If we alter the query slightly, we can cut down dramatically on the number of rows we’re looking at. I get that this changes the logic, but really, you need to take care when allowing people to search full strings like above.

 

Note that we're no longer scanning the index, either.
Note that we’re no longer scanning the index, either.

Granted, getting away from double wildcard searches and towards more sane search methods is a big leap. What if we just tighten our last query’s predicates up a bit? Say that we only needed POs that start with ’00’, and we only needed results since June. We’re filtering on [ProcessDate] to make sure that the order was actually fulfilled, or something. It’s a dream!

That's way less, boss.
That’s way less, boss.

Now we’re down to reading just a few thousand rows to find what we need.

So what?

If you’re on SQL 2012, or, for some reason on CTP 3.1 of SQL 2016, you should take a new look at troublesome queries. Perhaps you can track down similar predicate inefficiency using this new tool. You may be reading way more data than you think. Anything you can do to cut down on data movement will very likely speed things up. The queries, in order, ran for an average of 110ms, 47ms, and 3ms respectively. Small changes can make differences.

To get this to work, I had to be on the SSMS November preview. It wasn’t showing up in other versions of SSMS for me.

I’d also like to thank everyone who voted for my Connect Item. It’s nice to know that 162 of you are at least as weird as I am about tempdb.


How to Make SQL Server Backups Go Faster

At its core, backup speed boils down to just two numbers.

Question 1: How fast can SQL Server read data from disk?

You can measure this by doing a full backup to disk, but use the NUL: option:

This does a backup, but doesn’t actually write it anywhere, so it’s a measure of how fast the data files can be read from storage. If you run this, know that you’re taking a real backup: this backup can break your differential backups.

Question 2: How fast can SQL Server write to the backup target?

You can measure this using disk benchmark tools like CrystalDiskMark or Diskspd. If you’re using a UNC path for your backup target, you may need to temporarily map a drive to get these tools to work. If you’re the only server/user who does backups to that location, focus on the sequential write speed numbers. If multiple servers/users will be doing backups at the same time, focus on the random write speed numbers.

Your backup speed is determined by the bottleneck – either question 1’s answer, or question 2’s.

How to Read Data from Disk Faster

Make the drives faster. Tune your storage. Ah, I wish it was as easy as typing that three-word sentence.

Make it smaller. Use Enterprise Edition’s data compression to shrink the size of your database on disk. Archive tables and indexes that you’re no longer using.

Read it from somewhere else. If you’re using AlwaysOn Availability Groups, consider doing your backups from the fastest replica available.

Back up less data. If you’re doing daily full backups, you can switch to weekly full backups instead, and differentials every day. Just know that the differential by itself is useless – you need to make sure you have both the full and the differential available to do a restore to the most recent point in time.

Back up more often. If you’re only doing log backups once an hour, try nibbling off the logs in smaller increments instead.

How to Write Data to Disk Faster

Use backup compression. This is included free with modern versions of SQL Server, and it gets even better with third party backup tools that often have tweakable levels of compression. Experiment with different settings to find the right compression setting for your biggest databases. It’s a balance of how much CPU the tool uses, versus how much smaller the file output becomes. If your bottleneck is slow writes, and you have lots of idle CPU time, then the tradeoff makes sense.

Totally clears out your backup problems.
Totally clears out your backup problems.

Write to a faster target. While SATA RAID 5 gives you lots of capacity for backups, it may not be fast enough to get the job done.

Don’t involve third party appliances. If you’re using a dedupe appliance like a Data Domain, it may not be fast enough to keep up with SQL Server’s reads. Try doing a temporary backup to a plain ol’ file share, and compare the backup speeds. Then work with your sysadmins to see if there’s a way to tune your backup appliance.

Team multiple network cards together. If you’re using 1Gb Ethernet and backing up to a network share, try teaming multiple 1Gb cards together on both the SQL Server and on the backup target on the other side.

Use multiple backup files. Try striping the backup to multiple files. If your backup target is fast enough to keep up, writing to 4 files can often improve your backup throughput by 4x.

And when you’re all done tuning backups, it’s time to tune restores. After all, a really fast backup doesn’t matter when the system is down – the business is more concerned about how long the restore will take.


We’re now a Microsoft Gold Partner.

SQL Server
13 Comments

In our never-ending quest for world domination (for values of “world” and “domination”), we’re now a Gold Data Platform partner.

The Original Gold Partner (Photo by Rafael Marquez)
The Original Gold Partner
(Photo by Rafael Marquez)

What does this mean? Well, it’s an achievement badge, and we like unlocking new achievements, so, uh…yeah.

No, seriously, some company executives only want to deal with Gold Partners, so we wanted to make it easier for DBAs to get the help they want. Now, when the executive asks, “But are they a Gold Partner?” you can say, “Yep.”

Still the same great SQL Critical Care® and training classes, just now with a little shinier wrapping.

And maybe green M&Ms in the dressing room.


Filtered Indexes: Just Add Includes

I found a quirky thing recently

While playing with filtered indexes, I noticed something odd. By ‘playing with’ I mean ‘calling them horrible names’ and ‘admiring the way other platforms implemented them‘.

I sort of wrote about a similar topic in discussing indexing for windowing functions. It turns out that a recent annoyance could also be solved by putting the column my filter expression is predicated on in the included columns definition. That’s the fanciest sentence I’ve ever written, BTW. If you want more, don’t get your hopes up.

Ready for Horrible

Let’s create our initial index. As usual, we’re using the Stack Overflow database. We’ll look at a small group of users who have a Reputation over 400k. I dunno, it’s a nice number. There are like 8 of them.

With that in place, we’ll run some queries that should make excellent use of our thoughtful and considerate index.

If you were a betting organism, which ones would you say use our index? Money on the table, folks! Step right up!

Yes, Sorta, No, No.
Yes, Sorta, No, No.

That didn’t go well at all. Only the first query really used it. The second query needed a key lookup to figure out the less than filter, and the last two not only ignored it, but told me I need to create an index. The nerve!

Send me your money

Let’s make our index better:

Run those queries again. You don’t even have to recompile them.

Can't you tell by the way I run every time you make eyes at me?
Can’t you tell by the way I run every time you make eyes at me?

They all magically found a way to use our New and Improved index.

What was the point?

When I first started caring about indexes, and filtering them, I would get so mad when these precious little Bloody Mary recipes didn’t get used.

I followed all the rules!
There were no errors!

But why oh why didn’t SQL use my filtered indexes for even smaller subsets of the filter condition? It seemed insane to me that SQL would know the filter for the index is on (x > y), but wouldn’t use them even if (z > x).

The solution was to put the filtered column in the include list. This lets SQL generate statistics on the column, and much like getting rid of the predicate key lookup, allows you to search within the filtered index subset for even more specific information.


How many servers can one person manage?

how-many-servers-can-one-person-manage

HW – if all you do is rack & stack hardware, you can manage thousands of boxes. You see this at Facebook/Google type scales.

OS – when you’re responsible for troubleshooting operating system problems (as opposed to simply re-imaging the box from scratch), you can manage hundreds of boxes or VMs.

SQL – if you have to install, configure, and troubleshoot standalone instances of SQL Server, you can manage 50-100 instances.

HA/DR – when you have to design, build, and troubleshoot high availability features like clustering, Availability Groups, replication, etc, then you can manage 10-50 SQL instances. (Here, it’s important to know that we’re talking about boxes critical enough to have a high availability solution.)

PERF – if you have to tune queries and indexes for performance, you can do a great job on 1-5 instances of SQL Server. Doing a good job of performance tuning means understanding the underlying data model of the databases you’re tuning, plus having a good relationship with the application developers.

These numbers aren’t absolute: tools like training and automation can raise them, and a lack of those tools can make matters worse.


Tracking tempdb growth using Extended Events

SQL Server, TempDB
37 Comments

Everyone knows tempdb is a wierd [sic] place

Brent refers to it as a public toilet. I agree with that sentiment. It will let anyone in to do whatever they want.

Recently I was trying to track down what was causing tempdb’s log to grow. I was thinking about using a Trace to do it, but then I remembered that it’s at least 2015, and I should be using Extended Events. Don’t worry, there won’t be any PowerShell. You can keep reading.

I use the below command to fire up my session. It worked on my 2012 and 2014 instances. Anything older or newer than that, and YMMV. You will likely have to change the output directory to whatever exists on your server. Most people have c:\temp, though.

With that fired up, let’s kick tempdb around a little. Don’t do this in production, it’s going to suck. We’re going to shrink files and run an insert and update in a loop. You can increase the number of loops if you want, but 3 is good enough to illustrate how this works.

Now you can either stop the XE session, or keep it running. I don’t care. It’s your ~~~DEVELOPMENT SERVER~~~ and definitely ~~~NOT PRODUCTION~~~, right?

Right.

Let’s see what kind of germs are on the toilet seat

Look, I never said I was good at this XML stuff. If you are, feel free to tidy up all the VARCHAR(MAX) and BIGINT types into something more sensible. If you don’t like my King Krab emoji, blame Jeremiah.

But hey! The results are in. That didn’t take long.

The first file to get it is the log file. All sorts of helpful stuff got collected. If you can’t figure out what was going on from this, go home. Seriously. The query text is right there.

Meaningful data.
Meaningful data.

Yay. There’s more. The log file wasn’t the only one that blew up. All four tempdb files got used, as well. Is there an important lesson here? Probably. Even on my laptop, multiple tempdb files help. Don’t let me catch you out there with one tempdb data file. Or two tempdb log files. I’m watching you.

Happy Birthday!
Happy Birthday!

In case you were wondering…

Yes, it will capture shrinking files, too. So if anyone is being a horrible, you can catch them, and throw them out several windows.

For the love of Milton Berle, please stop shrinking files.
For the love of Milton Berle, please stop shrinking files.

You can use this on other databases as well, just change the database ID in the XE session definition. Just, you know, use it sparingly. There’s overhead for any type of monitoring, and XE is no XcEption (GET IT?). If you have a growth-heavy environment, capturing them all and the accompanying information could be a real burden on your server.

Building a better BAND-AID®

Stuff like this is good to diagnose problems short term, but it’s not meant to be a replacement for a full time monitoring solution. A lot of people spent a lot of time making nice ones. Go try them out and find one you like.

Thanks for reading!