Attention Season Pass Holders: New Fundamentals of Parameter Sniffing Videos Available!

Recorded Class Season PassIf you’ve got a Recorded Class Season Pass or Live Class Season Pass, time to get your learn on: yesterday’s Fundamentals of Parameter Sniffing class recordings are now available in your account here. As a reminder, here’s the class abstract:

Sometimes the exact same query goes slow out of nowhere. Your current fix is to update statistics, rebuild indexes, or restart the SQL Server. It works, but you don’t know why.

You’re facing a problem called parameter sniffing: SQL Server “sniffs” the first set of parameters for a query, builds an execution plan for it, and then reuses that same plan no matter what parameters get called. It’s been the bane of our performance tuning for decades, and it doesn’t appear to be going away anytime soon – in fact, in newer versions of SQL Server, it even gets worse!

In this one-day class, you’ll follow along with me in the Stack Overflow database on your own laptop or desktop as you learn:

  • What kinds of queries are vulnerable to parameter sniffing
  • How to test your code to see if it’s vulnerable
  • Why tactics like rebuilding indexes and updating statistics fix one query, but make others worse
  • How to proactively patrol your server to uncover the problems before they bring you down

This course is 100% demos: the only slides are the introductions at the start of the day, and the recap at the end of the day. The rest of the time, we’ll be working in SQL Server Management Studio. Roll up your sleeves: let’s do this.

After this class, you’re ready for Mastering Parameter Sniffing on June 3-5 (iCal). To attend that one, you’ll need a Live Class Season Pass: when you’ve got one of those, you get to attend any live online class I do throughout the span of your membership, even new ones that I bring out like this one. Just make sure you read the logistics, do the prerequisites, and set up your lab server. See you in class!


SQL ConstantCare® Now Tells You Which Indexes to Drop.

SQL ConstantCare

Indexes slow down data changes.

SQL Server doesn’t have the concept of asynchronous indexes – copies of the data that it could update later when it’s bored. It has to keep every nonclustered index up to date, every single time you change data, like a delete/update/insert.

Mo’ indexes, mo’ problems.

That’s why in my Mastering Index Tuning class, I start my D.E.A.T.H. Method with Deduping and Eliminating indexes that aren’t getting used. You wanna get rid of the dead weight that’s only slowing you down, and it’s especially important before adding more indexes.

The safest indexes to remove are ones that:

  • ARE being written to (because SQL Server has to keep them up to date) – I’m not as worried about archived or read-only tables
  • Aren’t used for uniqueness constraints (because they might be influencing execution plans even though they’re not used)
  • Aren’t columnstore indexes (because the presence of those can affect execution plans in other ways)
  • Haven’t been used at all for reads in over a month straight (ensuring they’re not used in month-end processing, either)

But you’re busy, and you have a lot of databases and servers, and you don’t set yourself a reminder to check to see which servers have been up for 31+ days, and then go in and check the index utilizations. And even if you did, you wouldn’t want to script out all the undo scripts to back out your changes.

SQL ConstantCare® now includes drop & recreate commands in your daily prescription.

Every day when your diagnostic data comes in, we examine your index usage and server performance. When we find good candidates for removal, we script out both the drop & recreate commands in your prescription file for easy running:

As with all of our prescription recommendations, you just wanna read through ’em as a sanity check, then run ’em. Note that the indexes are even fully qualified with database names: that’s because we want to make it super-easy for you to fix all of your databases just by hitting execute.

Sign up for SQL ConstantCare®, and then get started in minutes. We’ll help you take out the trash.


[Video] Watch Brent Write T-SQL

This morning I worked on a new check for sp_BlitzFirst to find statistics that were updated in the last 15 minutes, possibly causing plan caching issues and parameter sniffing. I streamed it live, and you can watch.

In part 1, the first hour, I write the proof-of-concept query:

In part 2, the second hour, I move the query into sp_BlitzFirst, test it, debug it, and check it into Github:

And finally in the third hour, I switch to open Q&A and do a few live demos:

And now, I’m enjoying the well-earned bagel that I discussed on the webcast:


[Video] How to Think Like the SQL Server Engine: All-Demo Edition

You’ve heard of my free How to Think Like the Engine class, and maybe you even started watching it, but…it has slides, and you hate slides.

Wanna see me do the whole thing in Management Studio, starting with an empty query window and writing the whole thing out from scratch live? This session is for you.

I’ll teach all of the concepts without a single slide. You’ll still want to print out this 5-page PDF so you can understand the index pages I’m referring to onscreen – it’ll make a world of difference. This was be a lively live-stream session on my Twitch channel full of questions and jumping-off-topic demos: if you prefer a rigidly structured, not-meandering session, stick with the conventional version of this session.

Part 1 (50m)

We start with the clustered index on the Users table in StackOverflow, then gradually layer queries in, talking about execution plans, STATISTICS IO, and why queries probably look different between production & development:

 

Part 2 (30m)

Now that you’re done with part 1, let’s switch to SELECT *, add an ORDER BY, and see how SQL Server struggles with repeated queries. To make ’em go faster, we’re gonna need an index, so we build one of those too.

Part 3 (28m)

We finished part 2 by adding a nonclustered index. Now, let’s see all the ways that makes our query go faster. Then, we’ll add more columns to our query so that the index isn’t covering anymore, and we’ll force SQL Server to make a tough decision: do lots of key lookups, or just scan the table? Then we’ll learn how SQL Server uses statistics to make that decision.

Part 4 (13m)

We finished part 3 with two similar-looking queries that produce very different behavior: one does a seek, and the other does a scan. Now you’ll learn why, plus add yet another index to make this query go faster, and then discuss the drawbacks of too many indexes.

Enjoyed this session? Follow me on Twitch to be alerted whenever I’m streaming.


Parallelism Can Make Queries Perform Worse.

While I was building lab queries for my all-new Fundamentals of Parameter Sniffing course – first live one is next week, still time to get in – I ran across a query with delightfully terrible behavior.

I’m always torn when I build the hands-on labs. I want to make the challenges easy enough that you can accomplish ’em in the span of an hour, but hard enough that they’ll … well, challenge you.

It’s tricky walking that line.

This query, though…this query, even though it’s just one table with one nonclustered index, crosses the line way past what you’d probably pull off in a Fundamentals class. It’s just so beautifully diabolical that I gotta share it somewhere, so here we are, dear reader.

Let’s query the Posts table to find new questions & answers.

I’m using the 50GB 2013 version of the the Stack Overflow database – large enough that it’s got some meat to it, but small enough that you can still download it quickly to follow along (it’s just 10GB compressed.)

Every day, all day long, people are asking questions and posting answers at StackOverflow.com. As new questions & answers are posted, they start with CreationDate = right now, and Score = 0.

Let’s say that our users want to find posts with a minimum amount of Score points, Answers, and Comments, and we’ve written this stored proc to help ’em:

Sometimes our users want to catch brand-new questions before anyone else has jumped in. Those users call it with a very LOW ScoreMin, AnswerCountMin, and CommentCountMin. Other users are looking for a leaderboard – they just wanna see the highest-ranking questions so they can go read to learn. Those users call it with a very HIGH ScoreMin, AnswerCountMin, and CommentCountMin.

With that in mind, here are the two parameter sets we’ll test with:

We give those a shot, and we get lucky – our query runs pretty quickly. It just so happens there’s an index that our query will use, so we’re satisfied with these actual plans since the two queries combined run in just a few seconds:

We do a quick glance, and it’s using our index, and the actual numbers are pretty small – we’re good here, right? Ship it.

Eagle-eyed readers will note that this query does an index scan, but – that’s not necessarily a bad thing at all! In this case, it’s quite appropriate. SQL Server sniffs the parameters for the first query (@ScoreMin = 1) and says, “It’ll be really easy to find rows with such low Score & Comment & Answer numbers. I could just scan the posts from most recent CreationDate down, and do key lookups for each of ’em. I won’t have to look up many rows at all before I find enough rows with the right minimums for CommentCount & AnswerCount!” And he’s right. In case you’re following along at home, here’s the definition for that index on Posts:

But try running them in the opposite order,
and parallelism makes things terrible.

Free that plan from the cache, and then run them in the opposite order:

And, uh oh – this isn’t good – the first query is quick, but the second query takes almost a minute. The secret is in the sauce, and by sauce I mean actual execution plans:

Let’s walk through what happened when first proc call ran:

  • This time, SQL Server optimized the proc’s plan looking for Posts with Score >= 250, AnswerCount >= 10, and CommentCount >= 10
  • It knows there are WAY less Posts that match those predicates
  • It still didn’t want to do a table scan – it still wanted to use the luscious CreationDate index because it satisfies the ORDER BY
  • But it knew it would have to read a lot of rows from that index in order to find the matching ones
  • So it decided to parallelize the scan on the CreationDate_Score index, letting multiple threads simultaneously work on doing all those key lookups
  • But because of that, when the CreationDate index is suddenly scattered across multiple worker threads, the threads’ output is no longer sorted by the index, and needs to be re-sorted again

This works out just fine for those parameters because there are relatively few rows with Score >= 250, AnswerCount >= 10, and CommentCount >= 10. It’s not hard to sort those rows. Piece of cake, the query’s done in under a second. This is a completely appropriate use for parallelism.

However, let’s walk through what happens when the second proc call runs:

  • We’re reusing the plan built for Score >= 250, AnswerCount >= 10, and CommentCount >= 10
  • But there are way MORE rows with Score >= 1, AnswerCount >= 0, and CommentCount >0 – millions, in fact
  • Originally, when we only used 1 CPU core, we could just scan the index backward on one thread and stop doing key lookups as soon as we’d found enough rows to match the TOP 1000 in our query
  • But now, since the work was parallelized, all of the threads have to check all of the rows in the index, doing key lookups for them
  • And then of course the poor sort is screwed too, because he didn’t allocate enough memory to sort all these rows

And bam, we’re in a world of hurt.

So what’s the fix?

There are lots of different options here: query tuning, index tuning, hints, plan guides, you name it – but I’m going to use one you probably wouldn’t expect just to show you that it works. I’m gonna stick an OPTION (MAXDOP 1) hint on the query:

With that change in place, there are no differences between the plans no matter which parameters go in first. Both queries have the first plan we discussed, and both queries run in a second or less. A similar change would involve to setting Cost Threshold and MAXDOP appropriately to reduce the blast radius, but of course that would affect many more queries.

Is a parallelism hint the right fix? Well, there are a whole buffet full of choices to choose from. Heck, one could spend an entire day just discussing the fundamental issue at hand, or even three days mastering it. Which reminds me – this demo’s too hard for the fundamentals class, but if I add in another index and use a third set of parameters, introducing 3 unique query plan combinations – yep, that’s perfectly diabolical for a Mastering level lab. See you in class!


Can I Offload DBCC CHECKDB To Another Server?

You want to check for corruption, but you don’t want to slow down your primary production server.

In this post, I’m specifically talking about offloading the corruption checking process. I’m not talking about doing corruption checking on both the primary and other servers – that’s wonderful, and if you’re doing that, you should hug yourself. You’re doing a good job. Who’s a good dog? You are! Good dog.

Now, for the rest of you – your production server is slow, and you wanna know what the drawbacks are of running CHECKDB on other servers instead. Here are the issues to consider. I’m going to be talking in general terms about business risk. I wish I could give you an idea of the risk you’re taking there, but without seeing your servers and practices, that’s tough. Start by putting your RPO & RTO in writing, and then these risks will start to make sense based on the size of your data, the amount of data you’re willing to lose, and how long you’re willing to be down:

If you offload production work, you need a production license. Developer Edition isn’t gonna cut it. Depending on your SQL Server version and the database features you’re using, you may be able to run Enterprise in production and then Standard in your offload environment, saving yourself some licensing costs. 2016 SP1 moved a ton of developer-focused Enterprise features down to Standard, including partitioning, compression, columnstore indexes, and more, and 2019 even added Transparent Data Encryption. That’s pretty cool! You can cut the costs on your maintenance jobs, so to speak.

If you don’t run CHECKDB on the same server that takes your backups, you’re running a very, very serious risk: your backups themselves may be corrupt. SQL Server will happily back up corrupt pages all night long without throwing an error. Even the restore process doesn’t check for corruption. It’s imperative that you get a corruption check done on the backup as quickly as possible. If you’re using Availability Groups and you’re offloading backups to a different replica, you gotta run CHECKDB on that same replica at the bare minimum. I’ve seen situations where the backup and CHECKDB were run on two different replicas, and the CHECKDB was passing on one replica, but the backups were backing up corrupt data pages on a different replica.

If you restore full backups elsewhere, you’re introducing a lag time risk. Say you’re offloading the work because you’re dealing with a large database – that also likely means you’re dealing with slow backups and restores. So if your timeline looks like this:

  • 6PM-8PM – full backup runs on production, pointed at a file share
  • 8PM-10PM – full restore runs in another environment, restoring from the file share
  • 10PM – CHECKDB starts
  • Midnight – CHECKDB finishes with a failure, indicating corruption

Okay, now what? Was it a problem with the backup, the restore, or the hardware in the CHECKDB environment? Your troubleshooting is only just beginning, and while you’re troubleshooting, corruption might be getting worse. On the other hand, if you’d have run CHECKDB directly in production, you would know with confidence where the problems are.

If you run CHECKDB on a server that had log backups applied, you’re testing different data pages than what lives on the primary. You can run CHECKDB on a log shipping secondary, for example, but only the logged changes are moving from the primary to that secondary, not the data page changes. That means it’s possible for the primary to be chock full of nuts at the same time the log shipped secondary is fine. What’s the big deal? Well, imagine two problems:

  • If the primary encounters corruption, sure, you can fail over to the log shipped secondary. Yay! Clean data pages.
  • If you need to restore something from backup, like for compliance purposes or to unwind an “oops” delete, think about where your full backups have been taken: that’s right, the primary! Those full backups were backing up corrupt data pages, and they’re worthless. You haven’t been taking full backups on the log shipped secondary, so while it has clean data pages, it only has clean data pages as of right now.

If you’re running CHECKDB somewhere that you won’t fail over to, you’re even more exposed. In our scenarios above, let’s say we offload CHECKDB to a restore server with less cores & memory, running cheaper Standard Edition. If our primary suddenly reports corruption, it doesn’t do us a lot of good to turn to the restore server and say, “Alright, buddy, time to promote you into the primary role.” Remember, we’re already offloading CHECKDB because the primary itself wasn’t fast enough to keep up with our workloads and our maintenance jobs: odds are, our lil’ buddy isn’t going to be able to do it, either. At that point, when you hit corruption on the primary, you already have to wipe it and restore, and hope that corruption doesn’t come back – but without doing root cause analysis on why the corruption hit, well, it’s probably gonna come back. Because of that, I’m not a fan of having only one big primary, and offloading CHECKDB to one tiny restore server. At that point, all our eggs are in one basket, and all CHECKDB is good for is to confirm that yes, the basket is broken. The business is down.

But if you’re using AGs, not restores, and you’re willing to fail over to that replica, the risk is more tolerable. If you offload both CHECKDB and backups to a replica that you’re comfortable failing over to if the primary encounters corruption, then you don’t have as much of a lag time risk. You just need more than 2 servers in the AG at this point, like this:

  • Replica1: normally our primary, super-busy, can’t tolerate maintenance slowdowns
  • Replica2: not queried, only used for backups & CHECKDB
  • Replica3: async, ready to jump into duty if necessary

Then if Replica1 starts reporting corruption, you can fail over to Replica2, make Replica3 the new backup/CHECKDB replica, and take Replica1 down while you troubleshoot its IO or SQL Server problems.

Automatic page repair isn’t a long term fix. In the scenario above, folks might say, “It’s okay if Replica1 hits corruption – it’ll just automatically fetch clean copies of the pages from Replica2 and Replica3, and heal itself in real time.” APR is like a bulletproof vest: it’s great for temporary protection for a lot of parts of your body, but if you get shot in places that the vest doesn’t cover, you’re gonna have a bad day. APR can’t heal corruption in all types of pages, in all databases. Once you start getting shot, you need to evacuate the area as soon as possible, and a bulletproof vest buys you that time. Same thing with APR – your IO subsystem or SQL Server itself is trashing your data pages, and you need to evacuate your data out of that area as quickly as possible to do root cause analysis.

tl;dr – if your only choices are either not running CHECKDB at all or offloading CHECKDB elsewhere, then obviously you should offload it. I’d rather you do it on the primary replica, but if you can’t, you can’t – just be mindful of the above gotchas.


I’m On the 6 Figure Developer Podcast

The 6 Figure Developer podcast with John Callaway, Clayton Hunt, and Jon Ash covers not just technical topics, but career topics so you can learn to leverage that technical knowledge and build a better career for yourself.

I’m on Episode 144 talking about career development. We talked about why I sucked so bad at development, my favorite resources for folks who are just getting started in their database administration careers, and I make a joke about going from a six-figure-developer to a seven-figure-database-administrator.

They stream their podcasts live on their Twitch channel as they’re recording, so if you wanna see ’em before they go live (plus see video feeds of the interview), subscribe to their Twitch channel or their YouTube channel.


How I Use Streamlabs OBS for Streaming and Training Classes

I’ve written a lot about the hardware & gear I use for streaming, and now let’s talk about the software side.

There are a few thousand words in here. Don’t think that you have to tackle all of this just to get started streaming, dear reader. I’ve been working on my streaming setup for months now, and I’m just trying to save you some time if you really wanna take the streaming thing seriously. Most of you aren’t going to need this post in any way, shape, or form – but then you might find it kinda interesting to see what happens behind the scenes. It’s surprisingly hard work to make live webcasts seem so fun, casual, and off-the-cuff!

The point of all the work we’re about to do is simple: make live webcasts as fun and dynamic as in-person sessions. Let’s get started.

Streamlabs OBS is like Azure Data Studio for streaming.

You know how you’re really familiar with SQL Server Management Studio, and you know how to use it to connect to SQL Server and get your job done? And you know how you’re kinda-sorta vaguely aware that Azure Data Studio is the new hotness, and it runs on Windows & Macs, and it connects to all kinds of databases, not just SQL Server?

Yeah, that’s kinda like Streamlabs OBS.

Streamlabs OBS is a cross-platform application that connects to a lot of different streaming back ends (Twitch, FaceBook, YouTube, Mixer, etc) and can even be used for screen-sharing apps like Zoom, GoToMeeting, and WebEx. Because you’re just now getting started with streaming, you’re gonna wanna use a cross-platform, multi-service tool to make your life easier because over the next couple/few years, you’re probably going to see different streaming services come and go. You wanna use one tool that works with all of ’em. I’ll talk more about the back end specifics later on in the post.

Streamlabs OBS is where you do your stream “development” – where you lay out different scenes, each of which is a collection of components. For example, here’s my live scene during a recent stream:

There are 3 components here:

  1. My SQL Server VM as the background
  2. My main camera at the bottom right, with a chroma key (green screen) effect so that my background disappears, and I can pretend to be a SQL Server weatherman, gesturing at specific things on query plans
  3. The live chat messages come up as toast popups above my head, and disappear automatically after 15 seconds

This is where Streamlabs OBS really starts to shine: they make it easy for you to choose between lots of overlays & premade themes, and installing them is as simple as clicking a button. As you edit your scenes to make ’em yours, they’re automatically backed up to the cloud, too.

And I say “scenes” plural, because you’re going to want more than one scene. You probably wouldn’t watch a TV show that used a single fixed camera angle for an hour straight, and likewise, it’s difficult to focus on a streamer who uses exactly the same screen display for an hour straight. You wanna mix things up with some variety. My main live scene is pictured above, but as another example, here’s a chat scene:

Here, the viewer gets a different experience: there’s a different camera angle showing my office and San Diego in the background, and a big chunk of the screen is dedicated just to the chat. I switch to this scene when I want to have a longer discussion with the viewers about a topic, and where we don’t need to have the demo up onscreen.

I started streaming using the Black / White theme by OWN3D because it was the one that most similarly matched my own site. After streaming for a few weeks, I made notes of what I wanted in my scenes and what I didn’t, and then hired a designer to make a custom set of scene overlays to match the company branding. I’m really excited about ’em, but I didn’t quite get ’em in time for this post, so I’ll publish another post about those soon.

An Elgato Stream Deck helps you switch between scenes easier.

To switch between scenes, you can use system-wide hotkeys (configurable in Streamlabs OBS), or configure buttons on an Elgato Stream Deck to send those same hotkeys. I use the right side keys for this so that I can memorize where each scene is at, and touch ’em with confidence with one hand without looking at the Stream Deck. Just makes transitions look a little more seamless.

The Stream Deck can even stack multiple actions up for each button press. When I switch from camera 1 to camera 2, I want the lights in my room to change as well. My background lights are plugged into Wemo Mini outlets, and the WSwitch app lets me use hotkeys to turn groups of lights on & off.

For example, when I hit the “Starting” button, it turns on all my office lights, switches to the Starting scene in Streamlabs OBS, tells OBS to start recording, and opens Pretzel, the music app I use for stream-safe music.

When I hit the buttons for other scenes, like the Cam2 scene above, it triggers other hotkeys, like telling WSwitch to turn on just the appropriate lighting for that scene. It sounds like overkill, but I stream anywhere between 5AM and 2PM Pacific, and during that time, lighting in my office changes a lot. I need to make sure lights aren’t pointing directly at the live camera, but I also need to make sure the green screen behind me is evenly lit. Erika got pretty frustrated with my continued purchase & exchanges of different pieces of lighting gear and light stands to get the combination just right, without having the lights be too visible to the viewers, hahaha.

The Stream Deck also helps a lot with viewer engagement and interactivity. I’ve customized the left side’s keys to do a variety of sound effects to help reinforce the concepts that I’m teaching – from cheers to jeers. I use these a lot, so I’ve arranged the first two columns so that I can access ’em by feel. They’re arranged from happy at the top to angry at the bottom. Since I use ’em so often, each key calls a random sound with a similar emotional feel: for example, the “Laughter” button triggers one of a dozen different crowd sounds so that it doesn’t sound the same every time.

When you’re using sounds online, you wanna be careful to honor the original sound producer’s copyright. I use royalty-free sounds purchased from Pond5.com, Sound-Ideas.com, and Storyblocks. Music presents a similar challenge: if you’re going to stream to public services like YouTube, make sure you don’t play copyrighted music during your stream. Streaming services will automatically mute your audio during the recordings if you violate copyright laws. I use the $5/mo Pretzel service with stream-safe music, but you might not like their mostly-electronic-dance-music stuff. I happen to love it.

While we’re looking at my Stream Deck, shout out to the free Stream Deck icons from Visuals By Impulse. Lots of beautiful button sets with nice themes.

Now that we’ve got all these fancy scenes set up, the next step is to stream ’em out so folks can see ’em.

Don’t stream directly to Twitch or YouTube:
stream to Restream.io.

Streamlabs OBS is like the Azure Data Studio of streaming: it’s a free client app, but it also requires a server to connect to. In the database world, that’d be SQL Server. In the streaming world, Streamlabs OBS connects to a lot of different back ends: Twitch, YouTube, Facebook, and more.

If you’re doing this to help as many people as possible, then you need to reach ’em where they are. If you don’t already have a big following, then you need to just get your work out to as many places as possible simultaneously. You know how it is: you probably spend your day in just one, maybe two social network sites, and you rarely keep multiple tabs open with all of ’em at the same time. If you stream to multiple services simultaneously, you stand a better chance of catching people no matter what service they use.

Streamlabs OBS kinda-sorta supports the ability to stream to multiple services at the same time, but it’s going to require a lot of bandwidth on your end. Instead of configuring Streamlabs OBS to broadcast directly to Twitch, you have Streamlabs OBS send your stream to Restream.io up in the cloud. Then, you configure Restream to relay your stream on to other providers, including Twitch, YouTube, Facebook, and others if you’re so inclined. As of this writing, you can probably get away with the freakin’ free plan for individuals, which completely blows me away. (I’m using a paid plan.)

Restream gives you metrics to show how many people were watching on each media. Here’s last Saturday’s stream:

I get the majority of my viewers on Twitch and YouTube, with only a small portion on Facebook and Mixer. I’m pushing video to LinkedIn, but they haven’t approved me to their private beta program for livestreaming yet, so I’m shouting into the void on that one. (If anybody knows how to get me into that program, holla – I’ve already submitted an application.)

Restream.io Chat centralizes the discussion, too.

Restream Chat centralizes your viewers’ live comments from Twitch, YouTube, Facebook, and even Discord all at the same time, in one stream, and you can overlay that stream onto your live broadcast! Everybody can see the comments/questions/chat from everybody else, and it all has one consistent look/feel.

This is especially important when you’re just getting started because frankly, you just won’t have a lot of watchers. When anyone on any service says something in chat, and it shows up onscreen, that triggers other folks on any service to jump in and start responding too. That’s interactivity. You want that, or else your stream feels like tumbleweed. As a presenter, there are few things as frustrating as doing a great job, but feeling like nobody’s listening or giving feedback.

First, install the Restream chat client on your local machine, log in, and then you see all the chat in one window while you’re broadcasting.

Next, you want your viewers to see what’s happening, too. In the Restream chat client, go into Settings, Embed in Stream. You can customize how your chat comments will appear onscreen for your viewers, with lots of themes available for different game styles. I went with the default (because it’s the least game-y), but for me, two settings are important:

  • Chat Background Opacity = 0% – because I want the chats to appear to float in midair on the screen, and not take up space permanently.
  • Hide messages after 15 seconds – because I want them to disappear quickly after I mention ’em. I want as much screen real estate available as possible for my demos and presentation slides.

Then copy the embed URL, go to Streamlabs OBS, and add a new browser source, and paste in the URL for the Restream chat embed. Here are the official instructions. The result makes your recordings look fantastic.

Once your chat messages are up onscreen, you can encourage interactivity so much more. You can ask fast off-the-cuff questions like, “Where’s everybody at in the world today?” or “If you’ve encountered a problem like this, say yes in the chat” or “I’m about to solve this one of two ways: would you rather watch me use technique A, or technique B? Let me know in chat.” Next thing you know, your live webcasts feel just as interactive as your in-person presentations – or frankly, for me, they feel even MORE interactive. Many people are much more comfortable typing something into a chat window than they are raising their hand in a large group of people and asking a question out loud.

In theory, you could use this for private training classes.

When you join meetings with conventional apps like Zoom, WebEx, and GoToMeeting, you would normally share your desktop directly. However, if you want to leverage Streamlabs OBS itself to handle your screen transitions, jumping from app to app, and even popup notifications for things like chat messages, you theoretically could.

To do that, fire up Streamlabs OBS, right-click on the preview window, and tell it where to send the preview of your stream:

In my case, my second monitor is a small, cheap 1080p monitor (and quite good-looking, too, I might add), and that’s where I output my preview. It shows my current live scene, like me floating over my desktop. Here’s the key: when you share a monitor with apps like Zoom, share that monitor. Presto, you don’t have to share your camera separately – this second monitor has a beautifully produced feed.

The problem, though, is that I’ve had wildly unpredictable latency. In small meetings, it’s worked great – my onscreen video and my microphone audio have been beautifully in-sync. However, when I’ve tried it in larger meetings, like GroupBy last week, the audio & video were way out of sync. To fix that, I would have to try tuning the audio & video delay every time I’m in a meeting – and I just don’t want to bother with that. Instead, for private class streaming, I’ve abandoned Zoom/GoToMeeting/WebEx for something different.

But instead, for private stuff, check out Mux.

Initially, I was only going to use Streamlabs OBS for free public streams, but…wow, did I fall in love with it. The multiple camera angles, the scene transitions, the built-in high-definition recording – my public webcasts were so gorgeous and lively, but my GoToWebinar classes felt so lame in comparison. I had to figure out a way to use Streamlabs OBS for my training classes.

The problem is that Twitch, YouTube, and Facebook are all about public streams. They don’t charge you money because they make money off your viewers, either by charging them subscription fees or showing them ads. That’s great for public classes, but if you want to run private training classes and charge money for them directly, you’re gonna need a different solution.

You could set up private Zoom/GoToWebinar/WebEx meetings, and charge for access for those. The screen-sharing approach I described earlier would work okay. I just wasn’t a wild fan of how most of those worked, especially in the way they handled viewer data. I wanted something that would be more GDPR-friendly down the road – I’m still not to the point where I’m comfortable selling to the EU, but I’m getting a lot closer.

I wanted a streaming video provider that would let me embed videos inside BrentOzar.com instead. I‘ve written about the WordPress plugins I use to sell training classes. I’m really happy with that infrastructure – it works great. I just wanted a video feed that I could embed directly in BrentOzar.com, and control access to that live training page based on whatever I’m teaching that day, and the memberships that the logged-in viewer has. If they’ve bought the course, let them see the video, that kind of thing. I tried Vimeo (since I’m already using them for the class recordings), but I was really disappointed with the 30-40 second latency the viewers were getting.

There are open source streaming platforms that you can download and run yourself in VMs, but…I wanted something really easy, with someone else managing the infrastructure. I didn’t mind paying extra to have someone else deal with the headaches. I don’t wanna learn how streaming servers work.

Enter Mux.com. They do streaming-as-a-service for developers who are building things that leverage online video. They have an API where you can create a stream, and then get back a URL that you can embed in your own site. They even offer a low-latency streaming option, and they seem to be hard at work building something with even lower latency.

For an 8-hour stream, Mux costs about $34, plus $0.63 per attendee. A 30-attendee class is about $52. Their pricing page has a nice calculator on it – just make sure you use the Live Streaming part.

It’s super-easy to create a stream on your Mux dashboard – click on this to watch it animate:

Click to watch me creating a stream (kinda boring, actually)

In that gif, I’m creating a new live stream, and you see me copy/paste in one line. By default, Mux streams aren’t low-latency: your viewers see things about 30 seconds after they actually happen. For a live training class, I want faster feedback than that because my attendees often ask questions about what I’m doing, so I add the reduced_latency = true parameter, which brings it down to around 10-15 seconds. Reduced latency isn’t the default because reduced-latency streams have a big disadvantage: if you get disconnected from the stream (or if you end it temporarily), you can’t pick back up. Reduced latency streams just end right there.

In that gif, I need to copy two things from Mux. First, I need the stream key, which I paste into Streamlabs OBS along with the Mux server name. I do wish that Streamlabs OBS would let me store multiple profiles, like one set of servers for my public webcasts and one set for the private ones, but I can deal with this.

Second, I copy Mux’s playback URL, and then I open BrentOzar.com. I’ve got the FV Flowplayer Video Player plugin installed, and it plays streams from Mux with zero configuration. I’m not going to go into detail on my e-commerce, subscription, and membership configuration – we’re hitting the edge of what it makes sense to publish publicly given the low number of people who will be leveraging that part – but hopefully this gets you close enough where you can figure out the rest if you wanna sell your own live classes.

Whew. The setup is a lot of work.

I’m three thousand words into this, and between the hardware & software, the evaluations & decisions & configurations & testing, I’ve probably got 100 hours into my streaming setup. I’m still very, very much a n00b compared to streaming pros.

Hopefully though, armed with what I’ve shared in this post, I can get you a real leg up on a much quicker on-ramp to a brilliant online presentation experience. The whole reason I do this – blog and present – is to share the stuff I’ve learned along the way to help you have an easier time getting started than I did. Sharing what I’ve learned about streaming is no different.

I definitely don’t think of streaming as a direct income source – if I did, I’d be screwed. In my last 30 days of spotty streaming, I’ve earned about $77 on Twitch, most of which comes from subscriptions. (I spent more than that on a single light stand!) Professional streamers have to work a lot harder at this, streaming more content on a more regular basis, and taking specific streaming actions to trigger people to spend money, like buy sounds or cheer.

I do view streaming as an indirect revenue source, though: I regularly plug my training classes on there, and I sell sponsorships which produce the ads you see on the stream & blog from time to time. (I’m taking a month off third party ads during May to promote my new Fundamentals of Parameter Sniffing and Mastering Parameter Sniffing training classes instead – you’ll see the third party sponsorships kick back in in two weeks.)

For me, streaming is a better way of giving back to the community and raising awareness about my personal brand – the same thing I used to travel to local user groups and SQL Saturdays to do. Those in-person events aren’t happening again anytime soon – California won’t allow conferences until there’s a vaccination or a treatment – so I figure I might as well get good at this streaming thing.

I miss y’all, and I can’t wait to see you safely at in-person events again. In the meantime, I’ve tried to make my online training be the next best thing, and I hope y’all enjoy the work that I put into it. It ain’t easy – but I think the results are pretty freakin’ awesome.

Get out there and make.

@garyvee

My $0.02 ? #fyp #garyvee

? original sound – garyvee


I’m on the TRY/CATCH Podcast

While on a recent trip out to Farm Credit Services in Omaha, I sat down to talk with FCS’s Natalie Gerdes & Tim Goergen to record an episode of their podcast, TRY/CATCH.

You can listen to the podcast here, or hear it on iTunes, Google Play, and Spotify.

I talk about how people say you should have a 3-year or 5-year career plan, and … that’s just really hard for people like me to do. I don’t think there was ever a point in my career where I could have accurately predicted what I was going to be doing in 3 or 5 years – and any prediction I had was completely wrong.

Instead, do two things: do good things, and tell people. You’ve gotta do both of those if you want to get ahead in the long term. You can’t just do good things (without telling anybody), and you can’t just tell people you’re doing good things (when you’re not.)


Free Upcoming Live Classes

So, we’re still under lockdown, huh?

I see.

Well, I guess I’ve got no choice but to keep y’all busy. Let’s get together for some free live classes:

May 12 (Tues) Noon Eastern: The New Robots in SQL Server 2017 and 2019 – You’ve been hearing about all these robots that are coming to take your job. They’re going to automate all the SQL performance tuning and make your life way easier, right? Or harder, I guess…since you’ll be looking for a job. Thing is, most of that is just plain old marketing hype that Microsoft is using, trying to sell your management on upgrading to newer versions or moving to the cloud. In this session, Brent Ozar will blow the marketing smoke away, show you these features in action, and show you which ones are ready for prime time. You’ll walk away better equipped to have conversations with management about why you should (or shouldn’t) upgrade, and how you can use these features not just to stay employed, but have a better career. Register here by checking the box for the May 12 session, then put your email address in the box at right, and click Attend Event. If you can’t make it to this live class, the recording link will be sent to the registrants as well.

May 13 (Weds) 2PM Eastern: Dynamic SQL Pro Tips – Dynamic SQL is one of the most powerful tools in the database developer’s arsenal. When you need a complex search stored procedure that takes all kinds of parameters (price, category, location, color), dynamic SQL can run extremely quickly by leveraging better indexes. However, when done wrong, it’s extremely painful to troubleshoot. I’ll teach you how to do it right. Register here, and while you’re at it, check out the full GroupBy lineup – it’s awesome! All kinds of good free stuff.

May 16 (Sat) 9AM Eastern: How to Think Like the SQL Server Engine: All Demo Edition – You’ve heard of my free How to Think Like the Engine class, and maybe you even started watching it, but…it has slides, and you hate slides. Wanna see me do the whole thing in Management Studio, starting with an empty query window and writing the whole thing out from scratch live? This session is for you. I’ll teach all of the concepts without a single slide. You’ll still want to print out this 5-page PDF so you can understand the index pages I’m referring to onscreen – it’ll make a world of difference. This will be a lively session full of questions and jumping-off-topic demos: if you prefer a rigidly structured, not-meandering session, stick with the conventional version of this session. No registration required – just show up here at the class time, but you can also download a calendar file for a reminder. If you can’t make the live class, subscribe to my YouTube channel or my Twitch channel, and the recording will be posted there.

May 18 (Mon) Noon Eastern: Top 3 Pitfalls of Managing Your Cost in Azure SQL DB – You manage servers, not invoices. Your company has started to move their databases into Azure, and they’re asking questions that you’re just not ready to answer. How big does your server need to be when you’re moving an existing database up to the cloud? How are servers measured? How do you measure to know if your new Azure SQL DB is overwhelmed, or if it’s a code or index problem? How do you make the right adjustments once you’re up there? Register here. If you can’t make it to this live class, the recording link will be sent to the registrants as well.

May 30 (Sat) 9AM Eastern: Watch Brent Query PostgreSQL – I’ll write the spring 2020 version of the SQL ConstantCare Population Report, using Azure Data Studio to query Aurora PostgreSQL, and write a blog post in WordPress. The resulting post will drop on Monday, June 1. No registration required – just show up here at the class time, but you can also download a calendar file for a reminder. If you can’t make the live class, subscribe to my YouTube channel or my Twitch channel, and the recording will be posted there.


[Video] Updated First Responder Kit and Consultant Toolkit for May 2020

We have quarantined all of the bugs. Many of the bugs. Okay, 7 of the bugs. Hey, we actually added more improvements than we fixed bugs this month! My favorite new things are that sp_BlitzIndex @Mode = 2 now makes it even easier to do disconnected index tuning just purely inside a spreadsheet, and sp_BlitzCache runs faster on big servers.

The “Deprecated” folder now has versions of sp_Blitz, sp_BlitzCache, and sp_BlitzIndex that work on SQL Server 2005. This isn’t new code: it just so happened that during one of my classes, a student asked for a 2005 version, and Oddvar Eikli mentioned that he happened to have all of ’em going back years. He graciously shared the versions that still worked with 2005, so I added ’em in. They aren’t updated by any means – they’re just here for historical purposes for folks who don’t have any other choice. Thanks, Oddvar!

To get the new version:

Consultant Toolkit Changes

Updated the First Responder Kit with this month’s script updates, plus:

  • The Indexes M2 tab adds a new Forwarded Fetches column. This is useful if you wanna sort by that descending to find the heaps that need to be rebuilt the most urgently.
  • The Indexes M2 tab’s Drop TSql and Create TSql columns now use fully qualified database names, making it even easier to copy/paste commands into a client prescription to tell them what indexes to drop, and give them quick undo scripts.

sp_DatabaseRestore: Fixes Awaiting Your Testing

Users have contributed 3 changes/improvements that need testing before we can get ’em into the dev branch. If you can test these by downloading the code and seeing if they work in your environment, please leave a note on the issue with your thoughts on it, and if a user or two agrees that they’re production-worthy, I’ll merge ’em in:

sp_Blitz Changes

sp_BlitzCache Changes

  • Improvement: when a plan isn’t found in cache, include a link to this DBA.StackExchange.com question so folks understand what could have caused it. (#2347, thanks Ajay Patel.)
  • Improvement: faster memory analysis using sys.dm_os_memory_clerks instead of sys.dm_os_buffer_descriptors. (#2345, thanks Daniel Mrowiec.)
  • Improvement: default output now adds the Remove Plan Handle from Cache column that used to require @ExpertMode = 1. Makes it easier to teach how to fix parameter sniffing issues. (#2346)
  • Fix: allows installation on Azure SQL Managed Instances. Remember, though, we don’t support anything in Azure SQL DB or Managed Instances just because MS can/has changed the DMV contents w/o warning and w/o documentation. If it works, great, if it doesn’t, getting compatibility is left as an exercise for the reader. (#2353, thanks agentKnipe and Erik Darling.)

sp_BlitzFirst Changes

  • Improvement: @OutputType = ‘Top10’ now outputs the top 10 wait types since startup in a screenshot-friendly way, so when I’m doing Twitch streams analyzing your wait stats, I don’t have to give you as many instructions. (#2342, thanks Nick Kirby.)
  • Fix: if the SQL Server was renamed without doing it right, the global variable @@SERVERNAME wouldn’t be correct, so sp_BlitzFirst wasn’t deleting history in the sp_BlitzCache table for this server. We now use SERVERPROPERTY(‘ServerName’) instead. (#2320, thanks Sixten Otto and Adrian Buckman.)
  • Fix: wait stats “per core per hour” number was rounding down to small due to extra division, and it wasn’t Joy Division. (#2336)
  • Fix: looks like fileproperty() was showing up as DBCC in sys.dm_exec_requests, causing a false positive that CHECKDB was running. Not too sure about this one since I can’t test it easily, but if you find an issue with this, holla at ya boy by creating a new issue with what you’ve discovered. (#2350)

sp_BlitzIndex Changes

  • Improvement: @Mode = 2 now includes a column with Forwarded Fetches. The output to table doesn’t, though, because we’re not currently doing change detection to see if inserting a new column would break the table. (#2314)
  • Improvement: @Mode = 2 now includes fully qualified database names for the drop & create T-SQL so that you can do easier index tuning with the Consultant Toolkit, fully disconnected, and send clients recommendations that they can execute faster. (#2357)
  • Improvement: Obsessive Constraintive: Serial Forcer warning about UDFs now points to an explanation page. (#2340, thanks C.Hambrick.)
  • Fix: now handles identity seed values bigger than bigint, like DECIMAL(38,0). (#2268, thanks jbarnard84.)

sp_BlitzWho Changes

  • Improvement: for Agent jobs, the Program Name column now shows the Agent job name. (#2328, thanks Jerry Hung.)

Live Stream from the Coding Session

Bored? Want to make fun of my code? Here’s the recording of the 3-hour live stream when I worked on a few of this month’s fixes:

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes us!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.


No, You Can’t Calculate the Tipping Point with Simple Percentages.

This morning, Greg Gonzalez (who I respect) posted about visualizing the tipping point with Plan Explorer (a product I respect), and he wrote:

The tipping point is the threshold at which a query plan will “tip” from seeking a non-covering nonclustered index to scanning the clustered index or heap. The basic formula is:

  • A clustered index (or table) scan will occur when estimated rows exceeds 33% of the number of pages in the table
  • A non-clustered seek plus key lookup will occur when estimated rows is below 25% of the pages in the table
  • In between 25% and 33% it can go either way

I’ve heard a few varieties of this over the years, but it’s fairly easy to disprove. Let’s take the Users table in the 50GB Stack Overflow 2013 database, and create a skinny version with a few NVARCHAR(MAX) columns that we’re not actually going to populate:

Then let’s see how many pages are in our newly created table:

It has 7,795 pages:

Let’s create an index, and then run a query that could use it, but has to make a tough decision about the tipping point. I’m using 1 = (SELECT 1) to avoid questions about autoparameterization:

The actual execution plan does an index seek plus thousands of key lookups:

Ah, the French key lookups have always been celebrated for their excellence

So how does the math work out on that:

  • Pages in the table: 7,795
  • Estimated number of rows: 3,551 (45.6%)

That’s why I get so fidgety when I see someone give a black-and-white rule like “a clustered index scan will occur when estimated rows exceeds 33% of the number of pages in the table.” It’s just not true, and in some cases – like this one – SQL Server pursues key lookups way, way beyond the point where it makes sense. We’re doing more reads than there are pages in the table:

Ouch. So two things: first, that “rule” is just an urban legend, and two, when you hear a rule – no matter who the source is, me included – stop to ask yourself, “If I had to disprove that rule, how would I do it?” Then go grab the Stack Overflow database and do some critical thinking for yourself. Write a demo script. Bust a myth. It’s how you learn, and it’s how you make sure that what you’re learning from is actually right.

Update 2020/05/07 – Greg responded in the comments and he’s corrected his blog post. It now says “will often occur” rather than just “will occur.” Thanks!


Your Views Aren’t The Problem. Your Code Is.

“I hate views,” the DBA said. “They kill performance. And nested views are even worse.”

Wrong. I’ll prove it.

I’m going to use the 50GB StackOverflow2013 database, and I’ll start by creating a couple of indexes on the Users table to help our queries, then create a view on the Users table:

First, compare plans for the table and the view.

If I run two identical queries looking for users in a specific city:

Both query plans use the index, estimate the same number of rows, and work the same way:

Both queries do the same number of reads, and take the same amount of time:

There’s no monster hiding under your bed.

Same thing with nested views:
there’s nothing innately wrong with them.

Let’s go deep and create a whole series of nested Babushka dolls ten layers deep:

And then query the outermost one. It still uses the index, still estimates row counts correctly, and still does just 6 page reads in 0ms.

If you’re purely using views for the purpose of abstraction, you’re fine. You can use them for assigning permissions, for example, and you don’t run into any problems at all – until you start getting fancy with the code.

The problem isn’t the view.
The problem is what you put in it.

Let’s go back to a simple non-nested view and start putting some business logic. For example, let’s say we want to hide users who’ve left a comment that got highly downvoted:

Now, all of a sudden, the query takes longer to execute and does over a million logical reads, because it needs to scan the Comments table. The execution plan shows that it’s a hot mess and desperately needs an index:

But the problem still isn’t the view, as evidenced by the fact that if we inline that code into a single query, not referencing the view, we get the same execution plan and behavior:

Views aren’t killing performance.
That code sucks by itself, or in a view.

Stop blaming views for crappy code. If you copy/paste all the code from the views and inline it directly into a single query, you’re likely to recoil in horror. The problem isn’t the views: the problem is that someone put bad code into a view, and therefore made it reusable.

Views are great for reusability.

If you choose to reuse crappy code, well, that says less about the view, and more about you.


Announcing Two New Parameter Sniffing Classes

During the quarantine downtime, I’m working on a couple of new classes that I think you’re gonna like. Both of them are 100% demos, no slides (other than introductions at the start of the day), complete with hands-on lab assignment challenges for you to solve. They’re both half off this month to celebrate the launch, too.

Fundamentals of Parameter Sniffing

Sometimes the exact same query goes slow out of nowhere.

Your current fix is to update statistics, rebuild indexes, or restart the SQL Server. It works, but you don’t know why.

You’re facing a problem called parameter sniffing: SQL Server “sniffs” the first set of parameters for a query, builds an execution plan for it, and then reuses that same plan no matter what parameters get called. It’s been the bane of our performance tuning for decades, and it doesn’t appear to be going away anytime soon – in fact, in newer versions of SQL Server, it even gets worse!

In this one-day live online class, you’ll follow along with me in the Stack Overflow database on your own laptop or desktop as you learn:

  • What kinds of queries are vulnerable to parameter sniffing
  • How to test your code to see if it’s vulnerable
  • Why tactics like rebuilding indexes and updating statistics fix one query, but make others worse
  • How to proactively patrol your server to uncover the problems before they bring you down

Upcoming class dates are May 26, August 4, November 16th, and more. You can register for this class now, plus Live Class Season Pass holders can attend this class free (here’s how), and Recorded Class Season Pass holders will get the recorded version of the class free after the first class delivery.

Mastering Parameter Sniffing

You’ve conquered my Fundamentals of Parameter Sniffing class, and now it’s time to get to work. You know you have parameter sniffing problems in your environment – and now you need to learn how to fix ’em.

In this 3-day course, you’ll learn:

  • How to capture good, bad, and terrible plans with the plan cache and Query Store
  • How to spot the common places where plans diverge: parallelism, memory grants, index selection, and join types
  • How to prioritize which plan differences you need to correct, based on your server’s wait types, and which changes are harmless bystanders
  • Ways to mitigate parameter sniffing with index changes, query tuning, and database-level settings
  • How SQL Server 2017 and 2019 try (and usually fail) to reduce it with Adaptive Joins, Adaptive Memory Grants, and Automatic Tuning

Upcoming class dates are June 3-5, August 5-7, and more. You can register for this class now, plus Live Class Season Pass and Black Friday Level 2 & 3 holders can attend this class free (here’s how), and Recorded Class Season Pass and Black Friday Level 1 holders will get the recorded version of the class free after the first class delivery.

Both classes are half off this month to celebrate the launch.

I’m so excited to share these with y’all because parameter sniffing is just such a fun topic. When I first sat down to write these courses, I thought I’d only have a day of material, tops – but the more I wrote, the more I realized I’m not even sure I can cover it all in four days, especially with the hands-on homework labs. Your first lab in Fundamentals requires you to write your own single-table query that’s going to be susceptible to parameter sniffing, and then prove it. It only gets more challenging from there, and I can’t wait to see how you do.


Starter/Good/Great/Bananas Live Streaming Setups

For the next year or two, we’re going to be learning and sharing online rather than in-person. I wanna look and sound as good as my budget (and my face) will allow, so in this post, I’m going to lay out a starter ($220), good ($1,500), and great ($3,000) online video/audio setups and show you what each level looks & sounds like.

The links in this post are all Amazon referral links, so I get a kickback if you buy through ’em. If you buy your gear through these links, I appreciate your generosity – it’s like tipping me for putting the time into building posts like this. You rock. I love you. Now let’s get started.

The easiest way to understand how investments can pay off is to talk about microphones. In this 11-minute video, I use four different microphones so you can hear the difference between different investment levels:

The gear I show in that video:

  1. Crappy: Logitech Brio webcam ($200) – sold out everywhere, but that’s okay, you probably don’t wanna buy it after hearing it
  2. Starter: Jabra Evolve 40 headset ($100) – and make sure to pick whether you want mono (1 headphone) or stereo (2 headphone speakers), and USB-A or USB-C.
  3. Good: Electro-Voice RE20 microphone ($400) – which also needs the Focusrite Solo or 2i2 (whichever is in stock), an XLR cable, and a boom arm (whatever’s in stock)
  4. Great (for me): DPA headset ($1,000) – highly configurable for one or two earpieces, different colors for different skin tones, etc. This one isn’t a referral link because the buying process is kinda tricky because they’re so configurable: DPA assembles these to order whenever your order is placed through one of their resellers. When you get to the point of picking out the audio adapter, you want the XLR version, which you can then plug into a Focusrite Solo or 2i2 (whichever is in stock) using an XLR cable.
  5. More microphones: Marco Arment’s podcasting microphone review – with samples from dozens of microphones.

See – or hear, I guess – what I mean about starter, good, and great? Everything about streaming has that gradual progression. Now let’s zoom out and talk about what influences our decisions as we pick out our gear and upgrade it over time.

My home office has constraints.
Maybe yours does too.

Erika and I chose to live in downtown San Diego, California, which means we can’t afford a big apartment with a huge home office. My office is just 10′ x 12′, about 11 square meters. We also like clean, modern architecture, so we’re in a building with floor-to-ceiling windows, concrete walls, and wood(ish) floors. My office has a 12′ wall of glass and a 4′ wall of exposed concrete.

My office means good audio is tougher to get. When you record audio in a small space with hard surfaces, you get a lot of echoes and reverb. I could work around that by covering the home office walls in audio-absorbing materials, but…frankly, I don’t want to. They’re ugly.

My office also means good video is tougher to get. The 12′ wall of glass means I have dramatically different lighting through the course of the day. I typically start classes at 6AM Pacific (9AM Eastern), so within the first hour of class, that 12′ wall goes from pitch black to brightly lit. I don’t have a whole lot of space for bulky light box gear to make up for it, either.

If you have a giant home office with carpeted floors and tiny windows, then you should probably stop reading here. You would be able to use different kinds of microphones, like boom mics that wouldn’t show up onscreen, and use different kinds of lighting that would get better results. You’ll be better off searching for general video recording studio setups.

With those constraints in mind, let’s start picking out our gear.

Starter Setup: $220

You can get started streaming with any phone headset and the webcam built into your laptop. Before this whole virus thing started, I’d have told you to spring for a slightly better headset and webcam, but all the kinda-sorta-upgraded gear is sold out now anyway. Just use whatever webcam you have (worst case scenario, the one built into a laptop. Just buy two things:

Microphone: Jabra Evolve 40 Headset ($100). I like this because it’s available in either USB-A or USB-C, and in both two-ear or one-ear versions. I like the one-ear version because I don’t like having both of my ears covered in long meetings. I like still being able to hear what’s going on in the room. The single-earphone version easily flips from left to right ear orientation. I don’t like wireless headsets in this price range: you’ll deal with audio lag time, lower microphone quality, and constant recharging of batteries. If you spend $100 just on a microphone alone, you can probably do better in terms of quality – but you’re not going to remember to keep your mouth as close to the microphone as possible. A headset takes care of that problem.

Light: Neewer 18-inch ring light, stand, and tripod ($120). Your favorite beauty YouTubers (don’t lie, I know you watch them) use adjustable, dimming ring lights because they cast very complimentary light patterns. Trust me, our faces need all the compliments we can get. If you have a standalone USB webcam, you can figure out how to mount it in middle of the ring light, so when you look at the camera, you’ll have a good lighting position, and you’ll look great. If you try mounting the webcam & light on your monitor, then when the monitor wobbles as you’re typing, so will the webcam & light. (You haven’t noticed your monitor wobbling? Your viewers will, especially when you mount a webcam and a light to the top of it.) Don’t put the light’s tripod on your desk, either, because your desk wobbles too. Put the tripod on the floor, behind your desk.

You need a microphone, webcam, and a light. All three are required. I don’t care if you don’t agree with my specific gear choices in this post, but just as you wouldn’t try to present without a microphone, you can’t present without a webcam and a light. Many viewers will tune out quickly if you’re invisible.

Here’s what a Zoom session looks like with that setup:

Optional upgrade webcam: Logitech Brio 4K ($200), but really anything you can get right now. The Logitech is the best USB webcam out right now: best resolution, best low-light sensitivity – but due to everyone working from home right now, it’s pretty much impossible to find these in stock. You may have to settle for something else. That’s okay – the light is going to make a lot of crappy webcams look great.

Optional: collapsible chroma key screen ($65) or wall-mounted Elgato green screen ($160). If you have a cluttered or ugly background, just put this up behind you. Many webcast apps like Zoom and Teams will be able to swap out your background in real time much more easily with this, and it’ll actually look good. If you try using their built-in features to replace your background and you DON’T have one of these, it looks terrible and amateurish.

I don’t think you should invest more than this until you’re ready to commit to the entire package of kit in the next section. You can go a really, really long way with the above gear. Build your audience and build your technical content, but track how much revenue you earn from it. When you’re ready to make an investment and take your game to the next level – but you’re also confident that it’s going to result in increased income – then hit the next level of gear.

Software: Streamlabs OBS (Free)

The more you present, the more you’re going to want to lay your video directly on top of your slides. Here’s what it looks & sounds like with the starter hardware setup listed above:

 

In the streaming world, the default standard in this space is the free, open source OBS Studio, but…it’s not very friendly or easy to use. I had a pretty rough time when I first started with it, and I decided I wanted to stop spending time futzing with something trying to get it to work.

I use Streamlabs OBS instead. It’s also free, works with both Windows & Mac, and Streamlabs have added a lot of cool stuff to make the streaming process easier. They do try to upsell you to Streamlabs Prime, which costs $12/mo, and adds all kinds of themes and apps. I think it’s worth it for a better-looking stream, but don’t spend any money on it when you’re first getting started. Just download plain ol’ Streamlabs OBS and go from there. Their 5-minute tutorial is super straightforward, and you can start streaming to YouTube, Twitch, or Facebook in a matter of minutes.

But like I discuss in that video, we’re specifically talking about YouTube, Twitch, or Facebook here. This setup doesn’t work for meeting software like Zoom, WebEx, GoToMeeting, etc. There are ways you can get it to work like obs-virtual-cam on Windows and obs-mac-virtualcam, but that’s outside of the scope of this blog post.

Good Streaming Setup: ~$1,500

You’ve been streaming for a while, and it’s earned you a consulting engagement, or you would like it to. You put work into your presentations, and you’ve decided that you want people to notice both the quality of your material, and the quality of how you deliver it.

While the stuff in the starter setup was meant to get you off the ground, it’s not really reusable or extendable. From here on out, we’re talking about foundational building blocks that you’ll never throw away again, only build atop of.

Camera: Sony A6100 ($700). Think of it as a webcam with a really nice 4K sensor, but instead of having a USB cable, it has an HDMI cable out. (It has a micro USB input, but that’s just for charging.) We’ll need an HDMI converter to make it act like a webcam – more on that in a second. The A6100 is one of the cheapest interchangeable lens cameras that shoots 4K videos continuously and has a little pop-up selfie screen:

This makes it easy to stay in-frame, and when you’re webcasting, your eyes are naturally drawn to your own face. That helps you stay focused on looking at your viewers, and raises the quality of your video.

Camera adapter: Elgato Cam Link 4K ($175). This takes a camera’s HDMI output and turns it into a webcam input for meeting software. It’s super simple, just plug and play, no settings to muck with. If you add in multiple cameras later, you can use multiple Cam Links, but they do need to be plugged into different USB root hubs.

Microphone: ElectroVoice RE20 ($400). This thing sounds buttery and amazing. The drawback is that, like any professional microphone, it’s kinda sensitive to placement. You’re going to have to mount it on a microphone arm that’s visible inside your videos, and you need to be right up on top of it to get the best sound. However, for me with my small, echo-y office, that’s also a benefit! It picks up way less of the noises in the room.

Microphone Interface: Focusrite Scarlett Solo ($150-$200). Professional microphones have XLR cables, and they need a device that accepts analog XLR signals, turns them into digital signals, and sends them over USB. The Solo is a great cheap way to do it, and it only handles 1 microphone, but that’s all you need. If the Solo isn’t in stock, the 2i2 and 4i4 also work fine – they’re just overkill because they handle more microphones, and you don’t need that.

The most anal-retentive readers might catch that I’m using a different lens than the kit 16-50MM one that ships with the Sony A6100. I’m using a nicer Sigma 16MM F/1.4 lens, but you shouldn’t do that: use the kit lens so you can use its zoom to frame you & your green screen as closely as possible, filling as much of the camera’s sensor as possible, and not cropping out pixels in OBS like I’m doing in the video below. (Long story as to why I have the Sigma, more on that in a future post when I add the next Sony camera. I’m gradually building out my setup for 3 cameras.)

Great (for me) Dual-Camera Streaming Setup: ~$3,000

This tier is about really starting to look professional: having multiple cameras, and switching between them like a live TV news studio. Here’s what it looks and sounds like:

We’re reusing almost all of the gear from the Good tier above:

This gear will be new:

Customizable keyboard control: Elgato Stream Deck ($250). LCD keys so you can configure what they show & do, like switching between different cameras. Could you do this by memorizing a bunch of hotkeys? Probably, but as you start to rely on more advanced OBS functionality, like playing sound effects, this will come in super handy.

Headset: DPA In-Ear Broadcast Headset ($1,000). If we’re going to pivot back and forth between cameras, we’ll need a microphone that works wherever we turn, and doesn’t get in the way of our video. This headset has a very good directional microphone, very small so it doesn’t look like you’re wearing an air traffic controller’s headset, and very configurable. These are basically assembled-to-order based on your choices:

  • Different colors depending on your skin tone
  • One earphone or two – I like one, since I like to hear the sounds in my office
  • Lots of adapters – if you’re just going to tether directly to the Focusrite Scarlett, you can get a cable to connect the DPA to an XLR cable, or you can get the DPA with different connectors for different wireless packs, like what I’m about to describe next.

Optional: wireless audio connectivity:

  • For the microphone: Sennheiser AVX ($900). The only reason I got this is so I can move around much more easily, and walk away from my desk without worrying about clipping/unclipping different pieces of gear. This is basically an industry standard for wireless audio, and it’ll last me the rest of my life. It actually comes with a clip-on lav microphone, but it’s just alright, not great.
  • For the headphone: Mpow Transmitter/Receivers (2, $30 each). The DPA headset above has a headphone jack for the earphone, and it connects to a 3.5mm headphone jack. You could plug it directly into your computer, but then you’d be tethered to it, defeating the purpose of the Sennheiser AVX.

Here’s what the office looks like from the other side of the screen:

Bananas Setup: Tens of Thousands of Bucks

Adam Saxton is part of the Guy in a Cube phenomenon, and as part of their channel hitting 100,000 YouTube subscribers, Adam gave a video tour of his home studio in Texas and posted the video gear he & Patrick use:

Of course, Adam didn’t jump to that overnight: he began with a starter streaming setup just like you will. He continued to invest in his skills and his gear, saw the returns, and continued to invest more. You don’t need Adam’s gear to get started, and neither did I.

Just get a ring light and a green screen, install Streamlabs OBS, and start sharing your presentations on YouTube or Twitch for free.


Free Webcast on Dynamic SQL Pro Tips

Dynamic SQL is one of the most powerful tools in the database developer’s arsenal.

When you need a complex search stored procedure that takes all kinds of parameters (price, category, location, color), dynamic SQL can run extremely quickly by leveraging better indexes.

However, when done wrong, it’s extremely painful to troubleshoot.

I’ve been working with dynamic SQL for over a decade. At the next free GroupBy online conference on May 13th, I’ll show you how to use sp_executesql, sp_BlitzCache, and my favorite tricks for easier debugging and commenting. The full session list is out, and it’s a great lineup, especially for free.

Register now for free. See you there!


Can You Use NVARCHAR As a Universal Parameter? Almost.

A perfect storm of unusual problems caused a client to ask, “What if we just used NVARCHAR(4000) as the default datatype parameter for any query, regardless of what datatype the table has in it – like numbers or dates?”

It actually works pretty well in most situations, believe it or not.

The Users table of the Stack Overflow database is helpful for demoing this because it has strings, integers, and dates. We’ll create a few indexes:

And then we’ll create a stored procedure with a single NVARCHAR(4000) parameter. I could use three different parameters, but one is all I need here to prove the point:

The actual execution plan is delightful: it does index seeks against all three columns, regardless of their datatype:

And there’s no yellow bang warning on the SELECT operator warning you about implicit conversions. SQL Server is able to implicitly convert the NVARCHAR parameter to the appropriate INT and DATETIME datatype to match what’s in the table. The length of the NVARCHAR(40) DisplayName column doesn’t matter either: there’s no implicit conversion problem between different datatype lengths. (Of course, if you were doing an insert or delete, and you tried to insert more data than the table could handle, you’d have a problem, but that’s rather obvious.)

Problems start to arise
with VARCHAR columns, though.

The Users table doesn’t have an ancient, barbaric datatype like that, so we’ll have to add one:

And then we’ll modify our stored procedure to query just that one column:

Now, the actual plan es mal:

Now, we’re seeing implicit conversions, and these have three problems:

  1. We get an index scan, not a seek: we’re reading millions of rows instead of a few
  2. SQL Server has to do the CPU-intensive work of upconverting every VARCHAR Location2 value up to match the NVARCHAR parameter of the stored procedure
  3. Our estimates can be bad because SQL Server doesn’t know what it’s going to find after that upconversion happens

That alone is bad, but it gets worse.

Mo columns, mo problems.

Let’s go back to our multi-use stored procedure that searches across multiple columns, and this time let’s add in Location2:

Now, the actual execution plan doesn’t even bother with nonclustered indexes at all, and it’s gone from mal to muy mal:

Ouch. At least SQL Server gives us the courtesy of the yellow bang, though.

The moral of the story: NVARCHAR parameters can be alright if you’re only searching NVARCHAR, dates, integers, etc, but be wary of which datatypes can be implicitly converted, and even when they can be converted, they can still result in bad estimates and scans.

Update 2020-05-27: David Barbarin blogged about a real-world scenario where this exact problem struck: someone used an NVARCHAR parameter thinking it would be universally converted (and it is), but backfired due to implicit conversion.


If You Have Foreign Keys, Don’t Update Fields That Aren’t Changing.

If you update a row without actually changing its contents, does it still hurt?

Paul White wrote in detail about the impact of non-updating updates, proving that SQL Server works hard to avoid doing extra work where it can. That’s a great post, and you should read it.

But foreign keys add another level of complexity. If you have foreign keys, and you tell SQL Server that you’re updating the contents of those related columns, SQL Server will check ’em even when the data isn’t changing.

To illustrate it, I’ll take the Stack Overflow database and add a foreign key between Posts and Users:

I’m saying that every Post’s OwnerUserId column must map up to a valid Users.Id. I’m using NOCHECK to get the implementation done faster because I’m not worried about the quality of the existing data in this case – I’m not trying to write a demo about foreign key join elimination.

Let’s grab a Posts row and see who the current OwnerUserId is:

Then set the Posts.OwnerUserId to the same value, and look at the actual execution plan:

The plan shows a clustered index seek against the Users table even though the data isn’t changing. Why? Because SQL Server has to go check that the Users row is still there.

The same thing happens even if I don’t specify a value for OwnerUserId, like if I just set it to itself. Here’s the actual execution plan for that, and it also hits the Users table just to check:

But if I update an unrelated column, like Score, then SQL Server doesn’t have to check the Users foreign key, and the actual plan avoids touching the Users table:

What this means for query authors

In a perfect world:

  • Tables all have foreign keys that lay out their relationships
  • You only update columns that are actually changing
  • Therefore, you only incur the overhead of foreign keys when the related columns change

In an imperfect world:

  • Your queries update every column, every time, regardless of whether they’re changing or not – maybe due to an ORM, or maybe due to a quick/easy/reusable stored procedure design
  • SQL Server thinks you’re changing foreign key values every time
  • You incur the overhead of foreign keys on every single update statement

This is especially problematic for transactions that update several tables in a row, each in their own update statement. As your concurrency needs grow, and you do thousands (or tens or hundreds of thousands) of updates per second, this can lead to blocking problems. At that point, you’re faced with a tough choice: either fix the T-SQL to only update columns that are changing, or…remove the foreign keys. In that case, removing the foreign keys is probably the easier quick fix.

DBAs: I’ll write your comment
to save you some time.

“Brent – this is completely unacceptable. We need foreign keys in the database to keep the bad data out. Those idiotic developers need to fix their T-SQL! Why would you update columns that aren’t changing?!? It’s their problem. Stupid ORM. I told them not to use it, and to craft every query by hand instead. I’m not changing our perfect database to lower its standards. Signed, Determined DBA”

And here’s my response.
Might as well get this over with now.

“Dear DBA: If we drop the foreign keys, our web site can sell more products right now. If we fix the code, it’ll take months, and in the meantime, our customers’ money will go to our competitors. Are you sure this is the hill you want to die on? Signed, The Consultant”


Free Webcast Today: How to Measure Your SQL Server

If you get into a car, how do you know if the car is fast or not? You hold down the gas pedal, and you time how long it takes before you’re breakin’ the law.

Now what about SQL Server: how do you know if yours is fast…or a clunker? Database performance tuners need to know three metrics about their SQL Server: how fast it’s going, how hard it’s working, and how big it is. I’ll explain where to get those numbers, and what normal ranges are. You’ll learn why advice is so different, depending on the kind of server you’re driving.

Register here to watch it live, or to see the recording instantly if you missed it.

I’m getting ready as we speak:


Menu
{"cart_token":"","hash":"","cart_data":""}