#TSQL2sday 136 Wrap-Up: Your Favorite (and Least Favorite) Data Types

Development
9 Comments

For this month’s T-SQL Tuesday, I asked you to blog about your most-loved and least-loved data types.

Crazy, right? How could people possibly love or hate data types? Well, if you’ve been working with them for a while, I figured you’d have built up an array of tips or pain points, and y’all delivered with 29 interesting blog posts.

Secretly, when I picked the topic, I hoped we’d end up with a couple of data types where people disagreed so much that it was both a favorite AND a least favorite. Indeed, y’all delivered on that too!

Here’s what you wrote, grouped by data type, each in lower case as Aaron Bertrand intended:

bigint

bit

  • Bit versus other data types by Kevin Chant – “So, as you can see in this slightly biased post the bit data type wins here.” (I laughed pretty hard throughout this one because it was not what I was expecting.)

date

datetimeoffset

  • Datetimeoffset Uniqueness by Rob Farley – “the time zone thing isn’t about whether they are actually identical or not. It’s about whether they are considered the same based on the things that the SQL engine uses to consider them equal.”
  • My two least favorite data types by Aaron Bertrand – “it’s easier to convert from UTC to some other time zone than it is to convert from some time zone that may observe DST to some other time zone that may observe DST.”

float

int

  • 4 Things I Love About Integers by Brent Ozar – “Integers are clean, elegant datatypes that do exactly what they’re supposed to. They’re the Golden Retrievers of the database world: they have a few shortcomings, but so, so few.”

money

smallint

spatial

  • Mapping visualizations and geo-spatial coding by Tom Huguelet – “the fact that we got so much for nothing, and the fact that these data types behave like objects with methods, and the fact that mapping is such an important type of visualization for analytics, make SQL Server Spatial Geography Types my favorite.”
  • Spatial is My Favourite, Yet Most Pointless Datatype by Greg Dodd – “I’d rather store Latitude and Longitude as decimal fields and have an application layer do the hard lifting, where I can send things parallel, where I can do some maths to work out the locations I want and index them if I need to.”

sql_variant

text

timestamp

  • The Datatype Blog by Steve Jones – “It has been deprecated, but the timestamp type is still around. It’s not in the list, but it is mentioned as a synonym for rowversion. This is a unique binary number in each database, which is often used to detect changes in a row.”

uniqueidentifier

varbinary

varchar

xml

They’re all good datatypes, Brent

Photo by Barn Images on Unsplash

My thoughts overall

In my training classes, I often compare our work to carpentry.

When you just get started with carpentry, you only have a tool or two. The longer you work in the field, the more you amass a collection of tools, and the better you understand how to use those tools.

This is a discussion we could have had 20 years ago, or 10 years ago. The answers change, too: just because a tool had a valid purpose once doesn’t mean it still does, because better options arise. Sometimes, new options arise, but…they’re not actually better. (I’m looking at you, JSON.)

We’ll still be having this discussion 20 years into the future. I’m celebrating the 20th anniversary of BrentOzar.com, and I can’t think of a better T-SQL Tuesday topic: I could have published this in 2001, 2011, 2021, 2031, or 2041. We’re still facing the same carpentry projects, but the tools we use and the ways we use them gradually change over time.

If you had fun writing (or reading) about this, and you’ve got an idea for a future T-SQL Tuesday episode that you’d like to host, tweet @way0utwest or leave a comment over here.


How I Worked Toward GDPR Compliance: The Really Long Story

Company News
17 Comments

BrentOzar.com is 20 years old, so I’m spending 20 days reflecting. Yesterday, I wrote about how my online classes are available to Europeans now, and today I’m going to talk about why that was a problem for a while, and how I solved it.

The short story is that I outsourced the whole thing to Teachable.

For the long story, buckle up: there’s over 2,500 words of navel-gazing in here.

Twenty years ago, BrentOzar.com ran on my own code.

Me at Florida Code Camp, 2007
Back in my code-slinging days

Back in 2001 when I started writing it, BrentOzar.com served two purposes: it helped me share stuff I’d written, and it helped me become a better developer. I liked coding, and I wanted to do it in my spare time, too, and since I wasn’t satisfied with the blog platforms circa 2001, I wrote my own.

When you’re facing a choice of build vs buy, like build your own blogging platform or use an existing one, there are a lot of variables involved:

  • The quality of the buy-able options out there (including the free ones)
  • Their cost (which is sometimes licensing, but also the hardware it runs on, and the maintenance costs)
  • The quality of a solution you could build yourself
  • That self-built solution’s cost (because your time is worth money, even if that number is small)

In 2001, when I made the build-vs-buy decision, the available blogging platforms just weren’t that good – the market was too immature – and my personal time cost was really cheap. I wasn’t making a ton of money at my day job as a developer & database guy, so I figured sharpening my development skills would raise my day job skills anyway.

After a few years, that got old,
and I switched to blogging software.

In 2003, I switched to Movable Type, and then in 2006, I switched to WordPress. Blogging had taken off, and the blogging platform market had grown. The quality of what I could get off-the-shelf was way better, so I revisited the build-vs-buy decision.

I decided that if I was going to spend a Saturday sitting in front of a computer, I could either spend that time tinkering with my own blogging platform – OR, I could spend that same time writing blog posts, sharing my knowledge. I decided that the latter was a better use of my time.

Developers make the build-vs-buy decision constantly today, way more than they ever did before, because it presents itself not just with each software platform decision, but every logic decision you wanna make. Wanna, say, process XML? You have tons of decisions to make about which library to use or an online service or write code yourself, and the choices can come back to haunt you if a service or library becomes abandoned, or it reinvents itself in a new version that requires tons of code changes. Developers are practically business owners making long term decisions that affect the cost of a project with every framework decision they make. Get the best ones you can. (Yay, Richie!)

When we started selling online training,
we had the build vs buy decision again.

At the time we made the decision, we thought the online training sales market was mature. There were a lot of learning management systems (LMS’s) out there, with a wide variety of price ranges and features. We hired an LMS consultant to help us navigate the choices, and we settled on a pretty expensive one.

Me circa 2011, by Matt Slocum

It was a horrible mistake.

The software was garbage. It charged some students, yet told them their credit card was denied. It double-charged some students. It let some students in without even charging them at all. We worked with their support teams over and over, and nothing got better. (Years later, I found out that particular vendor was a bit of a hollow showpiece: they had one really big paying client, and they focused all of their development & support on that one client. Everyone else was an afterthought.)

As the angry customer emails kept piling up, we were forced to revisit the decision.

We moved the training sales in-house,
using WordPress and WooCommerce.

We certainly couldn’t afford to build an e-commerce system from scratch, but there was a bit of a hybrid:

  • WordPress for the content and users
  • WooCommerce for the e-commerce
  • A variety of WordPress plugins to enable training classes
  • Some custom code to glue it all together – some Zapier, some custom WordPress code

This worked really well for years – until 2017, when the EU’s General Data Protection Regulation (GDPR) approached. Back in 2017, it just wasn’t clear how the GDPR would be enforced, but the penalties were terribad: up to €20M or 4% of your company’s annual worldwide revenue, whichever is higher. As a small business owner, I looked at those numbers and went, “Nnnnnope.” Only 5% of my sales revenue came from the EU. Just to continue selling to the EU, I would have to invest much more than that 5% of my revenue – and it simply wasn’t worth the investment at that time.

I knew I wanted to be fully GDPR-compliant someday – I do love the ideas behind the law – so I took the steps that made sense for me, like requiring folks to opt in before emailing them, implementing a privacy policy, and processing GDPR portability & erasure requests.

However, I didn’t see a way for the combination of WordPress and WooCommerce to be fully GDPR-compliant. For example, customer data isn’t encrypted – so if someone gets your WordPress database backups, you’re screwed. Even worse, in the WordPress plugin world, it’s very common for plugin developers to say, “For me to debug the problem you’re having with my code, I need full access to your production servers, or else I need a copy of your database.” I just couldn’t do that and still feel confident about GDPR compliance, not without doing a whole lot more legal prep work with each and every plugin developer.

I also didn’t want to dedicate Richie (my developer) to working on it. BrentOzar.com is PHP & MySQL, and that’s not what I hired him to do. He uses serverless technologies to enable PasteThePlan, SQL ConstantCare®, and another project we’re working on, and he already deals with enough context switching. I’ve got enough work to keep him busy for years, and I just wouldn’t want him working on the joyless tasks involving GDPR compliance. I’d have to hire another developer, and the ROI just wasn’t there.

So rather than change our sales platform, I stopped selling to the EU and EEA. That raised a lot of eyebrows at the time, but I still think it was the right thing to do as a small business owner. I heard a lot of database folks say, “Just become compliant – how hard can it be?” I had to laugh, especially when those same folks had wildly non-compliant WordPress blogs, hahaha.

COVID-19 changed the training landscape.

I was probably thrown out of Iceland for this photo

In March 2020, I was leaving Iceland based on the US State Department’s recommendations for US citizens to get home as quickly as possible before borders locked down completely. I knew the conference & training game was about to change dramatically: everybody was going to be learning at home for the foreseeable future.

To deal with the new reality, I raised my online training game: I switched from GoToWebinar (which did a terrible job of GDPR compliance) to streaming with OBS, invested in cameras & hardware, and set up scenes to make training way more interactive. Those changes were successful, and helped the classes sell really well to the American audience. That was my single biggest goal for 2020: taking care of the home crowd. In March 2020, when I was making those decisions, I had no idea how long the quarantines would last, or when vaccines would be available. I figured I’d just take it one year at a time.

The SQL Server business wasn’t the only one impacted by the quarantines, obviously: everyone has to get their training remotely now. 2020 was the year when we saw massive improvements in all kinds of software and SaaS platforms targeted at remote meetings and training.

So this year was a really good time to revisit the build-vs-buy decision. I wanted a training sales platform with:

  • The ability to host both recorded and live classes
  • GDPR compliance
  • EU & UK VAT payment & filing – because I didn’t want to hassle with the complexities of international tax filing in order to gain just 5% more revenue
  • Powerful & flexible pricing: subscriptions, one-time pricing, and coupons

And I decided that I would be flexible on any other features – just because I’d gotten used to something in my current WordPress/WooCommerce setup, didn’t mean it would be a requirement in the new training class sales platform.

After a lot of research and testing, I settled on Teachable.com. I migrated my training classes over there, and I’m going to process all class sales through there from here on out. Customers who have existing subscriptions in my WooCommerce/WordPress store can still continue to access their material exactly the same way, and their renewals will continue to work. New customers will just be signing up on Teachable.

Teachable handles a lot for me.

International sales are a giant pain in the rear because the EU and UK charge value-added-tax on online sales these days. If you sell any goods, even virtual ones, the governments want their piece – even if you don’t have a business presence in the EU or UK.

In the past, this meant registering with every single government where someone bought something online, or registering for a VAT Mini One Stop Shop (MOSS) where you filed all your EU taxes, plus there’s probably something separate now for the UK due to Brexit, and frankly, in the words of a wise philosopher, ain’t nobody got time for that.

Even worse, even when we went through all that several years ago, we still constantly fielded tax questions from people. “I’m trying to buy from South Africa using a French credit card over a Swiss VPN, and I believe I’m tax exempt because I wear white shirts on Thursdays.” I’m sure those questions are still going to come in, but…now I can just point them over to Teachable support and call it a day.

Teachable figures out where you’re buying from, handles the VAT number exemptions, collects the taxes, and even files the taxes and pays them for you. They just pay you out daily, weekly, or monthly (your choice) – and then you deal with your own corporate or income taxes based on your overall royalties. You still get all the control in terms of your course pricing, coupons, and availability, but Teachable just handles the accounting part. It’s doable for them at their scale – whereas for me as a small business, it totally wasn’t doable in a profitable way.

They charge for it, of course.

When I used WooCommerce & WordPress, we had to pay a fixed price for our plugins and hosting charges, and a credit card processing fee to Stripe for each purchase. (When we sold to the EU/EEA, we also paid variable overhead to accounting folks based on how much taxes we were filing.)

Teachable has monthly pricing, plus fees per transaction. I took a screenshot of a Recorded Class Season Pass sale to illustrate how things get a little complex, but before the screenshot, here’s the simple math:

  • Course price: $99.86
  • I got paid: $92.99 (about 93%)
  • Teachable got: $2.00 (2%)
  • Stripe got: $4.87 (5% – it looks higher on here, but that’s because they also got a cut of the tax)

The details, which also include currency conversion and taxes and coupons, oh my:

In the past, I’ve repeatedly said that the EU was about 5% of my revenue, so the question becomes: is it worth paying 7% of everyone’s revenue in order to get 5% additional revenue? Obviously not – but that’s not a fair question. We’ve always paid Stripe’s credit card processing fees, so that 5% is a wash. It’s really just a question of, “Is it worth paying 2% more on everyone’s revenue in order to get an additional 5% of revenue?” But that’s not really fair either: the situation is more complex because I can’t use Jilt for revenue recovery, for example, so revenue might go down, too. It’s hard to make these judgement calls in the fog of war. I don’t think I can boil it down to a simple equation, and the numbers are too close to call.

But if they’re too close to call, then I need to do it, because I want to get my training into the hands of more people.

Like any build-vs-buy decision,
there were compromises.

Some of the compromises were related to the GDPR, and others were related to Teachable’s product.

For example, when someone abandoned their shopping cart, I used Jilt.com to send them a reminder, and then a coupon to entice them to finish the checkout process. Jilt worked amazingly well, paying for itself many times over. However, this just isn’t an option with Teachable.

Similarly, when folks bought a BrentOzar.com training class, I automatically added them to Mailchimp, the service I use for email newsletters. I didn’t actually subscribe them to any newsletters – I just wanted them in the database so that when I sent out an update about their training class, I could email them through Mailchimp. Teachable has the ability to do that – but I just decided not to because for the GDPR, I want to minimize the number of places where customer data is kept.

Another compromise is that Teachable doesn’t really specialize in live training classes. You can embed live streams in the course content, but it’s not like Teachable has event calendars and the ability to buy different specific dates. I wouldn’t have a problem selling seats in individual classes on individual dates, but it would have been painful enough in Teachable that I just skipped it. Plus, the vast, vast majority of my live class attendees just buy a Live Class Season Pass anyway. The folks looking to save money or pick and choose specific classes usually ended up buying the recorded versions, which were cheaper – and they can still do that. I didn’t change the pricing on those. If Teachable ups their game and offers easier individual live ticket sales, I’d embrace that – but for now I’m keeping it simple.

I’m fine with compromises like that. If I wanted the perfect online training sales platform for me, then I’d have to hire a team of developers – and the cost on it would be stratospheric, and I’d end up spending too much of my time managing the developers and the platform – and not enough time delivering value to attendees.

I liked the WooCommerce/WordPress combination more than I like Teachable, but Teachable enables me to get the classes in the hands of more students, with less time taken on my part. COVID19 changed the equation there: I just don’t see in-person conferences happening anytime soon, and the customer population in the UK, EU, and EEA continues to grow. I had to be able to service those customers.

Those customers being, uh, you, dear reader.

The work was worth it.

After spending weeks researching tools, picking one, migrating my content over to it, building a bunch of marketing, and going live, the numbers are starting to come in.

Before we stopped selling to the EU, the EU/EEA represented about 5% of our training revenue. In last year’s Black Friday sale, we sold $1M USD. 5% of that would have been another $50K. So leading up to selling to the EU/UK, I defined success as $50K of revenue in the March 2021 launch sale.

48 hours into the sale, I’m happy with the numbers. It’s a success, and the work paid off.

Wanna join in? During March you can use coupon code Happy20th to save 75% off my Live Class Season Pass, Recorded Class Season Pass Fundamentals, or Masters recordings, or any of my online classes.

You can buy in Euros, GBP, or US dollars, whichever currency is more convenient for you. When you click the Buy button, you get your choice of currency, and it calculates your VAT and helps you with exemptions as well.

I’m also offering live online classes in Europe-friendly times: 8:00-16:00 UTC. (UTC keeps it simple since I’m living in Iceland at the moment.)

See you in class!


Wow: BrentOzar.com is 20 Years Old.

Company News
18 Comments
My first avatar, by Jin Wicked

In 2001, I was a developer, and I liked to write.

I’d had web sites for a number of years, but in 2001, I decided to switch from wickedlife.com (archive) over to BrentOzar.com. I wanted a domain name that would more appropriately reflect my interests as they changed over time, and I figured that I wouldn’t always be interested in the goth lifestyle, hahaha.

There were a couple of blogging applications available at the time – this was years before WordPress – but being a developer, I wanted to write my own as a learning experiment. Armed with Macromedia Dreamweaver and a few books, I wrote my own blog platform and started blogging.

In the early years, the blog posts ranged from going to Fry’s, using technologies I’ve long since forgotten, building a car computer, wishing Mom & Caryl a happy Mother’s Day, marketing slogans for girlfriends, and, uh, buying pantyhose to make my own aquarium filters. I kept a webcam on 24/7, which led to some amusing photos of me working. In later years, I started writing the personal stuff over at Ozar.me, and I focused BrentOzar.com on technology, specifically Microsoft SQL Server.

A lot has changed in the last 20 years. Hell, a lot has changed in the last one year!

But some things remain the same:

  • We – you and I – are a community, a tribe.
  • We support each other and help each other grow.
  • As we grow, our job roles may change, so we’ll drift out of some tribes over time. That’s totally okay.
  • You get what you give, and there are so many ways to give: write blog posts, give presentations, volunteer with open source projects, or help coordinate events, virtual or in-person. (Okay, maybe that last one has changed a little.)

Over the next 20 days, I’m going to spend some time here reflecting on things we’ve learned over the last 20 years – stuff that’s changed, stuff that hasn’t, moments that I’ve been proud of, and where I think things are heading. Some of it will be technical, some of it about the industry, and some of it will be navel-gazing.

Thank you for being here for this wonderful couple of decades. I really appreciate you. You make this fun for me.


Now Europeans Can Buy My Training Classes Too!

Company News
53 Comments

Good news, party people: we’re finally selling to the European Union, European Economic Area, and the United Kingdom again!

You can also buy my classes now in Euros, GBP, or US dollars, whichever currency is more convenient for you. When you click the Buy button, you get your choice of currency, and it calculates your VAT and helps you with exemptions as well.

So, what changes did I make for GDPR compliance? Read the next post in the series: How I Worked Toward GDPR Compliance: The Really Long Story.


4 Things I Love About Integers #TSQL2sday

Development
4 Comments

I love integers.

Furthermore, I think we’ve all been taking them for granted.

Integers are clean, elegant datatypes that do exactly what they’re supposed to. They’re the Golden Retrievers of the database world: they have a few shortcomings, but so, so few. And so for this month’s T-SQL Tuesday, I wanna share my love for integers with you.

1. Integers have clear contents right from the start.

You can declare an integer variable or column datatype, then set it to a string – and it’ll work great as long as the string can be converted into a number:

But if you try to stuff something into an integer that doesn’t belong, you get a clear error right away:

I can guarantee that the only thing in there is gonna be whole numbers, and that I won’t have to play any of the finicky games involved with floating point math or someone trying to store dashes in there to format their social security numbers. If you wanna do formatting on integers, you can – there are functions to help – but it’s usually pretty clear that any formatting should be done on the application side.

2. Integers rarely have implicit conversion issues.

If someone passes in a string, or even a unicode string, and tries to do an equality search of a number field, SQL Server is smart enough to convert the string to an integer first. SQL Server even leverages statistics to understand what the string number will produce in a search!

The resulting execution plan is beautiful because SQL Server converted the parameter, not the contents of the table, and accurately estimated the number of rows:

3. Integers are small – and compress down even smaller.

Integers take up just four bytes of space, and if a few billion values isn’t enough for you, their big brother BIGINTs take up just eight bytes. Eight bytes! I see people constantly slapping NVARCHAR(100) all over the place, storing un-normalized descriptions in row after row after row, just flushing space down the drain when they could be linking to a table with the handful of descriptions that they always use.

Need to save space? Row compression is a piece of cake to implement, and even BIGINT columns are just automatically stored in the smallest possible number of bytes. Row compression makes it easy to justify using BIGINTs as identity columns on tables where you’re worried about a large number of rows in the future because they just don’t take any extra space to store.

Got a lot of repeated values, and want even more compression? Enable page compression.

And when it’s time to build fact tables for reporting systems and data warehouses, buckle up: columnstore indexes bring really bananas levels of compression, like 70% or higher with repeated integers like order line item quantities, prices, or part numbers.

4. Integers even work in URLs.

If you’ve surfed Stack Overflow, you might have noticed how the URLs have integers in them, like this:
https://stackoverflow.com/questions/7395915/does-inserting-data-into-sql-server-lock-the-whole-table/

Well, you know what’s really cool? Lop off everything after the numbers, like this:
https://stackoverflow.com/questions/7395915

And go to that URL:

Stack Overflow automatically pulls up the full URL. Wanna know how they do it? Everything after the number is extraneous, because the number corresponds to the actual ID they’re storing in the Posts table! You can see the database version of the question you’re looking at by running a query:

And presto:

Try doing that with your fancypants strings, bucko, and welcome to URL encoding hell.


Download the Current Stack Overflow Database for Free (2021-02)

Stack Overflow
12 Comments

Stack Overflow, the place where most of your production code comes from, publicly exports their data every couple/few months. @TarynPivots (their DBA) tweets about it, and then I pull some levers and import the XML data dump into SQL Server format.

Stack Overflow’s database makes for great blog post examples because it’s real-world data: real data distributions, lots of different data types, easy to understand tables, simple joins. Some of the tables include:

  • Users – now up over 14 million rows
  • Posts – over 52 million rows’ worth of questions & answers, 143GB in just the clustered index alone
  • Votes – over 208 million rows, making for fun calculations and grouping demos

This isn’t the exact same data structure as Stack Overflow’s current database – they’ve changed their own database over the years, but they still provide the data dump in the same style as the original site’s database, so your demo queries still work over time. If you’d like to find demo queries or find inspiration on queries to write, check out Data.StackExchange.com, a public query repository.

I distribute the database over BitTorrent because it’s so large. To get it, open the torrent file or magnet URL in your preferred BitTorrent client, and the 53GB download will start. After that finishes, you can extract it with 7Zip to get the SQL Server 2016 database. It’s 4 data files and 1 log file, adding up to a ~411GB database.

Want a smaller version to play around with?

  • Small: 10GB database as of 2010: 1GB direct download, or torrent or magnet. Expands to a ~10GB database called StackOverflow2010 with data from the years 2008 to 2010. If all you need is a quick, easy, friendly database for demos, and to follow along with code samples here on the blog, this is all you probably need.
  • Medium: 50GB database as of 2013: 10GB direct download, or torrent or magnet. Expands to a ~50GB database called StackOverflow2013 with data from 2008 to 2013 data. I use this in my Fundamentals classes because it’s big enough that slow queries will actually be kinda slow.
  • For my training classes: specialized copy as of 2018/06: 47GB torrent (magnet.) Expands to a ~180GB SQL Server 2016 database with queries and indexes specific to my training classes. Because it’s so large, I only distribute it with BitTorrent, not direct download links.

If you only have a limited amount of bandwidth, you don’t have to keep seeding the database after you get it – I’ve got it hosted on a handful of seedboxes around the world.

As with the original data dump, these are provided under cc-by-sa 4.0 license. That means you are free to share it and adapt it for any purpose, even commercially, but you must attribute it to the original authors (not me):

so-logo

Happy demoing!


It’s March. We’re still quarantined. Let’s do some free classes for Europeans.

Company News
26 Comments

Last year, during the first few months of quarantines, I ran free training classes to help y’all pass the time and raise your skills.

Thankfully, I don’t need to wear this during the online classes.

It’s March 2021, and so many of us around the world are still stuck indoors, not going into work, not seeing our friends, dealing with mask breath.

So let’s do another round of free live classes – this time around, in Europe-friendly times, 8:00-16:00 UTC.

No registration required – just do your prerequisites, then show up at www.BrentOzar.com/training/live/ a few minutes before the class starts and join in on the fun.

The classes will be recorded, but the Instant Replay recordings will only be available to folks who bought the classes or my Recorded Class Season Pass.

You might be asking, “Brent, how can you keep taunting Europeans about things they can’t buy? Why bother doing classes in European times if the Europeans can’t get the recordings?”

And I’ve got some very good news about that. Stay tuned!


#TSQL2sday 136 Invitation: Blog About Your Favorite Data Type (Or Least Favorite)

T-SQL
36 Comments

T-SQL Tuesday is a monthly blogothon where we get together and write about a different topic. I’m hosting this month’s episode.

Your mission: write a blog post about your favorite data type, and schedule it for next Tuesday, March 9.

Mad about money? Feverish about float? Tell us what you use your favorite data type for, and what people need to know before they get started using it. Data types have so many interesting edge cases, like when Aaron Bertrand taught us to use lower case for our data types. (I still don’t do that, but for the record, I do feel guilty.)

The rules:

  • Your blog post has to go live on Tuesday, March 9
  • Include the T-SQL Tuesday logo, and if somebody clicks on the logo, it has to link back to this post
  • Include a link to this blog post, or after you publish it, leave a comment here so I can track you down for the roundup blog post next week
  • If you want to host a future T-SQL Tuesday episode, tweet @way0utwest or leave a comment on this post

Don’t worry about somebody else picking the same data type – this is the beauty of blogging. Readers are here for your unique take on things, in your voice. And don’t worry about repeating things that are in the documentation – nobody reads that. You could seriously copy/paste the contents of the documentation and you’d get comments saying, “Whoa, I never knew that.”


Free Webcast: Planning a SQL Server to Azure Migration in 2021

Microsoft Azure
9 Comments

You manage database servers for your company, and they’re thinking about migrating everything to Microsoft Azure. You’ve searched the web for information and advice, but everything you’re finding is outdated.

You just want a list of things to review on your servers, an overview of what won’t work in the cloud, and how to pick alternatives or workarounds.

In this one-hour session on Thursday, March 18, we’ll cover pure technical details, zero marketing, and get you up to speed quickly with none of the fluff.

Register now for the free webcast. If you can’t be there live, no worries – links to the recordings will be sent to the registrants. And if you’ve got questions you’d like to see me touch on during the webcast, you can get a jump on the Q&A by asking ’em below in the comments.


New SQL ConstantCare® Feature: One Serious Query to Tune

SQL ConstantCare

Our monitoring tool, SQL ConstantCare®, is different.

Some monitoring tools send you a whole bunch of emails for every little thing that’s happening on every one of your server – and having you end up just creating an Outlook rule to dump them all into a folder that you never read. PAGE LIFE EXPECTANCY IS LOW! DISK QUEUE LENGTH IS HIGH! It all just turns into noise.

SQL ConstantCareSQL ConstantCare® is minimal: just one email per day. We tell you when there’s something seriously wrong that demands attention.

So in the interest of minimalism, we’ve got a new feature: One Serious Query to Tune. When >50% of your SQL Server’s workload is caused by just one query, we tell you about it. We know you’re too busy to tune every query or index that could help performance, but in cases like this, we call out that there’s a serious issue that will probably be worth your time to fix.

You might be thinking, “Brent, how often does that really happen?”

Well, last week alone, it happened for 328 of our users!

Sign up now and get a free 14-day trial. You can cancel anytime without charge before your trial ends. What will we discover together about your servers?


Updated First Responder Kit and Consultant Toolkit for February 2021

First Responder Kit Updates
0

Along with the usual bug fixes & improvements, this month’s release includes a brand new sp_BlitzAnalysis script from Adrian Buckman. It’s for folks who have an Agent job to log sp_BlitzFirst to tables every 15 minutes, saving wait stats, Perfmon counters, file stats, etc into tables. sp_BlitzAnalysis reads those tables and sums up activity in your chosen time period, like this:

To install it, run the sp_BlitzAnalysis.sql file included in the First Responder Kit. It’s not a complex installation or anything – I just didn’t want to include it in the main installer scripts because this is the first public release, and I haven’t tested it myself yet either. (Been a little busy with the whole Iceland move.) To learn how to use sp_BlitzAnalysis, read the documentation. For questions or support on it, hop into the #FirstResponderKit Slack channel. (If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

How I Use the First Responder Kit
Watch and learn

To get the new version:

Consultant Toolkit Changes

I updated it to this month’s First Responder Kit, but no changes to querymanifest.json or the spreadsheet. If you’ve customized those, no changes are necessary this month: just copy your spreadsheet and querymanifest.json into the new release’s folder.

sp_Blitz Changes

  • Fix: the check for the most recent CHECKDB dates was running even when @CheckUserDatabaseObjects = 0. This was the only instance where we were using sp_MSforeachdb with @CheckUserDatabaseObjects = 0, so it’d slow sp_Blitz down on instances with hundreds or thousands of databases. (#2779)
  • Fix: put comments around the term xp_cmdshell to fool a firewall. (#2788, thanks TLovie.)
  • Fix: changed typo of BrenOzar.com to BrentOzar.com on one check. (#2786, thanks Michael Gowan.)

sp_BlitzAnalysis Changes

  • All new! This is the first release that includes sp_BlitzAnalysis by Adrian Buckman. If you want to install this script, install sp_BlitzAnalysis.sql – it’s not included in the normal installation scripts yet just to limit the blast radius in case something goes wrong.
  • For tips on how to use it, check out the documentation.
  • For questions or problems, see the “For Support” section of this doc.

sp_BlitzCache Changes

  • Improvement: default output now includes PlanGenerationNum from sys.dm_exec_query_stats. Helps troubleshoot scenarios where the plan is frequently recompiling due to statistics changes. (#2792, thanks Tom Lovie.)
  • Fix: a recent release broke @SortOrder = ‘recent compilations’, and that’s fixed. (#2772)

sp_BlitzFirst Changes

  • Improvement: if more than 20% of the queries are runnable, waiting on CPU power to make progress, we warn you as shown above. sp_BlitzFirst makes two passes through to check – once at the beginning of the sample defined with @Seconds (defaults to 5), and again at the end. (#2776, thanks Erik Darling.)
  • Fix: if you passed in @OutputTableRetentionDays because you wanted to store more or less history in the output tables, that change wasn’t being passed to sp_BlitzWho, which was still just storing 3 days. (#2758, thanks Emanuele Mezzo.)
  • Fix: the high compilations/sec and recompilations/sec warnings had way too high of thresholds. They were looking for 1,000 batch requests per sec, but you can see high CPU usage due to compilations with even just 10 big ugly plans per second getting generated. Lowered the threshold to either 10 compilations/sec, or more compilations than batch requests per second – which happens when a proc has multiple statements with recompile hints in them. (#2770)

sp_BlitzIndex Changes

  • Improvement: duplicate indexes are now sorted by row count descending so that the big bang-for-the-buck stuff comes first. (#2762, thanks Todd Chittenden.)
  • Improvement: Aggressive Indexes warnings are now sorted by total lock wait time descending. (#2768)
  • Fix: a few releases back, I added support for SQL Server 2019’s new missing index DMV that tells you which queries are triggering the request. I had some challenges getting it to work reliably, and Microsoft just now documented it so I’m removing support for it temporarily. It’ll be back. If you want to help with coding for it, check out #2185. (#2780)

sp_AllNightLog Changes

  • Fix: the RPO and RTO now honor overrides set in the config table. (#2775, thanks Alin Selicean.)

sp_ineachdb Changes

  • Improvement: Azure SQL DB compatibility in the sense that it’ll run on Azure SQL DB, but it can’t actually change databases since Azure SQL DB doesn’t allow that. I’m referring to Azure SQL DB “Classic Flavor” here, not Managed Instances. (#2790)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.


How to Trace a Nested Stored Procedure Without Using Profiler

Extended Events
5 Comments

You’ve got a stored procedure that calls other stored procedures that build dynamic SQL that call other functions, and there’s so much calling going on that you would like to phone a friend.

Your query plans look like piles of spaghetti, and when you zoom in, you just see more tiny piles of tiny spaghetti.

Finding the slow parts is really easy thanks to sp_HumanEvents by Erik Darling (Blog@ErikDarlingData.) Download it, install it in your master database, and you’ll be ready to go.

In one window, get ready to run the nasty stored procedure. Make a note of which session you’re in – it shows up at the bottom of your SSMS window like this:

See the “SQL2019\Brent (60)”? That number in parenthesis (60) is my session ID.

In another window, fire up sp_HumanEvents:

The parameter names & values are pretty self-explanatory there. After 30 seconds, you’ll get a nice rundown of what queries ran, how long they took, their execution plans, and much more.

You could do this same thing with Profiler, but it’s more painful to set up, has more overhead, and doesn’t give you the nice, clean, filtered result set. You could also do this same thing by rolling your own Extended Events session, but seriously, who’s got time for that? Erik’s done all the hard work for you.

If you like this, you’ll love how easy it is to find queries using option recompile.


Changing Statistics Cause Longer Compilation Times

Execution Plans, Statistics
19 Comments

I need to be up front with you, dear reader, and tell you that you’re probably never going to need to know this.

I try to blog about stuff people need to know to get their job done – things that will be genuinely useful in your day-to-day performance tuning and management of SQL Server. This, however, is not one of those things. This is fun, obscure trivia, documenting a situation you’ll probably never run into, prompted by Forrest’s post proving that compilation time includes more than parse time.

Start with a hard-to-compile query plan.

I’ll use any Stack Overflow database and the hard-to-compile query plan from my post, Bad Idea Jeans: Building Big Query Plans, modified to join to the Users table:

Run the query, and check the Messages tab:

That complex query plan is hard to build, so it took some time:

The table variable doesn’t have any rows in it, though, so execution time is nearly instantaneous. The hard part here is clearly building the query plan.

But it gets much worse.
Try updating stats while it runs.

In one window, free the plan cache and start rebuilding the indexes on the Users table with online = on so the CTE query can execute:

Then in another window, run the CTE query again. Make a pot of coffee or hit the loo, because you’re gonna have some time. Several minutes later, both queries finish, and the output of the CTE’s statistics is pretty bizarre:

What’s happening? Every time the Users rebuild finishes, it’s also updating stats. The query compilation sees those updated stats, and…decides that the plan it just got done building is invalid, so it’d better go try to build another plan! Thus, the repeated messages about parse & compile time.

Yes, I actually hit this problem.

It sounds crazy, but to recreate a client issue, I was loading a 10 billion row table, which takes a while. As it was loading, its statistics were constantly changing, because as it grew, it was constantly hitting the automatic stats updates threshold.

As a result, simple queries with no where clause or order by could finish instantly, but add in even the simplest where clause, and the query would never finish. It was constantly recompiling as more data was loaded.

Shout out to Joe Obbish and Paul White who helped me troubleshoot this problem last year – all credit for solving the mystery goes to them:


What’s New & Undocumented in SQL Server 2019 Cumulative Update 9

SQL Server 2019
26 Comments

Last week’s SQL Server 2019 Cumulative Update 9 snuck in some undocumented things.

We got two new DMVs for Always On Availability Groups, sys.dm_hadr_cached_replica_states and sys.dm_hadr_cached_database_replica_states. They have a subset of data from other AG DMVs:

I sure wish I could point you to the documentation or KB article on those.

But we have no idea what changed in CU9 here, or why, or what to do about it.

I’m gonna be honest with you, dear reader: I’m starting to tell clients to choose SQL Server 2017 over SQL Server 2019. 2019 just doesn’t seem like it’s baked, and 2019’s Cumulative Updates have just become too large of a risk: the release notes aren’t honest about you’re deploying. That’s not fair to organizations that depend on SQL Server for mission critical deployments: they need to know what changed so they can be on the lookout for new unusual behaviors.

If your own developers were changing production without telling you how or why, you’d call them into the office, close the door, and have a conversation.

In lighter news, there are also a few new messages in sys.messages:

  • 898: Buffer Pool scan took %I64d seconds: database ID %d, command ‘%ls’, operation ‘%ls’, scanned buffers %I64d, total iterated buffers %I64d, wait time %I64d ms. See ‘https://go.microsoft.com/fwlink/?linkid=2132602‘ for more information.
  • 7877: Backup on secondary replica is not supported in the connection to contained availability group. Reconnect to SQL Instance and retry the operation.
  • 8659: Cannot continue columnstore index build because it requires %I64d KB misc memory per thread, while the total memory grant for query is %I64d KB, total Columnstore bulk insert memory is limited to %I64d KB in query, and the maximum memory grant is limited to %I64d KB per query in workload group id=%ld and resource pool id=%ld. Retry after modifying columnstore index to contain fewer columns, or after increasing the maximum memory grant limit with Resource Governor.
  • 35528: unable to read page due to invalid FCB
  • 46933: PolyBase configuration is invalid or corrupt. Re-run PolyBase setup.

Query Store is the Flight Data Recorder. It’s Obvious What We Need Next.

Humor
31 Comments

When Microsoft brought out Query Store, they wrote:

Similar to an airplane’s flight data recorder, Query Store collects and presents detailed historic information about all queries, greatly simplifying performance forensics by reducing the time to diagnose and resolve issues.

I have a straight face under here, for real

It certainly works, making it easier for you to understand when a query’s execution plan suddenly goes downhill, taking your server with it.

But it isn’t enough.

In airplanes, it’s not enough to know that the throttles were powered down and the altitude plummeted. It’s not enough to know what happened: we need to know why it happened. What problem did the team think they were facing? Was there a misunderstanding in the cockpit about a metric? A disagreement between team members about how to solve the problem?

That’s why we need Voice Store.

Just as a cockpit voice recorder helps investigators better understand why decisions were made, SQL Server Management Studio needs a Voice Store to capture what the team was trying to do in the moments leading up to the crash.

To help speed up accident investigations, we simply can’t afford delays. That’s why it’s also important that Voice Store be connected directly to YouTube, automatically uploading recordings to help the community get to the root cause analysis.


What’s a Size-of-Data Operation?

Database professionals say, “That’s slow because it’s a size-of-data operation.”

A size-of-data operation is when the database has to write all of the affected data again. What you’re asking for might seem small, but for the database, it causes a lot of work.

For example, let’s say you live in a strict household, and your parents have told us that under this roof, everyone’s hair can only be black, blond, brown, or red. (Don’t blame me. I don’t make the rules. They’re your parents, not mine. Go talk to them. And tell your mom I said hi.)

If your parents suddenly loosen up and say, “We’re now allowing you to dye your hair pink,” that doesn’t mean we have to touch everyone’s head. It’s a metadata operation: we’re just making a new note so that when someone walks into the door and we check their hair color, we have a new acceptable option.

However, if your parents freak out and say, “Everyone’s hair must be pink,” that means we’re gonna need to dye every person’s hair. We’re gonna have to gather everybody together and do some real work.

That’s a size of data operation: however many people live in the house, we’re gonna need to work on all of ’em.

How that translates into databases

I’ll use the Users table in a large Stack Overflow database to explain.

If I alter a table and add a new nullable column, it happens instantly, and SQL Server doesn’t have to touch any pages:

Adding a column like that is just a metadata change. It tells SQL Server, “When new rows walk into the house, they’re allowed to specify their HairColor now. When rows leave the house, if they don’t already have a HairColor set, just say it’s NULL.”

However, if I populate that column and dye everyone’s hair pink, that’s gonna take some time, because we’re gonna have to touch every row:

That’s a size-of-data operation, so it’s slower.

How that affects development & deployment

When you’re just getting started developing an app, and all your tables are empty, you don’t really have to worry about this kind of thing – because size-of-data operations are fast when there’s, uh, no data involved.

But the larger your data becomes, and the slower your storage becomes (like if you move from nice on-premises flash storage up to the cloud), then the more you have to pay attention to this. You’ll want to run your deployment scripts on a real production-sized database, not just an empty development shell, and time the deployment scripts. The longer an operation takes, the more likely it is a size-of-data operation. On larger databases – at 100GB and higher – and at 24/7 uptime shops like online stores, you’ll likely have to tune your deployment scripts to avoid size-of-data operations.

I haven’t seen a good rundown of all the operations that are size-of-data operations for Microsoft SQL Server, but it’s changed over the years. Hey, if you’ve got a blog – there’s a potential idea for you! Not me though. The hot tub is calling.


I Sat Down for an Interview with Forrest Brazeal.

Cloud Computing
6 Comments

Here’s how the interview started:

Forrest Brazeal: Brent, you’re a performance specialist who’s worked with massive SQL Server deployments; how does it make you feel when you hear cloud architects saying things like “relational databases don’t scale?” What do you wish they knew?

Eyes up here, kid
“Time for a smoke break.”

Brent Ozar: They’re right. Nothing scales worth a damn. Even stuff that looks easy, like Amazon S3 file storage, can run into scaling problems if you design it without an analysis of how AWS partitions reads, and how you have to watch out for your object prefixes.

But the thing is, nobody reads the documentation.

Read the interview here. I had a lot of fun with it.


Partitioned Tables Cause Longer Plan Compilation Times.

Partitioning
10 Comments

Folks sometimes ask me, “When a table has more indexes, and SQL Server has more decisions to make, does that slow down execution plan generation?”

Well, maybe, but the table design choice that really screws you on compilation time is partitioning. If you choose to partition your tables, even tiny simple queries can cause dramatically higher CPU times. Even worse, as the famous philosopher once said, “Mo partitions, mo problems.

We’ll start with any Stack Overflow database, create a numbers table, and then dynamically build a partition function that partitions our data by day:

Then we’ll build a partitioned copy of the Users table, partitioning them by CreationDates:

Let’s compare partitioned and non-partitioned performance.

I’ll create a non-partitioned index on both the Users and Users_partitioned tables. Note that even if I don’t specify partitioning on the Users_partitioned index, it still ends up partitioned by default, which is kinda awesome:

Now we’ll run an exceedingly simple query that only returns one row:

At first glance, the plans look the same – but notice how the non-partitioned table is 0% of the cost, and the partitioned table is 100% of the cost:

That’s because the non-partitioned estimated cost was way less than 0.001, and the partitioned query’s estimated cost is over 15. What’s worse, the compilation time, execution time, and logical reads are totally different – the top one is the non-partitioned query, and the bottom is the partitioned one, both cleaned up for readability:

Hubba hubba. Just compiling the partitioned table’s plan took 27ms of CPU time. I know what you’re thinking: “Brent, who cares about 27ms of CPU time?” Well, remember, this is an overly simple query! In real-life examples, it’s not unusual to see 250+ milliseconds of CPU time spent just compiling the plan – which means you can only compile 4 queries per second, per CPU core. That’s when plan cache pollution due to unparameterized queries really screws you.

I’m writing this particular post because I had a client who chose to partition every table in their data warehouse, regardless of size, and used exactly the same partition granularity on all sizes of tables to keep their dynamic ETL queries simple. Unfortunately, even querying simple configuration tables was taking 250+ milliseconds of CPU time just to compile a plan.

Fine – but what about non-partitioned indexes?

I’ll drop that partitioned index and create a non-partitioned index on the Users_partitioned tables – note that I have to specify ON PRIMARY as the filegroup for the partitioned table, or else any nonclustered index will by default automatically be partitioned as well – then try the queries again:

Now we’ll run an exceedingly simple query that only returns one row:

Now the estimated costs are neck and neck:

But that’s just an estimate – the only thing neck and neck here are the logical reads:

Designing and executing a query plan takes longer when any of the objects involved are partitioned.

This overhead isn’t large as long as you’re comparing it to the overhead of big data warehouse queries where partitioning data can reduce reads. But if you’re comparing it to smaller objects – say, 100GB or smaller rowstore indexes – that are queried more frequently, with more varying queries that require building new execution plans – then the overhead of partitioning starts to add up. Mo partitioned objects, with mo partitions in each object, mo problems.

Partitioning is one of those features where I never hear people say, “Wow, every which way I turn, partitioning has just made performance better!” Rather, it’s a feature where people keep saying, “I had no idea partitioning was going to cause me problems over THERE, too.”

In my Mastering Index Tuning class, we spend just enough time on partitioning to explain where it’s a good fit – but much more time on indexing techniques that are more appropriate for 99.9% of workloads out there.


Unusual Parameter Sniffing: Big Problems with Small Data

Parameter Sniffing
22 Comments

Normally when we think about parameter sniffing, we think about a tiny data vs big data problem: the tiny data’s plan goes in first, and when we try to process big data, performance is terrible.

But sometimes, it’s the exact opposite.

I’ll start with the large Stack Overflow database (I’m using the 2018-06 version, but any large one will do) and write a stored procedure to find the most recent posts authored by folks in a specific location:

To help SQL Server out – I’m a giver like that – I’ll create a few indexes to give SQL Server some choices:

Call it for tiny data first…

When we call the proc for a less-populated location first, SQL Server chooses to look up the people who live in Iceland (because there aren’t many), and then go find their posts. Small data finishes in under 1 second:

If we now run it for a larger location, like Germany, the query takes 24 seconds to run because we find so many people in Germany, and they’ve posted so many answers. (I’m sure they’re not asking a lot of questions. Germans are really smart.)

And the sort spills to disk because we only granted Iceland-size memory.

This is the typical parameter sniffing problem that people blog about: put the tiny data parameters in memory first, and SQL Server just isn’t equipped to deal with big data.

But if the big data plan goes in first…

Do things perform better? Let’s free the plan cache, then start with Germany:

This time around, SQL Server says, “Ah, Germany’s really common. You’re asking for these posts to be sorted by CreationDate – as a reminder, here’s your query and your indexes:”

“So since you’re asking for the top 200 by CreationDate descending, I’ll just scan the CreationDate index from newest to oldest. For each post I find, I’ll go look up the Users table to see where that user’s from. I bet I won’t have to look up too many before I’ll find 200 of ’em that were posted by Germans.”

And he’s right: he only had to look up about 20,000 of them before he found 200 that were written by Germans. Very cool.

But…when we run this for Iceland…

SQL Server has to read over a million Posts, doing over a million Posts key lookups, and over a million checks into the Users table before it’s found 200 posts that were written by Icelanders. Ten seconds isn’t so bad, but if you pass in an even rarer location, like the charming and easy-to-pronounce Hafnarfjordur, Iceland:

SQL Server again scans the index on Posts, doing a key lookup on Posts for every single row that it finds, and I bet that sounds goofy. Oh it gets worse: for all 40,700,647 Posts, it does a clustered index seek against the Users table. We do over 124 MILLION logical reads on a table that only has 143K pages in it. We read the Users table 871 times over, and the Posts table 6 times over due to all the key lookups:

That sure is a lot of work – would be nice if SQL Server parallelized that query across multiple threads. No can do, though – the query plan was designed with Germany in mind, when SQL Server thought it’d find rows quickly. The whole time this query runs, it’s just hammering one of my CPU cores while the rest sit idle:

There is no one good plan here.

I always laugh when I see demos that talk about “the good plan” and “the bad plan.” I wish the real world was so simple, so black-and-white. In reality, queries have many possible plans, and many of ’em just don’t work for other sets of parameters.

SQL Server 2017’s “Automatic Tuning” does nothing here, either: it can try throwing a different plan in thinking there’s been a regression, but it’s not smart enough to pick different execution plans for different parameters. It picks one plan and tries to make it work for all of ’em – and that simply doesn’t work here.

It’s up to us to fix it, folks.

That’s why I teach my Fundamentals of Parameter Sniffing and Mastering Parameter Sniffing classes. Folks with a Live Class Season Pass can start watching the Instant Replays whenever they want, or drop in live. I’m teaching a bunch of classes over the next few weeks:

See you in class!


My Live Streaming Setup for 2021: Iceland Edition

Home Office, Streaming
10 Comments

Erika and I moved to Iceland for a while. For those of you who don’t follow my personal blog, here’s why we moved, and how we got here with a teleworker visa. You can spy on our adventures via my Instagram, Facebook, or Twitter.

We’re working half the time while we live here, and sightseeing the other half. It’s easy to telecommute from Iceland because it’s one of the most connected countries in the world. Most houses have symmetric gigabit fiber available – even though we’re extremely isolated, we still have great bandwidth:

This isn’t a permanent move – probably just 6-12 months – and we’re doing some traveling through the country, so I didn’t want to pack my full home office setup with my Mac Pro and big monitors. I downsized my streaming setup, and I figured some of y’all might find it interesting. It’s closer to what most folks would use for remote presenting.

Laptop: MacBook Pro 16″ – the fastest Mac laptop available today, a 2.4GHz 8-core Intel Core i9, 64GB RAM, but…it’s a long, long way from the 16-core Mac Pro I usually use. When I live stream, I really put the CPU fans to work. I was tempted to try one of the new MacBooks with the Apple Silicon M1 processor, but I’m holding out for the higher-horsepower versions that will surely follow. I brought my Keychron K8 keyboard and an Apple Magic Touchpad at the last minute at Erika’s insistence – I was trying to pack lighter, heh.

External monitor on right: 15″ generic 1080p. Normally for monitors, you want high resolution, but when you’re streaming, you actually want a plain 1920×1080 because that’s the max size you’d stream out to your audience. The monitor has inputs for USB-C, HDMI, Mini DisplayPort, etc, but the nice thing is that it can use a single USB-C cable for both power and for the display signal.

When I’m streaming, the MacBook Pro’s high resolution display has Slack (for attendee chat & questions) and OBS’s control panel and preview. The 15″ 1080p monitor is what’s showing to the audience, so I’ve either got SSMS on there, or a PowerPoint.

I usually teach while standing up, but I’m trying the dining room table to see if I can make that work. If it turns out I just absolutely gotta stand, we’ve got a high bar table picked out at a local furniture store – if necessary, we’ll grab that and leave it in the rental house when we leave.

Cameras: two Sony A6100s each plugged into Elgato Cam Link 4K: in my normal home studio setup, I plugged the Sonys into a Blackmagic Design HDMI recorder, but that’s a PCI Express card. The Elgato is an easier/smaller USB solution for laptops. I do prefer a two-camera setup, alternating between a head-on camera and a from-the-side camera for casual discussionss. I still use my iPhone with the NDI camera app as a scenery camera during the breaks – that’s good for breaks, but not for a main camera because it has unpredictable latency, meaning your speech and moving lips can drift in & out of sync, and I can’t stand that.

Microphone: DPA In-Ear Broadcast Headset plugged into a Focusrite Clarett. I’m only reusing these because they’re part of my desktop setup, so I don’t need to buy them again, but they’re overkill for most streaming setups. I have a Rode VideoMic GO ($100) on-camera microphone as a backup, but the sound on that isn’t nearly as good since it also picks up echoes from the room, keyboard noises, laptop fan, etc.

Customizable keyboard control: Elgato Stream Deck. LCD keys so you can configure what they show & do, like switching between different cameras. Could you do this by memorizing a bunch of hotkeys? Probably, but as you start to rely on more advanced OBS functionality, like playing sound effects, this will come in super handy.

USB-C dock: CalDigit TS3 Plus: if you count the above devices, that’s 5 high-speed USB ports right there, plus Ethernet. The MacBook Pro only has 4 USB-C ports, plus it needs one for electric power, so I needed a USB-C dock. The CalDigit is by far the most stable dock I’ve found – a lot of USB-C docks flake out during sustained 4K streaming with multiple cameras.

Tripod mounting: Manfrotto Magic Arm plus Super Clamp: I could theoretically clamp these directly to the dining room table that I’m using for a desk, but they’d shake when I jostle the table. Instead, I brought a couple of tripods, plus these Magic Arms so I can mount multiple devices to the tripod, like a camera plus lights.

Green screen: collapsible with stand: at home, I used a wall-mounted pull-down green screen, but obviously that’s not gonna work on the road. This folds up, and I’ll have it behind me while I’m teaching. I expect to be teaching sitting down while I’m on the road, but I’d be completely delighted if I could put together a setup where I could continue standing & teaching. The box it shipped in was exactly the max dimensions for standard airline checked luggage. It collapses, but not small enough to fit in a regular checked bag – it’s still an odd shape.

Carrying the electronics: Pelican 1620 case: on one of our Iceland flights, the airport crew unloaded the baggage during a serious rainstorm. Our bags and their contents were all completely soaked with water. We didn’t have any electronics in ’em, thank goodness, but for this trip, I wanted something waterproof that I could check as luggage. (It won’t have any lithium batteries in it – I take those in my carryon.)

If you wanna see the results and you’ve got a Live Class Season Pass, hop into my live class page during any of my classes this month:

See you in class!


Menu