If you want to run the labs on your own (either during class, or later), you’re going to need:

  • A server
  • The Stack Overflow database
  • The indexes & scripts set up
  • Then ideally, back the database up, and set up an Agent job to rapidly restore the database between labs
  • SQLQueryStress

Let’s talk through each piece.

Building a SQL Server

In class, we use AWS EC2 VMs:

  • Mastering Index Tuning & Mastering Query Tuning use an i3.xlarge: 4 cores, 30GB RAM, local NVMe SSD storage.
  • Mastering Server Tuning uses an i3.2xlarge: 8 cores, 61GB RAM, local NVMe storage. We use more cores & RAM here because we run heavier stress tests, since we’re doing server-level discussions

Each server needs at least 500GB local SSD space to deal with the 350GB Stack Overflow database, index creation space, TempDB, backups, and restores. It needs to be fast storage, too: for a rough idea, you’ll need to be able to restore the ~350GB backup in 15 minutes or less. (I’m just using restore speed as one example here – you’ll be doing lots of storage-intensive tasks, like creating indexes on large tables.)

Before you install anything on your VM, run a test with CrystalDiskMark. You want at least 1,000 MB/sec for sequential reads and 400 MB/sec for sequential writes. Results for an AWS i3.xlarge like we use in class is shown at right: if you can’t get the top line’s results to be at least that high, stop: you’re not going to be able to keep up in class.

If you want to use an Azure VM, use the L8s v2 instance type with 8 cores, 64GB RAM, and 1.8TB of local NVMe storage. It’s the smallest instance type Microsoft offers with NVMe storage, which is what you’ll need to keep up with the labs.

Can you use faster hardware? Yep, and as long as you’re not caching the entire database in RAM, you’ll probably still have the same basic challenges that we tackle in all of the classes.

After building the server, install:

Getting the Stack Overflow database

Download the 2018-06 version of the Stack Overflow database: 38GB torrent (magnet.) If you’re not familiar with BitTorrent, here are detailed instructions, but just make sure to get the Stack version mentioned earlier in this paragraph – it’s important so you reproduce demos the same way. Your company may block BitTorrent – many do – but it’s the most efficient way to get a >300GB database out there. There aren’t a lot of file hosts willing to share a file that big, heh. You may need to download it from home.

When you extract it, it’ll expand to a ~304GB SQL Server database that you can attach. There are other smaller versions of the database too, but be aware that you’ll get radically different query plans, and some of the demos won’t work the same way since we’re querying for specific date ranges.

It ships in SQL 2008 compatible format, but you usually wanna learn on the most recent Cardinality Estimator (CE). To set that, in SSMS, right-click on the database, go into Options, and change Compatibility Level to the most current version you’re on. SQL Server 2019 does do some things dramatically differently – sometimes better, sometimes worse – and we’ll talk about that during class.

Setting Up the Stored Procs and Indexes

We use the First Responder Kit scripts for performance analysis, and we use the below procs to help the demos along. Create these in the Stack Overflow database:

Don’t install that in production, bucko.

Next, run the index creation scripts below. It’s going to take a while, depending on how fast your lab’s storage is: on my crazy-fast 2019 MacBook Pro, it took 26 minutes. I know, that’s a long time – but you only have to do this once. The way I’ve built this lab, I have you create ALL of the possible indexes up front, and then when it’s time to do each lab, I just drop the specific indexes that you won’t need for that one lab. This means you can restore the database quickly to get back to the base config, drop specific indexes, and be good to go. (No, you don’t wanna try to avoid the restore – during some of the labs, we actually alter the database’s config, like add or remove columns from tables.)

After creating these indexes, we’ll take a backup of the database so that you can restore it each time to set back to a known starting point.

Back It Up and Set Up a Restore Job

During class, between each lab, the students restore their database to this starting point. For sheer performance, you want to run the backup across 4 files – SQL Server backups & restores can actually go faster when you stripe them across more files, even when writing to the same storage.

Here’s a quick and dirty backup script – just change the drive/folder for your machine:

Then, set up an Agent job to restore them. Here’s the script I use – just modify the file path locations for your backups. In my example below, I’m doing my restores from E:\MSSQL\BACKUP – you may need to modify your backup path and file names.

Note that mine also has a step to set Cost Threshold for Parallelism and MAXDOP back to normal defaults for the VM we’re working with in the cloud. I also set compatibility level to 140 (SQL Server 2017) – that’s just because I do most of my client work on 2016-2017, so I want the database to behave that way. If you do most of your work on a different version, you can use a different compat level. We’ll talk during the class about how some labs need specific compat levels.

Test your restore job to get an idea of how long it’ll take – if it takes more than 15 minutes, you probably want to use a faster VM to be able to keep up in class. If the restore takes, say, 20-30 minutes, you’re going to have a tough time keeping up when you’re doing storage-intensive things like creating indexes.

SQLQueryStress for load testing

SQLQueryStress is an easy, free, open source load testing utility. Download the latest version (zip), and extract it to a folder. You don’t have to run a setup or anything – it’s just a single .exe file that you can run whenever you want to do a load test.

Then, download my load test files and extract those to the same folder. Your folder will end up looking like this:

SQLQueryStress folder

Run the First Lab for Testing Purposes

Let’s act like it’s time for you to do the first lab in class. Follow the setup instructions on the Lab 2 page. It’s totally okay if you get some exceptions, like 5%-10%. If you get 100% exceptions, that means there’s something wrong with your setup, like you didn’t apply all of the setup scripts. Click the little … button to see what the exceptions are, and troubleshoot those.

You can click on the … button to view the exceptions at any time without stopping the test. These exceptions are totally normal and okay:

  • Deadlocks
  • Divide by zero errors
  • Transaction count after EXECUTE indicates a mismatching number of BEGIN and COMMIT statements

If you see any other kinds of errors, there’s probably a setup problem. Stop and double-check your work.

You don’t have to wait for it to finish – your goal is just to make sure that it starts running without producing 100% errors. It won’t be fast, believe me! That’s the point of these labs, to really push your machine hard.

You can cancel SQLQueryStress at any time. You may hit a bug in SQLQueryStress where you close it, but it’s still running – you may need to check in Task Manager if it’s still showing up even when it’s not running. It kinda disappears like a zombie process. In that case, you’ll want to end-task it:

And that’s it! See you in class!

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.