If you’re taking Fundamentals of Columnstore or any of my Mastering SQL Server classes, you’re going to need your own SQL Server VM to follow along with the labs. The setup is exactly the same for all of those classes, so as long as you’ve set up for any of ’em, you’re good to go.

Here’s how to set it up. You’re going to need:

  • A fast SQL Server on Windows (not Linux, not Azure SQL DB, not containers, etc)
  • The right Stack Overflow database
  • An Agent job to rapidly restore the database between labs
  • SQLQueryStress

Let’s talk through each piece.

Building a SQL Server

When I do the demos & labs onscreen, I’m using an AWS EC2 VM, size i4i.xlarge with 4 CPU cores, 32GB RAM, and local NVMe SSD storage. Very often during class, I say things like, “You know you’ve finished the lab successfully when it runs in under 1 minute.” For that to work, you need a VM like mine. The slower yours is, the harder it’s going to be to finish the lab in the time you’ve got allotted.

Each server needs at least 300GB local SSD space to deal with the ~200GB Stack Overflow database and index creation space. It needs to be fast storage, too: for a rough idea, you’ll need to be able to restore the backup in 15 minutes or less. (I’m just using restore speed as one example here – you’ll be doing lots of storage-intensive tasks, like creating indexes on large tables.)

If you want to use an Azure VM, use the L8s v2 instance type with 8 cores, 64GB RAM, and 1.8TB of local NVMe storage. It’s the smallest instance type Microsoft offers with NVMe storage, which is what you’ll need to keep up with the labs.

Before you install anything on your VM, run a test with CrystalDiskMark. On whatever drive where the SQL Server data & log & backup files will live, you want at least 1,000 MB/sec for sequential reads and 400 MB/sec for sequential writes. Results for an AWS i3.xlarge like we use in class is shown at right: if you can’t get the top line’s results to be at least that high, stop: you’re not going to be able to keep up in class.

Can you use faster hardware? Yep, and as long as you’re not caching the entire database in RAM, you’ll probably still have the same basic challenges that we tackle in all of the classes.

After building the server, install:

  • SQL Server 2017, 2019, or 2022 Developer Edition (Express Edition won’t work)
  • Use the SQL Server Setup Checklist in my First Responder Kit. For example, you’ll want to set max memory to leave 10% of the box’s memory free for SSMS & SQLQueryStress, and during the install, check the box for “Grant Perform Volume Maintenance Task” – this lets SQL Server restore databases way faster, which will be important during the class
  • Use the default collation during install, SQL_Latin1_General_CP1_CI_AS (don’t get fancy with binary collations)
  • Use at least 4 equally sized TempDB data files, each at least 10GB in size (more files is okay, but just definitely get at least 4 to avoid PFS/SGAM contention during the labs)
  • The latest Cumulative Update
  • The most recent SQL Server Management Studio (and it really does have to be the most recent one)
  • The most recent First Responder Kit scripts in the master database – use Install-Core-Blitz-With-Query-Store.sql

Wanna Use a VM in Azure?

Kevin Hill (@Kevin3NFDallasDBAs.com) recorded a video in December 2020 how to do it – just be aware that there may have been changes in the Azure portal since then, but the basics should still work:

Getting the Stack Overflow database

In class, we use a specific version of the Stack Overflow database that I’ve customized with indexes, procs, etc. I only distribute it via BitTorrent because it’s so large. To download it, install a BitTorrent client, and then open one of these: torrent or magnet. That’ll kick off the download process, which will eventually download 4 12GB backup files.

It’s 4 backup files because if you stripe a backup across 4 files, it generally goes faster (as do the restores.) You’ll need all 4 of them in order to restore the database. If you’re working in a cloud VM, save these backup files on the permanent storage of your VM, not the local ephemeral storage (which is erased each time your VM shuts down.)

The database is only available via BitTorrent due to its size. Please don’t email me asking me to email you a copy: I can’t email 50GB worth of data.

Set Up a Restore Job

Don’t just restore it once manually: you’ll need to automate this process. During class, between each lab, the students restore their database to this starting point. To do that, set up an Agent job to restore them. Here’s the script I use – just modify the file path locations for your backups. In my example below, I’m doing my restores from E:\MSSQL\BACKUP – you may need to modify your backup path and file names. I’m also moving the files to my Z drive, and you may need to change that too.

If you’re working in the cloud, put the backup files on your permanent storage, but restore the database onto your ephemeral (fast) storage. That way, whenever you start your SQL Server VM up for the first time in a while, you can load the data into the fast storage.

Note that mine also has a step to set Cost Threshold for Parallelism and MAXDOP back to normal defaults for the VM we’re working with in the cloud. I also set compatibility level to 140 (SQL Server 2017) – that’s just because I do most of my client work on 2016-2017, so I want the database to behave that way. If you do most of your work on a different version, you can use a different compat level. We’ll talk during the class about how some labs need specific compat levels.

Run your restore job to get an idea of how long it’ll take – if it takes more than 15 minutes, you probably want to use a faster VM to be able to keep up in class. If the restore takes, say, 20-30 minutes, you’re going to have a tough time keeping up when you’re doing storage-intensive things like creating indexes.

SQLQueryStress for load testing

SQLQueryStress is an easy, free, open source load testing utility. Download the latest version and extract it to a folder. You don’t have to run a setup or anything – it’s just a single .exe file that you can run whenever you want to do a load test.

Then, download my load test files and extract those to the same folder. Your folder will end up looking like this:

SQLQueryStress folder

And that’s it! See you in class!

29 Comments. Leave new

  • Hi Brent, We have some developers doing your fundamentals training that don’t have sysadmin on the SQL Instance there using, what permissions are needed for he sp_Blitz scripts to work?

    Reply
  • Hi.
    I followed the prerequisites and tried running SQL stress. I’m getting the following error “Could not find stored procedure ‘usp_IndexLab1’.”
    Are these procedures supposed to come along with the 2018/06 stack overflow database? or where else do I find them?

    Reply
  • I am in a paid training. I just couldn’t find the procs in any of the prereq setups.

    Reply
  • I’m setting up my VM now, using an Amazon EC2 i3.xlarge instance like yours, but a bit confused on the storage – should we be using Amazon’s EBS volume storage or the instance’s ephemeral storage? With a 300 GB EBS volume, my CrystalDiskMark speeds are ~ 100 MB/s for sequential reads, not fast enough for Mastering classes even copying your instance configuration. Ephemeral storage is much faster, as you note, but also less reliable. Thanks for any clarification

    Reply
  • The local storage. That’s why it says “local NVMe SSD storage”.

    Reply
    • I really appreciate the quick reply, but can you clarify what local storage refers to. When configuring an EC2 instance, there’s instance storage (ephemeral), and separately AWS has Elastic Block Storage, but nothing is called local. The C drive for a default EC2 instance is mapped to an EBS volume, which is what got the slow reads.

      Reply
  • Hi Brent, Is there cost involve in the AWS/Azure VMs that i need to pay? If Yes, do you know how much?

    Reply
  • I downloaded the latest version of SQLQueryStress.zip from https://github.com/ErikEJ/SqlQueryStress/releases/download/102/SqlQueryStress.zip
    When running on Microsoft Server 2019 Datacenter edition, I got the following error:
    To Run this application, you must install .NET Core.
    The framework ‘Microsoft.WindowsDesktop.App’, version ‘3.1.0’ was not found
    Would you like to download it now?

    Tried downloaded but still get the same error.

    Reply
  • Jose Galeano
    May 17, 2022 2:41 am

    Hi Guys,

    I tried twice to download and restore the database, 2018-06 version of the Stack Overflow, not sure what happened with the backup, below the error msg.

    Error Msg:
    RESTORE detected an error on page (0:0) in database “StackOverflow” as read from the backup set.
    RESTORE DATABASE is terminating abnormally.

    Reply
    • That means you’re getting download corruption. You’ll want to try with a more reliable internet connection or a Torrent client that better supports retries.

      Reply
  • Hi Brent, hi fellow users,

    I’m fairly new to azure vm creation. I was wondering if there’s a script to setup this azure vm rather than going through the azure portal GUI. thanks

    Reply
  • I’m trying to setup azure VM with NVME drives but as far as I can see NVMW drives are not available no mater which size I choose.
    Tried with L8s v3 but still no NVME support, only premium SSD.
    Microsoft is saying that NVMe interface is supported only with Ebsv5 and Ebdsv5 VM families and not with L8sv2/3.
    As a test I installed L8sv3 as recommended here as showen in youtube video but disk performance was very bad compared to yours Brent so no point using it.

    Ideas?

    Reply
  • Thank you Brent for the information, according to url it does support NVME but as I mentioned before I tried L8sv3 with 2019 DataCenter windows and SQL 2019 and when I looked at disks description it stated premium SSD and not NVME. Maybe I am doing something wrong? There is no option to choose NVME at all. Need this if I am going to follow along with and workshop in Oslo in september. Please advise.

    Reply
    • My guess – and this is just a guess – is that you’re looking at the network disks attached to the VM rather than the local ephemeral storage. I’m also going to guess that you’ve never set up an Azure VM with ephemeral storage before.

      I do need you to be fair with me: your ticket to the Oslo class doesn’t include personalized Azure support from me, and this isn’t an Azure training class. If you don’t have the necessary Azure experience to set up the VM there, you’ll either need to use a method you’re familiar with, or simply watch the class (as most attendees do) rather than trying to follow along with the labs.

      Fair?

      Reply
  • I’m sorry if I came accross as someone asking for step by step guide. That was not my intention. Just stated what I saw with these L8sv3 regarding disks.
    Anyways, its fair and I respect that.

    Reply
  • SQL Server 2017, 2019, or 2022 Developer Edition – Do you expect to install all three versions or the one I am primarily working on and interested in doing the lab with? Same question about restoring the database. Restore in all three instances or just one?

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.