Even if your SQL Server is the only guest on a host, it still might not be as fast as bare metal.
One of the reasons is NUMA, which stands for Not Ur Momma’s Architecture. Okay, no, smart reader, you caught me – it actually stands for Non-Uniform Memory Access. In your momma’s architecture (Symmetric Multi-Processing), any CPU could access any memory all at the same low price. In today’s NUMA servers, a single motherboard with two CPUs and 128GB of memory can actually be divided into different nodes.
When a process running on CPU #1 wants to access memory that’s directly connected to it, that’s local access, and it’s fast. However, when that same process wants to grab data stored in CPU #2’s memory, that’s remote access, and it’s not as fast.
The performance penalty of remote memory access varies greatly from system to system, and you can measure it with Coreinfo from Sysinternals. (That Russinovich knows everything.) Blogger Linchi Shea went so far as to test the overhead of local versus remote access on one particular system, and he saw about 5% performance reduction. He considered that the worst case scenario for the server hardware he was using, but keep in mind that the situation will be much worse on servers with higher costs for remote memory access like IBM’s x3850 and x3950.
Windows exposes NUMA configuration details to applications, and it’s up to the app to tune itself appropriately. SQL Server has been NUMA-aware since 2005, and Microsoft’s continued to add improvements for it through 2008 and 2012. To learn more about how SQL Server handles NUMA, check out Gavin Payne’s SQLbits presentation, The NUMA Internals of SQL Server 2012.
How Virtualization Screws Things Up
The good thing about virtualization is that it abstracts away the hardware. You can run any virtual SQL Server on any server in the datacenter without a reinstall. You can even move virtual machines from one host to another, live, without a service restart – even if the underlying hardware is completely different. You can use multiple VMware hosts with completely different NUMA architectures – different numbers of cores per NUMA node, different amounts of memory per node, etc.
In order to pull this off, virtualization just presents a lump of CPUs and memory to our guest. Our virtual machine has no idea what the underlying NUMA configuration is – and it can’t, because it could change at any time when we’re moved from one host to another. This isn’t a performance problem for most apps because they don’t need to know anything about NUMA. They just want a lump of CPUs and memory.
Unfortunately, this is a performance problem for SQL Server because it actually wants to know the underlying configuration – and wants to tune itself for it. This is why when even running on a host with no other guests involved, performance still won’t match bare metal.
How vSphere 5’s Virtual NUMA Fixed Things Up Again
There are three key decisions that will make your life easier (and possibly your performance better).
First, isolate your virtual SQL Servers onto their own hosts. With SQL Server 2012’s licensing, when you buy Enterprise Edition for the host’s CPUs, you get unlimited virtual machines on that host. For a while, this wasn’t easily doable in VMware because of their incredibly stupid memory limits with licensing, but thank goodness they fixed that license stupidity recently. I can’t imagine a software vendor being dumb enough to limit their product to 64GB of memory in this day and age. <cough>sqlserverstandardedition</cough> I’m so glad VMware listened to their end users and fixed that limitation. <cough>microsoftpayattention</cough> Restricting a database server to just $500 worth of memory, why, that’d be like releasing a tablet with 4 hours of battery life. <cough>mylastpostasanmvp</cough>
Second, in that pool of hosts, use identical hardware running vSphere 5. All of the hosts need to have the same NUMA architecture. This does come with a drawback: it’s harder to do hardware refreshes. Most shops just buy new hardware as it becomes available, throw it into the VMware cluster, and let VMware DRS automatically rebalance the load. Unfortunately, the SQL Servers won’t be able to vMotion onto this hardware if it has a different NUMA configuration. The guests will need to be shut down at the next maintenance window, get a different NUMA config, and then be booted on the appropriate hosts.
Finally, configure vSphere 5’s Virtual NUMA on your guests. This is done automatically for guests with more than 8 vCPUs, but at 8 or less, you’ll need to enable it manually. Presto, SQL Server will see the underlying architecture and tune itself appropriately. (Well, not entirely appropriately – now SQL Server just solves the easy problems for you, and creates new hard problems.)
To enable virtual NUMA on VMs with 8 or less vCPUs, follow the instructions on page 41 of the Performance Best Practices for VMware vSphere 5.1 PDF. And hey, while you’re in there, get your learn on – it’s an excellent resource for SQL Server DBAs who want to know if their shop is doing things right.
Wanna learn more? Check out our VMware, SANs, and Hardware training videos. It’s a 5-hour training video series explaining how to buy the right hardware, configure it, and set up SQL Server for the best performance and reliability. Here’s a preview:
>> With SQL Server 2012?s licensing, when you buy Enterprise Edition for the host’s CPUs, you get unlimited virtual machines on that host.
AFAIK, you also need to be paying Software Assurance to be license compliant for unlimited vms on that host.
Sean – no, you only need Software Assurance to get License Mobility – the ability to migrate VMs around quickly.
Microsoft’s SQL Server 2012 Virtualization Licensing Guide is pretty clear about the need to buy SA for “Maximum Virtualization”.
See page 12
Yes, and “maximum virtualization” means the ability to dynamically shift workloads (read also: VMs) from one physical machine to another, which is what Brent is referring to for License Mobility.
You do NOT get unlimited VMs per host with Enterprise core licenses UNLESS you have SA. Per Microsoft’s licensing document: “With the SQL Server 2012 Enterprise Edition, customers that have licensed all physical cores on the server can run an unlimited number of instances of the software in a number of OSEs (physical and/or virtual) equal to the number of core licenses assigned to the server.” If you have twelve core licenses, you are limited to twelve VMs.
Maximum Virtualization allows you to run unlimited VMs if the entire physical box is licensed. Maximum Virtualization requires Enterprise Edition AND SA.
Maximum Virtualization is different than Application Mobility or License Mobility. Microsoft seems to use the two Mobility terms interchangeably. SA gives you both Maximum Virtualization and License Mobility.
So are you saying not to virtualize my SQL Servers? The licensing aspect seems like a nightmare to me and my boss want to virtualize everything. I keep saying no, but how long can I keep that up.
Glenn – no, not at all. I’ve got a lot more info about virtualization over at https://www.brentozar.com/go/virtual including reasons why you should (and shouldn’t) virtualize your servers.
Long time follower first time poster here. I’d like to start by thanking you for the wealth of SQL information you’ve shared. Being an accidental DBA over the years, your blogs has been exceedingly useful. With that being said, I felt compelled to share a few thoughts with respect to VMware + SQL and bear in mind I’m a Sys. Engineer, so I’m a little biased towards the systems side of things.
The virtual NUMA recommendation that you’ve suggested reminds of this really great article I read once https://www.brentozar.com/archive/2011/12/consulting-lines-pilot-dog/ ?. This is not something I would recommend you just turn on by default. At least not without a complete picture of what’s going on.
To start, regardless of SQL or not, the VM needs to be sized correctly. This isn’t even just a matter of nailing down the vCPU’s/Memory needed from a performance standpoint, but also understanding the underlying physical HW. The reason I say this, is because going through the hassle of adjusting one of these non-default settings should have some sort of positive effect right? So let’s take your example of 8 vCPU’s, not enough to trigger vNUMA right? Well what if your server has 8 cores per socket? In that case vNUMA pretty much won’t do anything. VMware 99.9999% of the time is going to fit that VM into a NUMA node. VMware will always try to fit a VM into a NUMA node, only under tremendous resource pressure or VM’s larger than a single NUMA node should you see a VM spanning multiple nodes. VMware would more than likely move the VM to a different socket, still keeping it within a single NUMA node.
Additionally, vCPU aren’t the only thing to be careful about when sizing a VM with respect to NUMA, memory could also be an issue. For example, if you have a dual 8 core server with 32GB of total memory and you configure a VM with 24GB of memory and 4 vCPUs, this would cause a VM to span multiple NUMA nodes (each NUMA node only has 16GB of memory). Not from a vCPU standpoint, but from a memory standpoint. In this case, I’m honestly not sure if enabling vNUMA in VMware would make a difference either. Based on former understanding, VMware would try to keep the VM’s 4 vCPU in a single node and simply hop over the memory bus when needed. Part of the reason for this, is in physical HW, you wouldn’t be able to do this (another pro for virtualizing).
In addition to all of this, you should know that in order to have vNUMA, you’ll give up hot add in your SQL server, or rather, if you enable hot add, vNUMA is disabled http://www.yellow-bricks.com/2012/01/16/enabling-hot-add-by-default-cc-gabvirtualworld/
My summary is simply that you need to understand the full picture before tweaking those advanced settings, what are you losing, is there a performance gain. Is 5% really going to make a noticeable difference? I’m not saying don’t do it, as usual it depends…
With great respect,
Eric – yep, absolutely, I always recommend getting the full picture when making changes. Thanks!
Brett, when you talk about vCPUs are you talking about virtual sockets or the number of cores (vSphere 5)?
Does 1 socket and 4 cores = 4 vCPUs?
Does 2 sockets with 2 cores each = 4 vCPUs?
Pam – virtual cores. Thanks!
In vSphere VMs get a slice or full thread (in non-HT CPU, 1 core= 1 thread, in HT enabled CPU, 1 core = 2 Threads). Each Thread is referred as LCPU. So a slice of a thread(LCPU) is called vCPU. The physical socket is referred as PCPU(physical CPU).
In a nutshell PCPU = Physical CPU (Socket)
LCPU = Thread
vCPU = Slice of thread.
In vSphere 5.1, 1 LCPU can be shared by upto 25 vCPUs and in 5.5 it can be shared by upto 32 vCPU. So if you have a Quad core non HT CPU, you can have upto 100 vCPUS (25×4) for your VMs in vSphere 5.1 and 128 vCPUs(32×4) in vSphere 5.5.
Hope it helps..
as info, the youtube vid on this page for vmware training preview is now private.