tl;dr – I do not recommend this book.

I was so incredibly excited when it was originally announced. A book published by VMware Press, written by prominent VMware, SQL, and storage consultants? GREAT! So much has changed in those topics over the last few years, and it’s high time we got official word on how to do a great job with this combination of technology. Everybody’s doin’ it and doin’ it and doin’ it well, so let’s get the best practices on paper.

When it arrived on my doorstep, I did the same thing I do with any new tech book: I sit down with a pad of post-it notes, I hit the table of contents, and I look for a section that covers something I know really well. I jump directly to that and I fact-check. If the authors do a great job on the things I know well, then I’ve got confidence they’re telling the truth about the things I don’t know well.

I’ll jump around through pages in the same order I picked ‘em while reading:

Page 309: High Availability Options

Here’s the original. Take your time looking at it first, then click on it to see the technical problems:

Page 309 - read it yourself critically first, think about what you know might not be right, then click for my annotated notes

Page 309 – read it yourself critically first, think about what you know might not be right, then click for my annotated notes

OK, maybe it was bad luck on the first page. Let’s keep going.

Page 111: Database File Design

Page 111 - read it critically first, then click here for my annotated version

Page 111 – read it critically first, then click here for my annotated version

The “Microsoft Recommended Settings” are based on a 2006 article about Microsoft SQL Server 2005. I pointed this out to the book’s authors, who responded that Microsoft’s page is “published guidance” that they still consider to be the best advice today about SQL Server performance. Interesting.

Even so, the #3 tip in that ancient Microsoft list is:

3. Try not to “over” optimize the design of the storage; simpler designs generally offer good performance and more flexibility.

The book is recommending the exact opposite – a minimum of one data file per core for every single database you virtualize. That’s incredibly dangerous: it means on a server with, say, 50 databases and 8 virtual CPUs, you’ll now have 400 data files to deal with, all of which will have their own empty space sitting around.

I asked the authors how this would work in servers with multiple databases, and they responded that I was “completely wrong.” They say in a virtual world, each mission critical database should have its own SQL Server instance.

That doesn’t match up with what I see in the field, but it may be completely true. (I’d be curious if any of our readers have similar experiences, getting management to spin up a new VM for each important database.)

So how are you supposed to configure all those files? Let’s turn to…

Page 124: Data File Layout on Storage

Page 124 - read it, think on it, and then check out my notes

Page 124 – read it, think on it, and then check out my notes

Imagine this setup for a server with dozens of databases. And imagine the work you’d have to do if you decide to add another 4 or 8 virtual processors – you’d have to add more LUNs, add files, rebalance all of the data by rebuilding your clustered indexes (possibly taking an outage in the process if you’re on SQL Server Standard Edition).

What’s the point of all this work? Let’s turn to…

Page 114: You Need Data Files for Parallelism

Page 114 - you don't even have to click for my thoughts this time. See, I'm all about you.

Page 114 – you don’t even have to click for my thoughts this time. See, I’m all about you.

No, you don’t need more data files for parallelism. Paul Randal debunked that in 2007, and if anybody still believes it, make sure to read the full post including the comments. It’s simply not true.

I asked the authors about this, and they disagree with Paul Randal, Bob Dorr, Cindy Gross, and the other Microsoft employees who went on the record about what’s happening in the source code. The authors wrote:

You can’t say Microsoft debunked something when they still have published guidance about it…. If in fact if your assertions were accurate and as severe then we would have not had the success we’ve had in customer environments or the positive feedback we’ve had from Microsoft. I would suggest you research virtualization environments and how they are different before publishing your review.

(Ah, he’s got a point – I should probably start learning about SQL on VMware. I’ll start with this this guy’s 2009 blog posts – you go ahead and keep reading while I get my learn on. This could take me a while to read all these, plus get through his 6-hour video course on it.)

So why are the authors so focused on micromanaging IO throughput with dozens of files per database? Why do they see so many problems with storage reads? I mean, sure, I hear a lot of complaints about slow storage, but there’s an easy way to fix that. Let’s turn to page 19 for the answer:

Page 19: How to Size Your Virtual Machines

Page 19 - read critically, and then click for my annotated notes

Page 19 – read critically, and then click for my annotated notes

Ah, I think I see the problem.

To make matters worse, they don’t even mention how licensing affects this. If you’re licensing SQL Server Standard Edition at the VM guest level, the smallest VM you can pay for is 4 vCPUs. Oops. You’ll be paying for vCPUs you’re not even using. (And if you’re licensing Enterprise at the host level, you pay for all cores, which means you’re stacking dozens of these tiny database servers on each host, and managing your storage throughput will be a nightmare.)

In fact, licensing doesn’t even merit a mention in the Table of Contents or the book’s index – ironic, given that it’s the very first thing you should consider during a virtual SQL Server implementation.

In Conclusion: Wait for the Second Edition

I’m going to stop here because you get the point. I gave up on the book after about fifty pages of chartjunk, outdated suggestions, and questionable metrics (proc cache hit ratio should be >95% for “busy” databases, and >70% for “slow” databases).

This is disappointing because the book is packed with information, and I bet a lot of it is really good.

But the parts I know well are not accurate, so I can’t trust the rest.

↑ Back to top
  1. >>They say in a virtual world, each mission critical database should its own SQL Server instance.

    We definitely have more than 1 DB on a VM instance….

    • Yeah it’s hard to imagine one DB per instance. Separating the disks/mount points for each database on that critical instance so they don’t cause cascading outages… that sounds more useful.

      Still, I didn’t know and don’t have a stance on any of the more technical issues above. That’s for the architects to worry about :-) I do sympathise with the struggle though. Just a few weeks ago I was in an argument where someone said, “If you cared about performance you wouldn’t be putting SQL Server in a VM”.

      But we got interrupted by a phone call from 2005 asking for its DBA back.

    • In highly automated environments I have seen an instance per application or business unit, but one database per VM/SQL Server instance seems extremely excessive and unnecessarily complex.

      I have watched some of Jeff and Michael’s presentations from VMWorld and thought that they were quite informative. Many of their recommendations from those presentations held up well in lab testing.

      This book is on its way too me and I will still be reading it, but I am disappointed to be hear that these oversights and out dated recommendations made it into the book.

      • Kyle – agreed, I was really shocked by the book’s contents. The authors have great reputations in the Oracle and VMware community, and I was really looking forward to seeing what they produced for SQL Server.

    • Denis – yeah, that’s been my experience as well.

    • I used to work for one of the biggest public companies in the world and I actually saw it, I even configured, single SQL instances for only one mission critical database.

      Complex? Yes. Expensive? Of course. But if it goes down, it won’t go down with something else. This is something that medium and small size companies can’t afford but for someone who has worked for billion dollar size companies out there, is a possible scenario. The exception for the rule was when the application required several databases to operate or when the application itself was not considered MC. And that was established based on how many millions in lost revenue an hour downtime could generate.

      By the way, there was a lengthy and bureaucratic process behind that before granting the MC status and/or decide if it was going to run alone in its own SQL instance or not.

  2. Very interesting review.

    So I went to the Amazon site and noticed the book was just published on August 4th, but already has two 5-star reviews dated August 6th and 10th.

    I’m a little skeptical that someone was able to read the book so quickly and give a proper review. That means the reviewers received the book ahead of time. Now that makes me skeptical of their objectivity.

  3. Thanks Brent, you made my day, I laugh a lot when ii rread the book content. I will buy this book and read when i am free. Haha

  4. Interesting timing for this post seeing as how we are in the midst of planning our migration to virtual servers on VMware. You hit the nail on the head – if licensing wasn’t mentioned, that’s a huge oversight seeing as how that’s one of the most compelling reasons to virtualize SQL Server to begin with. I can’t fathom creating all those data files, and then “manually/proactively” monitoring their growth rates? Holy administrative nightmare batman, and not just for the DBA. Thanks for the link to Paul’s post also – that was very informative.

  5. Let me see if I understand this correctly. I’m supposed to manually manage all these file locations and sizes in the name of optimizing parallelism, then give them crippling amounts of RAM (1GB for basic tier, 4GB for medium tier)? That’s pure insanity.

    • Doug – and if I was really a good consultant, I would promote this book because it would bring us in a whole lot more business. Folks would be so frustrated because they’d put so much work into micromanaging these files, and yet performance would still be awful.

      • But you would promote it under a Pseudonym, let’s say Orent Beezar, otherwise it would fall back on you

  6. Interesting to see those view snippets of the book. We had a consultant come in a few months ago from the company who manages our infrastructure. He tried to tell us that we should align any of our virtual SQL Servers with the same recommendations outlined on page 111. He claimed that SQL Server will never use more then 1 CPU or 4-8 GB of ram regardless of the database load. Also if we aligned the virtual resources we would see performance increases. He nearly hit the floor when he say our virtual OLAP environment (2 guests each with 300GB of ram and 32 vCPU completely consuming a dedicated host).

    Luckily he lost a lot of credibility quickly here and no one bought into his recommendations – at least when it came to our virtual SQL boxes.

    • Ian – wow, I’m stunned that a consultant would ever make a statement like that. Yowza. Although on the bright side, it means folks like us will always have work. (sigh)

  7. Excellent article and thought provoking how many other books are misleading people, on that note what books would you recommend around this topic ( or others for that matter , obviously yours) if there are any?

    • Martin – thanks, glad you liked it. I actually don’t have a book I would recommend on virtualizing SQL Server with current versions of VMware, sadly, and that’s why I was so excited about the book.

  8. …and we trust Brent more than we trust author.

    If you won’t write your own books, would you kindly do a fine-tooth-combing of every-SQL-Server-book-ever-written, and write a detailed review? I’ll buy you a beer for your efforts.

    Many thanks.

    • On the topic of SQL Server + VMware – yes, I trust Brent (and Paul Randal). I don’t know the author, but I do understand the points Brent brought up and they are very valid points. You will find many SQL Server experts agree on those very topics, not that Brent needs such confirmation from the DBA community.

      • Allen – thanks sir. You know, it’s funny, when I read something like this book, I really start to question my own experiences. Before publishing the review, I had several discussions with other community folks (and of course the authors) as a sanity check. When I read these types of guidelines (1vCPU, 1GB RAM) I really wonder if it’s me, or if the world has gotten success with that type of approach.

    • Justin – thanks sir. It’d better be a really big book.

  9. We are in the process of deploying SQL Server 2012 EE on VMWare, I have been looking for good resources. Thanks for qualifying this book, won’t bother buying it.

    Here’s to hoping Allan HIrt’s (@sqlha) new book covers visualization.

  10. Thanks for reviewing this book Brent. I was also interested in picking this up, but with so many issues on just those few pages I’ll wait. I pulled our VMware Admin over to look at your review too, we both got a good laugh at the notion of 1GB of RAM on a VM and having one VM Client per database.

  11. Where I work we have a ‘large’ SQL Server cluster designed by a Microsoft consultant in order to group all the small boxes into a big cluster (3 years later still uses less than 20% of resources, what a waste).

    The CPD failover design consists on a VM for each database and they want it to have the same amount of resources assured at VMware level, so that means we need to waste double amount of resources just in case.

    Maybe we didn’t get the best consultant but this is with what we have to deal with in the every day

    OTOH, we do have SQLServer VMs with 1vCPU and 1GB of RAM and they work as expected. Believe me that they have enough with that but also believe me, that even those being production DBs, the amount of work they do is minimal.

    And from a VMware admin POV, it’s easier to start low and give more resources as needed. Unused vCPUs tend to produce ‘CPU Ready’ problems.

    • Urko – I’d be curious to hear what the end users thought about those VMs with 1vCPU and 1GB RAM, and hear about how you do DBCCs and backups on those VMs. Sounds like a great blog post or case study!

  12. Great review Brent.
    I found it funny-sad how one can find that many issues in a few pages.
    We have a growing VM SQL deployment here and are trying to move to it as much as we can. When it comes to spec’ing out the boxes, it is always “interesting” when vendors want physical box requirements on a VM. it takes some work effort to trim them down to what we, the DBA’s, think is enough :) In the beginning, when we had not much experience, we let some of these go that way. Talk about wasted resources in vCPU cycles! Thankfully we’ve gotten either rid or trimmed down as possible.
    On the back flip, VM’s with 1vCPU 1GB RAM, has been a no-go for any we tried. As you mentioned in another post res[ponse, doing any “intensive” task, like DBCC, backups, or even RDP’ing to the box just kills it…

  13. Oh cool, one CPU and gig of RAM. Can’t wait to play Myst on that.

    Does this mean I can run an awesome production SQL Server on my Galaxy S5?

    This is why I tug on my collar when my boss asks if I’ve read anything about using VMotion on a SQL cluster.

  14. Pingback: (SFTW) SQL Server Links 22/08/14 - John Sansom

  15. Seems like they were confusing SQL Server with MySQL? I do have some (xsmall) MySQL VM’s with 1 vCPU and 4gb RAM. And because the licenses are so much cheaper, and we’re SOA, the MySQL is mostly one service db per instance. That is certainly not the way any of my SQL Server instances are set up.
    Nice writeup and informative as always Brent.

  16. I agree that virtualizing SQL does mean that you need to size VMs appropriately which is different to physical servers. I am actually a VCP5-DCV and I can understand (understand their point only I would never recommend this) why they would say put one database on each VM running SQL, as I am sure I heard that many years ago.

    VMware do sell their software expecting you to use their HA features and pretty much tell you to forget Microsoft Clustering (I have done most of the VMware sales and technical sales training) etc, however I have never seen these recommendations in the field.

    Licensing costs would be a nightmare let alone having to patch every single instance of SQL server for every application you are running on your virtualized environment. Good for only having a single application become unavailable should a sql instance reboot itself (or some other outage) leaving all your other applications available, but a massive NIGHTMARE should you need to patch windows and SQL on those VMs (hint you WILL need to patch them).

    Currently 124 people have said your review has helped although I would imagine it hasn’t helped sales which is a shame as there is very little available elsewhere. It does however leave me without a decent book on virtualizing SQL server which is annoying and highly sought after. At this rate I might have to do some research and write one myself and I much prefer writing code. :)

    • Charles – thanks for stopping by. Yep, the licensing and patching would be pretty tricky, especially if you’re using any kind of third party monitoring or backup or management software for all these VMs.

  17. Thanks for continuing to expose this crap. Our profession needs better empirical standards.

  18. MEM/CPU recommendations: Actually our vmware guy said that VM’s usually have overallocated vcpu’s (can be checked via operations manager if needed) and they might actually benefit on having just one vcpu’s instead of many. It was something to do with clock rating and how stuff are scheduled in hosts. I wonder if this is explained in the book in the parts you did not read.. :/ . I think vmware does alot of memory operations behind the scene (compression etc), I don’t know if that has anything to do with recommendations made in the book, but memory recommendations seem odd..Haven’t seen these kind of values in any production machine.

    RPO/RTO values: I don’t understand these, RPO time:depends on the DB settings and that has to be discussed with customer what they need. ..RTO: in failover cluster I would say ~20s to failover and then you have to add the time application takes time to recover. Never watched/experienced this is in client side so i cannot confirm the exact time needed. IF you had MSSQL2014 AND CSV usage (not mentioned in the charts tho’), I think that would save up time. But these are not the numbers I would tell my customers because they must be able to give time to users when to log back to application and boy if we give wildly optimistic numbers > those users that try to log in applications that are not available when you said >they would WILL call you :)

    Anyways, this was a book that I was thinking of buying but if you have to second guess everything you read, I think I will wait for another book to come out. Maybe you should write your book on the matter :)

    • Jaana – the 1vCPU recommendation is “mansplained” in the parts I did read, but it completely ignores basic SQL Server tasks like DBCCs, backups, and index rebuilds. In a world where we don’t have to maintain databases, sure, 1vCPU can work. Back here in reality…

      RPO and RTO *goals* depend on what the customer needs. RPO and RTO *deliverables* depend on how you set up the technology. The chart on page 309 says that RPO is different between SQL 2008R2 failover clusters and SQL 2012 failover clusters, and this is simply and demonstrably not true. It’s not even debatable – it’s just flat out wrong.

  19. Jaana – I agree with your VMware guy as most VMs are actually over provisioned as there is a difference between what a system needs as a VM compared to a physical box. I imagine that Operations Manager can help resolve that issue of over provisioning but have not yet used it to see how it allocates resources to VMs running SQL Server. The main reason that vCPUs are over provisioned is due to how VMware schedules the physical CPU resources (as well as how people still perceive a core to be in the physical world). I believe that each vCPU is actually a representation of time, of the physical processor, so giving a VM more vCPUs might sound like it will give it more time to run (and it does per se) but those slices have to all be allocated together so your VM may actually find that it has to wait longer for 2 slices to become available at once, rather than just waiting for one slice which the scheduler could allocate much faster and much easier. If 4 vCPUs are allocated the scheduler will have to wait for 4 time slices to become available before it can give them the access it needs to the physical processors causing an even longer wait time before it gets to do some processing causing the VM to appear like it is working much slower. Depending on the edition of SQL will also determine how many vCPU cores it can use too (its sees physical and virtual cores in the same way), which have been slowly increasing over time. I believe enabling hyper threading on the physical CPU can also present some issues but I’m happy to be corrected on any of the above.


  20. Brent – cheers for the link its a bank holiday here so needed some more stuff to read :)


  21. Brent,
    I can’t say I am surprised. I worked for five years as a DBA for a federated cluster of SQL Servers/databases that used replication to supported a global shipping application for tracking shipping containers.
    Due to issues with the “Experts” from VMware recommendations they could never migrate any of these SQL servers to a VMware host.
    At one point we had to ask for another support tech because the one we were working with said it was a waste of time and resources to place the tempdb, log files, and data files on different vhd’s and different san lun sets. He swore that the SQL server process could not read/write data fast enough for this to make a difference.
    IMHO: VMware has some very bad information the collected during unique and custom SQL 2000 and 2005 early adopter implementations and mixed it with some poor responses from the support available at that time. It is embarrassing that they refuse to listen to any rhyme or reason that comes from current published fact and implementations.

  22. Hi Brent, would you recommend any books about virtualizing SQL Server with Hyper V 2012? I know everyone gets very excited about VMWare, but we just dont have the funds for it. We are just about to start virtualizing our SQL Server 2005 Std Edition, and I dont see many books written about this using HyperV, mainly VMWare. Or is it because VMWare is so much better than HyperV that its not worth it?

    • Naz – it’s just due to market share. If you’re a book author, you don’t really make money on books – you do it to gain consulting revenue. If most folks are using VMware, then that’s what you want to write for. For Microsoft Hyper-V books, you’ll want to look at Microsoft Press (who publishes books for a different reason.) Enjoy!

  23. Thank you for this article!
    I had a chat with a colleague for over ridiculous things in the book.
    You’ve just confirmed our view.
    I will read it with caution

  24. I’m currently reading the book, and while it’s disappointing to read your review, I believe it can still be a useful tool–as long as one is aware of its shortcomings. The book, like any other source of information, should be vetted for accuracy and completeness against other sources. To be effective, our toolkit should always include a variety of tools and information sources (if you only have a hammer, everything looks like a nail.) Each tool should be carefully scrutinized to identify strengths and weaknesses, and we should never rely exclusively on the judgment of others to determine the value of a tool. Even the best tool is only as good as the person wielding it–a master carpenter with an old crappy hammer is still going to build better furniture than the first-time homebuyer who bought the “best” hammer at the home improvement store?
    I would also like to respond to some of the comments posted regarding the impracticality of managing an environment where each application or database had its own SQL instance. This really is dependent on a variety of factors–what “problem” you’re trying to solve with virtualization, how much preparation and planning is put into architecting the VMware environment, the design of your database, etc. This is my opinion, but it’s an informed opinion based on my own experience running a completely virtualized SQL 2008 EE distributed database environment for the past 4 years.
    Most organizations follow a common path to virtualization–dipping their toes into the water and virtualizing the least critical applications first, with server consolidation and cost savings the primary business drivers. We took a slightly different route to virtualization–we dove head-first off a cliff into the ocean! We virtualized the most complex, performance intensive workload we had—the database tier of our web-based SaaS platform—our most valuable technology asset!
    The primary drivers in our case were almost completely opposite of everyone else’s. We virtualized for performance and scalability—we didn’t care about consolidation or cost savings (okay–the CFO and finance guys cared about the money –they were in the middle of an IPO after all!) Virtualizing the database tier and switching to a distributed architecture allowed us to scale vertically and horizontally, rapidly and efficiently, and keep pace with the business’ rapid growth. No more slow, painful migrations to new physical hardware to increase capacity or performance!
    Virtualization allowed us to easily upgrade the underlying hardware, and increase the total pool of available resources without having to take the platform off-line. It allowed us to allocate additional vRAM, vCPU, or storage resources on the fly (This does require a reboot of the VM, but a virtual reboot is MUCH faster than a physical server reboot!) Admittedly, cost was NOT the primary consideration –the finance guys did set limits—we couldn’t spend money like drunken sailors in Bangkok (as guidelines go, that left a lot of room for interpretation—they really should have been more precise.)
    Each VM contains a single SQL 2008 EE instance containing a pair of user databases–one of which is an “archive” database containing data written once and then rarely ever accessed again. Each vm is configured with 8 or 12 vCPU (depending on the load on that particular server), 56 GB RAM (48 GB allocated to SQL–being doubled to 96 GB next week.) There is a 1:1 ratio of virtual to physical memory and CPU resources (using memory reservations, etc.) The SQL VMs sit on top of VMWare ESXi HA clusters, and policies are configured so that there are always sufficient unallocated physical resources available on the physical hosts within a cluster to maintain the 1:1 ratio of physical to virtual resources if a VM fail-over occurs (Each physical host can support at least 2 of the SQL VMs without oversubscribing memory or CPU.)
    The disk space for the VMs all resides on an EMC SAN with tons of SSD & FibreChannel drives and multiple I/0 paths. O/S files & application binaries, data files, transaction logs, tempdb, and pagefile are all segregated on different LUNs. The physical ESXi hosts are all connected to a fully redundant 10 GB network with dual NICs, and separate VLANS segregate data, VMware management, and vMotion traffic Four years later, and the original 16 SQL Server VMs have grown to 22. We have also completely virtualized our web and application tiers, Exchange servers, and most of our utility and other servers. There are almost no physical Windows servers left in the environment.
    We’re currently in the planning stages of the next iteration of the architecture—SQL 2014, with 256 GB of RAM per SQL instance, Cisco UCS with the new, faster processors for the ESXi host servers, and upgrading to Cisco 7000 series switches in order to extend the virtualization to the network layer, and eventually make our D/R site the second node in an active/active geographically load-balanced extend our HA configuration to the site level. Exciting times?

  25. Brent, I’ve truly enjoyed this review. An additional small item to consider under the section “Data File Layout on Storage” for the TempDB file location selection during a SAN-to-SAN replication. Normally, the granularity for replication is on a per-LUN basis. Having the TempDB on its separate LUN allows the SAN admin to ignore it during replication. Decreasing replication bandwidth utilization in an environment where multiple virtual SQL server are spawned on a per application-database basis. Besides, as the enterprise grows, it is nice to storage-vmotion TempDB LUNs into their own flash/SSD RAID groups with ease (isolating database file IOs on their own RAID groups instead of sharing the same storage pool with every other server).

  26. I had a chance to read the entire book. While their are minor flaws I think it’s a pretty good book. A dba should never take every recommendation in a book.

    • Robert – great, thanks for your comment. I would disagree about the word “minor” obviously, given the things I covered in this review alone, but I found a lot more challenges than that. Just gotta stop and cut my losses after finding dozens of errors.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>