Blog

You heard it here first, folks – I’ve got the scoop on what might be the most exciting new feature in Microsoft SQL Server Denali.  I hope you’re sitting down.

SQL Server 2008 R2 introduced the data-tier applications (DACs) – packaged databases that could be deployed on Azure or full-blown SQL Server.  The initial idea was that databases could be moved around from place to place and upgraded from afar.  At the time, I wrote that while the initial version wasn’t worth exploring, future versions could bring us virtualization for databases.

SQL Server Denali’s new contained databases seemed interesting at first, but like the DAC packs, were more of a down payment than an actual deliverable.  For example, they don’t really separate out TempDB per contained database – sure, they create objects in the right collation to avoid join problems, but if you’ve got one poorly-behaving app that abuses the buffer pool or TempDB, you’re still screwed.  These databases are contained in much the same way that velociraptors were contained in Jurassic Park.

Fun for the whole family. Especially your ex.

But hold on to your butts.

Enter the Data Director

What if you had a console that let you create or deploy contained databases that were really contained – not by deploying them on an existing server, but by creating a new virtual machine for each individual database?

That day is here.

With Windows Core, we’ve finally got lightweight virtual machines that can be completely locked down and managed.  With Hyper-V, we’ve got the ability to light up VMs quickly and easily via an API, which means we can do it inside SQL Server Management Studio.  Now, when you deploy a database, you get to pick how many CPUs it gets, how much memory it gets, and what tier of storage it gets.

It’s hard to guess the number of CPUs and amount of memory, though.  Project managers lie about schedules and user counts.  Developers lie about their code being optimized.  New hardware comes in and we have to move things around.  Fortunately, we can change these numbers on the fly: SQL Server’s hot-add CPU and memory capabilities haven’t been fully utilized in the wide market yet, but virtualization makes it a no-brainer.  Change the dropdown for the number of CPUs and memory, and the virtual hardware is instantly added through the hypervisor, recognized by the OS, and added to SQL Server as well.

Denali’s new AlwaysOn Availability Groups add the ability to scale out to multiple replicas for more read performance and easier disaster recovery.  It’s scriptable, so you know what that means – yep, just pick the number of additional replicas you want, and the console takes care of the rest, spinning up additional VMs for you and configuring the scale-out.

Backups?  Not only can we take full backups of the database, we can take snapshot backups of the VM host too.  We can use storage replication (built into the hypervisor, no matter what storage we’re using) to seamlessly replicate the entire server from our production datacenter over to a disaster recovery datacenter without the hassles of mirroring, log shipping, or replication.  Just check a box, and it’s taken care of.  All of this integrates with Policy-Based Management – set a policy for production, and all of the new production-class databases you create will inherit this policy.

That license really has teeth.

OPEN SOURCE IS COMING FOR YOU!

One of the reasons we need those backups is to restore – whether it’s to development, or to test a new version of SQL Server.  With this new feature set, you can simply restore to a new database server name in a matter of seconds thanks to virtualization snapshots.  This means when you need to test a new version of Linux, you…

Oh, wait, you caught me.

VMware Killed the DBA Star

This is going to be a hard paragraph for you to read, but here goes.  Data Director isn’t a feature of SQL Server Denali.  It’s VMware vFabric Data Director.  And, uh, it’s for Postgres, not SQL Server.  And it might be cheaper than SQL Server Standard Edition for some companies.  Here’s a demo video:

I KNOW, right?  I shook my head when Microsoft introduced the DAC Pack two years ago, I shook my head at Denali’s contained databases, but my floor shook when I saw what a virtualization vendor managed to pull off in Version 1 of their database appliance.  This looks fantastic for run-of-the-mill infrastructure databases.

I know what you’re thinking: who wants one OS per database?  Infrastructure managers, that’s who.  They want to avoid the hassles of databases stepping on each other just like you do, and they don’t mind throwing hardware at the problem.  Hardware is cheap – especially compared to salaries.  Why not throw another blade in whenever we add another dozen databases?  Let VMware manage the load by moving things around automatically.

If you’re a DBA, and you’re not learning about the cloud – whether it’s public clouds like SQL Azure or private clouds like VMware vSphere – you’re never going to see your career shift coming.  And believe me, it’s coming – not this year, maybe not next year, but if you wait until it’s a no-brainer for the CIO to deploy it, then it’s going to be a no-brainer for him to let you go and hire someone who understands these new technologies.

And the dinosaur’s gonna be you.

↑ Back to top
  1. Nice post, but have you by chance heard of the HP appliance known as the Database Consolidation Appliance? http://h30507.www3.hp.com/t5/Converged-Infrastructure/New-HP-Database-Consolidation-Solution-eliminates-SQL-database/ba-p/93527

    Given this post I may have to write a post of my own about the appliance, its defnitely interesting and I look forward to sitting down with you at PASS and talking about it. :-D

    • Jorge – thanks! Yeah, I got a preview of that and I just didn’t buy it. It’s got proprietary written all over it: it’s locking companies into one make/model of server, and one database platform. Given HP’s abandonment of PolyServe (and insert TouchPad joke here), I can’t imagine betting that much on yet another proprietary HP solution that doesn’t really add much value over regular virtualization.

  2. Sounds like this is for cloud providers who manage a lot of customers’ databases. Either with tight security or with multiple instances.

    We have Vmware where I work and you are still bound by hardware, license costs for each new host, DR, and etc. The cloud is not as elastic as it’s hyped.

  3. So in essence is this just a pre-built and configured VM with PostGres on it? Couldn’t you achieve the same thing with a scripted install and VirtualBox, Hyper-V, etc? Not arguing, just trying to understand. This doesn’t look that different to me than a VM with SQL Server on it. I see those all the time.

    • Sure, great question. There’s several differentiating features here, but you have to know the whole VMware stack to get how big this is. Here’s a few:

      Automatic storage policies and tiering: starting with VMware vSphere 5, I can specify response time policies for storage, and put VMs in different tiers. When storage begins underperforming for a particular guest (which also means when the SQL load takes off), it can be automatically moved to faster storage in real time while the guest is up. The online storage migration has been around since vSphere 4, but 5 makes it automated.

      Hot-add CPU and memory – Postgres has struggled with this in the past, and EMC’s improved the source code to make it work. If I want to hot-add CPUs and memory to a SQL Server VM guest, and I expect it all to respond smoothly on the fly, that doesn’t work particularly well. I can reboot the guest and it’ll reconfigure, but not on demand.

      Policy-based backups – don’t even get me started on how incredibly difficult it is to manage SQL Server backups at scale. I’m personally frustrated that in the year 2011, I have to turn to a third party vendor to be able to set policies across my enterprise to get my dang databases backed up. That’s server management 101.

  4. Ah – that helps. I know the System Center folks have a “Cloud in a Box” offering themselves that does some of this, but I’m not as familiar with that being the in Public Cloud space.

    So it’s the OS and management that you like about this, rather than the actual SQL Server platform. That makes sense.

    • Yeah, it’s the service angle. This turns the database into a more manageable service. When you use the SQLfire database provider, it even becomes a full-blown private cloud service, scaling and sharding automatically as you add more VMs. That’s gorgeous.

      Now let’s see if it actually works when it comes out, heh.

  5. are there any successful deployments of VLDB SQL servers on vmware? i know you can do one OS instance per host with a DR server somewhere, but someone who has done it dynamically in vmware. where you just put OS instances hosting 200GB or larger databases and let vmware figure things out.

    we just replaced some SQL servers with proliant G7′s with 72GB of RAM and saw some big improvements in performance. we weren’t hurting before but anything is nice.

    • Alen – yep, I’ve got clients doing >1TB databases in vSphere, and one’s actually got a >10TB database in there. Performance tuning is critical on a database of that size regardless of whether it’s physical or virtual, of course.

  6. how do they DR it? is the data on a SAN or do they use something like VEAM

  7. and if you have to do all kinds of tuning to assign resources to it, what’s the point of vmware for it then except for DR?

    • I’d always recommend using shared storage with virtualization. That’s kind of the whole point – freedom from host failure. Besides, you don’t want to back up >1TB via traditional methods – you want to do tricks like SAN snapshots, and then present those snaps to whatever’s doing the tape backups.

      One point of virtualizing those boxes is that they tend to have very bursty resource utilization. They get hit hard during nightly ETL, then they sit idle unless someone’s running a report. CIOs don’t like to see big hardware sitting around idle.

  8. we are in the process of deploying a new cognos server and went with a stand alone server. we looked at vmware but IBM said you had to have dedicated hardware to run it on vmware. I think the exact working was dedicated resources for the time it needs them. the way its coded is that if you let vmware manage it all and it asks for more resources then it will start a paging storm.

    my point is that in a lot of these cases you have to dedicate resources that will sit idle either way. what’s the point if they are on vmware or a physical server? current hardware is dirt cheap these days and very power and heat efficient

    • Alen – ah, if IBM’s telling you that you have to have dedicated hardware to virtualize something, then that takes away all the advantages.

      vSphere 4′s DRS eliminates the “paging storm” problem. It learns your app’s load patterns. If you do a large amount of work every night at 11PM, for example, it’ll vMotion all of the other guests off your hardware a few minutes early, and then you’ve got dedicated hardware when you need it. When you add newer/faster hardware into your VMware cluster, vSphere automatically decides which guests could make the best use of that hardware, and rebalances the loads for you. Your guests move around from generation to generation of hardware with no reinstallation required.

      Like you point out, new hardware is dirt cheap, and that’s actually an argument FOR virtualization. Why not take advantage of those new processors as soon as they come out? Just toss a few new machines into the cluster, evacuate a few old ones, and your whole cluster is faster without reinstalling any operating systems. Try that with bare metal. ;-)

  9. we actually thought about it but then it would cost another $10,000 for a vmware server license and more storage on our EMC SAN. and us DBA’s were worried about I/O contention. we have 2 servers on tier 2 or 3 storage on our EMC that don’t perform as well as those with faster storage. same thing, BI servers with a lot of burst traffic.

    before the week is out i actually have to tweak my netbackup schedules to separate the backups of 2 SQL servers in a cluster to different times. i’ve noticed that when they backup at the same time it’s always slower than at different times

    • Alen – absolutely, if your current storage or servers are underperforming, then you won’t want to use virtualization.

      About the backups – that’s much easier to handle in virtualization since we can coordinate backups at the host level, doing each guest serially, or capture SAN snapshot backups in a matter of seconds.

  10. Hi Brent – given your extensive experience, and limited resources newbies in cloud technologies may face in workplaces, what would be the best way to find affordable resources and learn/get hands-on with these new technologies based on your opinion?

    Much appreciated.

    • Buis – right now, because the technologies are changing so fast, the best way to find resources is to work directly with the vendors. They have developer programs to help you get started. Read their marketing material, read their getting-started documentation, and see which one calls to you most with its features and limitations. Then dig deeper into that particular platform.

  11. Pingback: Something for the Weekend – SQL Server Links 02/09/11

  12. While I understand the excitement around the “Data Director”, I think the title of the post is very misleading as it is titled “Secret New SQL Server Denali Feature”, which it certainly isn’t.

    Gladly hear more details on the why you shook your head on Data Tier Applications and Contained Databases.
    Other than that – great post, would love to see it working

    • Dandy – yep, the title of the post is misleading on purpose. Sorry if my humor didn’t translate well, but I was leading the reader into thinking Denali had something exciting when in fact that cool feature is provided by a competitor. (Not that Denali doesn’t have exciting stuff – I’m quite looking forward to it when it ships.)

      If you click on the DAC links in the post, I’ve written several posts about why I wasn’t impressed with Data Tier Applications when they first shipped. I haven’t heard of any improvements to those in Denali yet, oddly.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php