Jason Massie and Brent Ozar work with SQL Server for a living and write blogs for fun, but that’s where the similarities end. Jason’s a fan of shared storage SANs: putting SQL Server data and logs on the same set of physical hard drives. Brent’s a cheerleader for dedicated configurations where the arrays are separated by purpose. We locked the two of them in a virtual room to see what would happen. No database administrators were harmed in the making of this article.
Brent Says: You’re crazy. Well, I mean, you’re crazy just in general – anybody who writes a comic about SQL Server needs some medication – but I mean you’re wrong on this issue. Since the dawn of time, we’ve been telling SQL Server DBAs to put their data, logs and TempDB on different arrays for a reason: they get better performance.
Jason Says: We you say “need medication” I take it you mean more medication. Anyway…. Yes, that SQL Server 6.5 book we read a decade ago is, in fact, outdated. We once thought the earth was flat and we were the center of the universe. You can now file “dedicated raid 1 for logs” along with those busted myths. That book was written back when servers had a small number of locally attached drives, and we were lucky if we got two hard drives in a mirrored pair for the logs. Today, SQL Servers are attached to huge SANs with large numbers of hard drives. Are you saying a shared piece of fifty or a hundred drives isn’t enough?
Brent Says: For small servers, you’re completely right, but let’s talk about an extreme example: established multi-terabyte data warehouses need very high I/O for long periods of time, several hours a night, during their nightly load windows. They need to load as much data as possible, as fast as possible, and these windows often coincide with the nightly backup windows of other systems on that same storage. They have dedicated DBAs who monitor performance closely, and they want to get every last dollar out of their SAN. They carefully partition the data to balance load across different arrays and make it easier to do simultaneous ETL and reporting.
Jason Says: In situations like this when money for hardware and manpower is not an object, you can just throw enough resources at it that it is hard to mess up. However, you should spend some resources on testing and make sure you’re not flushing money down the toilet. Separate drives for data, logs and ETL load files might best but having a higher total spindle count might still be better. The proof is in the test numbers, and blanket statements are not safe. Not to mention by designing for the ETL, you will have idle drives when you are not loading – and you don’t want to have expensive SAN hardware sitting idle, do you?
Brent Says: You have a point there. It’s easy to recommend dedicated drives in situations where there’s a consistently high utilization rate, like for 24×7 OLTP systems, email archiving databases or very heavily used data warehouses. It’s harder to recommend them when the database servers just aren’t being that heavily used around the clock. But talking about virtualization reminds me – I don’t want any SQL Servers sharing spindles with virtual servers, ever. Virtual servers have such a high I/O utilization already that it’s like playing with fire.
Jason Says: It’s like buying a new house – you want to check out the neighborhood first and find out what your neighbors are like. You wouldn’t build a brand new mansion in a sketchy part of town, and you wouldn’t put a data warehouse and a bunch of VMware guests on the same spindles. I would suggest going to the Mayor (SAN Admin) and demanding a performance SLA. I have found by doing that, with executive sign off, that the Mayor makes sure the streets are clean and the riff raff out of sight.
Brent Says: Getting a good performance SLA is the first step to peace and harmony in the datacenter. Data warehouses are a special case: they have lots of dedicated staff, generally experienced people, who can focus on things like IO throughput and latency. But what about the rest of the databases, like your typical OLTP or application databases? A single DBA has to manage dozens, sometimes a hundred instances. They don’t have time to benchmark.
Jason Says: Hey, just because they’re a DBA doesn’t mean they spend every weekend in the datacenter, alone with their servers. Not all of them are single.
Brent Says: Don’t be so sure. My girlfriend refused to even think about marrying me until after I got out of the production DBA on-call rotation. She says it’s a coincidence, but I dunno.
Jason Says: If you don’t have the time to benchmark, then you don’t have the time to micromanage storage performance, and you really don’t have the time to build a good estimate of how many individual drives you need for TempDB, logs, and so on to begin with. Using a shared set of drives is the easiest way to achieve high performance. If you’re not getting enough performance, add more drives, and it’ll even out. Don’t try this with, say, 6 hard drives and expect to get good results.
Brent Says: That’s the same with dedicated drive setups, too. I’ve been at shops where the developers swore SQL Server couldn’t scale – only to find out they had the data, logs and TempDB all on a handful of drives. Whether you’re using dedicated drives or shared drives, don’t expect to scale a database server on any software platform without adding storage. If you want a real eye-opener, go read the TPC benchmark records and look at the number of hard drives that were used. It’s not unusual to see drive utilization rates of 1-5% – meaning, on a 146gb drive, they’re only storing a few gigs of data. More spindles equals more speed, period.
Jason Says: Another problem is changing app requirements like adding audit trails or replication. These adding read IO to you dedicated log drives. There is also reporting mayhem(large sequential scans) being added to your OLTP app. These are the bad elements but once they are in your ‘hood, how do you move them out?
Brent Says: That’s actually a problem with some dedicated spindle SAN configurations. When each array is configured on just a few disks, management can be a nightmare. I’ve worked with one major SAN vendor who could only add drives two at a time to arrays, no matter how large the array was, and all drive management had to stop while that array was restriped – sometimes taking hours. Adding more drives for more performance to handle new load was a real pain point, and I would have gladly traded off some performance in exchange for easier management.
Jason Says: Easier management is a big selling point for shared drive configurations. When multiple servers draw out of a common pool, it’s easy to allocate space in smaller increments. You can build multiple pools with different tiers: a Solid State Disk tier, a 15k RAID 10 tier, a SATA RAID 5 tier, and so on.
Brent Says: Speaking of Solid State Disk, when SSDs go mainstream, all SANs will just use shared drive configurations. With SSDs, the random access problem goes away, because there’s no heads to move around, no platters to spin. We’ll be able to gauge SAN utilization strictly on the number of queued-up requests instead of worrying about drive latency, and then it’ll be easy to tell when we need to add more drives.
Jason Says: Are you saying that maybe then you’ll get around to admitting the earth isn’t flat and shared drive configurations are always the best?
Brent Says: Get off my lawn, whippersnapper.
Jason Says: This is me getting in the last word. ?
Updated November 2008…
Now there’s more – we revisit it by taking a look at how virtualization affects SAN configuration.
Interesting read, and I have to ask, is that King Kong Bundy? 🙂
Yes! That is King Kong Bundy.
Brent – would you and Jason like to shoot a video of this exchange, or at the minimum a podcast? I like the idea very much – could go viral.
Type: 'We you say “need medication”' should be 'When you say “need medication”'
There are other (better) options available. Unlike a flat or round world, SANS are kind of cubed. Sort of like the refridgerator under the tom cat.
Current SANs offer improved performance, less planning time, high availability and give the DBA/Storage admin a chance to “play with the mouse” (GUI) interface.
I heard Brent ask once, which disks do you put a particular SQL Server file on. I got some dog faced looks from others as I replied from the audience “All of them”. The current generation of SANs provide auto spreading, auto tuning capabilities combined with quicker self healing, thin provisioning, snapshot and replication capabilities. Selected current generation SANs offer great performance, without the need to decide where to place selected dtabase files. All of our major databases are “virtualized” (on AIX). These newer breeds of SAN support improved DR by allowing you to place “Virtualized” (AIX dynamic partitioning using POWER Hypervisor, VmWare, OracleVM, MS HyperV)workloads onto the same SAN that replicates to DR. By spreading all files across a large number of drives (Like XIV with 180 drives), your workloads are likely to perform above the level of placing them on multiple SANs, or multiple raid sets. Fast, agile, responsive, alert, independent, …. actually rather like a bunch of felines after prey.
Some of the areas mentioned in the discussion are largely overcome. Placement becomes a non issue. The time worrying about it is better spent on other tasks. Idle drives? The reduced “waste” from partially populated raid ranks from separating different types of workloads is reduced. The peak times for OLTP, DW, ETL, backup, etc frequently vary. SANs that autospread the files in small chunks across a large number of disks, continue to due that regardless of the tasks currently asking for the most I/O. With multiple SAN controller modules(more than 2) the I/O s are less of a bottleneck. The multiple CPUs and larger cache of each module serve up the more balanced load. The ability to have cache on more modules improves overall cache performance. A failed drive is rebuilt in less time and with less impact on performacne than a disk lost in a RAID rank (of equal size disk) because the second copy of the file is rebuilt across all drives not in the module with the failing disk.
With 4 different RDBMS software in house, we checked out adding SSD to a traditional SAN. We sat through a demo of SSD. Rather than 30 times the speed, we saw that the sequential “write” time was only three times faster than a single disk. With RDBMS logs being one of the bottlenecks, and only a 3 times faster write speed than disk, spreading across multiple disk backed modules should provide adequate throughput. If you have small hot indexes that might fit on a small SSD 128 to 256 GB, then SSD might be important. Our largest database had hot spots that could not fit. Rather than dedicated SSD drives, placing SSD in the cache path might be a better fit for larger databases. Particularly if this can be done for spread modules.
Today’s newer generation SANs provide the ability to get easier management “with” (rather than trading off) better performance.
This is fun … Put Brent and Jason in the steel dog crate and catch them with the occaisional feline claw between the bars as they wrestle inside the cage. Time for me to use that extra time I got from newer SAN management and do what cats do best. Purrrr—zzzzzzzzzzzzzzzzzzz
Furball – the advantages you cite only work if the company buys enough drives.
Imagine instead the following scenario: you have two departments, one run by Bernie Madoff and the other by a database administrator. They both initially need 30 drives each, so they pool their resources and buy a SAN with 60 drives. Performance is great because they follow your pooled storage recommendation.
The database administrator does a great job of pruning out historical data, watching growth, and making sure he doesn’t overwhelm the storage.
Bernie’s department, on the other hand, takes in more and more data, eventually turning into a data warehouse, but never invests a single cent in adding additional drives. They just use every single bit of available capacity on each drive. They brag to their investors – uh, I mean, users – that they haven’t had to pay anything despite their growth.
Performance will go south pretty quickly for the poor database administrator, whose performance suffers due to Bernie’s underprovisioning. This is where shared storage falls flat.
The challenges of storage administration.
First, please allow me to join you in patting your green thumb gardening DBA for all of that trimming and removing of dead wood. Actually scratch her/him between the ears for me.
If you find data on all of those BM disks, do you think they will find money? I hope so for the investors.
There are others drooling over the new SAN’s. In most shops, bringing 30 TB in as you describe above has the immediate repsonse of “I hear you rolled a new SAN with multiple terabytes…. Give me 20 TB or more”. They need a place to keep their images or MP3s. That is best corrected with a chargeback system. I would be glad to, they are $## per gigabyte, plus monthly charges. That tends to put things on appropriate media.
Just think of how many teir 1/teir 2 spindles you have in the shop today supporting critical and important applications. Most of the time, they are simply spinning without working. A number of SAN Vendors have recognized this and provided better performance to the most critical applications by spreading the I/O across many, many spindles.
I remember punching holes in cards for storage. Brent, it was hard for me to overcome my many years of “hand placing files for performance” and embrace the idea of spreading data across different types of data on many spindles. Giving the highest demand applications the higher performance. How could I think software could do it as well as I could? It wasn’t easy for this old DBA and I still have a way to go. The vendors of this new generation of SANs understand we have “Loved Ones”. If not already available, they will develop cache algorythms to favor those we love. As SSD or other large affordable memory is implemented into the cache chain, this should become more affordable and very effective.
For most shops who purchase “Best/Worst of Breed” applications, rather than develop their own, the thought of SQL Server consolidation has not been able to be realized. With the variety of virtual offerings today, at least we can finally put them into fewer servers. By using shared consolidated storage, the cost of administation and the ability to respond more quickly to changing business needs may be obtainable. Time will tell. Consolidation savings, combined with the promise of better disaster recovery, failover and snapshots make this a fun time for the SQL Server Admins in small to medium businesses.
Amazing citing I must say… Its like walking through a list of pros and cons on a topic… and what best other than two maestro’s playing it together… Thanks for putting it up.. (and hopefully I will be tuning in more timely in furture, for my better… ;-P )
I would be hard pressed to change a standard recommendation just because the SAN has improved. I still don’t manage the SAN and those who do are not necessarily looking to get the highest IOPs. An example, recently where all database files as well as all other files were spread out across all the spindles on the SAN (major vendor not named) worked fine until there were not enough IOPs left due to the rest of the enterprise selling space not IOPs. You will generally run out of IOPs on these types of systems and still have massive amounts of usable space leftover.
Keith – when you say you wouldn’t change your recommendations just because storage improves, does that also mean you wouldn’t change your storage best practices when using SSDs? That seems surprising.
Brent your girl friend is right.I wish i had studied in medicine.I have started learning SQL since it was in its 6.5 version.now i think it was all waste of time .I would have been happier if I had spent my time on another subject!please do not remove my comment maybe it help someone to find a better way.I am always worry about the future if i loose my job ( for any reason such as economic and the like),I will not be able to find a proper job soon as the vast majority of employers think younger DBA are better than old DBA.and that’s a harsh truth
Just an FYI. i clicked on the Comic about SQL Server link and got a junk landing page. Forefront also disabled a script from running that it thought was trying to infect my computer.
Michael – thanks for the heads up! Jason let his domain expire and someone else got it, so I removed the link.