The old advice went something like this: “Put your data and log files on separate drives and your server will be more reliable. If you lose the data drive, you can still do a tail-of-the-log backup, and you won’t lose any data.”
It’s advice. But is it actually good advice?
Let’s think through it.
- If SQL Server loses connection to shared storage, you’re still screwed. No surprise there.
- If it loses connection to just one volume, and it happens to be the log file volume…you’re still screwed.
- But if it happens to lose connection to just your data file volume, you’re safe! Well, you’re down, but you didn’t have any data loss (assuming you know how to do a tail of the log backup.)
At first, that sounds like you’ve cut your risks by 50% – but let’s dig deeper. This scenario correctly assumes that a single volume can fail. I’ve certainly had that happen:
- A SAN admin accidentally remapped one of my volumes to a different server
- A messed-up snapshot ran out of space (the SAN admin had accidentally taken a snapshot of one of my server’s volumes)
- A raid array became corrupt
I can’t even begin to estimate how often these things happen, so just to pick a number, let’s say any given volume has a 1 year time between failures.
So it’s time for a quiz:
- If you put all of a SQL Server’s data files & logs on a single volume, how many failures will that server experience per year?
- Bonus question: what kinds of data loss and downtime will each of those failure(s) have?
- If you split a SQL Server’s data files onto one volume, and log files onto another volume, how many failures will that server experience per year?
- Bonus question: what kinds of data loss and downtime will each of those failures have?
Think carefully about the answers – or read the comments to see someone else’s homework, hahaha – before you move on.
“I disagree with your volume failure rate idea.”
I hear some of you saying, “Wait – I believe failure rates are not tied to volumes. It’s not that each volume can fail – it’s that a server will have a failure rate. I believe a server will lose a volume once a year.
OK, hotshot, let’s say that once a year (again, just picking a number), your server will lose one of its volumes. In that case, which design would give you the least data loss and downtime?
- Just 1 volume, with both your data & logs on it
- 2 volumes, 1 with data files and 1 with logs
- 10 volumes, 9 with data and 1 with logs
- 100 volumes, 98 of which are empty, then 1 with data and 1 with logs
If you’re arguing for answer #2, keep in mind that when your server has its annual volume failure, you stand a 100% chance of downtime and a 50% chance of data loss.
Whereas #4 has a 2% chance of downtime, a 1% chance of data loss. Brilliant! 1,000 empty volumes probably equals five 9s of uptime, woohoo!
Except now you stand a 100% chance of getting a bottle to the face from your storage admin. Ask for this configuration, and they’ll be happy to explain why adding more empty volumes to your server doesn’t magically protect any valuable volume.
If you still have a single point of failure in your log file volume, adding other volumes to do other things doesn’t help you. You’re just adding more single points of failure with their own failure rates.
On a related note, check out my 2009 post, Adding Reliability to Your Infrastructure.
Probability is hard and often counter-intuitive 🙂
With only one hard drive the AFR (Annualized Failure Rate) (i.e. the probability that the drive will fail in that year) it’s let’s say 1% (you can check the AFR on the Hard Drive specs)
Adding a second drive, your adding a second point of failure with a probability correlated to the 1st one, so they sum up, reaching 2% (for the probability that one drive fails during the year)
In the 1st case, you have 1% probability for downtime and total data loss
In the 2nd case, you have a 2% probability for downtime, with a 50% chance of data loss (if it hits the Log Volume), so a 1% probability for total data loss.
That’s why a single hard drive at home fails rarely, why having a RAID 0 of many drives is a bad idea and why at the office, with hundreds of drives around, you’ll always see at least one or two get fried every year.
Emanuele – very good! You passed the test. Good job! (That’s also why I gave the one-failure-per-year sample number to start with, just to keep the math easy.)
The odds don’t sum up. The odds of one or more failing would be one minus the odds that neither fail. That would be 1 – (.99 * .99) or .0199 a.k.a. 1.99% That pesky .01% difference
Just so other readers are clear: you’re agreeing that the odds of failure are higher when the database relies on two volumes are up, rather than just one.
I think the real question is what is the probability during the year that you have to execute some type of data recovery. That is what this question is asking. Then next issue is does your backup/recovery model make the loss of data minimal no matter which drive fails.
Richard – yep, this is what high availability and disaster recovery is all about: talking to the business about the downtime & data loss they’re willing to accept, and then designing a solution to match that.
I only recommend splitting the data and logs for performance reasons. DR is DR and that’s what backups (and testing restores) are for. If they have lots of spindles or flash storage, put everything on on volume, honey badger don’t care. Just know that the RTO and RPO are what matter, and those are what defines the technical implementation.
Randolph – innnnnnteresting. And why do you recommend splitting them for performance reasons?
My client base is small to medium business. In many cases, they’re putting everything on a single VM with minimal RAM and 7200rpm drive, maybe doing RAID 5, and all on the system drive, sharing with the page file. Latency is awful. So even moving tempdb to its own drive makes a difference. I explain how logs have sequential writes, etc., and it can make the world of difference even switching to RAID 10 from RAID 5.
Gotcha. The 7200rpm magnetic hard drive thing is tough to understand in this day and age, though, when 1TB SSDs are under $500: http://www.anandtech.com/show/9799/best-ssds
Oh I know that. You know that. Anadtech knows that. “Get faster drives” is in my recommendations too, but people don’t like change. These are the same class of folks running entire businesses on Windows 98 desktops with no backups. If it ain’t broke … most recently I had to explain why Express Edition isn’t appropriate for a production environment due to AutoClose. They see the problem, they see it’s slow, even IntelliSense causes hanging, but $x000 is too much for a licence because “free is better”. Now sell those folks a 1TB drive. When they finally have data loss and no backups, that’s when the chequebook comes out. Until then, they don’t like change. Some folks listen, some folks don’t. They can understand moving a database file to another drive, but not replacing the drive. It’s “too expensive”. ¯\_(?)_/¯
Maybe in the USA they are that price (and certainly MUCH cheaper now) but, here in Australia, and especially where businesses set up these agreements as “preferred suppliers” where you can only purchase through a third-party reseller much of the time: A 1Tb SSD for a server goes from that few-hundred $ to a few thousand $$ and is then rejected by the business as being an “unnecessary expense”.
I’m with Randal, I always thought the separate drives advice was based on performance. I had never tied separate drives to being reliability advice. I personally have seen significant improvements after separating data and logs onto different drives. But that was a while ago and I believe magnetic drives were involved). So who knows, maybe it’s time to do another test (performance wise)
*Randolph (so so sorry)
Oh absolutely – I’ve seen improvements on it too for single-database workloads, dedicated drives, or storage with the ability to tweak caching settings based on volume. (For example, different read/write cache allocations for the data volume vs the log volume.) The catch is that you have to actually identify those scenarios. Otherwise, if you just take a 20-database workload and split the data and log volumes, keeping the same settings…
You’re fired, Marcel.
We have 200 databases on 1 server. We did separated drives MDF and LDF drives for performance and our company not switching to SSD.
Just having some fun with this.
Number 4 still has the same chance of data loss, you can’t say “if one of the hundred goes down, I’m good for another year”, it assumes you’re going to have a fairer chance that it won’t pick the one(s) that are important. Murphy was too much of an optimist so I’m not buying it. That drive (or two) would get more use anyway if it’s
spinning rust. Why not simply mirror 49 logs and 49 data files in #4? Or 49 mirrored or weplicated sites? Aside from the write penalty, that is. Of course I’m taking it to extremes, that’s half the fun.
 Going Full Elmer, for Fuddsake!
Cujo – if only SQL Server had the ability to mirror logs – which it doesn’t. Wherever your log file is, that’s a SQL Server single point of failure. (Until you implement things like mirroring or AGs to a different SQL Server.)
Brent, SQL may not be able to mirror logs, but if you’ve got a controller that can handle 100 drives, likely to controller could.
Set up a massive RAID-1 array consisting of 49 drives and Bob’s your uncle.
Of course, now you’re paying the cost of 49 drives but only getting the capacity of 1 drive…
Jason – yep, except that single array can fail. (Hoo, boy, I could tell you stories about RAID controllers failing and killing volumes.)
‘disk mirror’ was available up to 6.5. Microsoft removed it for 7.0 recommending using the OS mirroring instead. Up till then one could mirror the transaction log devices.
Thanks, Grandpa! 😉
you’re welcome sonny. LOL they call it progress 🙂
It’s still available in Sybase. Quite useful for storage migrations according to the guy sitting next to me.
@James, yes it is, one can mirror a device, make new one the prime, break the mirror, then drop original, without having to stop the database server.
To be very honest I am shocked. My knowledge may be a little stale. Till now I emphasize putting data and log files on separate drives in my shop even though we use SAN. Brent are you saying this has no performance benefit? All that random and sequential access theories we learnt in the past ….????
Kenneth – performance is a tricky discussion too. You were taught that log file access is sequential, so it should be put on a separate volume, right?
Think through these questions:
1. How many databases do you have on the server?
2. Do you take turns using them, or are they all active at once?
3. Do you also have log backups happening?
4. Now given those answers, is your log volume access REALLY sequential?
5. And even if it is, are the underlying drives in that volume really dedicated? Or are you using a big pool of drives that lots of servers are accessing simultaneously, doing different things?
Phew… disruption .lol
Kenneth, this isn’t a discussion about performance, it’s about reliability. And with faster storage layers these days, it matters less than it used to.
here’s what we do:
1. ~30 databases of 2-8 GB each
2. multiple users accessing all of them a lot in both query and updates all the time
3. a Primary 8 core Server with SQL 2008 Standard and DB and 10 Logfiles on the same drive
4. Raid 10 on 4x600GB 15K RPM SAS with 1TB total
5. a Backup Server exact replica of the primary server
6. Nightly full backup from Primary to Backup Server
7. Log backups every 1/2 hour from Primary to Backup Server
8. Restore all DB’s every night to verify a complete Backup/Restore process.
9. We can restore and switch over to the backup server in under 10 minutes.
10. We also move the full backups offsite to Box every night for protection against a facility meltdown.
Obviously we have a pool of drives with the Raid 10 where all the log files
are on the same drive.
We seem to get very good performance and are well protected too.
A very effective argument for AlwaysOn Availability Groups, database mirroring, synchronised replicas, etc.
If I lose a drive, that’s alright, got another on my secondary server. Lose a server, that’s OK, got another server.
The issue of having the data and logs on different drives now becomes a question of speed and efficiency.
This is the problem you pose to your ICT and finance managers. Then ask them how much the down-time will cost for the once a year failure. If it’s more than the cost of a duplicate server then your on a winner.
Graeme – exactly. Making redundant copies of your data (via mirroring, AGs, etc) can add reliability. Adding more single points of failure does not.
But about speed and efficiency – make sure you read the other comments and start thinking through those questions. There are some surprises in there.
Wasn’t the data and log files volumes separation about better performance rather than failure? Nobody mentioned anything about that. I have’t read all the comments though. :o)
Believe it or not, there’s blog posts out there that actually recommend separating data and logs onto different volumes for better recoverability – not understanding that by adding volumes, they’re causing the very outages they’re touting recoverability for. (sigh)
You implement separate OS, Data, Transaction Logs, TempDB, and sometimes ETL volumes for performance, reliability, *and* monitoring.
I’ve gone to client’s aid that did not consistently separate Data and Transaction Log volumes, and TempDB for that matter. This is especially dangerous if you do not have consistent transaction log backups running. Often times a runaway transaction log was causing downtime in production because a data file, or another TLog file could not expand/auto grow. I’ve also seen poorly placed TempDB files on the C: drive or the TLog drive, and overzealous use of TempDB, take a server down.
It helps monitoring because you know what the purpose of the volume is, and allows you to predict and monitor the data access patterns, i.e. Sequential vs Random, Read vs Write, *and* predict and analyze storage growth more easily which can also affect reliability.
Jason – I’m a little confused. You’re saying that by putting data, logs, tempdb, and ETL on separate volumes, they won’t run out of space?
You’re not saying that, right? Can you rephrase how separating things on different volumes makes sure they can’t possibly run out of space?
Brent – No, I’m not saying that at all. A logical design doesn’t prevent anything except maybe the chaos and confusion that occurs when people follow this “advise” and store everything on a single volume. I’m saying that by separating these artifacts you make it easy to monitor, predict and respond to problems, like for example running out of disk space, and poor performance in the storage subsystem. If a TLog volume is sending alerts of any kind, it’s pretty obvious where to look for the root cause of a problem. With what you’re recommending, not so more.
What’s next, Brent? Should we create all of our tables is a single database because like RAID controllers fail and stuff? I think you’re wrong here, and that really surprises me.
No offense though. Love you! XOXOXO
Jason – again, I’m going to push you on this:
> I’m saying that by separating these artifacts you make it easy to monitor, predict and respond to problems, like for example running out of disk space
You’re saying that it’s easier to monitor three volumes’ disk space instead of one?
> If a TLog volume is sending alerts of any kind, it’s pretty obvious where to look for the root cause of a problem.
Whereas if you only had one volume sending out alerts of low drive space, you’d have a harder time? Do you really start troubleshooting by looking at drive space metrics?
I’m really open to hearing more about this, but…you’re gonna have to bring much better answers than that.
Jason – also, just to be clear, if you’re doing it for ease-of-use, I’m 100% okay with that. I just don’t want anyone thinking their database servers are going to be more reliable solely because they put the files on different volumes.
You’re being a bit unfair. Let’s give the people as much credit as possible when they say, “Wait – I believe failure rates are not tied to volumes. It’s not that each volume can fail – it’s that a server will have a failure rate. I believe a server will lose a volume once a year.”
If we assume those people, rather than being crazy, are assuming volume failure rate is based on IO amount, then scenario 2 and 4 are identical, and both are safer than scenario 1.
In fact, usage rate of a drive does impact failure rate…but I doubt it’s enough of a factor to justify trying to improve reliability to splitting the data and logs. Also, anyone who wants to improve reliability by buying more drives should maybe consider Raid 1 instead. My personal experience has been that raid volumes prefer to fail simultaneously in order to guarantee data loss.
Unfortunately, volume failure rate isn’t solely based on IO. In the examples I cited in the post, the volumes would fail regardless of IO volume.
This has gone to the top of my bookmark list.
I’ve been looking for something that negates the whole “separate volumes for performance” mantra.
When storing on shared storage, that is genuinely shared with other servers/processes, I’ve really not been able to come up with any benefits to splitting volumes, but constantly have trouble changing people’s opinions, and therefore start doubting myself.
The idea that adding extra volumes also increases the chance of failure is so obvious that it really hadn’t occurred to me
Steve – thanks, glad you enjoyed it!
Delighted with most comments, but it seems there’s still confusion between the concept of “volume” and “physical disk”. NOT the same thing. 😉
Yeah, which is even more of an eye-opener, if we’re honest: if you don’t understand the difference between volume, array, and physical disk, then you should leave this kind of architecture to your storage admins.
I think some of the terminology may be getting in the way of some of the thinking. (System v. server, volume v. array v. lun v. etc.) The reliability of any “system” is a function of the reliability of each of the parts of the system, and there are robust ways of calculating such things if you know whether you have serial or parallel reliability relationships. I had to do such calculations in my satellite design class in college. Here’s what looks like a decent site on the topic if you really want to go down that rabbit hole: http://reliawiki.org/index.php/RBDs_and_Analytical_System_Reliability
I tend to agree with most of the spirit of the post though, regardless of numbers. If you’re dealing with a SAN, as a DBA you likely don’t have much of a voice, you’ll get what they give you, whether you like it or not. And it’s abstracted enough that if a hardware component fails you likely won’t be impacted beyond temporary slowness while the spare kicks in, unless you start getting nasty stuff like lun or volume corruption, (which was the cause of a really long weekend for me about 10 years ago.)
If you’re dealing with local storage, then you may want to do some of that back of the envelope number crunching. If 1 vol is 99% reliable, as is a second, but the system is down if either fails, then you consider that a serial relationship, so the reliability is not 99%, but (.99)(.99) = .9801 or 98%, not 99%. But are you then focusing on the number of volumes when maybe you should be looking at the difference in RAID levels? Is it really the “volume” that fails or the underlying array? Then again, how many DBAs get a chance to configure hardware like that? I’m not sure, but at my last place I was lucky enough to be able to work together with my server and storage admins on such things, though the goal was performance and space maximization more than reliability.
Chuck – thanks for the thoughtful note. And yep, totally agree about the serial vs parallel reliability relationships. That was the big eye-opener for me: using two single points of failure (a data volume and a log volume) don’t make your system MORE reliable, they make it LESS reliable. That was pretty counter-intuitive at first.
The most mind-blowing part for me was that if you put your data onto a separate single point of failure, and then it fails, you might think, “See, I saved myself by having my logs on a separate volume!” When in reality, you were the one who caused the data volume loss, basically, and the downtime is your own design’s doggone fault.
This type of articles just confuse less technical proficient people. What a waste of time and dangerous advice. Also.. loosing the access to the log drive (even without AG) doesn’t mean there is loss of data just service interruption (if you adopt the right measures at the right time) only the in-flight transactions will be in lost which is not a problem since the client will be notified of the failure.. Try to read a little more documentation other than the Microsoft Docs or your poor articles..
Self – generally speaking, if you lose access to your volumes, there’s a pretty good chance of either corruption, or someone else gaining access to your volumes. Make sure you read the post carefully about the scenarios where I lost access to my volumes – for example, when they were mapped to another server, which proceeded to initialize & format them for use (because they assumed those were their volumes) or when the volume was corrupted.
Thanks for stopping by, though, self!
Heh, he said “loosing”
I agree with your reasoning about why the specific advice you quoted doesn’t make sense anymore. However, my main concern with putting everything on one volume relates to running out of free storage, which is a much more common problem in my experience than someone messing up a SAN volume.
Here’s why I think splitting the same amount of storage over multiple volumes can improve reliability. On a few occasions I’ve seen databases become corrupt (i.e. wouldn’t finish crash recovery and transition to an online state even after space was recovered) after a volume holding data files filled up. I’ve yet to see this happen due to logs running out of space. Most rapid onset out-of-space issues I’ve seen are caused by tempdb or log growth, so I think it can be useful to segregate transaction logs onto separate volumes. It also can reduce the surface area of what’s broken in SQL Server if a log fills a disk — databases that don’t have full logs will still be taking writes, tempdb will be OK — whereas if a log fills a disk and everything is on that disk, any file that needs to grow won’t be able to take writes.
It’s true that you could set autogrowth and maximum file sizes so that you won’t run into such issues, but if you’re managing many databases with unpredictable data growth patterns it’s likely that you’re as likely to cause problems (by setting the limits too low in some places) as you are to prevent them.
James – gotcha. You’d rather have 100GB of free space on two volumes than have 200GB of free space on one volume. I’m the opposite: TempDB aside, I’d rather have less volumes to manage and more free space per volume. For me, that’s been a better solution to avoid the “oh no I’m out of space again” issue. Heck, you even say it yourself in your last line:
“it’s likely that you’re as likely to cause problems (by setting the limits too low in some places) as you are to prevent them.”
You’re referring to file sizes – but the exact same concept holds true for volumes as well. Give yourself some breathing room. (And of course, the best solution is storage gear that automatically grows volumes as you approach capacity limits, but that’s another story.)
It’s more that I’d accept increasing the probability of a bad problem (a log filling up) to reduce the probability of a much worse problem (a corrupted data file after the data volume fills up). But once you introduce the fact that using separate data and log volumes makes it more likely that either volume will fill up, it gets difficult to tell whether this practice is helping and the simplicity and flexibility of a single volume looks more appealing.
I think there is a lot of motivated reasoning on this topic and people (probably myself included) are trying too hard to defend this legacy practice.
Yeah, and along those same lines, I usually see this kind of logic in shops where:
* There’s no alerts set on data file growths (easy to do with Agent or monitoring apps)
* Management won’t allow leaving 25-30% free space on volumes despite wild data growths
* There’s no max sizes set on files
Do you mean that their stated concern about running out of space isn’t entirely sincere?
Heh heh heh – or they do the absolute minimum work possible to say they’ve done something. “See, I put the data on a separate volume, so my work here is done, right? Now, back to defragmenting my pages every day, generating more log file activity and expanding the size of my differential backups. A DBA’s work is never done!” 😉
25% – 30% free space? How luxurious! In my experience most places want you to keep things around 5% free. At my old place, the “enterprise monitoring tool” had a default 10% free space alarm that people (including me on occasion, I won’t deny it) repeatedly asked to have adjusted to 5% just for their snowflakes, and even less for large drives (10% of 1TB is 100GB, which for a slowly growing db seems like a waste for precious SAN space) or even turned off completely for more “static” drives (think historical data) which were sized “just so” (1% free)
Yep! This is where Dave Stein’s excellent trick comes in handy: leave a backup file (say 10-20GB) on each of your volumes. That way, when you get the alarm, you can delete the backup and buy yourself some more time to fix the emergency.
“Here’s why I think splitting the same amount of storage over multiple volumes can improve reliability.”
I believe It CAN! It’s called RAID, isn’t it? 🙂
But trying to stay on topic, the original question is why we have this “best practice” of separating data and log to separate volumes “for reliability”. I don’t think the point is to discredit using separate volumes for various other non-reliability reasons, but rather to question doing it strictly for reliability, when mathematically you can show a system which depends on 2 things, where that system is down when either is down, is mathematically less reliable than a system having only one.
Charles – nope, you missed a pretty important word in the post: VOLUME.
Drive = a physical device, like a hard drive or solid state drive.
Volume = a logical device typically consisting of multiple drives in a RAID array.
Go ahead and check out Storage Networking 101 here: http://www.enterprisenetworkingplanet.com/netsp/article.php/3698291/Storage-Networking-101-Configuring-Disk-Arrays.htm
I think you missed my smiley face. I apologize, it’s a failure of written tone rather than spoken. What I meant was James was describing RAID, rather than what you were describing. I wasn’t disagreeing with you. This whole conversation is just another way of saying redundancy is what’s important, and that the more single points of failure you have in a system the less reliable it is.
(Also I have to fix my form autofill settings, I only go by Charles when my mom yells at me…)
It’s strictly about reliability in the case I described, though — the only difference is I’m talking about different types of failure, some worse than others. (I don’t agree with the conclusion in my original comment anymore btw)
I can’t really comment on reliability of a server or instance by separating the disk allocations although I have seen vast performance benefits by separating files/filegroups/databases onto different DASD.
There are arguments for and against lumping everything onto one and I will defer to the traditional DBA answer of “It Depends” because there are too many “what-ifs” that can determine how things are set up.
The site where I am at the moment has over 300 instances and, with the performance benefits shown over the years of splitting up the files, every instance is configured the same way to use mount points such that 24×7 support is made much easier and quicker without having to go through a discovery process of working out where everything is at 3am while half-asleep. Of course, some instances have special requirements and they are known and documented to their differences.
When in a large organisation – and especially when dealing with government – thanks to the silo mentality, it can be difficult to want your systems set up a certain way as another area will think you are telling them how to do their jobs. You then need to get standards drawn up and ratified so that you can enforce consistency.
For reliability and/or recoverability purposes though, I don’t see any benefit of having the files split out or bundled into one.
Hi, I know this is about reliability, thanks to this post I have a good idea about it, thanks.
I have one server working with two databases, the server has a RAID 1 configuration. Right now I was talking with the ERP’s provider because they did a very poor job with stored procedures on the database, (cursors everywhere), so, for now when users call these procedures the server work very hard. For an immediate solution, I will put another RAID1 disk, but in a solid state, and move the log files and tempdb to this new RAID1 with SSD, this will make better performance for me, this is wrong?.
Well, it depends if the bottleneck is indeed I/O , or not.
Guillermo – for performance advice, you’ll want to run sp_BlitzFirst @SinceStartup = 1 and look at your primary wait types. If you’re not waiting on storage, then moving the disks around may not help your workload. (Classic example: cursors aren’t usually waiting on storage.)
Hi, I run the script with that option, there’s a lot of fun!.
What I can see in your explanations, I have problems in many queries (stored procedures), and that’s what the provider must fix, obviously with a lot of reading Brent’s site. Also, I can see some options who in descriptions say’s I will have better performance, at first, if I use a fasters HD. I can see in sp_blitz, Slow Storage Writes on Drive.
Wonderful tool and guidance, Thanks,
Sorry for my english…
No problem. Interpreting these results is pretty far outside of the scope of what I can do in a blog post comment, but you’re on the right track. Hope we helped!
I know that you already have been done a lot of work in your post and videos. I was reading all that and only remain to say THANK YOU, I’m not a DBA, but you really help me to understand whats is the job of one of them. Thank you and thanks for the people in your team, they are great also, they do a great job.
Ok Brent. One hard drive is not acceptable for a production database. Period. For Dev, I’d even recommend it. Backup (and test the restores somewhere else).
But how many volumes is enough? This is a very good conversation. More volumes = more administration. Less data loss = keeping a job. What does this instance need? We can shard, we can add data files, we can horizontal partition and on and on…. It all depends for the customer.
Kurt – yep, that’s what the post is about. Thanks!
I always learn something reading Brent’s posts. I get the reliability argument.
There is a very narrow case where putting a log on its own volume gives a performance boost.
Since magnetic disk seek time is very expensive and logs are written sequentially, if we put
one log on a magnetic disk then we avoid seeks and get a boost.
Putting 2 logs on one disk kills the boost because it needs to seek between the logs.
Having one log on an SSD gets no boost because seeks are not expensive on SSD.
Having one log on a SAN volume may get no gain if multiple LUN share the same disks.
One and only one log on a magnetic disk based volume can get a boost from avoiding seeks.
Awww, thanks sir!
The end result of specific IO patterns from many volumes on many VMs is a random IO pattern at the SAN. Some products claim to fix this. Even so, do I want one pipe from my VM to the SAN? One set of volume-disk-HBA queues to the SAN? One ALU size for everything? Thick or thin for everything?
If the IO patterns run in parallel on different pipes…how many paralyzed pipes can a IO pattern paralyze if an IO pattern could paralyze paralyzed pipes?
I am going to create a new huge database(>3 TB size and keeps growing) in a server containing 3 par fast class SCPG of disk,1 TB memory size. There will be more writes and read from this database.And some of the tables may contain nearly 300 columns in a table.Kindly suggest me the best recommendations for setting up this new OLTP database.
Where can my datafile and logfile be placed,in separate drives?What can be the best autogrowth setting for the files.
Where can i keep my tempdb files
Where can i place the index files
Can we have different files for 3 different schemas being used in the database?
Hi! For questions, head to https://dba.stackexchange.com.
Posted, kindly respond
I still do need a final verdict on this one. I see normal case where companies with legacy On-Prem databases setup Single drive D:/SQL/Data, Log, Backups. Now, performance wise, I’m about to relocate the log and backup separate them on other local drives. I’m in doubt now if what I do is right and proper. Would you be able to elaborate as to why having separating data and log will likely cause the downtime? What is most likely the reason that separating them will create downtime? Both Capacity Planning and Performance is typically experience case from operations perspective. Thank you for responding.
I think the theory these days is that in most cases, the disks backing your different volumes are the same.
You might have a D: and an E: drive for data and logs, but ultimately it’s the exact same disks so you’re not getting any performance benefits.
For reliability, I feel it’s a little more grey.
There’s an x% chance that your server will lose access to a volume. The more volumes you have the more likely it is that the server will fail to see one of them one day.
I would definitely want my backups on a different volume though.
Again, it’s not really a performance thing, it’s more about availability if there’s some massive unpredicted database growth. If a database grows and there’s suddenly no space for backups, you end up with no backups and possibly a data drive that’s now out of space because you have old backups on it.
If you have physical drives attached to your sql server (as opposites to SAN / NAS) then there likely is more benefit to separating the volumes
Thanks much Steve :). It is more clearer now.
Having additional questions here.
1) If disks are Physical and were configured in Raid 5, would you think performance improvement in doing the following?
2) In SQL Server 2014, process, does it requires database taking offline/online or simply changing the path via SSMS and later copy those trn into the E:\SQL\Log?
3) Do you recommend enabling compression of backups? Or it will just cause some CPU performance degradation issue?
Thanks for any inputs.
Those are great questions, and exactly the kinds of things I work on together with you in my consulting work. Feel free to click Consulting at the top of the page.
Oh wow. Didn’t know you’re paid. Anyway thank you!
What about Microsoft recommendations if you ran into the 665 error? https://support.microsoft.com/en-us/help/2002606/os-errors-1450-and-665-are-reported-for-database-data-files
They recommend splitting the database files on VLDB but of course they dont say anything about different volumes.
That’s about an old, unsupported version of SQL Server.
It applies to SQL Server 2017. We’re getting it on SQL Server 2017 (14.0.3281.6) and Server 2016. I think it still supported 😉 or I belive we are in big trouble.
OK, I would call Microsoft support – I don’t think this is your only problem here.
(Just to be clear – this is pretty far beyond what I can diagnose in a blog post comment. Thanks for understanding.)
I get it, thanks anyway.
Great post, and interesting read. As a DBA, I think its important to stick to the old adage, “it depends” when answering questions about storage reliability. From a cost-perspective ($$$ for features and capabilities) and existing support infrastructure (aka are you a solo DBA, application has retry features, version of SQL Server, etc), each environment’s most optimal design “depends”.
I love Microsoft’s perspective from its Cloud team about what a “Well-Architected Framework” looks like for them. Its not all-inclusive and not meant to be always universal, but as they write:
These concepts are not all-inclusive. But focusing on them will help you build a reliable, secure, and flexible foundation for your application.
The Azure Well-Architected Framework consists of five pillars:
Credit (Microsoft – Azure Well-Architected Introduction)
Clearly, there are multiple pillars to consider when deciding on your “best” storage infrastructure. However, if you focus is on a single pillar, you probably are either overthinking your design, do not have enough knowledge or experience as a DBA, or have become overzealous with a single concept that may or may not even have a strict enough SLA to be too concerned about. Worse, you may even be betting on the wrong approach and forgetting that as a DBA, your job is to help your clients understand what is at stake and give them the opportunity to decide what works for them, while making sure you still have a job when things go south and the blame train travels.
As an example, by choosing to rely on a single drive, the C drive, then Security is probably not high on your list. Since with the root drive, you are choosing to believe your own security design, that the vulnerabilities and bugs that occur in the operating system (rogue activities, shadow copy growth even on “disabled”, malware, design vulnerabilities, etc) and the applications running on top of it do not pose much risk. You’ve also made a statement about long-term longevity about your current OS and the updates necessary to keep it supported (if that matters to you), among other things.
Does this make you a bad DBA? Maybe….but hopefully Brent’s post shows us we can always learn new things and not to be too rigid in our thinking. Maybe a single drive (just not the C drive, please) is the answer. Or maybe it is the cloud. Whatever it takes you, is that design right for your environment and will your stake holders understand the decisions made?
Section 2.5 of your Setup checklist in the First Responder Kit recommends multiple separate volumes. I’m confused.
Great catch! That setup checklist is years old, and we haven’t revised it in forever. I’ll go remove that section for the next release. Thanks!
One funny thing is whatever you use à separate volume for data and log on SAN storage, most of the time these are hosted on the same RAID array anayway…as your system admin will never build separate arrays just for that purpose as well as not allocate them to a single VM. Whatever you do on the “logical” part pay attention to the physical implementation behind the hood (and sometime you have no idea at all in compagnies with strict separation of duty).
We have a couple of SANs. I don’t do SANs, but our SQL Servers do. The raid or whatever magic goes on in the SAN for the drives is rarely an issue for me. However, the path can be. I can tell when a controller’s firmware is being updated. It’s amazing to see a write delayed 15 minutes and not be lost when something goes wrong. I like partitions to insure there is a queue big enough and not shared – one busy IO type does not queue out other IO types (e.g., a busy data read and busy log write). Perhaps this is no longer an issue with recent window server editions. Old habits. When I see slow IOs I wonder if the mysterious VM virtual disks to host virtual disks to HBA stuff to SAN controller is at fault. I asked for host LUNs to match the VM LUNs, but I don’t know if it’s required. Perhaps the larger ALU and other improvements of SAN software means that 6 LUNs each on 12 busy servers does not have a queue issue on one host LUN. Perhaps I should become a SAN expert. Oh wait…need to do a bunch of SQL and TFS stuff first.
Still using spinning magnetic rusty frisbees, eh?
Storage guy here. You might be overthinking it but it’s hard to say without knowing your setup. From my own experience working with VMware on six different storage platforms over the past ~12 years, exceptionally slow I/O is typically an over-saturated back-end (spinning rust that’s 100% active and can’t keep up) or an over-saturated storage processor (array “head”) with CPUs pegged to 100%.
Bottlenecks at the physical server level, VMware level, or in the storage network can also occur but latency at these points — in my experience — tends to degrade I/O performance in a linear and predictable fashion. In other words, I/O latency rocketing to the moon when exceeding a certain activity level is likely a back-end issue. I/O latency moving from, say, 2ms to 10ms, could be anywhere.
With modern arrays the streams will get crossed *somewhere*. Multiple disks will serve the same pool. Multiple volumes/datastores will reside on a pool. Multiple VMware disks will reside on a volume/datastore. You’ll never achieve true performance segregation for your SQL server’s volumes even if you carve out groups of physical disks to service separate workloads (with many modern platforms you simply cannot).
The modern way of enforcing fairness is to use Storage I/O Control (SIOC) at the VMware level and/or QoS features within the array. Even then, I *strongly* urge to avoid both unless there’s a clear and present problem and these features can be proven to help solve it. In other world, don’t be proactive with these and try to fix a problem that is yet to exist. VMware and storage vendors do a really good job with engineering fairness into their systems – let them do their thing.
The only tweak I might recommend — and it’s likely to make only a small improvement, if at all — is to present multiple virtual storage controllers to the VM and spread the Windows (or Linux) volumes across them. The idea is to achieve better parallelism and it’s recommended as a best practice by VMware. I do it regularly even though I’ve yet to see it help by more than 5%.