Tag Archive: sql server training

When should you put data and logs on the same drive?

A reader emailed me with a question: if you only have two drive arrays in a server (C and D, let’s say), and you also have an application on that server, how should you configure SQL Server?  Best practices say to keep your data and log files on separate drive arrays, but is it OK to put them on the same drive and put the application files on the other drive?

The reason the data file (mdf) and log file (ldf) should normally be on separate drives is that when you’re doing inserts/updates/deletes, they both have to write at the exact same time.  When the SQL Server engine starts to write data, it essentially:

  • Writes to the log file that it’s going to change the data
  • Writes to the data file
  • Writes to the log file to mark that the transaction finished

That means putting those on the same set of drives is going to make writes much slower, because the writes will always be random: the disks will jump from the log file location to the data file location and back to the log file location.  Random activity is slower than sequential activity, so the penalty is even worse.

So when is it appropriate to put the data and log files on the same drive, and use the leftover drive for something else?

  • When the database is rarely updated. If it’s mostly read-only, you won’t incur the heavy penalty on writes.
  • When one drive is dramatically faster than the other drive. Say you’ve got one RAID 10 array with 8 hard drives, and one RAID 5 array with 3 hard drives.  It may make sense to have everything on the fast drive.
  • When your application hammers TempDB more than anything else. I had a job interview once where the company said TempDB was more active than their data or log file drives by far.  Ouch – in that case, it might make sense to put TempDB on its own array, and the user database data & log files on the other array.
  • When the application does heavier disk activity than SQL Server. Notice that I said heavier disk activity: it’s not just enough to say that the application is used a lot, or does a lot of work – it has to be disk activity.  Some applications just use the disk when they first start up, and in that case, they don’t merit a separate drive array.

Regardless of your decision, though, use Perfmon after the system goes live and track the drive activity.  If one of the two arrays is being overwhelmed with load while the other one sits idle, then it’s time to rethink the decision.  Moving data or log files is as easy as detaching the database, copying the physical file (don’t move it – something could go wrong) and reattaching the database.  Granted, it involves downtime, but it’s better than being stuck with your decision for life.

Update on moving database files: JMKehayias, KBrianKelley and SQLDBA on Twitter all chimed in that users would be better off using ALTER DATABASE rather than a detach/reattach.  At first I was suspicious, not seeing an advantage, but SQLDBA sold me when he said they’d moved a log shipped database that way: alter database to change the file paths, take it offline, copy the files to their new locations, and bring it back online.  It kept log shipping intact.  That’s a win.

More about SQL Server Data and Log Files

Should SQL Server use shared drives or dedicated drives on a SAN?  I’ve talked about it in a few posts:

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

Five Tips to Get Better Vendor Support

So you’re completely pissed off that your third party software isn’t working, and you’re getting ready to call the vendor’s support line and let loose a stream of obscenities.  I feel you: I’m using Symantec Antivirus too. <rimshot> But before you pick up the phone, stop and think about a few things that will help you get a much better support experience.

1. Did you check the Event Logs?

Don’t leap to conclusions and assume that everything else in the system is working absolutely perfectly.  Go into Control Panel, Administrative Tools, Event Viewer.  Look at the Application and System event logs.  If you see so many red and yellow icons that it looks like a pinata, then maybe you’ve got a bigger problem to troubleshoot.

Don’t just check the exact date/time of your problem, either: sometimes a long-running problem will affect what you’re trying to do.

Example: a user’s backups are taking far longer than expected.  Scrolling back through the event viewer, the RAID card noted that a drive had failed, and the system was running under degraded performance while the array was being rebuilt with the hot spare drive.

2. Does it work without the vendor’s product?

If you’re troubleshooting, say, a SQL Server backup product, and you’re getting all kinds of errors when you try to do a backup or a restore, try doing it without the third party product.  Do a native-only backup/restore/whatever, and see if the same issues happen.

If they do, don’t bother calling the vendor, because you’ve got a bigger problem.  But don’t call Microsoft yet, because you might not have removed enough unknowns from the equation.

3. Narrow it down as far as possible.

Make a list of everything involved with the issue, and then rule out each of those parts.  The more specific you can be about the problem, the faster you can get a fix.

The worst example of this is when someone calls in saying, “I get an error now and then.”  What error?  What are you doing when the error occurs?  What’s running?  What changed?  Let’s help by ruling things out.

Example: let’s say our nightly jobs to back up a database keep failing.  My list of moving parts might include:

  • The backup script
  • The SQL Server Agent job
  • The backup software
  • The database
  • The SQL Server instance
  • The server hardware

So then to rule out each of those, I would do the following:

  • The backup script – try using a simple “BACKUP” command, outside of any other scripts, or use the SQL Server Management Studio GUI
  • The SQL Server Agent job – try a different time of day.  True story – had a customer whose jobs failed every night, and we eventually figured out that somebody had an OS-level scheduled task to stop SQL Server at a specific time – which happened to be during the backup time window.
  • The backup software – try using only native backups, not third party apps.
  • The database – try backing up a different database.  If it’s a big database, try a small one, like model/msdb/master/etc.  If those work, then try another large database.
  • The SQL Server instance – it can be SQL Server-level settings like service account permissions.  Try using another instance with the same database and see if it breaks.
  • The server hardware – don’t just think the server itself, but break this down into pieces too.  If you’re using iSCSI as a backup target, try backing up to local disks.  If you’re using local disks and you’ve got SAN space available to test, try backing up to the SAN.

If you’re running something from your machine, then try it from another workstation.

4. Write down the exact steps to reproduce the problem.

To get the absolute fastest support possible with the best end result answer, boil the issue down to an exact set of steps that anyone can follow on their own machine.  Support organizations are huge, and these days people are working from home a lot.

When you write down the steps, your target audience is a developer who just got emailed about your case from his PDA and he’s curious to see if he can reproduce it on his laptop, which doesn’t have an internet connection at the moment.  If you can give him enough information to reproduce the issue without connecting to the internet, you won.

Bonus points if they don’t have to use any of your data – that way, they can’t blame your data as the problem. Even better, this lets people test that they’ve actually fixed the problem without getting your data involved.

Example: check out this problem I had in the Microsoft data mining forums.  I wrote up exactly how to reproduce the problem by entering random data in an Excel spreadsheet.  Anybody, from a developer to a support engineer, can test this case on their own machine.

5. Study the Art of War.

I love this book, and it comes in useful in so many business interactions.  You can read The Art of War for free online.  I recommend saving a copy in a text file and putting it on your PDA.  The next time you’re waiting in line somewhere, bored out of your gourd, open it up and start reading.

Some of the principles won’t seem obviously beneficial to IT workers, so stroll down to your local bookstore and you’ll probably find a version of The Art of War targeted at businesspeople.  There’s dozens of versions of this book with different interpretations of what it means for modern life.

In terms of a support call, here’s what it means: you don’t want to go to war. If you call support, you may not like the results.  I abhor calling support because I’ve had some ugly experiences with some unqualified staff.  As a result, if I can figure out how to fix the problem myself, I’ll probably learn things that will make me a better warrior – I mean, DBA.

If you have to wage war, bring overwhelming force to the battlefield. Come armed with every single event log for your system, documentation about its configuration, screenshots of errors, and the scripts that produced the error.  When the support call starts, I like to ask, “Can I email you the information I’ve gathered?  It’s all in a zip file.”  I start that email before I even start talking, because it’ll take a while for this data to course through the internet.  By the time I’ve finished my explanation, they’ve got my files.

If you didn’t bring enough force, you have to be fast. If support asks for something you didn’t get ahead of time, don’t say you’ll call them back – gather the information immediately. Don’t hang up, don’t arrange a later meeting, just get it right now.  Otherwise, the person you’re working with won’t be on duty later, or they’ll have forgotten about the specifics of your case.  Show that you’re willing to focus, and they’ll return the favor.

Sometimes your business managers won’t allow you to stop production to gather the information you need.  In that case, make it abundantly clear to management that you want to move forward, but they’re choosing not to.  That way, when a manager comes in and asks why you haven’t fixed the problem yet, you can point to the exact email where the business management said you had to delay until a given date.

I’m sure I’m missing more links between The Art of War and calling support – heck, I could have made an entire blog entry out of that – but I’m off to follow up with my next support call….

Bonus Tip from K. Brian Kelley: Know your escalation levels.

K. Brian Kelley (web site, Twitter) suggested another tip: ask what your options are for escalation, and exercise them.

Escalation is the process of moving things up the chain to get to a higher level of support.  There’s usually a documented process: first level will work on it a given amount of time or until they feel like they can’t make any more progress, and then they’ll escalate the call up to the second tier of support.  That second tier is more experienced and has more tools at their disposal.  If they can’t solve it, they escalate it up another tier, and eventually you’re talking to the product developers.

When the call starts moving, make it clear what your timeline requirements are with support.  When I have a production-down situation, I ask the support engineer to agree at the beginning that we’re going to escalate the call if we don’t have significant progress within one hour.  (That time can be different depending on how severe the outage is.)  Then every time we escalate, I repeat that process with the new engineer.

On the other hand, if I’m dealing with a non-urgent server, say a QA box where I’ve got a replacement available for my staff, then I’ll tell the support engineer that I can wait 24 hours before we need to escalate it.  Putting a time on it helps both of you work with appropriate urgency.

You don’t have to wait for the time to lapse, either: if you’re getting frustrated with the support engineer and you don’t feel like your case is going in the right direction, ask it to be escalated.  I feel horrendously guilty whenever I pull that, but the sad fact is, sometimes it’s the best thing for everybody involved.  The support engineer doesn’t want to waste your time – heck, they want you off the phone too!

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

What do DBAs learn in their spare time?

I got a question from a junior SQL Server DBA who wanted to make his way up to expert level.  He wanted to know if it would help his career to learn the free/express versions of other database platforms – MySQL, Oracle, Postgres, etc – or if it would be a distraction.

I have mixed opinions on this.  I don’t think it hurts, and I think you should pursue whatever you’re passionate about.  If you’re dying to learn more about other databases, then you should do it, because it won’t feel like work.  It’ll feel like fun, and the time will fly by.

On the other hand, if you’re doing it to get ahead in your career, I’d do two things.  First, decide how much time per week you’re going to spend learning new things.  Different jobs and different personal lives can end up with very different learning times per week.  If you’ve got a hectic day job and a full family life, you might be lucky to squeeze in ten hours a week of learning.  In that case, every hour counts, and you may want to hone the skills you need to use every day.

If you’ve got the luxury of a lot of time on your hands, then the second thing I’d do is make a list of the technologies you want to learn, in the order you want to learn them.  Look at what skills your coworkers and your seniors are using, the skills you wish you had in your arsenal, and think about when you would use those skills.

For example, I don’t know jack about PowerShell, and it’s got a lot of good buzz.  I’d like to learn it – but with what I do right now, I’m just not going to use it very often if at all.  Interesting technology, but for my priority list, it has to sink down farther.

If I was a junior DBA today in a small shop, making my way up to a larger shop or a senior position, my priority list would look something like this:

  • Business intelligence (because it’s really hard to find people who are good at it, and it’s a lot of fun to show the boss things you discovered in BI)
  • Performance tuning T-SQL code (because it’s never fast enough, and you can look like a hero when you speed up code)
  • C# (because it helps you work better with developers who rely on SQL Server)

When I was a senior DBA in the field last year, my list looked like this:

  • Storage area networks (because SANs really drive SQL Server performance)
  • Virtualization (because it’s coming at us like a freight train, and more SQL Servers will be virtualized every year)
  • Policy-based management (because I couldn’t get enough help, and I needed to automate more proactive checks)

While I’d love to learn more about other database platforms, it just wasn’t high enough on my priority list.  Your mileage may vary.  I’d be curious to hear what’s on other DBAs’ priority lists though!

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

Steel Cage Blogmatch Part Two: The SQL

Jason Massie and I had a steel cage blogmatch about whether to configure a SAN with shared or dedicated drives, and we both passed out for exhaustion.

This debate is starting up again but with a slightly different spin: virtualization.  In an environment with Microsoft Hyper-V or VMware ESX, more than one SQL Server may have its virtual drives on the same physical hard drives.

If we have two SQL Servers, are we better off keeping their virtual hard drives on separate physical hard drives, or using one larger pool of hard drives shared between the two?  Microsoft’s technical article “Running SQL Server 2008 in a Hyper-V Environment: Best Practices and Performance Considerations” attempts to answer that burning question with Figure 16 on page 24:

Shared Disks vs Dedicated Disks

Shared Disks vs Dedicated Disks

This graph shows two things:

  1. Nobody spell checked the graph title, and
  2. Dedicated drives were faster.

However, they’re only 3.5% faster on average.  Management of the shared disk approach is much more than 3.5% easier, so I’d have to vote for the shared drive approach, much to Jason Massie’s glee.

Plus, keep in mind that your servers probably won’t all be under full load at all times.  During periods where only one server is under heavy load, that server should achieve higher performance throughput since they’ll have more idle spindles available to them.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

The answer to SQL Server training problems

Last Monday, I talked about the problems with SQL Server training as it existed last week.

Starting today, things are changing.

Today, you can get a sneak peek at the new SQLServerPedia, but I’m not going to give you the link until the end of the post because I want you to read what – HEY – WAIT – QUIT SCROLLING DOWN!  Finish reading this, you cheating cheater.

Okay, where was I?  Oh yeah, I want you to read what’s different about this site.

Real experts “own” each section and put their name on it.

Each topic area of the site, like Performance Tuning or Architecture & Configuration, has a name and a face.  That person is responsible for the accuracy and quality of the content in that section.  These people aren’t sitting around copy/pasting content from Books Online like some other SQL blogs I’ve seen out there – these are real SQL Server experts with real jobs and real experience solving really big problems.  This isn’t “BigDaddy31″ or “TriggerzRool” on Experts Exchange – every one of our experts has at least 10 years experience with SQL Server.

If you’ve got something to say about a page, you can say it.

Every page in the wiki has a “Discussion” tab where you can give your feedback straight to the editors – and to other readers.  You can talk about your experiences with a particular topic, suggest things the editor should do to enhance the page, or trash talk, I suppose.  But if I was you and I had something to say, I’d read the next feature…

If you want to enhance a page, you can do that too.

If you want to add content to a page, just create an account on SQLServerPedia.com and click the Edit button on a page.  Add your information.  Hit Save.  It’s just that easy.  An editor will moderate your changes, make any improvements they want, and then approve your content to go live.  That’s what makes us different – real people are double-checking the answers before they go live, because we don’t want a bad solution misleading a lot of DBAs.  When you see something in the SQLServerPedia wiki, you’re going to know that it’s passed the BS test: all of the editors have been working with SQL Server for at least a decade, and they know what’s the right thing to do and what to avoid.

We’re hosting video and audio podcasts of SQL Server training presentations.

You can subscribe to our SQL Server podcasts using your iPod or Zune, or you can just watch streaming versions in your web browser.  We’ve got video and audio copies of lots of technical presentations that we’re starting to push through our podcast feed, with new ones coming out twice a week.  If you’ve got a presentation that you’d like to record, let me know and I’d love to help you record it and host it.  There’s so many DBAs that are desperate for training and knowledge, and this is our chance to make a difference.

Everything is free – no registration required.

That’s right, free SQL Server training.  Why should you have to pay to access the work of the community?  If you want to join in and help the wiki, then you’ll need to register, but as long as you’re reading, you can keep your tin foil hat on your head and your credit card in your pocket.

Here’s to the Editors, the people making a difference.

These are the people who are pushing the buttons to make it happen: the editors at the new SQLServerPedia.com.  These are seriously experienced database experts who are taking the time out of their lives to help build community knowledge.  None of these guys are getting paid for this – they all volunteered because they believe in giving back to the community.  I can’t emphasize enough how thankful I personally am to these guys, and how thankful our readers should be.  They’re helping us all become better DBAs, and nobody’s charging admission.

And here’s to the people who helped get it off the ground.

We couldn’t do this without the dedicated contributions of Andy Grant and Christian Hasker of the Quest Software SQL Server team.  Andy and Christian pulled strings to get the entire SQL Server training content from our former product KnowledgeXpert ported over into the wiki.  If you wanted this content before, you had to pay to get it, and now Quest is giving it away for free in the wiki.  That’s the kind of commitment Quest has to getting DBAs trained, and that’s why I came to work for them.  The SQL Server business unit is about one thing: making DBAs look good, and that’s where the new SQLServerPedia comes in.

Okay, okay, here’s the link.

To get a sneak peek, go to http://SQLServerPedia.com/wiki.  We’re doing a soft launch right now, and a full launch at PASS – at that point, the main pages on SQLServerPedia.com will have the same top navigation bar as the wiki does right now.

One thing you need to know: during the soft launch, the “Search” box at the top right only shows results from the blog, not the wiki.  That’s on purpose – we don’t want to be in Google’s indexes just yet.  We’re giving the editors time to poke around in their content, and we’re working with a search engine team to optimize our HTML.

Every day this week, I’ll blog about a particular area of the site that stands out for me – something we’re doing that takes training to the next level.  Tomorrow, I’ll be talking about our vision of what a good wiki article means.

Go check out the site and let me know what you think!

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

Steel Cage Blogmatch: How to Configure a SAN

Jason Massie and Brent Ozar work with SQL Server for a living and write blogs for fun, but that’s where the similarities end.  Jason’s a fan of shared storage SANs: putting SQL Server data and logs on the same set of physical hard drives.  Brent’s a cheerleader for dedicated configurations where the arrays are separated by purpose.  We locked the two of them in a virtual room to see what would happen.  No database administrators were harmed in the making of this article.

Brent Says: You’re crazy.  Well, I mean, you’re crazy just in general – anybody who writes a comic about SQL Server needs some medication – but I mean you’re wrong on this issue.  Since the dawn of time, we’ve been telling SQL Server DBAs to put their data, logs and TempDB on different arrays for a reason: they get better performance.

Jason Says: We you say “need medication” I take it you mean more medication. Anyway…. Yes, that SQL Server 6.5 book we read a decade ago is, in fact, outdated. We once thought the earth was flat and we were the center of the universe. You can now file “dedicated raid 1 for logs” along with those busted myths.  That book was written back when servers had a small number of locally attached drives, and we were lucky if we got two hard drives in a mirrored pair for the logs.  Today, SQL Servers are attached to huge SANs with large numbers of hard drives.  Are you saying a shared piece of fifty or a hundred drives isn’t enough?

Brent Says: For small servers, you’re completely right, but let’s talk about an extreme example: established multi-terabyte data warehouses need very high I/O for long periods of time, several hours a night, during their nightly load windows.  They need to load as much data as possible, as fast as possible, and these windows often coincide with the nightly backup windows of other systems on that same storage.   They have dedicated DBAs who monitor performance closely, and they want to get every last dollar out of their SAN.  They carefully partition the data to balance load across different arrays and make it easier to do simultaneous ETL and reporting.

Jason Says: In situations like this when money for hardware and manpower is not an object, you can just throw enough resources at it that it is hard to mess up. However, you should spend some resources on testing and make sure you’re not flushing money down the toilet.  Separate drives for data, logs and ETL load files might best but having a higher total spindle count might still be better. The proof is in the test numbers, and blanket statements are not safe.  Not to mention by designing for the ETL, you will have idle drives when you are not loading – and you don’t want to have expensive SAN hardware sitting idle, do you?

Brent Says: You have a point there.  It’s easy to recommend dedicated drives in situations where there’s a consistently high utilization rate, like for 24×7 OLTP systems, email archiving databases or very heavily used data warehouses.  It’s harder to recommend them when the database servers just aren’t being that heavily used around the clock.  But talking about virtualization reminds me – I don’t want any SQL Servers sharing spindles with virtual servers, ever.  Virtual servers have such a high I/O utilization already that it’s like playing with fire.

Jason Says: It’s like buying a new house – you want to check out the neighborhood first and find out what your neighbors are like.  You wouldn’t build a brand new mansion in a sketchy part of town, and you wouldn’t put a data warehouse and a bunch of VMware guests on the same spindles.  I would suggest going to the Mayor (SAN Admin) and demanding a performance SLA. I have found by doing that, with executive sign off, that the Mayor makes sure the streets are clean and the riff raff out of sight.

Brent Says: Getting a good performance SLA is the first step to peace and harmony in the datacenter.  Data warehouses are a special case: they have lots of dedicated staff, generally experienced people, who can focus on things like IO throughput and latency.  But what about the rest of the databases, like your typical OLTP or application databases?  A single DBA has to manage dozens, sometimes a hundred instances.  They don’t have time to benchmark.

Jason Says: Hey, just because they’re a DBA doesn’t mean they spend every weekend in the datacenter, alone with their servers.  Not all of them are single.

Brent Says: Don’t be so sure.  My girlfriend refused to even think about marrying me until after I got out of the production DBA on-call rotation.   She says it’s a coincidence, but I dunno.

Jason Says: If you don’t have the time to benchmark, then you don’t have the time to micromanage storage performance, and you really don’t have the time to build a good estimate of how many individual drives you need for TempDB, logs, and so on to begin with.  Using a shared set of drives is the easiest way to achieve high performance.  If you’re not getting enough performance, add more drives, and it’ll even out.  Don’t try this with, say, 6 hard drives and expect to get good results.

Brent Says: That’s the same with dedicated drive setups, too.  I’ve been at shops where the developers swore SQL Server couldn’t scale – only to find out they had the data, logs and TempDB all on a handful of drives.  Whether you’re using dedicated drives or shared drives, don’t expect to scale a database server on any software platform without adding storage.  If you want a real eye-opener, go read the TPC benchmark records and look at the number of hard drives that were used.  It’s not unusual to see drive utilization rates of 1-5% – meaning, on a 146gb drive, they’re only storing a few gigs of data.  More spindles equals more speed, period.

Jason Says: Another problem is changing app requirements like adding audit trails or replication. These adding read IO to you dedicated log drives. There is also reporting mayhem(large sequential scans) being added to your OLTP app. These are the bad elements but once they are in your ‘hood, how do you move them out?

Brent Says: That’s actually a problem with some dedicated spindle SAN configurations.  When each array is configured on just a few disks, management can be a nightmare.  I’ve worked with one major SAN vendor who could only add drives two at a time to arrays, no matter how large the array was, and all drive management had to stop while that array was restriped – sometimes taking hours.  Adding more drives for more performance to handle new load was a real pain point, and I would have gladly traded off some performance in exchange for easier management.

Jason Says: Easier management is a big selling point for shared drive configurations.  When multiple servers draw out of a common pool, it’s easy to allocate space in smaller increments.  You can build multiple pools with different tiers: a Solid State Disk tier, a 15k RAID 10 tier, a SATA RAID 5 tier, and so on.

Brent Says: Speaking of Solid State Disk, when SSDs go mainstream, all SANs will just use shared drive configurations.  With SSDs, the random access problem goes away, because there’s no heads to move around, no platters to spin.  We’ll be able to gauge SAN utilization strictly on the number of queued-up requests instead of worrying about drive latency, and then it’ll be easy to tell when we need to add more drives.

Jason Says: Are you saying that maybe then you’ll get around to admitting the earth isn’t flat and shared drive configurations are always the best?

Brent Says: Get off my lawn, whippersnapper.

Jason Says: This is me getting in the last word. ?

Updated November 2008…

Now there’s more – we revisit it by taking a look at how virtualization affects SAN configuration.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

SQL Server on a SAN: Dedicated or Shared Drives?

A reader wrote in and asked:

We’re running SQL Server with blades and a NetApp SAN. We have a few hundred databases from 100mb to 200gb.  All data, logs, tempdb, etc. are located in the same 30-disk pool.

Apparently this was setup using NetApp’s guidelines. NetApp recommendations are to put everything in one aggregate, or a couple of aggregates with the caveat that performance will suffer if you don’t use all disks in the same aggregate. They actually recommend putting data and log files together!

Have you ever seen anything like this before? Is this a stable system or is it craziness?

By the way, your blog is absolute genius!  I love everything you’ve ever read, and your SQL Server expertise is surpassed only by your good looks!

(Okay, so maybe I added that last paragraph myself.)

First off, yeah, I’ve seen that before, and it’s not craziness.  There’s times when this is the right way to do a SAN configuration, and there’s times where it’s wrong.  Some SAN vendors (like EMC and NetApp) recommend this type of configuration as a default, and they only recommend changing it when you can justify better performance in a dedicated drive setup – meaning, physical drives are dedicated to specific tasks, like a six-disk raid 10 setup for your transaction logs.

Before I give an answer for this one specific scenario, let’s talk about the decision factors.

When you’re choosing between two things, you want to be as specific as possible about the options.  If someone asks you what’s better, a Chevy or a Mercedes, you can’t just say Mercedes.  They may be asking about a late-model Corvette versus a clapped-out 1970′s Mercedes wagon.  Or that Mercedes might be pristinely restored, whereas the Corvette’s been in an accident.  You get the idea.  If you were choosing between cars, you’d want to know the age, the mileage, and what the person wanted the car for – performance, family travel, reliability, etc.

When you’re choosing between SQL Server storage options, here’s what you need to know about each of your choices:

  • The number of drives – sounds simple, but try to get that information out of your SAN administrator in a shared-drive environment.  Then, just for laughs, ask them, “If I was going to take my drives out of the shared pool and switch to my own dedicated spindles, how many spindles will I get?”  It’s probably going to be a pretty small number, and you’ll need to know this before you decide to switch.
  • The RAID level – some SAN vendors will say that the RAID level doesn’t matter anymore, and in huge shared pools, that’s vaguely true.  A 30-drive RAID 5 and a 30-drive RAID 10 are going to overwhelm your HBAs anyway.  However, in dedicated drive setups, we’re probably talking about much lower numbers of drives, and then it starts to matter.
  • The peak load windows for other apps on those drives – if you’re sharing drives with a couple of tiny servers, and you’re the biggest load in the group, then this probably isn’t an issue.  On the other hand, if you’re sharing drives with a large ERP system with hundreds of users that all log in around 8am, you’re going to want to expect that.
  • The backup method – if you’re using SAN snapshots, you want as many spindles as possible to make your backups less invasive.  I’m not saying SAN snapshot backups are invasive, but if you suddenly present that snapshot to another server on the SAN, your load on those hard drives just doubled.  That changes your SAN bottleneck, and for this, you’ll want shared spindles.
  • The current performance bottleneck – I could write a whole session just on finding your current SAN bottleneck.  Oh wait – I did, and I’m delivering it via videoconference soon!  Well, to condense it into a single bullet point, you need to find out where your performance bottleneck is, because it may not be your hard drives.  You want to focus on your bottleneck first, eliminate it, and then move on to the next bottleneck.  If the current one is the drive arrays, then even moving to dedicated spindles may not be the answer if we don’t get enough of them (see our first bullet point).
  • And there’s more – like the amount of time you want to spend managing the SAN, whether you want to use SQL Server partitioning, and so on – but these are a good start.

Now that we know the questions, let’s look at some of the answers for this one scenario.

The number of drives – right now you’ve got 30 drives in a single pool for a cluster.  If you carved it up into dedicated drives, you might do something like this:

  • 12 drives in a raid 10 for data
  • 10 drives in a raid 10 for logs
  • 8 drives in a raid 10 for TempDB

I’m just pulling these out of the air to illustrate that you’ve got a lot of drives, and you can play around with the config here.  To make a good design decision, you’d want to know the read/write mix and the activity on TempDB.  Bottom line, you’ve got enough drives that you could get some performance with dedicated drives instead of shared.  If you were running, say, 10 drives altogether in the pool, then the decision changes.

The backup method – NetApp’s snapshot solution has some great SQL Server integration, so just generally, if you’re using NetApp, I would use a shared drive config.  You may not be using their snapshots now, but as your databases get bigger, it’s nice to have the snapshot option available.  If your backup window gets out of control, call your NetApp guys about getting a demo.

The current performance bottleneck – it takes a lot of work to figure this out, but the key word in the question was “blades”.  I love blades, but I’ve seen a few implementations where people have shoved a bunch of SQL Servers in a single blade chassis, only put two HBAs in each blade, and only used two fiber cables to connect the entire blade chassis to the SAN.  That means all of the servers are choking on a small amount of bandwidth.

When we look at these answers together, I’d say that the shared drives are not holding back this server’s performance.  Before changing from shared to dedicated drives, I’d add HBAs in each SQL Server, enable & test multipathing software, and connect the blade chassis to the SAN infrastructure with as many fiber cables as feasible.  Otherwise, changing from shared to dedicated spindles for this setup won’t make a performance difference.

At the same time, I would only make these changes if you’re seeing disk performance bottlenecks on the SQL Server.  (This reader mentioned that he used Quest Spotlight, and it’s easy to see this in Spotlight.)

Does that mean NetApp’s original recommendation was right?  Not necessarily, but here’s why they do it: most people have setups just like this.  People will throw hundreds of thousands of dollars into a storage controller and a bunch of drive enclosures, only to kill the performance by not giving it any bandwidth.  The more time and effort you put into the setup around the edges, the more it’ll pay off in performance.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

New webcasts & videos up at Quest Backstage

Got a bumper crop of webcasts and videos up this week courtesy of the fine folks who refill my checking account:

Consolidation & virtualization roundtable with me, Kevin Kline and Ron Talmage – we raced through about twenty slides at a breakneck pace – about an hour.  We seriously could have talked for two hours easily about these topics, because there’s so much to cover even at a high level.  If you’re thinking about doing your first consolidation or SQL Server virtualization project, we talk about what makes a successful one and what to watch out for.  There’s no hard-core technical stuff here, just project planning info, but it’s good stuff.  I actually learned things while we were presenting, and that’s kinda scary.

CSI:DBA – ways to find out who deleted data & how to get it back – this was originally supposed to be me & Jason Hall, but he went on vacation (lucky devil) and you get an hour of non-stop Brent.  I covered the native options for auditing database activity, and then demoed how to do it with LiteSpeed v5′s new Log Reader.  I gotta be honest: the native methods for doing this totally suck.  I know, I work for Quest, but seriously, I wish the LiteSpeed v5 Log Reader would have been out when I worked in the field.  It could have saved my bacon a couple of times.

C.Y.A. – Cover Your Apps – DBAs often get called in to do backups before an application upgrade.  Wouldn’t it be nice if you could back up the database AND the application files at one time and package them all in a single backup file?  LiteSpeed does this with the Attached File Backup.

Kevin Kline’s first videoblog entry is online at SQLServerPedia, and while it doesn’t feature Yours Truly, it’s still worth a watch.  I’m looking forward to his next videos.

In a mildly related Backstage note, I’m shopping for rockstar clothes this week in California in preparation for my rockstar photo session with Tracy Manford (BlogFlickrTwitter).  I’ve been using the same crappy head shot for the last five years, and it’s time to update the photos to reflect my personality.  Tracy’s hilarious, and if you want to understand why she’s the right person to photograph me, just check out this one photo of hers.  End of story.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

Budgeting 101 for SQL Server database administrators

It’s that beautiful time of year when everybody watches the colors. No, not the changing browns and reds of the foliage, but the red and black numbers on P&L statements. That’s right – it’s budget time.

In order to build a good database budget, we have to ask our managers a few key questions:

Are we going to archive old data, or keep it online?

Database growth can be controlled by keeping a limited amount of history in the database. As DBAs, though, our job isn’t to limit the business, but to give them options. The business can decide whether keeping more data online is worth the additional expenditure for hard drives.

Example: take a data warehouse with 3 years of history. Near year-end, DBAs may be tasked with pruning this data out, but users often complain that they’ve decided they want the extra data online forever. Budget time is the perfect time to present this question to the end users. If the warehouse needs to go from 3 years to 4 years of storage, that’s an increase of 33%. Take the current storage costs, add 33%, and present that as the cost to keep an additional year of history online. Every year, recalculate storage costs and ask the same question again.

Include backup costs in these calculations, too. A truly savvy DBA will organize their filegroups so that old, historical data lives on read-only filegroups that rarely get backed up, but if we were all that good, we wouldn’t be having this conversation.

How has sales growth changed over the last year?

Database size can sometimes be changed based on business growth. If an e-commerce book store sold 15% more books this year than last year, and if they expect to grow by another 15% next year, then that’s a lot more transactions and customers that have to be stored in the database.

I’ve found that it’s easier to budget by past growth instead of future predicted growth because – well, let’s face it, predictions are hard. It’s much easier for a manager to understand and agree that we’ve already grown by 15% over the last year, and that our infrastructure has to keep pace.

Has our company headcount grown this year, and what’s the projected growth?

If a database application supports 1,000 users, and the company projects a growth of 5% next year, then that’s 5% more users that the application has to support. Granted, most applications are built in a way that scales easily, but big growth happens fast. Changes in headcount give DBAs a clear, concise way to show why their hardware has to be able to support more load.

What pilot applications are being rolled out to more users this year?

In small to midsize companies, database administrators aren’t always brought into meetings early enough during the planning phase of application development. Proof-of-concept applications become pilot applications, which become widely-rolled-out applications. At budget time, survey the landscape to see which small applications are loved by the users, and what kinds of wide adoption their managers expect to see.

This is where being a DBA gets hard: database servers are often shared between multiple applications. Being a good DBA means knowing what percentage of the database server load is being consumed by a particular application. A new application might be a nasty resource hog, and it’s not obvious until the number of users scales up. Now is the time to run profiles and performance logs to get insight into each pilot application. The more a DBA knows about each pilot application, the better job they can do budgeting for the future.

Finally, what more could you do with more time?

Make a list of day-to-day and month-to-month DBA chores that are getting overlooked due to a lack of time. Everybody has them – whether it’s fire-drill test restores, performance tuning, or even plain old career training. Give management a list of things that the department could accomplish with one or more additional head count. The list should be unbiased, matter-of-fact items, not “give me another body or I’m going to die.” The less emotion and the more raw facts the list includes, the easier it is for managers to forward it on to their managers and get approval.

Even if another person isn’t needed, make the list anyway. That way, a year or two from now when the department really does need another DBA, it’s easy to compare last year’s list with this year’s list and see that the company’s needs are changing.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube