How to Interview Storage Vendors

Storage
14 Comments

Buying SAN gear is hard work.  You have to compare vendors based on performance, capacity, reliability, ease of implementation, quality of support, and of course, price.  When I’m working with a client buying new storage, here’s the questions I have them ask each prospective storage vendor.

Q: If we buy SSD, SAS, and SATA drives, how does the tiering work?  Some vendors can automatically move data between fast and slow drives, while other vendors only allow this manually – if at all.  If the storage does all the work for you, find out what kind of reporting it offers – some tiering solutions will automatically use SSDs where possible, but you’ll want to know when it’s time to buy more drives for each tier.

Q: Is SSD used for caching or for storage? How much do you recommend?  There’s no wrong answer here, but listen to their recommendations.  This is often the fastest way to improve performance.  It’s going to be expensive, but that’s not part of the technical decision.

Q: Do you support active/active multipathing for both sending and receiving on a single LUN?  If you have just one 1TB volume, and you’ve got a handful of connections to the storage, can you use multiple connections simultaneously to get more throughput during read/write intensive operations?  Most vendors just don’t support this, so ask them to be as specific as possible – it’ll help you design the right connectivity for your storage.  No multipathing capabilities means your big databases may need 8Gb FC or 10Gb iSCSI connectivity, and they may need to be broken up across multiple files on multiple LUNs.

Q: Does the system include snapshots? Are they writable?  Do they integrate with SQL Server?  SAN snapshots are one of my favorite tricks to speed up multi-terabyte database backups.  We can take a snapshot of entire SAN volumes in a matter of seconds – but only if the vendor includes this support.  Some vendors include it, and some vendors charge an arm and a leg.

Q: Exactly how much can we grow before we have to make a major upgrade?  Storage gear is a lot like servers: if you need to get an incremental performance upgrade you may be able to throw in a few more drives, but if you need to double or quadruple your performance, you might get stuck replacing the whole thing.  Ask exactly how much cache, how many ports, and how many drives you can add before you have to make big changes.

Q: What are your best practices for Microsoft SQL Server?  Ask for their technical documentation on configurations.  They may not have an up-to-date version for SQL Server 2012, for example, but they should at least have a copy for 2005 or 2008.  Examine that documentation for things like RAID levels, multipathing, drive separation, and so forth, and take those guidelines into account as you spec out hardware.  If they don’t have guidelines for Microsoft SQL Server, ask for Oracle, but if they don’t have either of those, be aware that you’re going to be on your own for performance troubleshooting.

Q: How frequently have firmware upgrades come out in the last 6 months, and what’s the upgrade process like?  Insist on specifics from their technical documentation, not general sales and marketing info.  If they’re putting out firmware upgrades every few weeks, and if they recommend these upgrades in order to get support, and if you’re required to take all attached servers down in order to perform the upgrade, that’s a problem.  If, on the other hand, they recommend performing the upgrades online, administration will be much easier.  Insist on the specifics, though, because I’ve seen SAN vendors say, “Well, you could do the upgrade online…but we don’t recommend that.”

Q: Can I talk privately to a similarly-sized customer?  Ask for a nearby reference that you can have a private conversation with – without the storage vendor’s staff around.  Try to get someone technical from the reference company, not a C-level executive, because you want someone who’s working with this thing every day.  Ask about the company’s experience with performance, capacity, reliability, ease of implementation, and quality of support.  Do it in person, not over the phone or over email, because you’ll be more likely to get the off-the-record truth.  Ask them, “If you were going to do it all over again, would you buy this product – or if you were going to buy something else, what would it be?”

After you’ve compiled everyone’s answers together, you should be able to narrow down the field to a few strong contenders.  From there, ask each vendor to provide access to a system similar to the one you’d be purchasing.  If you’re spending enough money, either you can access the system remotely, or they’ll loan you a demo set of gear to install in your datacenter.  Asking the right questions ahead of time ensures that you’ll spend the least amount of time kicking the tires of gear that just won’t perform.

Previous Post
How Fast Does Your SAN Need to Be for SQL Server?
Next Post
Hash Partitioning, SQL Server, and Scaling Writes

14 Comments. Leave new

  • Very timely post. I just happen to have a meeting with a storage vendor next week. Thanks Brent!

    Reply
  • David Forck
    April 5, 2012 3:39 pm

    Aaaaaaaaaaaand… forwarded this to my Sys. Admin (he also does the san stuff).

    Reply
  • Nice! After last week’s SQL Connections presentation on SAN for DBAs I had a better understanding and now am sending this article to the team setting up my current project’s servers and SANs in the data center. It will be interesting to hear the responses, if any from the data center.

    Reply
  • Brandon Leach
    April 6, 2012 10:49 am

    Great post and thanks for the advice the other day on twitter. We’ve set up a little task force and will be meeting soon with our vendor to discuss SQL Aware snapshots.

    Reply
  • Hello Sir,

    I don’t know whether this is a right place to ask you question or not. I have one small question regarding disk IOPS. it would be great help if I get suggestions from you.

    I am querying sys.Dm_io_virtual_file_stats to get information. I used to read information from this DMV every 5 seconds and later on, aggregate it.

    Here is the data I got:

    Sample_ms: 98463062
    Num_of_read: 5727829
    Time: 7/21/2012 3:54:08 AM

    after a minute or so, I again query the same DMV and the number I have received it something like this:

    Sample_ms: 98530718
    Num_of_read: 5733363
    Time: 7/21/2012 3:55:16 AM

    So, I am calculating it something like this:

    ReadPerMs=(num_of_read2-Num_of_Read1)/(Sample_ms2-Sample_ms1)

    read/sec=ReadPerMs*1000 to convert read from MS to Second.

    so based on above formula, figures would something like this:

    ReadPerMs=(5733363-5727829)/(98530718-98463062)
    ReadPerMs=0.081796

    Read/Sec=0.081796*1000
    Read/Sec=81.79615

    Is this right way to find read/sec? because the figure I saw here was in development server and from our live server in business hours I used to get, sometime, 3000 to 4000 read/sec based on above calculation. we have RAID-5 with three disk of 10000 RPM so it should handle more than approx 300 IOPS than why my calculation is showing numbers like 3000 or more?

    I am really stuck here in this calculation and highly need your help.

    Thanks,

    Reply
    • Hi, RA. Rather than building your own, check out this 5-minute IO script from Microsoft’s Mark House:

      https://blogs.msdn.com/b/mhouse/archive/2012/02/14/5-minute-io-script.aspx?Redirected=true

      That’ll probably be easier to interpret, and it’s already tested.

      Reply
      • Hello Brent,

        Thank you very much for your very prompt response, I have checked the script of Mark and found matching number with the script I have developed for our environment.

        I am still confused why SQL Server is giving high number of read then the disk space? we are using following hard disk which can’t support more then 125 read/sec whereas I sometime get near about 1300 read/sec. how is it possible?

        “146G, Serial Attached Scsi, 10K RPM, 2.5, Seagate Firefly”

        Reply
        • RA – do you have cache on your storage controller?

          Reply
        • Hello Brent,

          Let me give you some more information about the disk we have.

          Raid Controller: RC1
          Throughput: 6GBit
          RAID Set: RAID-5
          RAID Capacity: 272.25 GB
          Windows: Disk 1
          Disk: 0.0.2, 0.0.3, 1.0.4
          Disk Capticity: all three of the above disk has 136.12 each.

          Stripe Element Size: 64
          RPM: 10000

          Reply
          • RA – for me to get the answer, I would need to know more about the RAID controller. I don’t know what RC1 is.

          • BTW, we have 16MB cache as “On-Board Cache”.

          • OK, most RAID controllers have more than 16MB cache. I think you’re talking about the cache on the drives rather than the RAID controller.

            Let’s take a different approach – can you work with your system administrator on this? I think it’s going to require more back-and-forth than we’ve got time to do here in blog comments.

          • ok. sure, I will work with system admin. BTW, big thanks for spending your valuable time on my issue.

  • we have two RAID controller on the server and we have named it “RC1” and “RC2”. SQL Server database is configured on “RC1” controller. these are not SAN. it is DAS.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.