Blog

Solid state drives (SSDs) have gotten a lot of press because they can be dramatically faster than magnetic hard drives.  They use flash memory instead of spinning magnetic platters.  Since any bit of memory can be accessed at any time without moving a hard drive’s head around, SSD random access is insanely fast.

Fusion-IO ioDrive

Fusion-IO ioDrive

How fast are SSDs?  So fast that good ones overwhelm the capacity of the connection between the server and the drive.  The SATA bus maxes out at around 300 MB/s, and a good SSD can saturate that connection.  In order to get your money’s worth out of an SSD, you have to connect it with something faster than SATA.  Value is especially important given the pricing of solid state drives – more on that in a minute.

Fusion-IO ioDrives get around this limitation because they’re not SATA drives; they plug directly into your server’s much faster PCI Express bus.  These cards can push several times more data per second than SATA drives can. Other vendors using this approach include OCZ Z-drives and RAMSAN.  Of course, this connection method only pays off when the drive uses top-notch memory chips, and after briefly testing some of Fusion-IO’s products in their lab, I can vouch that they’re using the good stuff.

How Fast Are Fusion-IO Drives?

As a former SAN administrator, I’m anal retentive about reliability and uptime.  I’ve heard FusionIO drives sold as a “SAN in your hand,” but with a single drive, there’s not enough failure protection for my personal tastes.  I wouldn’t run any storage device without redundancy, so I ran most of my tests in a RAID 1 configuration – a mirrored pair of Fusion-IO SSDs.  Keep in mind that since these devices have their own built-in controllers, any RAID setup must be a software RAID setup managed by Windows.  Software RAID has a bad reputation, but it’s the only choice available when working with these drives.  I was initially worried about the performance impact of software RAID, but I didn’t have anything to worry about.

I tested several different ioDrive models using my SQLIO scripts as seen on SQLServerPedia and got blazing results.  Here’s a fairly typical set of results from a pass doing random reads:

C:\SQLIO>sqlio -kR -t2 -s120 -o8 -frandom -b64 -BH -LS P:\SQLIO\TestFile1.dat
sqlio v1.5.SG
using system counter for latency timings, 2929716 counts per second
2 threads reading for 120 secs from file P:\SQLIO\TestFile1.dat
using 64KB random IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 10240 MB for file: P:\SQLIO\TestFile1.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 22787.54
MBs/sec:  1424.22
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 0
Max_Latency(ms): 106
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 100  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

I know what you’re saying, though.  “Wow, Brent, that’s faster than any SAN I’ve ever seen before, but solid state drives are slower for writes, right?”  Yes, here’s a set of results for writes:

C:\SQLIO>sqlio -kW -t2 -s120 -o1 -frandom -b64 -BH -LS P:\SQLIO\TestFile1.dat
sqlio v1.5.SG
using system counter for latency timings, 2929716 counts per second
2 threads writing for 120 secs to file P:\SQLIO\TestFile1.dat
using 64KB random IOs
enabling multiple I/Os per thread with 1 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 10240 MB for file: P:\SQLIO\TestFile1.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 10114.24
MBs/sec:   632.14
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 0
Max_Latency(ms): 54
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 100  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

Write performance was about half as much as read performance, but that wasn’t because Fusion-IO drives were significantly slower in writing.  They managed the same 500-600 MBs/sec write performance when using one drive alone (rather than RAID).  When I put them in a RAID 0 (striped) configuration, I frequently got over 1,000 MBs/sec.  To put things in perspective, the fastest solid state drives in Anandtech’s recent showdown only achieved 337 MBs/sec in their absolute best case scenarios.

The only way to outperform a Fusion-IO drive is to invest six figures in a SAN and hire a really sharp SAN admin. These drives consistently outperformed every storage I’ve ever seen short of SANs wired up with true active-active multipathing software, because typical 4Gb fiber connections can’t sustain this kind of throughput.  Random access, sequential access, lots of threads, few threads, reads, writes, you name it – once we had them configured properly, I couldn’t get them to perform any slower than 550 MBs/sec, which is faster than you can drive 4Gb fiber.  Connecting directly to the PCI Express bus really pays off here, and makes everything simple.

Configuring SQL Server on a SAN is hard.  To really wring the best performance out of it, you have to design the pathing, the queue lengths on the host bus adapters, the RAID arrays, the SAN caching, the database filegroups and files, and sometimes SQL Server table partitioning in order to get everything to work in concert.  Fusion-IO drives return SQL Server configuration to its easiest: plug it in, put data and logs on the same array, and just create databases with the GUI, using one data file and one log file.  As long as your hardware environment is solid, it’s pretty darned hard to screw up a database configuration on Fusion-IO drives.

The Drawbacks of PCI Express Drives

Connecting data storage devices as PCI Express cards isn’t all unicorns and rainbows. Since they’re PCI Express cards, they can’t be used in blades unless you’ve got special PCI Express expansion blades.

These cards are connected directly to one server, which means they can’t be used in clustering environments.  If your data is mission-critical, you’re probably using a cluster to protect yourself from server failures.  Protecting from failures is even more important when you’ve got a single point of failure – like, well, a single PCI Express card with all your data.  If something should happen to go wrong with one of these devices, you can’t simply fail over to another server unless you’re doing synchronous database mirroring, something I rarely see in production.

You can get some form of protection by using software RAID 1: building a simple mirror with Windows between two Fusion-IO drives in the same server.  Whenever data is written to the volume, Windows will automatically write it to both Fusion-IO drives.  Software RAID gets a bad rap, but in my brief testing, I saw no performance penalty when using this configuration.

However, when a drive fails, you probably won’t be hot-swapping these while the server is in operation.  With server-class RAID arrays, you can pull a failed hard drive out of a server and replace it on the fly.  The RAID controller will rebuild the array while the server is still online.  Data access speeds will be slower while the array is rebuilt, but at least the server can stay up the entire time without an outage.  Not so with PCI Express cards: the server will have to be pulled out of the rack and opened up in order to access the drives.  This requires careful cabling – something I don’t see often in datacenters.

And Yes, You Need to Be Paranoid

During my testing, before Fusion-IO ironed out all of the configuration issues, I kept having drives drop offline.  Normally I’d blame my own stupidity, but my tests were run in FusionIO’s datacenter, on their servers, configured by their staff.  I connected via remote desktop, set up SQLIO per my SQLIO tutorial at SQLServerPedia, and ran the tests.  During my tests, ioDrives appeared to have failed and FusionIO staff had to replace them.  It took several weeks for us to narrow down several unfortunate problems.

If you truly try to push your IO subsystems to the limit, a Fusion-IO subsystem will expose more weaknesses than other storage subsystems because it has so much more throughput.  Some of the problems included motherboard issues, driver problems, OS configuration errors, and even insufficient power supplies that couldn’t handle the load of multiple drives.

Buyers need to be aware that this is a version 1 product with version 1 best practices and documentation.  When you put something like this into your infrastructure, make sure you’re actually adding reliability.  In my post about adding reliability to your infrastructure, I pointed out:

“The only way a twin-engine plane is more reliable is if just one of the two engines is enough to power the airplane safely. If the airplane requires both engines in order to maneuver and land, then the second engine didn’t add reliability: it just added complexity, expense and maintenance woes.”

The ironic part about my FusionIO testing woes was that they only happened in RAID scenarios.  The drives were actually more reliable without RAID – when I added RAID, I could knock a drive offline in minutes.  The faster we wanted to go, the more careful the team had to be with other pieces of the infrastructure.

FusionIO drives solve a real problem, and they can deliver staggering performance, but just like any other new technology, you should test them thoroughly in your own environment before deploying them in production.  Make sure to test them in the exact configuration you plan to deploy – if you’re going to deploy them in a RAID configuration, test them that way, rather than testing individual drives and assuming they’ll hold up in RAID configs.  In the case of Fusion-IO drives, you should probably even test using similar power supplies to production in order to improve your odds.

Where I’d Use Fusion-IO Drives in Database Servers

If you’re experiencing heavy load problems in TempDB, and if you’re not using a cluster, a Fusion-IO drive can probably solve the problem with less engineering effort than any other solution.  Simply shut down the server, drop in an ioDrive, change SQL Server’s TempDB location to point to the ioDrive, and start SQL Server up again.  Your TempDB IO won’t travel over the same storage paths that your database needs, which frees up more bandwidth for your data and log traffic.  TempDB requests stay on the PCI Express bus and don’t hit your SAN.

If you’ve got power, cooling, and space constraints in your datacenter, but you need to add more storage performance (but not necessarily capacity), a Fusion-IO drive makes sense.  You’ll gain more storage throughput this way than by adding several shelves of bulky, hot hard drives that require lots of power.  On a cost-per-gigabyte basis, this won’t make economic sense, but if you’re buying storage for performance needs, the cost equation is different.

If you need to scale an OLTP database that doesn’t require high availability, you might consider skipping clustering and trying database mirroring instead.  Use two SQL Servers in the same datacenter, both equipped with Fusion-IO drives, and make sure they’ve got plenty of bandwidth between them to keep up with synchronous (or asynchronous) mirroring.  You could argue that this provides a higher availability than clustering, since it uses two different storage back ends.  I’ve had SANs fail, and I can see how this might be attractive in some environments.  StackOverflow strikes me as an excellent candidate – the databases are small enough to fit on Fusion-IO’s drives, and the servers are rack-mounted as opposed to blades.

I’m also intrigued at the ioDrive’s potential to offload transaction log load.  I can envision a scenario where databases are in full recovery mode, but the load of logging and transaction log backups is starting to put a strain on the server’s IO.  Moving the transaction logs onto the Fusion-IO drive eases the load on the SAN (not just the hard drives, but the cabling between the servers and the SAN controller).

I wish I had SQL Server benchmarks for these scenarios to share with you, but the testing process ended up taking several weeks, and I ran out of time.  Thankfully, Paul Randal is blogging about his experiences with Fusion-IO drives.

My verdict: these drives can solve some tough challenges. I’m not saying that because I’m being paid to, either; Fusion-IO was gracious enough to give me access to their labs for my testing, but I didn’t get compensated.  Quite the opposite – I sank a lot of time into this project.  Folks who follow me on Twitter may remember that I struggled with Fusion-IO during the initial stages of the testing as we went through one hardware failure after another.  After the problems we encountered and the weeks of investigation, I’m glad to finally be able to say without hesitation that you should check out FusionIO’s products.  Their throughput may push the rest of your infrastructure to its limits, but hey – that can be a good problem to have!

This post is a part of T-SQL Tuesday, which is focusing on storage/IO this week.  If you liked this post, head over there to see what else is happening in the blogosphere this week!

↑ Back to top
  1. I’ve also got an article on sql server central should be up in a week or so.

  2. Your review leaves me to see some great options for read only partitions as well. That is if they are small enough to fit on the drives.

    The HASSUG is talking with Fusion-IO today. This review will help me formulate some great questions for them today.

  3. Nice write up, Brent. I might disagree about the prevalence of synchronous database mirroring. we are using it at NewsGator with good results. I know of several much bigger customers are using it also.

    • Glenn – I’m interested to hear more about your thoughts there. Out of all your SQL Server databases, what percentage use synchronous mirroring?

      My experience has been something like 5% of customers (max) use synchronous mirroring, and out of those, they only use it on 5% (max) of their databases, giving us an overall number of .25% (max) market penetration for synchronous mirroring. That’s not 25%, that’s one quarter of one percent. It’s great in specific circumstances, but it’s something I would call rare to see in production.

      • We have done both synchronous and asynchronous mirroring at NewsGator. We are now using synchronous mirroring for 100% of our production databases. These databases are all in the same data center, with nearly identical I/O capacity on both sides.

        We like it because how easy it makes rolling upgrades (with a 5-10 second failover), and because it gives us two copies of the data, above and beyond our backups.

        One thing to keep in mind is that SQL Server 2005/2008 Standard Edition only supports synchronous mirroring, so that might mean that more people are using synchronous mirroring than you think.

    • Glenn, I am interested to hear more about how you guys use Sync. I have seen lots of Async use but never seen a production system that uses synchronous.

      • I’ve seen it used for truly mission-critical, customer-facing databases that involve revenue-producing applications. Run a cluster for high availability, then do synchronous mirroring to another server in the same datacenter (or extremely nearby). This buys you nearly transparent protection from SAN failures, SQL patches, OS reboots, etc.

        You still need another solution for disaster recovery like log shipping or SAN replication to a disaster recovery datacenter, though.

  4. If only my company could afford these things! Sheesh. Great review, though.

  5. Great review Brent! I am looking forward to Paul Randal’s findings as well and comparing them to the SQLIO Numbers I got in my tests.

    I also recently helped a customer Setup Mirroring for a fail over situation going from one data center to another data center with mirroring on 2008. One data center just recently failed as well so they were able to test the solution and it worked well for them.

    pat

  6. Question – in your testing did you ever sustain a test where one or both of the drives filled up? What happened then?

    We used to run tests and then reformat the drive for the next test – bad idea.

    We note now that performance drops when the drives need to do “recovery” under write pressure – that seems well understood.

    Curious how that works in a SW RAID scenario where one drive might be 3/4 full and the other new. How sustainable is the performance when one of the drives has to perform “grooming” while the other is still clean – does performance drop to the level of the slowest drive?

    • I reformatted the drives between every test, actually, but I just used the quick format for NTFS.

      I’m not sure what you mean about a SW RAID scenario where one drive has more data than the others. Can you elaborate on that? I’m trying to come up with a mental scenario where one drive has more data than others in RAID, but I don’t see how that could happen. RAID fills the drives proportionally, unless of course you’re using one drive that’s bigger than the others, and that’s a bad idea.

  7. I am the Principal Solutions Architect at Fusion-io and would like to start by thanking Brent for posting such an objective review of our technology. This is indeed very useful to the SQL Server community. I am also grateful to Brent for the patience he showed us while testing our product. Ultimately, we were able to make his tests work by replacing an unstable server.

    I would like to take this opportunity to list out a few options that customers have with respect to providing High Availability on SQL Server Databases that are running on Fusion-io technology:

    1. As some of you have mentioned, Mirroring is a viable way to achieve high-availability. In fact, Wine.com (www.wine.com) has its entire database on Fusion-io drives and it uses mirroring in synchronous mode to deliver high-availability. I was the VP of IT/DBA at wine.com at that time and therefore have first-and experience of their pain, the fusion-io implementation, and the benefits achieved. Here is the case-study that anyone can reference:
    http://community.fusionio.com/cfs-filesystemfile.ashx/__key/CommunityServer.Components.PostAttachments/00.00.00.01.81/Wine_5F00_Dot_5F00_Com_5F00_Case_5F00_Study.pdf

    2. With windows 2008 R2, there is an option of setting up multi-site clustering that does not require shared storage. This does require a third party software to keep the data in sync within the two servers but it does allow a SQL Server/Fusion-io solution to deliver high availability.

    3. The Gridscale appliance from xkoto.com delivers a true SQL Server Load Balancer solution, where a number of SQL Server nodes (where all nodes have direct attached storage) are covered by a load-balancer. If one node fails, the business continues as usual.

    In conclusion, I think that with the advent of new technologies like Fusion-io, we are bound to see new ways of accomplishing the good old objectives. The bottom-line is that both OLTP and OLAP environments are completely ripe for Fusion-io transfusion towards highly performing and scalable architectures.

    Thanks.

    sumeet@fusionio.com

    • Hi, Sumeet. Thanks for responding. I want to clarify a couple of things for readers.

      2. When you say third party software, you mean solutions like Doubletake to handle the SQL Server file synchronization. I just want to make sure that’s clear to the readers.

      3. I haven’t used Gridscale myself, but I’ve heard through other users that there’s suggested limitations around the read/write ratios – for example, it may perform best with a database that does 90% reads. I’m not sure if that’s true with their current version – I’d like to hear from anyone who’s used it if that’s still the case. I’ve also heard that not all stored procedures or functions are compatible with Gridscale.

      Thanks again for the comment though!

  8. Hi Brent,

    You are absolutely correct about the utility of Dooubletake software in this solution. There is also a software product called DataKeeper from Steel Eye Technologies that can also be used for this purpose.

    I don’t have any customers to reference for Gridscale but I saw a demo for them at the Boston MTC and it definitely looks promising. The way it works is that the load balancer app executes the same SQL statements on all the nodes. Therefore, there are some cases (specially when there is a high instance of Dynamic Stored Procedures) where Gridscale is not an option. They do have a great analyzer that can scan your database and then tell you if they will be a good fit or not. In terms of reads/writes break-up, Gridscale combined with the Fusion-io technology will result in a system that can accept more or less any kind of workload.

    Thanks.

    sumeet@fusionio.com

  9. Was this tested on Windows 2008 R2 or on Windows 2003? Only reason i’m asking is i’ve read the TRIM command makes SSD a lot faster in Windows 7/2008 R2. Does the Fusion IO support or depend on TRIM?

  10. Writeup was a good read, appreciate your efforts.

    I have been following Fusion’s product from quite literally across the hall as I work for a company that use to be in the same office building as Fusion. I’m a big fan of theirs and having looked over the performance numbers I expected they would revolutionize the data market. After reading this it’s clear that their product, while still awesome, has quite a few challenges in front of it. Nevertheless, I would still like to my hands on one of their cards :)

    Given the incredible amount of data throughput these drives manage I wonder if it would be possible to customize a mobo to use an allocation of the drives themselves as system memory? It would be pretty nifty to add additional memory to a box in roughly the same manner used to add virtual memory.

  11. Pingback: T-SQL Tuesday #4 - IO, IO It's Off To Disk We Go | SQL Server Blog - StraightPath Solutions

  12. Pingback: In-Memory Databases | Kevin E. Kline

  13. We tested the card for longer than 120 seconds. After 48 hours with our stress test under SQL server the write latency was poor. We talked to the support team and still poor write performance over time. Our database doesn’t run for 120 seconds we returned the cards.

    • Bill – actually, only *each* test I ran lasted 120 seconds, but I ran hundreds of tests in a row, lasting over a day. It just got to the point where I knew exactly which 120-second test would break the card.

  14. “The only way to outperform a Fusion-IO drive is to invest six figures in a SAN and hire a really sharp SAN admin.”

    Really? we must be under charging then. We deploy half rack HA SAN nodes with: 11TB of usable ZFS storage, they get 45KIOPS out and 10KIOPS in, and maintain 10Gb links to up to six nodes in a VM cluster, installed with a full year of 9-5/5 support for under 70K.

    if only we had known that people like spending so much money!

    • Well, I’ll give you a hint – sarcasm usually doesn’t work well in marketing. ;-) But it is indeed a sure way to bias someone against you and your product. Good luck with it, though – let me know how that approach works out for you.

  15. Do have any recommendation for setting up the software raid ?

    • David – no, I haven’t done enough of them to establish any best practices, but I’m sure Fusion-IO has. They’ve been very supportive of the SQL Server community, and I bet they can hook you up with someone who’s done it a lot.

  16. I’ll look into that, thanks …

    I just threw a ocz revo in a clients server with spectacular results (dropped from 40ms avg wait times to 0.6ms with all files on one drive !)

    Kudos on the idea though, poor mans SAN lmao !

    *** not trying to plug anything : ) wish client could afford fusions

  17. Pingback: What to do about Storage? - blog.serverfault.com

  18. Brent,

    Do you think block-level replication between (like DRBD) physically discreet servers would be viable? Each having 10GoE and an ioDrive?

    I can imagine this constellation of technology providing a relatively inexpensive yet redundant and fast system. Reads would, in theory, be just as quick…but writes? I really wouldn’t know.

    Stu

    • Stu – that wouldn’t be something that Microsoft would support, and that throws it out the window for me. Anything I implement has to be supported by Microsoft – that’s just my own personal standards. That solution might be interesting, but if we had to fail over to the other server and we didn’t have consistent files, the support call would be over pretty quickly when I described the environment.

      • Brent, Ah, OK…was not aware of your MS focus, which is fair enough. (Is there not a similar block-level replication technology for MS platforms? I would not know.)

        The thing about block level replication is that your files *should* be consistant, as I understand it. At least with the way I have DRBD configured, the write call does not return until the second machine completes the write. Hence my mention of the slower writes above.

        Anyway, thanks for taking the time to respond. It’s edumacational for me.

        Stu (a coder with administrator tendencies)

        • There are block-level replication solutions from third parties, but I haven’t had good results with those, to put it mildly. There’s great SAN-level block replication tools, but those are in another price league altogether.

  19. Pingback: Our Storage Decision - blog.serverfault.com

  20. How many fusion IO drives do you need to get maximal performance and redundancy on ans sql 2008 r2 sp1 with synchronise replication to a secondary server.
    And do I install the whole SQL database and binaries on the fusion IO and how do I split tempdb, logs, data when I have a san and some fusion io cards.
    Are there some white paers on installing an SQL server on this type of cards? Do we tale SLC or MLC

    Thanks
    K

    • Hi, Koen. There’s a ton of questions I’d have to ask to get you the right answer for your particular environment. If you’d like to engage with us for consulting help, feel free to email us at help@brentozar.com and we can get you started on the right track.

  21. Hi Brent

    Thanks for the very informative article.

    I have a question on the following situation:
    for the Transactions Logs which exibits high sequential writes, have you heard of using FUSION IO with RAID 5 for the TLOGS?

    I have picked up the SAN admin configured Fusion IO with RAID 5 for the placement of the TLOGS and TEMPDB, I would think its a bad idea for the TLOGS due to the write penalty of the parity calcs for RAID 5.
    My colleague states its not an issue due to more discs involved in striping and they have tested

  22. Pingback: Our Storage Decision - Server Fault Blog

  23. any updates on this?
    have you played w/ the fusionIO appliance?
    Thanks!!!
    jo

  24. Great write up! Any particular consumer-side config to make the Fusion-io card perform well? WIth another vendor’s card, increasing page size from 4k to 64k, and disabling the power-limiting features made huge performance gains. And, similar to your tests… it took weeks to figure that out.
    I really like the idea of on-board PCI-e flash for SQL Server. For the workloads I work with, using the onboard flash to create a write-through (or write-around, or tier 0 depending on vendor terms and implementation) cache in the server can dramatically increase total throughput as well as driving latency down. After saturating fibre channel at 1.6 gigabytes/sec with a workload, the same workload on the same server and SAN storage with pci-e flash and tier 0 cache (from a vendor other than Fusion-io) total data throughput went to approximately 4 gigabytes/second with nice low average latency.
    Putting tempdb on the card was good for us because we wanted not to send large amounts of query spill to the SAN.
    Of course if the whole thing fits in onboard flash – that’s the best of all worlds. But onboard PCIe can work great to accelerate SAN performance when the enterprise features of a SAN are needed.
    Other options to consider in addition to Fusion-io + their software for a tier 0 on-board cache: EMC xtremsw, Intel 910 + Intel CAS, QLogic FabricCare (two linked-slot PCI-e device combining flash and FC HBA), and there are probably others out there as well.

    • Hi, Lonny. Thanks, glad you liked the post. No, I don’t have anything else I can share here, but glad to hear you’re having good experience with the gear. Consider writing your own post to share your experiences – this type of post is usually very popular with readers. Enjoy!

  25. Any recent tests with these. I’ve heard they have improved.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php