Getting Started with Diskspd

Diskspeed, or diskspd.exe, is Microsoft’s replacement for SQLIO. While I’m not going to replace our SQLIO tutorial, you can use the information in here to replicate the same type of tests on your systems to get the information you need. During the Dell DBA Days, Doug and I used diskspd as one of our techniques for getting a baseline of raw performance. We wanted to get an idea of how fast the servers and storage before running SQL Server specific tests.

How do I get diskspd?

You can download diskspd directly from Microsoft – Diskspd, a Robust Storage Testing Tool, Now Publically Available. That page has a download link as well as a sample command.

The upside is that diskspd is a fully self-contained download. You don’t need Java, .NET, or anything else installed to run it. Apart from Windows – you’ll still need Windows.

How do I use diskspd?

With the command line, of course!

In all seriousness, although diskspd is the engine behind Crystal Disk Mark, it stands on its own as a separate tool. Download the executable and unzip it to an appropriate folder. There are going to be three sub-folders:

  • amd64fre – this is what you need if you have a 64-bit SQL Server
  • armfre
  • x86fre

I took the diskspd.exe file from the appropriate folder and dumped it in C:\diskspd so I could easily re-run the command. Let’s fire up a command prompt and try it out.

Here’s a sample that we ran: diskspd.exe -b2M -d60 -o32 -h -L -t56 -W -w0 O:\temp\test.dat > output.txt

Breaking it down:

  • -b2M – Use a 2 MB I/O size. For this test, we wanted to simulate SQL Server read ahead performance.
  • -d60 – Run for 60 seconds. I’m lazy and don’t like to wait.
  • -o32 – 32 outstanding I/O requests. This is your queue depth 32.
  • -h – This disables both hardware and software buffering. SQL Server does this, so we want to be the same.
  • -L – Grab disk latency numbers. You know, just because.
  • -t56 – Use 56 threads per file. We only have one file, but we have 56 cores.
  • -W – Warm up the workload for 5 seconds.
  • -w0 – No writes, just reads. We’re pretending this is a data warehouse.
  • D:\temp\test.dat – our sample file. You could create a sample file (or files) by runningdiskspd with the -c<size> flag.
  • > output.txt – I used output redirection to send the output to a file instead of my screen.

How do I read diskspd results?

You’re going to get a lot of information back from this command. You’re going to want to close the window and back away quickly. Don’t. This is good stuff, I promise.

The first thing you’ll see is a recap of the command line you used. Then you’ll immediately see a summary of the commands:

That’s a lot easier than trying to read a set of command line flags. Six months from now, I can review older runs of diskspd and understand the options that I used. diskspd is already winning over sqlio.

Next up, you’ll see a summary of CPU information. This information will help you understand if your storage test is CPU bottlenecked – if you know the storage has more throughput or IOPS capability, but your tests won’t go faster, you should check for bottlencks. The last line of this section (and every section) will provide an average across all CPUs/threads/whatevers.

After the CPU round up, you’ll see a total I/O round up – this includes both reads and writes.

Look at all of those bytes!

If the I/O numbers initially seem small, remember that the data is split up per worker thread. Scroll down to the bottom of each section (total, reads, writes) and look at the total line. This rounds up the overall volume of data you’ve collected. The I/Os are recorded in whateverunit of measure you supplied. In our case, the I/Os are 2MB I/Os.

Important Sidebar Your storage vendor probably records their I/O numbers in a smaller I/O measurement, so make sure you do some rough translation if you want to compare your numbers to the advertised numbers. For more discussion, visit IOPS are a scam.

Finally, latency! Everybody wants to know about latency – this is part of what the end users are complaining about when they say “SQL Server is slow, fix it!”

This table keeps the min, max, and a variety of percentiles about how the storage performed while you were beating on. This information is just as helpful as the raw throughput data – under load your storage may have increased latencies. It’s important to know the storage will behave and respond under load.

How often should I use diskspd?

Ideally, you should use diskspd whenever you’re setting up new storage or a new server. In addition, you should take the time to use diskspd when you make big changes to storage – use diskspd to verify that your changes are actually an improvement. No, diskspd doesn’t include the work that SQL Server does, but it does show you how your storage can perform. Use it to make sure you’re getting the performance you’ve been promised by your storage vendor.

Previous Post
Consulting Lines: “Let’s put that in the parking lot.”
Next Post
Filtered Indexes and IS NOT NULL

62 Comments. Leave new

  • •-t56 – Use 56 threads per file. We only have one file, but we have 56 cores.

    Does this mean that the number of threads to be used in the testing depends on the number of processor of the machines?

    Reply
  • Hi Jeremiah,
    Its interesting to get new tools to test performance, we’ve been playing with SQLIO on our new datawarehouse server and have had some good results. What we cant figure out is how to test multiple disks at once for multiple LUNS.
    For instance, if we want to spread our SQL datafiles over 7 disks. Is it possible to emulate this and get a combined throughput using the diskspd tool?
    We’ve been looking at tools recently and, as yet, havent found anything suitable for our requirements.

    Reply
    • Great question – you can just list multiple files when you write the command. For example, you could run diskspd -b2M -d60 -o32 -h -L -t56 -W -w0 file1.dat file2.dat file3.dat > output.txt

      Reply
      • Fantastic, thanks.
        I’ll give it a crack once I figure out the file access errors I’m seeing 🙂

        Reply
        • The -t parameter is per target, where the -F parameter is fixed. So you want to be careful in how you use the -t parameter with multiple targets or you could wind up with more threads than you intended.

          Example:
          -t56 target1 target2 target3 will create 3 sets of 56 threads; 168 threads total.
          -F56 target1 target2 target3 will create 56 threads total that test all three targets at the same time.

          So if you’re testing 8 targets simultaneously, how you specify the thread count could make the difference between 56 threads sharing 8 targets or 448 threads attacking in dedicated groups of 56.

          Reply
    • How is this better than SQLIO and are there any graphical interpretation tools like the SQLAnalyzer?

      Thanks for all the articles JP!

      Reply
      • George Walkey
        June 26, 2017 7:44 am

        its not better, there are many IO testing tools, SQLIO, perfmon, this thing, Process Explorer (Delta-IO), iometer.
        dskspd’s numbers are often wrong by a factor of 10 compared to the others. Unusable.
        Then there is Resource Monitor, all give different numbers for measuring IO,
        ProcessExplorer/SQLIO numbers are correct and easy to prove..
        SQLIO moves a fixed size file around, stat and stop your watch.
        pretty easy to do the math and see that dskspd gets it wrong.

        Then you you find yourself asking, which tool it broken. This one is.
        You find yourself testing the tools, not the IO subsystem.

        Reply
  • Jeremiah,
    Nice description on how diskspd works. I am looking for ways to determine the ‘simulated workload’ for existing servers. Most customer servers I see are multi use so CPU and I/O patterns are all over the place. I want to go to the WINTEL and Storage Admins with this tool and a plan for testing the systems to make sure the hardware is performing as required before I tune SQL. I am considering using the PLAN CACHE to look for CPU threading through paralellism and sys.dm_io_virtual_file_stats for I/O patterns. How do you determine your test parameters for both new and existing servers?

    Reply
    • I ask the users what kind of patterns they need to see and if they can repeat those I/O patterns on a test system. If they can, I monitor that I/O pattern and build tests to mimic that pattern. If they can’t repeat the I/O pattern, then I devise multiple artificial stress tests that hit various axes of performance.

      Reply
  • Hello Jeremiah,

    Thanks for the write up! Didn’t know that this tool is replacing SQLIO. Just wanted to know if you have loaded the results into a database table for analyses and if so, is there a lot of cleanup to?

    Thanks,

    Rudy

    Reply
  • Thanks Jeremish

    So long using Crystal Disk and I didn’t know that little detail.

    Kinda unrelated, but can you also talk sometime later of iometer and which you find to be more accurate?.

    Reply
  • Hi Jeremiah

    Thanks for an excellent post. I’m trying to simulate a typical datawarehouse load on SQL2014 but I’m unsure about which parameters to use.

    I’m thinking:
    -b64
    -d60
    -o8
    -h
    -L
    -t8
    -W
    -w0

    Do you have an opinion if these parameters can be used to simulate a “typical” workload in a datawarehouse?

    Reply
  • Hi Jeremiah
    Nice write up…I have three questions

    1. If I have multiple drives/mount points presented to a SQL environment and data/log files will be spread account these mount points. How do I run concurrent tests? can you share a sample script?

    2. Can we run test scripts concurrently?
    e.g.
    one script pointing to data file with random read/write (30/70) and
    another script file pointing to log file with sequential write

    Reply
  • Can i set timeout value while running Diskspd tool

    Reply
  • Super great thread. Thanks!

    Question: When I am running tests of diskspd, why am I seeing different values ht eh bytes column? Here is what I mean:

    diskspd -d15 -F1 -w0 -r -b32k -o10 C:\mytestfile.dat

    Total IO
    thread | bytes | I/Os | MB/s | I/O per s | file
    ——————————————————————————
    0 | 1413742592 | 43144 | 1776.97 | 56863.00 | C:\myDelete\testfile.dat (5120KB)
    ——————————————————————————
    total: 1413742592 | 43144 | 1776.97 | 56863.00

    vs

    diskspd -d15 -F1 -w0 -r -b64k -o10 C:\mytestfile.dat

    Total IO
    thread | bytes | I/Os | MB/s | I/O per s | file
    ——————————————————————————
    0 | 35270623232 | 538187 | 2242.34 | 35877.48 | C:\mytestfile.dat (5120KB)
    ——————————————————————————
    total: 35270623232 | 538187 | 2242.34 | 35877.48

    Why am I seeing a difference in bytes between the two runs?

    Reply
    • Andrew – let’s think through that. You’re saying when you run it with two different sets of parameters, it does two different things. So what are those different parameters that you’re using, and how might those impact the amount of IO performed?

      Reply
  • Hi Brent, thanks for your reply. I have the commands shown above, but I’ll put them here again:

    First run:
    diskspd -d15 -F1 -w0 -r -b32k -o10 C:\mytestfile.dat

    Second run:
    diskspd -d15 -F1 -w0 -r -b64k -o10 C:\mytestfile.dat

    The only difference is the block size I am using. All other parameters are the same, and I’m targeting the same 5MB file in each case.

    My question is, during the first run it shows 1413742592 bytes, but in the second run it shows 35270623232 bytes. If I am using the same file, then why is the value in the bytes column different?

    I guess I don’t know what the bytes column represents, and that is why I am questioning it.

    Reply
    • Hmm, so if you make the same number of requests, and you use a larger block size, I wonder if that would impact the amount of IO that you’re doing?

      Reply
    • Or to put it another way – if I hand you a book with 1,000 pages,
      and I ask you ten times to get me a 5-page section,
      and then I ask you ten times to get me a 500-page section,
      would you read a different amount of pages during those two tests?

      Reply
      • Thanks again Brent. That is one heck of an analogy. Here is the crux of my question: I was thinking that the number of bytes in that column represents the amount of data moved. If I am moving the same amount of data in each case, wouldn’t the number of bytes be the same in each case?

        My apologies if this is an ignorant question.

        Reply
  • Is 2MB really an indicative byte size for a SQL Server workload even in a DW scenario?

    Reply
  • Great tool, if only I could get it working!

    Is there something obvious I’m missing with using this tool? I get “error opening file” no matter what target I specify on the command line. Am I supposed to generate a file beforehand?

    Reply
    • Hi Jim – I was having the same problem – then discovered the option -c[K|M|G|b] (which will then generate the test file for you)
      Not sure if the option was recently added, but the example in the doco at 2.1 will not work without it.

      Try:
      diskspd -d15 -F1 -w0 -r -b4k -o10 -c50M testfile.dat

      Reply
    • Hi,
      you have to use the option -c in order to create the output file :
      e.g: diskspd.exe -b2M -d60 -o32 -h -L -t56 -W -w0 -c1G E:\temp\2iotest.dat > output.txt

      Reply
  • JAMES YOUKHANIS
    August 10, 2016 8:23 am

    What parameter should I pass if i want to test a specific drive?

    Reply
  • Creighton Simmons
    December 2, 2016 10:52 am

    Hey Brent!!!!!!!!!!!!!!!!!

    Hey Man, now… what about that AvgLat column?? It’s over 1000. Does that mean 1000ms+ average latency?

    If so, I would think that would tell you just about all you need to know. That the disk subsystem you were working with is S.L.O.W.

    Now, I don’t have a clue what that really means, but if it does mean 1000+ ms Avg Latency, then the tool did its job and you can go into the server room and start replacing that SAN with 5.25″ floppies.

    Thanks for your contributions to the community!

    Reply
    • Creighton – for starters, the post was written by Jeremiah, not me. Reading puts the fun in fundamental!

      Second, no, average latency of over a second doesn’t actually tell you all you need to know. You may be dealing with an over-saturated HBA, for example.

      Enjoy the journey!

      Reply
      • Creighton Simmons
        December 5, 2016 12:56 pm

        To Whom it may Concern:
        Well, sure… it doesn’t tell you the reason for the latency, but you know you have a problem and I think that’s the point of the tool. So, now you know you have a big problem, troubleshoot it, or if you’re a DBA like most people who are looking at this stuff – get the network/storage administrator involved if you see that kind of number for AvgLat. The article itself states: “Finally, latency! Everybody wants to know about latency – this is part of what the end users are complaining about when they say “SQL Server is slow, fix it!”, so you have this column called “Average Latency” but don’t mention it. You can find out how to run the command in a lot of places. A good interpretation of the results is what people are looking for. I guess “Look at all those bytes” is the take away from this post.

        Do we have to follow the same format????!!!!!!!!

        Reply
    • Have a deeper think about what this might mean.

      Block size is 2M, queue depth is 32, and thread count is 56.

      Per the diskspd documentation on github (see https://github.com/Microsoft/diskspd/blob/master/DiskSpd_Documentation.pdf ), the queue depth is per-target per-thread.

      So, once the queues for all threads have filled up, a new request has to wait for about 31 operations in its own thread’s queue plus 31.5 operations on the other 55 threads. That’s 1,763.5 ops, or 3,527 MB. Now look at the throughput. 2,669MB/s. So we’d expect latency of about 1.32 seconds from that – it’s actually narrowly beaten that mark which tells me the tool isn’t quite able to keep all the queues full all the time.

      So depending on what was expected of the disk subsystem – 2.7GB/sec isn’t paltry, it might well be expectation for a test rig – that latency could very easily be purely an artifact of the test parameters.

      Reply
  • What would be optimal settings to see if the drive is suitable for tempdb with 8 files 8GB each?

    Reply
    • Check your current TempDB workloads, and then see if you can maintain those same IOPs & latency numbers on the new array.

      Without a baseline, it’s really impossible to tell if a drive is “suitable” – suitability is based on your workloads. Some servers just don’t use TempDB at all, and some really hammer it.

      Reply
  • When I try to capture the result for single and multiple files using DiskSpd, I get this error messages.

    For single file, here is the error message:
    c:\diskspd>diskspd -b8K -d30 -o4 -t8 -h -r -w25 -L -Z1G -c20G c:\iotest.dat > DiskSpeedResults_01.txt
    Error opening file: ûb8K [2]
    Error opening file: ûw25 [2]
    Error opening file: ûZ1G [2]
    Error opening file: ût8 [2]
    Error opening file: ûh [2]
    Error opening file: ûL [2]
    Error opening file: ûd30 [2]
    Error opening file: ûr [2]
    The file is too small or there has been an error during getting file size
    Error opening file: ûc20G [2]
    Error opening file: ûo4 [2]
    There has been an error during threads execution
    Error generating I/O requests

    For multiple files, here is the error message:
    c:\diskspd>diskspd -b8K -d60 -o4 -t8 -h -r -w25 -L -Z1G -c20G Test1.dat Test2.dat Test3.dat Test4.dat Test5.dat > DiskSpeedResults_01162018
    .txt
    Error opening file: ûb8K [2]
    Error opening file: ûo4 [2]
    Error opening file: ûh [2]
    Error opening file: ûw25 [2]
    Error opening file: ûL [2]
    Error opening file: ûr [2]
    Error opening file: ût8 [2]
    Error opening file: ûZ1G [2]
    Error opening file: ûd60 [2]
    Error opening file: ûc100G [2]
    Error opening file: Test1.dat [2]
    Error opening file: Test2.dat [2]
    Error opening file: Test3.dat [2]
    Error opening file: Test4.dat [2]
    Error opening file: Test5.dat [2]
    There has been an error during threads execution
    Error generating I/O requests

    What am doing wrong?

    Thanks in advance.
    Yoh!

    Reply
  • Should my MS SQL Server be running while i am testing with diskspd ???

    Reply
  • For those just getting started with this, in the download zip are two word documents that cover everything you need to know. One of the docs is specific to SQL Server and includes many sample scripts for different work loads. I recommend familiarizing yourself with them.

    Reply
  • TechnoCaveman
    March 15, 2019 7:10 am

    (as said on TV) Thank you.
    BTW the microsoft docs *really* need updating. I’m trying to educate my boss and some app dev folks but the docs say “SQL 2005” or 2012. Even if the documentation has not changed, its better to see SQL 2016 or 2019.

    Reply
  • Can we run diskspd for 12 hours?

    Reply
    • Kasthuri – it’s not really a burn-in test tool.

      Reply
      • Diskspd can be used as “Burn-In” tool if your SAN policy starts your LUN(s) down in lower tiered 7200RPM storage and through use the Array auto-policy will move your data into higher tier Flash Drives based on use. And yes I know this could be alleviated through better SAN mgmt, but this is not in my Team’s control! This would be a perfectly appropriate underhanded DBA way to use the tool for long periods of time–to force your precious data to a higher tier Flash Storage on the Array. Especially since in some shops there can be a bit of a knowledge gap with some Infrastructure Teams as it relates to their knowledge of Database systems and the need for higher speed storage for certain Production Systems. It took me a month to get my Infrastructure team to realize that they could force the Array to move my set of 6 x 4TB set of LUNs up to the highest tier of Flash Storage after claiming it was *NOT* possible. These are not dumb folks just the opposite, maybe a bit naive when it comes to Database Needs!

        Reply
  • S. Hamilton
    May 23, 2019 11:03 am

    Hi I have been struggling with understanding my results and wondering if they are expected but getting AvgLat speeds of 40ms for these switches -b64k -r1M -w100 -o8 -F20 -h -L -d600 -c50G however when i use -b8K instead I get AvgLat of under 1ms. Is the high AvgLat speed expected? I have run it against two SSD drives in a mirror and also against 8 SAS 15K drives in a raid10.

    Reply
  • Luis Agustin Azario
    November 3, 2019 5:05 am

    I have this two servers comparitin, each one in a diffrent client. We are having perf issues with first server, now we migrated to new VM and got only 50 % iver 422.17, aprox. 600

    Server 1:

    Total IO

    thread | bytes | I/Os | MiB/s | I/O per s | file
    ——————————————————————————
    0 | 276692992 | 4222 | 26.39 | 422.17 | testfile1.dat (1024MiB)
    1 | 308412416 | 4706 | 29.41 | 470.56 | testfile2.dat (1024MiB)
    ——————————————————————————
    total: 585105408 | 8928 | 55.80 | 892.73

    Server 2:

    Total IO
    thread | bytes | I/Os | MiB/s | I/O per s | file
    ——————————————————————————
    0 | 21538865152 | 328657 | 2054.12 | 32865.97 | testfile1.dat (1024MiB)
    1 | 21165637632 | 322962 | 2018.53 | 32296.47 | testfile2.dat (1024MiB)
    ——————————————————————————
    total: 42704502784 | 651619 | 4072.65 | 65162.44

    Reply
  • BTW for anyone getting those multiple opening file errors:
    DISKSPD only accepts ASCII characters and if you copy and paste from the web or use it in an application as an Argument to DISKSPD, it will see it as UNICODE hence the error. As some have mentioned replace the hyphens, or use char(45)

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.