Years ago, when I was frustrated with my SQL Server’s performance, the SAN admin kept saying it was a SQL Server problem. When the SAN admin quit, I took his job – and discovered the ugly truth. In the years since, I’ve done a lot of projects with SQL Server and VMware on SAN storage, and I know how to find out right away if it’s really a SAN problem. Learn how to use disk testing tools in this session:
Along the way, we discuss:
- My Blitz: SQL Server Takeover Script – this T-SQL script helps you rapidly assess the health of your servers. I’ve added new backup throughput sections to show your backup throughput over time.
- Glenn Berry’s DMV Queries – uses SQL Server wait stats to prove if the server’s waiting on storage. Also check out Glenn’s book on SQL Server hardware.
- Microsoft SQLCAT disk latency recommendations – Top OLTP Issues and OLTP Best Practices. Yes, these these posts are old, and if your SAN guy complains about that, respond with, “Oh, are our disks not supposed to be as fast as they were in 2006?”
- CrystalDiskMark – the easy button of storage testing. It doesn’t mimic SQL Server at all, but that’s okay, because remember that we’re proving it’s a SAN problem.
- SQLIO – nowhere near as easy as CrystalDiskMark, but gives you much more fine-grained control. I still don’t really try to mimic SQL Server IO patterns exactly, but just use this to get a sanity check after the quick results from CrystalDiskMark.
- Tom’s Hardware Disk Performance Charts – to get quick ballpark numbers to compare your storage throughput. Yes, these are only single hard drives, but disturbingly, you’ll often find that poorly configured storage isn’t all that much faster than a single enterprise hard drive.
- Wes Brown’s Fundamentals of SQL Server IO – great educational series that goes into detail with lots of links.
- My SAN Best Practices – a collection of my posts about SQL Server on shared storage.
Wes Brown August 31, 2011 | 8:23 am
Thanks again Brent for the mention! I wasn’t sure how much you would be able to do in 30(45) minutes but I was impressed. Simple, concise and relevant. Did I learn something yesterday? Yep, that you don’t have to get mired in the weeds to prove you have a storage problem.
Brent Ozar August 31, 2011 | 8:24 am
Thanks man, glad you liked it! It definitely is tough to cram all that info into a short presentation. I originally started with 30 minutes of material, and I kept going back and adding Just One More Slide….
Thirster42 September 1, 2011 | 1:48 pm
During a part of the video you mention show notes and some stuff by Bill about sys.dm_os_buffer_descriptors, i’m not seeing it. Am i blind?
Brent Ozar September 1, 2011 | 1:48 pm
Thirster – not Bill, but Glenn, actually. It’s the Glenn Berry link.
sqlglenn September 1, 2011 | 10:59 pm
Trying to find the script you mention, but I can’t seem to get there by using brentozar.com/go/blitz
Suggestions?
Brent Ozar September 2, 2011 | 7:03 am
Click harder maybe?
Here’s an alternate URL: http://www.brentozar.com/sql/blitz-minute-sql-server-takeovers/
Pingback: Something for the Weekend – SQL Server Links 02/09/11
David Bobko December 26, 2012 | 4:33 pm
Brent,
When running SQLIO against a clustered sql server with a net app san, I am seeing some strange behavior. When running the following command sqlio -kW -t2 -s120 -dG -o1 -fsequential -b64 -BH -LS -Fparam.txt for o (1,2,4,8,16,32,64,128) I am seeing the IOPS and MBSec speed relatively low until I reach the 64 and 128 outstanding IO? Any reason why this may happen?
This drive is the drive carved out for my transaction logs.
Brent Ozar December 27, 2012 | 8:41 am
Hi, David. Unfortunately this is beyond something I can troubleshoot in a blog comment. If you’re interested in bringing me in to help, shoot me an email at help@brentozar.com. Thanks!