|
|||||
SQL Server Perfmon (Performance Monitor) Best PracticesDatabase administrators need to know each SQL Server’s weakest link so we know where to focus their hardware budgets and time. On any given server, we need to know what the slowest component is, and where it’ll pay off to invest hours and dollars. This is performance tuning, and the first place to start is by using Performance Monitor. Performance Monitor, or Perfmon, measures performance statistics on a regular interval, and saves those stats in a file. The database administrator picks the time interval, file format, and which statistics are monitored. After the stats are gathered over a period of time (hours or days), we can do analysis by opening the results in Excel and setting up some basic formulas. Perfmon isn’t just for SQL Server tuning: system administrators use it to monitor performance on Windows itself, Exchange, file & print servers, and anything else that can run into bottlenecks. As a result, it’s easy to find Perfmon information online, but it’s not always specific to SQL Server. Since every application has its own set of statistics, it helps to get SQL-related tips. Perfmon Tutorial VideoHere’s a short video explaining what I’m about to go over. You can either watch the video or read the text – they both explain the same concepts, although the text goes into more detail.
Get the Flash Player to see the wordTube Media Player.
Setting Up Perfmon for SQL Server TuningDatabase administrators should have an always-on workstation for things like performance logs and traces. If you don’t have one, read my article on setting up a DBA workstation, and then come back here. On your always-on workstation, go into Control Panel, Administrative Tools, Performance. (If you’re using Windows Vista, that location will be different, but I won’t cover Vista here since SQL 2005 SP2 isn’t even out yet, so you shouldn’t be using Vista as your always-on workstation.) The first thing that comes up is a 1990’s-looking line chart with a few basic performance stats on your workstation itself. Interesting, but not what we’re after. On the left side, expand Performance Logs and Alerts, and click on Counter Logs. Counter Logs let us pick a set of performance counters and log them to a file periodically. One counter log is listed by default, but we’re going to add a new one. Right-click on Counter Logs and click New Log Settings. Name it with the name of your database server, because each server should have its own log settings. We could theoretically build a single counter log for all of our database servers, but then anytime we want to run the counter log against one of our servers to test its performance, it will log data for ALL servers, and that’s not usually how we want to do performance tuning. After typing in a name, we can start setting up the counters we want to log. The Basics of SQL Server Perfmon CountersClick the Add Counters button, and change the computer name to point to the SQL Server name instead of your workstation’s name. (If you have multiple SQL Server instances on the machine, don’t worry about it, that doesn’t apply here.) After typing in the server name, hit Tab, and the workstation will pause for a few moments. It’s gathering the list of performance objects available on that server. Each server will have different lists of performance objects depending on what software is installed on that server: for example, SQL Server 2005 offers a different set of counters than SQL Server 2000. In the Performance object dropdown, choose the “Memory” object. The list of counters will change. Select the “Pages/sec” counter, and click the Add button. It will seem like nothing happened, but try clicking the Add button again. You’ll get an error saying that the counter was already added. It’s not exactly an elegant user interface, but it works. Technically. Now scroll up to the “Available MBytes” counter, highlight it, and click Add. Those are the only two memory counters we’re going to monitor for now. In the Performance object dropdown, choose the “Processor” object, and in the counters list, highlight the “% Processor Time” counter. Notice that in the right hand window, we now have more instances to choose from. We can track the % Processor Time statistic for each individual processor, or for all of them combined (_Total). Personally, I like to highlight each individual processor, and then click Add. I don’t find the _Total statistic useful because it’s simply an added number combined from all of the individual processors. On a 4-CPU box (single cores, no hyperthreading), that means our _Total statistic can be from 0 to 400. However, 100 can mean two different things: it could mean that each processor is running 25% utilization, or it could mean that one processor is pegged at 100% while the rest are twiddling their thumbs waiting for work. It could also mean any other number of combinations, like two processors at 50%. Therefore, the _Total number usually gives me more questions than answers, so I don’t bother logging it. Highlight all of the processor instances except _Total, and click the Add button. In the Performance object dropdown, choose Physical Disk, and choose the “% Disk Time” counter. Notice that again on the right side window, we get multiple instances; this time, we get one per physical disk. In performance terms, physical disk means one disk shown in Computer Management’s Disk Management tool. One physical disk may have multiple partitions, each with its own drive letter, but for performance tuning, we want to know how hard that one physical disk is working. This one “physical disk” may have a bunch of actual physical drives, like in RAID systems. However, Windows isn’t quite smart enough to know exactly how many drives are in the RAID array, so the term “physical disk” is a little misleading here. Highlight all of the physical disks in the instance list (again, leave off the _Total instance) and click the Add button. Then, in the counter list, choose the “Avg. Disk Queue Length” counter and add it too. The Best SQL Server Performance Monitor Counters to AnalyzeNow that you’ve got the hang of adding counters, here’s the full list we need to add, including the ones mentioned above:
After adding those, click the Close button, and we’re back to the counter log setup window. Under “Sample data every”, the default should be 15 seconds, and that’s fine. In the “Run As” entry, type in your domain username in the form domainnameusername, and click the Set Password button to save your password. This lets the Perfmon service gather statistics using your domain permissions – otherwise, it tries to use its own credentials, and they probably won’t work on the remote server. Click on the Log Files tab, and change the log file type to Text File (Comma delimited). This lets us import it into Excel easier. Click on the Configure button, and set the file path. I save mine in a file share called PerformanceLogs so that I can access it remotely via a mapped drive, and so that I can share it with other users who want to know how their server is performing. Click OK, and click OK again, and you’ll be back at the main Performance screen listing the counter logs. If all went well, your counter log is now running. To make sure, right-click on your new counter log’s name, and the Start option should be grayed out. If it’s not, most likely there was a permissions problem with the username & password you used. Click Properties, and try setting the username & password again. Let Perfmon run for a day or two to gather a good baseline of the server’s activity. It’s not that invasive on the SQL Server being monitored, and the in-depth results will pay off. The more data we have, the better job we can do on analyzing the Perfmon results. Turning SQL Server Perfmon Logs Into Pretty Excel SpreadsheetsAfter Perfmon has run for a day or two, go back in and right-click on the counter log and click Stop. That will cease monitoring of the database server. Open the results CSV in Excel. It’s going to look ugly at first: ![]() Perfmon Counters In Excel This is the data from Perfmon. Each column represents one of the metrics, and each row represents one time period. It’s hard to see that with the way it’s currently formatted, but we’ll solve that. Time to put some lipstick on this pig. We’re going to:
Excel 2007 pros can figure out how to do this, but for the folks who don’t spend too much time in spreadsheets, I’ll explain how: Getting Rid of the Half-Empty Row and Cell A Then delete the contents of cell A1, which we don’t need. Format the Dates in Column 1 Click on cell A2 and look at the formula bar, and you’ll see that it’s a date – it’s just that Excel picks the wrong format to display it. Highlight all of column A (by clicking on the “A” button at the top of the column). Right-click on that A button and click Format Cells. Pick a Date format that includes both the date and the time, and click OK. At that point, the column will probably show all “#######” entries, which just means it’s not wide-enough. Double-click on the line between the A and B columns, and Excel will autosize the column. Format the Numbers In our spreadsheet, some of the numbers have tons of stuff after the decimal place and it’s hard to read. Highlight all of the columns except A (our dates), and right-click and Format Cells. Choose the Number format, zero decimals, and check the box for Use 1000 Separator. Some of our numbers will have valid data less than zero, but we’ll fix that later. Right now we’re just getting the basics cleaned up. At this point, your spreadsheet should look like this: ![]() Perfmon Results Formatted Now let’s solve that mystery row of headers. Cleaning Up the Headers In row #1, notice how all of the cells in my screenshot start with \MYSERV. That’s because the server is called MYSERVERNAME. (Well, not really, but I faked it for this demo.) Click on one of those cells, and you’ll see the full text that Perfmon saves in the header, like “\MYSERVERNAMEMemoryAvailable MBytes”. Do a find & replace by hitting Control-H. In the “Find What” box type your server name with two backslashes, like \MYSERVERNAME. In the “Replace with” box, erase everything. We want to just delete everywhere that says \MYSERVERNAME. Click Replace All, and Excel should tell you how many replacements it made. Now do the same for these strings, but be careful to note which ones have spaces:
Now you’ll be able to tell a little more about what’s in each header, but to make it better, change the font size to 8 for that header row. Then highlight all of the columns and make them wider – not terribly wide, but just wide enough that you can see most of the headers for easier analysis. It’ll end up looking like this: ![]() Perfmon Counter Names Now Visible Getting better. Before the next step, go ahead and save the spreadsheet in Excel format, because the calculations will be slow. Add Formulas Across the Top Here comes the secret sauce: we want formulas across the top to quickly summarize our findings. It’ll be easier to understand by looking at the final desired result first: ![]() Perfmon Results with Formulas In this particular example, don’t be fooled by the zeroes in columns C through E – that’s because this server didn’t actually have any activity for those metrics. Start by right-clicking on the Row 1 button at the far left of the screen and click Insert, and repeat that process seven times so that we’ve got seven empty rows at the top. Then type in the labels in cells A2-A6 to match my screen shot. In the B column, put in the formulas. In my typed examples below, I assume that your data goes from row 9 to row 100, but you’ll want to change that 100 number to wherever your last data row is.
Finally, the icing on the cake: hit Control-Home, then move your cursor to cell B9, which should be your first cell with data. Hit Alt-W-F-Enter, which should Freeze Panes. That way we can move around our spreadsheet while we still see the dates on the left and the formulas & headers across the top. Now, we’ve got a gorgeous spreadsheet that’s easier to roam around and analyze. Analyzing the Performance Monitor Results to Find SQL Server BottlenecksNow for the detective work! There are a lot of smart people out there who have some great ways of interpreting Perfmon results. My favorite is the Microsoft SQL Server Customer Advisory Team (SQLCAT), who published two fantastic blog posts that sum up what counters to look at, and what they mean:
So now that we’ve established how the smart people do it, I’ll tell you how I do it, ha ha ho ho. Look for Obvious CPU Bottlenecks First, look at the Processor Queue Length for CPU pressure. If this number is averaging 1 or higher (except during the SQL Server’s full backup window if you’re using backup compression), this means things are waiting on CPUs to become available. I’m suspicious when this number is over 1, because it often means that people have installed other software on the SQL Server such as applications or web sites. That’s a problem. If you get pushback from management saying that they don’t want to buy new servers, point out that two CPUs of SQL Server Enterprise licensing cost around $50-60k – which would pay for a separate server for the web app. If you can eliminate applications from the SQL Server, then you don’t have to use as much CPU power, and less CPUs mean less licensing costs. There are more in-depth Perfmon metrics that you can add to your capture if you see the Processor Queue Length showing up, but for junior DBAs, the first thing I would recommend is simply remote desktop into the SQL Server. Right-click on the taskbar, click Task Manager, and click on the Processes tab. Check the box that shows processes for all users, and then click on the CPU column to sort the CPU percentage from high to low. Sit and watch it for a minute or two. Which processes are using CPU power? If it’s SQLServer, then we need to do more research, but there’s a good chance it’s another application, and we need to get that app off this server. Generally speaking, enabling hyperthreading on a SQL Server is not going to fix that kind of a problem. Again, all of this is generally speaking, not specific to your environment (unless you email me with info about your environment). Look for Obvious Memory Problems Look at the % Usage metric, which monitors how much of the page file Windows is using. Generally speaking, you don’t want to see a SQL Server swapping memory out to disk. If this number is averaging 1% or more, then this server would benefit from more memory or setting SQL to use less memory. 1% doesn’t mean that money should be spent on more memory right away: in cost-sensitive shops, every server hits the page file sooner or later. It just points to a possible issue, and keep an eye on it. On the Memory – Available MBytes statistic, look for fluctuations of a couple hundred megabytes or more. If that’s happening, then either the SQL Server’s memory is being adjusted dynamically (probably a bad idea for performance) or users are actively logging into the SQL Server via remote desktop and running software. Correlate these fluctuations with disk activity: when available memory drops, is disk activity also increasing? Is this disk activity affecting the page file drive? If so, use this as a demonstration to people using remote desktop – show them that this is why people shouldn’t remote desktop into database servers. If the Memory – Available MBytes dips below 100mb, that’s an indication that the operating system may be getting starved for memory. Windows may be paging out your application to disk in order to keep some breathing room free for the OS. Make sure you have SQL Server’s service account set up with permissions to lock pages in memory. Look at Disk Metrics Last The reason we don’t look at disk performance metrics first is because memory problems can trigger disk problems. If a SQL Server doesn’t have enough memory, or if the SQL Server account doesn’t have the necessary permissions to lock pages in memory, then disk activity may be artificially high on the page file drive. As we examine the disk numbers, make a mental note of which drive has the page file – and there may be multiple page files as well. Also, find out if one physical drive array is shared by multiple drive letters. This is especially important for servers that boot from SAN: the boot drive may contain the page file, and that boot drive may be on a shared set of spindles with several drive letters. Heavy page file activity will slow down access to all of the affected drive letters in Windows. Another caveat: mount points. If the shop uses mount points, a drive letter may be broken up into several sets of physical drive arrays. If you don’t know what a mount point is, then you’re probably not using them, and you can ignore that advice. Recommended Book: SQL Server 2005 Performance TuningIf you want more help, I highly recommend this book. It goes in depth into all of SQL Server’s performance bottlenecks, explains how to troubleshoot which one is affecting your particular server, and how to remove that bottleneck. It’s a great book, and I’m helping to write a similar one for SQL Server 2008 with a couple of the authors of this book. Two thumbs up. Third-Party Tools For SQL Server Performance MonitoringThis whole thing probably sounds like a lot of time-consuming work. That points out one of the drawbacks of Perfmon: it only gives information when the DBA sits down to analyze the logs. If you don’t mind spending some money to save some time, then check out third-party tools that do SQL performance monitoring. They run at all times, constantly watching the performance of each server, and can give faster alerting when a problem starts to occur. I work for Quest Software, so of course my favorite performance monitoring tool would be Quest Spotlight on SQL Server. It has a fantastically easy-to-use and easy-to-comprehend dashboard showing problem areas of each server with red, yellow and green color-coding. What I really like, though, is that we’re building in analytical info from our partners at Solid Quality Mentors. SolidQ has an army of SQL Server consultants that see some of the toughest environments around, and they know what metrics to look at and what they can indicate. We’re bundling that into Spotlight so DBAs can draw upon this expertise at a glance and see immediately where the tough problems are hiding. More articles on SQL Server performance and Best PracticesHere’s a few of my other related articles on performance monitoring:
Want my help with your server?No problem! Follow the instructions in this article to gather performance metrics on your server, and then send me the CSV file. I’ll take a look at it, let you know what I think is going on with the server, and give you some ideas of where to look for problems. I can’t always turn these around quickly – sometimes it takes a week – but I’ll do my best. Feel free to contact me and I’ll give it a shot! 32 comments to SQL Server Perfmon (Performance Monitor) Best Practices |
|||||
|
Copyright © 2009 Brent Ozar – SQL Server DBA - All Rights Reserved |
|||||
“This one “physical disk†may have a bunch of actual physical drives, like in RAID systems. However, Windows isn’t quite smart enough to know exactly how many drives are in the RAID array, so the term “physical disk†is a little misleading here.
Highlight all of the physical disks in the instance list (again, leave off the _Total instance) and click the Add button. Then, in the counter list, choose the “Avg. Disk Queue Length†counter and add it too.”
This is a correct statement. However you should note that if you have multiple disks in an array from a SAN Windows will show high queuing numbers. If the SAN array has 5 disks and each disk is showing a queue of only one, then Perfmon will report it as 5. That can lead you to think you have high disk queuing (anything steadily above 2 is considered high) You have to divide the reported queue number by the number of disks in the array to get an accurate read.
Yeah, I’m going to cover that in the analysis section when I write that up in a few days. Watch out – I might make you write it, hahaha.
Thank you for your mention of Spotlight on SQL Server in this article. I am the Product Manager for Spotlight and I agree with your assessment that DBAs should be comfortable with all the tools at their disposal. We position our products to augment rather than replace native tools such as Perfmon. Spotlight can be configured either to run constantly in background or only in real-time when the Spotlight client is open.
awesome document, very helpful..
thx again…
JohnE
[...] across the blog of Brent Ozar. I read an excellent article he authored with instructions for using Perfmon, and he’s been a great source of information and direction for me since. He’s a SQL [...]
Brent, When trying to add SQLServer:Buffer Manager – Buffer cache hit ratio
SQLServer:Buffer Manager – Page life expectancy
SQLServer:General Statistics – User Connections
SQLServer:Memory Manager – Memory Grants Pending
I cannot find SQLServer as a choice. What am I missing?
Brent.
Found my issue.
I have Win 2003 Enterprise 64-bit
A guy on Microsoft community for SQL cluster gave me a tip
perfmon /32
Now everything shows up.
Thanks Brent,
It helped me a lot…
Hi:
Brent, Is there a more detailed document about the counters? I read your article and found it very interesting but you didn’t put many info about the others counters and I’m really needing the help!
Thanks anyway!
FJM
Jr DBA
Hi, Francisco! No, sorry, I don’t have a more detailed document online, but if you’d like to email me questions I’d be happy to answer. Thanks!
Thanks for this Brent, I’m consulting yours and Denny C’s blogs and so on more and more often; far more readable and interesting than the dry text-book like stuff we’ve been putting up with for so long.
Hi Brent,
Thanks for putting this magnificent article in the 1st place
.
1st of all, i’m really am a noob on SQL server, but i do hope to get to know how am i able to generate comparison charts, forecasting and so on using “Analyze (in the Cloud)”.
If you did posted for this, could you please reply a link for me?
thanks in advanced.
best regards,
Jack Chong
Hi, Jack. Unfortunately, you’ll be on your own to create comparison charts and forecasting – that would be an advanced-level tutorial, and I haven’t recorded one of those. Sorry!
Brent,
I couldn’t see the below counters in perfmon,
SQLServer:Buffer Manager
SQLServer:General Statistics
SQLServer:Memory Manager
Instead I could see the below counters,
SQL : Catalogs
SQL : FD
SQL : Services
Where can I see those counters. please help.
Also it is SQL server 2005 in Windows 2003. It is having two DB instances. How to monitor for each instance.
Pappitha – hmm, something doesn’t sound right. It doesn’t sound like you’ve got the full SQL Server installed on that machine (like maybe you’ve got MySQL or SQL Server Express Edition), or you may not have the permissions to view the Perfmon counters.
I have the admin rights to the machine. Please let me know whether we need to enable anything to monitor the SQL server perfmon counters.
I could monitor the OS level performance counters with the same login credentials,
Nope, that’s all you need!
Today is my first visit to this site and my experience is awesome.
I am always interested in finding threshold values for perfmon counters ( Asyou mentioned above for example Buffer cache hit ratio,% Usage metric, etc…).
Is there a complete list , which documents the threshold values for most of the counters that will be used in perfmon.
Hi Brent,
I was struggling to find what counters to use for zeroing in on my performance issues. THanks for your article, hopefully, I’ll be able to find something.
-N
Brent,
Thanks for the article. It has helped a lot!
Is there a place where Microsoft has published acceptable limits for these performance counters? I thought I had seen it at some point but I can’t seem to find it now.
Thanks,
Jonathan
Microsoft Press released a training book for the 70-444 exam called: Optimizing and Maintaining a Database Administration Solution Using SQL Server 2005. This book contains acceptable limits for various performance counters.
Adam, thanks for the tip. Do you know if they have one for SQL 2008? Thanks again!
Jonathan, I have not been able to find an equivalent SQL 2008 book. I hope when they release it that Microsoft uses Windows Server 2008 for the performance counters. The BI and Developer books just came out so probably in the next few months the DBA book should be released.
Adam, Thanks for following up with me. I will keep an eye out for the book too.
Thanks for the excellent article!
However, the first comment still leaves me a little confused. You say that your statement is correct, that Windows doesn’t know about how many physical drives may represent a logical drive, and that makes sense. Yet then you say the wait queues need to be divided by the number of spindles, even though one request is sent to the SAN, where the controller will divide it into multiple write operations. I don’t see why I would divide anything? Thanks again for your comprehensive articles.
Rod
I’ve used your steps to get a spreadsheet of metrics. The problem is with the formulas. Do I have to apply the same formulas for colums C, D, E, etc? Otherwise I dont see any ifo up top and the locking rows doesnt make much sense.
Aaron – yes, absolutely. Copy the formulas for C, D, E and the rest.
Rod – it’s probably a little more than I can illustrate here in a comment, but your best bet is to pick up the book SQL Server 2005 Performance Tuning, which does a good job of explaining that.
Brent,
I just found your blog – it is great and this is what I need.
Can you (or anybody else who is reading it) tell me which free tool is better for performance tuning: Perfmon or SQL 2005 Performance Dashboard? And when we run it, which one affects the performance least?
Thanks,
Alex
Alex – thanks! Both of those don’t affect performance at all. Perfmon will give you better server-related statistics (CPU % busy, disk metrics) but SQL Server Performance Dashboard will give better database-related statistics (locks, queries, transactions) with more detail. You should use both.
Thanks a lot for the quick reply. I’ve already started PerfMon following your recommendations. Do you think 3 hours of run will be enough to see the results? We have active-passive cluster on x64 platform with 16 processors, 48Gb RAM, running Win-2003 SQL-2005 SP2. Over the weekend our developers have added new data and new source code. Since that SQL server became pretty slow, it consumes about 40Gb memory (even if it was rebooted 2 days ago) and CPU is twice higher as it used to be. Any advice where I should look first?
Thanks,
Alex
Alex – if they deployed new code and the server slowed down, it would make sense to check out their code. You can use Profiler to watch the queries that are running to see why they’re slow.
Thank you !