The Law Of The Minimum

Monitoring
15 Comments

I like science fiction

Not enough to go to conventions, or memorize every line, but I have some opinions on things. Dune is probably my favorite series of stories in the genre. Yes, I think Paul would beat Luke in a fight. Sorry, Star Wars people. He’s just kind of a wimp.

Throw grenade, digress

One of my favorite quotes from Dune is from a fella named Kynes. It’s somewhere around page 136, and it goes something like this:

You are dealing, you see, with the Law of the Minimum… growth is limited by that necessity which is present in the least amount.

I love this line, because it’s practical and realistic. In a nutshell, there’s no such thing as infinite access to a finite resource. Keeping this in mind has helped me countless times, and you can apply it anywhere. Your career, your personal growth, heck, even your SQL Servers, if you’ve got nothing better to think about all day.

Since I started working here, I realized just how much time was the minimum resource before. Now my troubles are stuff like “my bar cart doesn’t have enough space for me to grow my scotch collection” and “learning French by going to French restaurants doesn’t work but is delicious”. I suppose the ‘moral of the story’ here is to make sure that your resources are being consumed by the right things. They should matter to you. If they don’t, it’s time to reevaluate.

But since you want to know about SQL

I can’t keep writing this obituary to the years of my life spent responding to work emails and problems into the wee hours. Plus, some of that was kind of fun. So without further ado, here’s a tolerably short list of signs your server is dealing with The Law Of The Minimum. I guess that makes it ‘the list of the minimum’, huh?

CPU

This one can be tough! If you’re not looking at some performance assessment tool that says your CPU is constantly at 100%, what else might clue you in? The first thing that comes to mind is THREADPOOL waits. That’s basically your SQL Server running out of worker threads to give to new tasks, and holy smokes is that a tough situation. You can hit this wait hard just by having enough concurrent queries running, especially if they’re going parallel and hanging out for a while.

Memory

This is fairly easy to spot, but not always easy to remedy. You may see waits on RESOURCE_SEMAPHORE or RESOURCE_SEMAPHORE_QUERY_COMPILE. The former means your queries can’t get memory to start up, and the latter means your queries can’t even get enough memory to get an execution plan. The worst part about it is that you could be constrained to a limited amount of memory by so many things. Old OS or hardware, Standard Edition, or a crowded VM host. And the list of things in SQL Server that use memory is outlandish. Buffer pool, various caches and DMVs, queries compiling, queries executing. And then there’s Windows. If Windows doesn’t get enough memory, more than SQL is in trouble.

Disk

This is related to memory, because if data isn’t in memory, it has to come from disk. You may see a lot of PAGEIOLATCH waits if you’re short on memory, but you’ll see really long waits if your storage is old and crappy. WRITELOG may also pile up here as well. More RAM can help reads, but it can’t help writes. For that, you need better disks. Your SAN admin may send disdainful emails about the response time of the disks your troublesome database files sit on. That’s when it’s time to start asking about how your server is connected to those disks. Perhaps that single 1Gb connection wasn’t the best idea.

Figuring it all out

Very few servers are waiting on nothing, sitting around bored. The ones that are often still get blamed for performance problems, when it’s the application that stinks. You can usually prove out either scenario, but you’re going to have to watch your server a bit. For this, you need something that measures wait stats, and aggregates them in some meaningful way. Whether it’s a free script or a monitoring tool is up to you.

After all, that’s your minimum.

Thanks for reading!

Previous Post
Update On Connect Item To Replace DBCC SHOW_STATISTICS
Next Post
[Video] Office Hours 2016/10/05 (With Transcriptions)

15 Comments. Leave new

  • Nice Post! “The spice must flow….”

    Reply
  • thanks for the post. Once I had a debate over threadpool causing 100% cpu. Glad to confirm that I was right about it. the developers had a crazy loop going over a query several times. fixing the loop saved the day.

    Reply
  • Hi Eric, nice post.
    By the way you are referring to the “Liebig’s Law of the Minimum”, something very common in agriculture.

    Reply
  • Loved this post. Thanks Erik.

    Reply
  • I always thought of “The law of the minimum” in terms of “your scalability is limited by your next bottleneck”.

    This blog post is exactly the thing I tried (but failed) to get across when I wrote this article. I think you hit the nail on the head here.

    (and Fremen are the better moisture farmers)

    Reply
    • I don’t think you failed at all. Your post is a scalpel (I think I emailed you about how cool I thought it was when it was published), and mine is a mound of spitballs on the ceiling.

      Reply
  • I think learning and using Performance Analysis of Logs (PAL) from Codeplex really helps to identify where the bottlenecks are located in addition to the wait stats. SOS_SCHEDULER_YEILD is one I look at for CPU.

    Reply
    • Is there anything specific about those waits you look at to decide if they’re a problem? What do you do to reduce them?

      I’ve never used PAL, but it looks interesting.

      Reply
      • I look at the relative weight of each wait stat. For instance, if SOS_SCHEDULER_YEILD is the waitstat with the most wait time, I think CPU contention and I start looking at threads, %user time per process (something other than SQL using CPU), %system time incase there is too much I/O going on (bad disk architecture), and then internal DB issues like compiles/recompiles, etc. I also look at CPU cores to see if the load is evenly distributed (is parallel query processing happening well, TempDB threads using all cores)

        What to do to reduce a wait stat is more varied than the number of wait stats. It depends on too many variables to name a single thing. Adding CPU cores is the last result though.

        Reply
  • You can only put 5 pounds of ‘stuff’ into a 5 Pound bag – no matter how space-time is folded or in which ‘virtual’ reality one exists. I sometimes feel like we DBA’s are playing a three cup shell game where we don’t even start holding the pea.
    Thanks for the reminder.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.