Blog

SQL Server ships with sane defaults, right? Well, sort of. SQL Server ships with pretty sane defaults for a single CPU socket server or for a development work station. Honestly, the defaults are good enough for many things, but there are a few gotchas. Changing these settings is quick, relatively painless, and can result in all around stability and happiness.

MAXDOP... 9?!

MAXDOP… 9?!

 

Maximum Degree of Parallelism

By default, SQL Server will use all available CPUs during query execution. While this is great for large queries, it can cause performance problems and limit concurrency. A better approach is to limit parallelism to the number of physical cores in a single CPU socket.

If you’re saying “What?” right now, don’t feel bad. Basically – on a SQL Server with two CPU sockets and a 4 core CPU that does hyperthreading, the maximum degree of parallelism option should be set to 4.

Don’t believe me? Check out Recommendations and guidelines for the “max degree of parallelism” configuration option in SQL Server.

Cost Threshold for Parallelism

Speaking of parallelism – what about that cost threshold for parallelism setting? The default is set to 5, is that a Good Number®? Not so much.

The optimizer uses that cost threshold to figure out when it should start evaluating plans that can use multiple threads. Although there’s no right or wrong number, 5 is a really low setting. It’s appropriate for purely OLTP applications, but as soon as you add a modicum of complexity ka-boom!

I recommend starting with this setting at 50 or so and tuning up or down as appropriate. Make sure you measure for the critical queries in your application, of course.

Instant File Initialization

Technically this is a Windows permission, but whatever. By granting the “Perform volume maintenance tasks” permission to the SQL Server process, we give SQL Server a boost when it comes time to grow out data files.

The default behavior in Windows write a bunch of zeros whenever a user asks for space. If I create a 500MB file, Windows will write 500MB of zeros to disk to properly initialize the file.

By giving the SQL Server process these extra permissions, SQL Server can use a different API when requesting space for data files. These permissions tell Windows to mark the space as used and immediately hand it back to the calling program. In short – you can grow data files faster (which includes TempDB on system start up).

Backup Compression

Free compression! If you’re using SQL Server starting with SQL Server 2008R2, you can flip a check box and get yourself some backup compression. This is a no brainer, really – backups are smaller, they take less time, and the restores even take less time. What’s not to love?

The best place to observe a problem is from far away.

The best place to observe a problem is from far away.

 

The Remote DAC

No, I don’t mean a DACPAC, I mean the dedicated administrator connection. When you connect through the DAC, SQL Server provides a dedicated connection, CPU scheduler, and memory; all of which make it easier to troubleshoot when SQL Server’s gone completely crazy. By default you have to be on the SQL Server’s desktop either physically at the console or else remotely over RDP. When I’ve got a runaway SQL Server, I know that I don’t want to wait for a sluggish GUI to respond to my commands.

Enter the remote dedicated administrator connection.

By enabling the remote DAC, a DBA can make use of the dedicated administrative connection from any computer that’s allowed to connect to the SQL Server – from your workstation, bastion server, or even your home PC over the VPN. It doesn’t matter, as long as you’ve got rights. Remote troubleshooting systems that are stuck at 100% CPU utilization is much easier when you’ve got dedicated CPU resources and memory at your disposal.

What are you waiting for?

Only one of these changes (Instant File Initialization) requires a restart, so get out there and start switching things up. (Your mileage may vary.)

Interested In Learning More?

In our in person course, How to be a Senior DBA, we cover the SQL Server settings you should watch and the tools that will identify even more settings to change. Read more about our 2014 training!

↑ Back to top
  1. Good advice to be sure! I would also add the max server memory setting to your list of must change options. Obviously there are others and but where do you stop???

    I do have a question about Cost Threshold for Parallelism. I always had the seat-of-the-pants feeling about 5 being too low, so I have used 10 as my default for the last 4 years. I can’t really defend that number but things have been much smoother since the change. I have a mixed bag of OLTP / heavy batch file processing. I realize that this is a “your mileage may vary” situation but can you elaborate on the 50 setting as being a good starting point?

    • Sure thing! I recommend 50 for Cost Threshold for Parallelism for a few reasons, but the big one is that 5 was the recommendation back in 1999 when having two CPU sockets made you a king among people deploying Windows. Here in the future, we can buy a cell phone with a quad core chip on it.

      50 is an arbitrary number, but it is significantly higher than the default. The upside is that a cost threshold of 50 gives you some wiggle room after you’ve convinced management to let you make the change. If things get worse you can always cut it in half and still be better off than you were before.

      I didn’t recommend setting max server memory simply because most people need to buy more memory ;)

    • Thank you for the suggestions.

      Just a comment: changing sp_configure options wipes out the plan cache, so, even though a sql restart is not required for most of these options, the changes recommended should still be planned carefully in a prod environment: perhaps at the start of a slow period, such as a weekend.

    • Just add the OPTION (FORCE PARALLELISM) hint!! Oh, wait … we don’t have one of those! :-D

  2. Nice list Jeremiah! 1-4 are super important and pay dividends right from the get-go. 5 is one of those gotta-have-it-when-you-need-it-but-you-hope-you-don’t kinda things.

    I really think the list should include a 6th item: Default file growth increments. 1MB for data files and 10% for tlog files has devastated almost every client I see’s IO response time and throughput capabilities. Having hundreds of thousands or even millions of 1MB file chunks is going to hurt your IO performance, I don’t care how many disks on the SAN you have backing up your LUN! :-)

  3. Great points….How about adding “Optimize for Ad hoc Workloads” to the list if we are using 2008 or above…?
    Also, If I recall correctly, Backup Compression is an Enterprise Only feature for SQL Server 2008. I think MSFT made this available to Standard edition as well starting 2008R2…

    • I can’t recommend “optimize for ad hoc workloads” as a blanket recommendation. While this can control the overall growth of the size of the plan cache, it doesn’t control growth of the count of entries in the plan cache. Since search time in a hash table is O(n) (it grows linearly with hash table size), this can become a big problem when you have a lot of ad hoc queries and a high query volume. Basically – you can end up performing enough plan stub lookups to cause performance problems.

      Compression is an EE only feature in SQL Server 2008 but is free in Standard starting with 2008R2. That’s why I mentioned “If you’re using SQL Server starting with SQL Server 2008R2…”

      • I think I just had a flashback about how ‘Optimize for Adhoc Workloads’ burned me once. That was not amusing.

        • Kendra, can you elaborate? The “optimize for adhoc workloads” option seems pretty harmless, but knowing possible caveats would help me choose it more carefully.

          Thanks!

        • I’d love to hear more about this too!

          • Good news! You already heard about it the comment Kendra replied to.

            Optimize for Ad Hoc can end up causing huge wait types when you’re looking up plans when you have a sizable plan cache.

            Let’s say you have a table and all reads on the table are performed on an indexed BIGINT column. Queries from the application are unparameterized and come in as fast as SQL Server can process them.

            In this situation, you can end up with hundreds of thousands (if not more) plan stubs in the execution plan cache. Whenever a new query comes in, SQL Server has to check to see if that plan has ever been seen before. This lookup operation isn’t free, neither is inserting a plan or plan stub into the cache.

            As the plan cache grows, you can expect the time it takes to perform a lookup or insertion to grow linearly with the cache. Busy server + optimize for ad hoc = LOLSLOW.

          • I have had issues with slowness from non-cacheable plans before on a proc that was run about twice a second that brought a good sized server to its knees. However, that was at a point in my career where I was clueless as to using wait types to their full potential.

            What wait types would you expect to see in both my case and the case of ‘Optimize for Adhoc Workloads’ gone wild?

            BTW, from what I remember, the non-cacheable proc created a temp table, loaded data into the temp table, then created indexes on it. We switched the order of operations to get it to go to cache.

          • Can you give some detail about how searching through hundreds of thousands of plan stubs is going to be worse than searching through hundreds of thousands of actual plans (which will be much larger and steal away much more buffer pool memory)? I would guess there would be fewer to search through because the real plans are larger, but then you have to balance that with the overhead of storing the new big plan, more flushing out of unused single-use plans and the loss of buffer pool memory leading to more IO which is really slow.

            Given the number of people I respect recommending turning on optimize for ad hoc workloads (Tim Chapman, SQLSkills, Bob Pusateri, Grant Fritchey, etc, etc) I would consider it an “edge case” to not want it on.

            Note that with ALL of the changes mentioned on this post the reader is obligated to TEST on THEIR SYSTEM to validate that they are making a change for the better!! :-)

          • Kevin – that’s actually an interesting question. Often when I see MCM-level people recommending things that *they* would do to *their* servers, I take a step back and ask who their target audience is.

            For example, on any server I manage, I like to have a particular vendor app that does a phenomenal job of correlating plan cache changes to database object changes. However, it takes a rocket surgeon to interpret the user interface. I’ve tried getting clients to install that tool, but when they get it, their eyes glaze over.

            I’m an MCM – but the way that I manage my systems is different than how I’d recommend the public manage their own systems. I make changes and take risks by default, but only because I know I’m monitoring things really closely, and I can detect when there’s a problem. (I’m sure you’re the same way.)

            Another example – out of all of the race car drivers I respect, none of them have airbags in their race cars. Should I take the airbag out of my car too?

            Now, reread this post with the context of the reader, not someone whose email address is TheSQLGuru. ;-)

          • Kevin, my take is this – if you don’t have a reason to make a change, you shouldn’t be making it. The settings that we’ve talked about changing all have clear reasons to make the switch.

            I can’t make a recommendation without justification. Just saying “turn on optimize for ad hoc because… reasons” doesn’t fly. If there’s a good reason to flip the switch, I’ll flip it. But, in general, I try to address the underlying problem.

          • JP: “I can’t make a recommendation without justification. Just saying “turn on optimize for ad hoc because… reasons” doesn’t fly.”

            Wait a minute – that is EXACTLY what you did with this blog post! :-) Change CTFP because 5 is too low because…. Change maxdop because 0 can cause xyz problems. Etc. I merely pointed out that many others (respectable others, just like yourself and Brent) have stated that optimize for ad hoc workloads solves a known and common problem the majority of the time. And like I said, one must still test – even with changes such as CTFP and maxdop – because the defaults COULD be optimal for their environment, however low the probability of that may be.

            Anyway, enough of this beleaguered horse! We will just have to agree to disagree. Move along people, there’s nothing to see here … ;-)

          • Kevin – uh oh, can you point me to anything I’ve got showing everyone should turn on optimize for ad hoc across the board? I would definitely want to edit that to make it more clear that it’s not an across-the-board recommendation.

          • Wow….not what I expected here.

            Anything above is worth looking into, putting it through Dev, Beta, and load testing. However, if you’re blindly turning these knobs without knowing what they do or how your environment would act, that’s not good.

            Microsoft set most of the defaults because they’ll work ok for default installs on small servers because the larger servers will have DBAs testing other settings. There are several settings that should be tested out and changed on most servers, and Jeremiah is pointing out those settings and where he would start out testing them.

            If you read a blog and jump to installing it in Prod without knowing what it does or why you’re doing it, please don’t read my blog. In this case they mentioned Cost Threshold of Parallelism which wasn’t on my list of things to look into, but it is now. I did not change that in prod, but I’m going to learn more about it and determine if that’s the best thing for me now.

          • Brent: I didn’t say you said to turn on optimize for ad hoc, I said you (and Jeremiah) were respectable sources like others who HAD said to do that. :-)

            Steve: I think everyone here has said to test any changes to ensure it is optimal/appropriate for your system!! You are correct that to do otherwise is foolhardy, REGARDLESS of the “respectability” of the source!! LOL

            I do disagree with one small point you make though Steve – that Microsoft has set (some) defaults to work well on default installs on smallish systems. Some of these are the values they are simply from inertia, often well over a decade out of date. They just don’t want to accept the “risk” of changing them, even though they would clearly reflect modern hardware/data sizes, etc, much more helpfully. I think CTFP, MAXDOP, 1MB data file growth and likely others fit that scenario.

          • Kevin,

            Yeah, small and out-of-date are about the same recommendations, so that’s probably all it is. Even then, I can’t imagine 1 MB or 10% growth being a positive thing, same as a flat 300 recommendation for PLE that Microsoft is known for.

            I feel that this was a great article by Jeremiah with the exception that I wish it was longer. Most of these things are on my new server build checklist, and I’m always adding to that.

            I’m still hoping to meet you in person on Friday for the Pittsburgh SQL Server Saturday presenters’ dinner, and will definitely be seeing you this Saturday!

          • Well, it WAS a post on just FIVE things!! He picked some good ones. And the best thing is that the nice collection of comments he picked up are now part of the blogosphere for future consumption.

            AWESOME about meeting you! I will be there with my 8yo daughter Bean, who will be attending here 10th or 11th SQL Saturday! :-)

  4. It would be great to see a longer list that includes more features. I can’t say Cost Threshold for Parallelism was on my list, and I’m probably missing more.

    The only thing to keep in mind with instant file initialization is that it is specifically data files only. Those log files need to be 0′d out before they can be used since there is no EOF marker on them. See Microsoft’s site for a quickie showing it doesn’t work on log files, but you’ll have to dig through Google to find a descriptive reason. Since logs don’t grow too often, you really only notice this when you start SQL with a huge default log size on TempDB.

  5. Great post! But I’m missing the code to change the settings – would you please be so kind to provide it for the rookies like me? :) Thanks Jeremiah!

  6. Good List.

    I’d add a quick check that the OS Partitions are properly aligned with the strip size of your disk array. Not as big an issue as it used to be since Microsoft changed the defaults when creating a new OS Partition, but well worth checking just in case.

    On the MAXDOP, I’m slightly confused. You’re suggesting limiting the value to the number of cores on a single socket whereas the document you link to suggests

    “For servers that have hyperthreading enabled, the MAXDOP value should not exceed the number of physical processors.”

    They don’t appear to mention the single socket restriction, have I read either you or the document incorrectly.

    Thanks

    Conan

    • Conan – if we just keep adding one more quick thing, then it turns into a book, hahaha. Gotta draw the line somewhere.

      Reread that Microsoft knowledge base article carefully – there’s several caveats, like the number of cores in a NUMA node.

  7. How would the MAXDOP article apply to a SQL Server hosted on an Amazon EC2 instance. Would it be configured via the vCPUs hosted on that EC2 instance?

    Thanks.

    • Short version – it’s complicated and depends entirely on the hardware that your VM is running on (which can change across reboots).

      Longer version – MAXDOP in AWS depends on the size of the EC2 instance that you’re using. You can see all of the instance sizes at ec2instances.info. I’ve put together an eBook on SQL Server in AWS that should help with more SQL Server configuration. And, if you’re really curious, you can always use CPU-Z to figure out which CPU is under the hypervisor and attempt to configure MAXDOP appropriately.

      • Thanks Jeremiah for the quick list. Great information! I would just reiterate what you briefly mentioned and that is TEST and RE-ADJUST as needed!The beauty of SQL Server is that one size never fits all (and that’s why we all have jobs).

        A quick add-on question to Gerardo’s question about Hosted SQL Server. What about VMs in general? What should MAXDOP look like on a VM?

        • Hi Francis, setting MAXDOP on VMs is a pretty big question. One thing to keep in mind is that your VM can be moved between hosts with different core configurations without you being aware of it. That hardware could have completely different performance characteristics. Under virtualization (and in the cloud), we all need to stop micromanaging our SQL Servers and start thinking of them as appliances or plumbing.

          Determine the resources you need to accomplish your task and use that as a guide.

          • I will disagree a bit with that comment Jeremiah. MOST people can stop micromanaging their SQL Servers. But anyone who wants/needs an optimized service (database, web server, file server, BigDataServer, whatever) simply must micromanage said servers to some varying degree.

          • I think what Jeremiah is saying is that we should pick a size for our VMs and leave it alone. Revisit that decision after the first month, and possibly quarterly after that. Also, make it easy on yourself by having the hosts configured identically.

            When he refers to them as appliances or plumbing, that is a very good analogy. You put them in place, leave them alone as much as possible if you want to rely on them working properly, and make changes very rarely and when necessary. If you do this, MaxDOP becomes one of those things you can stop micromanaging and start spending your time worrying about things that need your attention.

            This may involve you investing more upfront to set it up to the point that you don’t need to tweak it at every turn, but it saves you time. With money and time being pretty much the same thing, it’s going to be cheaper to put in a solution that doesn’t need a babysitter.

          • You’re right on the money, Steve. Virtual/cloud servers are resources that should be sized once and then left alone. Setting MAXDOP in an environment where it’s easy to start the day on 3GHz cores and end it on 1.7GHz cores doesn’t make a lot of sense.

            In an environment like that it’s more important to worry about business level metrics (orders per second) vs product level metrics (flanging skrills per hectare of whatever).

  8. Pingback: (SFTW) SQL Server Links 06/09/13 • John Sansom

  9. One caveat about backup compression – if you’re using TDE (transparent data encryption), you’re going to see no benefit from compression. All it’s going to do for you is make your backups take longer, and possibly even produce a larger backup file.

    For that matter, I haven’t found a third-party tool that will compress a backup from a TDE-enabled database either…

    • That’s a feature of any type of encryption – it renders compression useless by transforming plain text into meaningless garbage.

      For many people, they won’t be using TDE. For those that are using this Enterprise Edition only feature, they probably have enough capital available that it’s possible to overcome the lack of backup compression by adding additional files and I/O paths.

  10. Re the comments about changing initial-size and auto-increment size. I’d like to agree, and in fact used to change those defaults (on Model DB). However, I got burned a couple times by app-vendor installation routines that failed. If they hard-code a size that’s smaller than the system default, their install routine will flat-out fail.

  11. Does your recommendation of setting Cost Threshold for Parallelism apply to both physical machines and VM’s? I’m specifically interested in a VMWare VM. Does it depend on host settings at all?

  12. Pingback: Why I Work With SQL Server | Art of the DBA

  13. Hi, Jeremiah.

    I have revisited this article frequently to make sure I’m choosing reasonable settings as I start troubleshooting performance problems. I have a SQL 2005 server that’s running on a Win2k3 virtual machine. It has 4 CPU, but the config is 4 sockets, each with one core. If I understand your MAXDOP guideline, I think my MAXDOP setting should be 1. Does that sound right?

    Thanks – you guys rock.
    Bil

    • Bil – I assume you’re talking about a VMware virtual guest, in which case the number of cores and sockets is a bit of an abstraction. Your VMware administrator can set your 4 virtual cores to be either 1 socket with 4 cores, 2 sockets with 2 cores, or 4 sockets with 1 core each. It doesn’t line up to real processors at all.

  14. Hello, Thank you for your good advice. I agree with all settings recommendations.

    Regarding the MAXDOP, I would say that according to the Microsoft KB you should to check how many cores per NUMA Node you have, and then you should set the MAXDOP to this maximum value or less. I mean, the constraint is not the cores per physical socket but by the numer of cores per NUMA node.

    KB reads as follow:
    For servers that have NUMA configured, MAXDOP should not exceed the number of CPUs that are assigned to each NUMA node.

    Thank you.
    Hernan.

    • You’re right, but I won’t be changing the post. Only AMD chips ship with multiple NUMA nodes per socket and I can’t recommend running AMD chips based on recent benchmarks.

      • Well, an admin could manually configure NUMA (and it isn’t “physical cores” then obviously), but then they best damn well know way more than what 5 things they need to configure on their SQL Server!! :-)

        And Jeremiah is definitely correct about AMD chips. Search out blog posts by Glen Berry on some very good advice on which Intel chips to pick for various SQL Server hardware configurations (and why).

  15. So I am in a VMware environment using SQL 2008R2 Standard. The processors are configured to be presented to the OS as 4 “Physical” CPUs with 2 Cores (8vCPU). I have 8 CPUs showing in NumaNode0. From reading this post I believe I should set the MAXDOP to 8, but a little voice is saying maybe 2. Can someone point in the right direction.

    Thanks.

    • You need to enable NUMA in VMware. This is detailed in the “Performance Best Practices for VMware vSphere” guide – just search for that phrase and the version of VMware that you’re using. From the 5.5 guide:

      You can enable vNUMA for virtual machines with eight or fewer vCPUs by adding to the .vmx file the line:
      numa.vcpu.min = X
      (where X is the number of vCPUs in the virtual machine)

      Just remember that changing this could cause SQL Server to see certain configurations as “illegal” and take some CPUs offline (e.g. 8 x 1 core sockets). Regardless of NUMA, configure SQL Server so that parallelism lines up along physical
      processor boundaries.

      • I’m in an IAAS environment where I don’t know the actual physical processors being used to serve up the 8 vCPU, so I’m not sure I can “configure SQL Server so that parallelism lines up along physical processor boundaries”.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php