Blog

Announcing a New 2-Day Event: Make SQL Server Apps Go Faster

We started with our top-selling PASS Summit pre-con, and then doubled it to two days long!

You’re a developer or DBA stuck with a database server that’s not going fast enough. You’ve got a hunch that the biggest bottleneck is inside the database server somewhere, but where? In just two days, you’ll learn how to use powerful scripts to identify which queries are killing your server, what parts of the database server are holding you back, how to tackle indexing improvements, and how to identify query anti-patterns.

On Monday and Tuesday before the PASS Summit conference, we’ll dive deeply into:

  • How wait stats tell you where to focus your tuning
  • How the plan cache shows you which queries are the worst
  • How to make fast improvements by picking the right indexes
  • How to identify and fix the most common query anti-patterns
  • And you’ll even get execution plan, query, and index challenges to solve during the class
Make yourself comfortable and get your learn on.

Make yourself comfortable and get your learn on.

You’ll be plenty comfortable because we’re holding this independent event in the Big Picture Theater in downtown Seattle, one mile from the Convention Center. Coffee, beverages, snacks, and lunch are all covered. (Sorry, no popcorn – you know we wouldn’t be able to stop eating it once we get started.)

Registration is $695, and for another $200, we’ll throw in our Developer’s Guide to SQL Server Performance video class. This way, you can get started learning right away without waiting for the class!

Act fast, save $100 - if you register during July with coupon code GOFAST, registration is only $595.

Space is limited to just 100 seats – way less folks than we had last year, so you’d better move fast if you want to get in. This is a totally independent training event run by Brent Ozar Unlimited – you can’t get in with your PASS Summit ticket, and you can’t register for it as part of the PASS check-out process.

Check out the details now.

Capturing Deadlocks in SQL Server [Video]

If you’re experiencing deadlocks in SQL Server, you want to know more about them. What queries are causing them? What tables and indexes are at the root of the problem? You can find this information by using tools such as trace flags, Profiler, and Extended Events. Jes shows you how in this 25-minute video:

Download the scripts here.

For more videos like this:

How to Get “Senior” in Your Title [Video]

In my blog post Sample SQL Server Database Administrator Job Descriptions, I explained how production DBAs are like fridge maintenance engineers, and development DBAs are like chefs.

In this 35-minute video, I explain how this plays into the concept of a “senior DBA” – you can’t just say “senior fridge user” any more than you can say “senior DBA.” There’s a lot of kinds of database administrators, and they each have different job tasks.

In the video, I use this grid to lay out common job duties:

DBA Job Duties

DBA Job Duties

Then in the video, I use a 10-question self-assessment test to check your skills across each job duty, figure out where you’ve got the best strengths, and guide your training to focus on one of the most common job descriptions.

For more videos like this:

Sizing SQL Server for AWS

Let’s skip the shenanigans and get right down to it – you’re responsible for SQL Server and someone in your company wants to move things into Amazon Web Services (AWS). You’ve got SQL Server setup covered thanks to our helpful SQL Server setup checklist and you’re confident you’ve got hardware under control, but things are different up in the clouds.

The question on your mind is “What’s the best way to size a SQL Server in the cloud?”

The Basics: Sizing the Server

measure-this

Amazon gives you a set list of server sizes to choose from. It’s up to you to pick the right server for the job. The available choices can be overwhelming at first glance, but it gets simpler when you consider different classes of SQL Server and map them to different families of AWS instances.

By the way: we call servers instances in the cloud. Yes, it’s confusing. No, we can’t change it.

Instances are grouped into instance types – you can think of an instance type as a general category of performance. This is going to be your starting point for deploying SQL Server. Don’t worry, though, you can usually change the instance type after you’ve configured SQL Server.

Possible SQL Server instance types:

  • M3 General Purpose – These are basic servers with up to 8 cores and 30GB of memory. M3 instances won’t win any awards for speed, but they are cheap.
  • R3 Memory Optimized – These instances ship with up to 32 cores (2 sockets) and 244GB of memory. Two local SSDs are thrown in – seems like a good place for TempDB to me!
  • I2 IO Optimized – Sounds great at first, right? All the memory of the R3 with 6.4TB of local SSD. At only $7 an hour, it’s a steal!

So, which one do you need? Remember that you don’t want to just choose the biggest and baddest instance right off the bat – there’s a noticeable cost associated with any of these instances.

We’ll get back to picking an instance type after covering a few other details.

Support Level

In many organizations, DBAs are lucky to have tiered support. Mission critical applications like POS systems, or credit card processing get put in the highest tier of support. Internal applications may end up in a second tier. And, finally, legacy applications or applications that are not critical in any way to the business end up in the third tier.

We can’t ask AWS to give us faster disks, SAN replication or some other magical enterprise HA feature. But we can use this support information to figure out if this SQL Server requires the attention of a full time DBA or if it can be offloaded to something like Amazon RDS. Making this decision is outside the scope of this article, but knowing that there is an option is enough.

  • What’s the support level required for this application?
  • Does the application require regular manual tuning?
  • What’s the support SLA?
  • Is there an incident response plan?
    • Has anyone ever reviewed the incident response plan?
    • Have you ever had to follow it? That is: does it actually work?

Networking

I bet you don’t even think about networking.

The first thing you need to know is: by default, AWS instances use a single network card (NIC) for all traffic. Storage and communication use the same pipe. If you want to use database mirroring and rebuild your indexes, you may be in for a nasty surprise when mirroring drops in the middle of a rebuild.

Second: not all instance types are networked equally. Look in the Networking Performance column to get a vague idea of what’s going on. “Moderate” equates to 500 Mbps, “High” is 1000 Mbps, and “Low”/”Very Low” shouldn’t be entering the equation for a SQL Server.

Watch the storage and network throughput of your SQL Server to verify how much total throughput you need.

For SQL Server it’s better to guarantee a certain level of both network and disk performance. AWS offers what are called
EBS-Optimized Instances. These EBS-Optimized Instances use a separate network to communicate with storage in AWS. Traffic on the storage network won’t cause problems for traffic on the regular network.

Use EBS-Optimized Instances for your most demanding applications.

Other applications may not require as much throughput, but you should adjust maintenance to prevent the shared network from becoming a bottleneck.

Network considerations are:

  • What is typical network throughput for the application like?
  • Do you need separate storage and networking pathways?

Storage

storage

I’m going to repeat myself: Use EBS-Optimized Instances for your most demanding applications. You can change this after the instance is up and running but it will require a restart. Plan carefully.

For I/O intensive workloads, AWS offers Provisioned IOPS (PIOPS). The PIOPS option lets you decided on a guaranteed number of IOPS (+/- 10%) available to the system using that disk. PIOPS disks have an increased cost associated with them – they’re 2.5 times more expensive per GB than regular storage and you pay $0.10 / IOP / month.

Stripe drives with caution. In a conventional data centers, it’s a best practice use RAID to configure large volumes and to stripe data across many disks for additional performance. Although RAID striping is possible in AWS, keep in mind that any RAID volume is only as fast as the slowest disks in the RAID volume. Since all permanent storage in AWS is networked attached (and subject to unpredictable latency), think very carefully before setting up RAID volumes.

Storage is somewhat limited in AWS, too. Each AWS disk (EBS volume) can be up to 1TB in size. Up to either 16 or 25 disks can be attached to an instance (this depends on some driver limitations). At a maximum, 25TB of data can be stored on an AWS instance using EBS volumes. I bet you were worried for a minute, right? Anyone wanting to store more than 25TB of data will need scale out across multiple servers.

The storage considerations are:

  • How much storage do you need for data, logs, and backups?
  • How fast does that storage connectivity need to be?
  • How much are you going to access that storage in any given second?
  • Do you need to guarantee the accesses per second?
  • What is the rate of data growth?
  • What is the rate of data retention?
  • What is the total volume of data being stored?

Picking an AWS Instance Type for SQL Server

So how should you go about picking an AWS instance type for SQL Server?

Take a look at your memory requirements for SQL Server and compare that the instance types on offer in AWS. This may exclude several instances types immediately.

After answering the storage questions, determine if you need provisioned IOPS, dedicated storage connectivity, or even 10Gb networking. This further narrows down the choices you have for instances.

Once you’ve reached this point, you should have several choices of instance size available.

Put It All Together:

  • Network
    • What is typical network throughput for the application like?
    • Do you need separate storage and networking pathways?
    • Are features in place that require additional network throughput?
      (This might be AlwaysOn Availability Groups, database mirroring,
      or log shipping.)
  • Storage
    • How much storage do you need for data, logs, and backups?
    • How fast does that storage connectivity need to be?
    • How much are you going to access that storage in any given second?
    • Do you need to guarantee the accesses per second (IOPS)?
    • What is the total volume of data being stored?
  • Memory
    • What are the memory requirements of the existing SQL Server?
    • What is the use case for this SQL Server? OLTP? DSS?
    • What other services will run on this system?
    • Do you need Enterprise Edition?
  • CPU
    • What is the current CPU count?
    • What is the current CPU speed?
    • Do you require any Enterprise Edition features?
  • Support
    • What is the support agreement for this SQL Server?
    • What’s the support level required for this application?
    • Does the application require regular manual tuning?
    • What’s the support SLA?
    • Is there an incident response plan?
      • Has anyone ever reviewed the incident response plan?
      • Have you ever had to follow it? That is: does it actually work?
  • Instance Size: If you still have choices left to make between one of several instance sizes, ask a few more questions:
    • What is our current rate of activity growth? (e.g. as data volume doubles, will user activity double?)
    • How much will this instance cost over 1 year? 3 years?
    • Is a reserved instance a possibility (taking into account your growth rate)?

When you put it all together, sizing a SQL Server in AWS is a lot like sizing a on-premises SQL Server with several constraints around sizing and capacity.

Bonus questions:

  • Performance Profile
    • Is there a performance SLA?
    • What are the top wait types of an existing SQL Server?
    • What aspects of the application can be changed?
  • Doing Less Work

 Ignorable AWS Instance Types

Instance types we can completely ignore:

  • G2 GPU – This is for people who do math on graphics cards.
  • C3 Compute Optimized – Instances with a lot of CPU, some local SSD scratch space, and not a lot of memory. Great for HPC, not great for SQL Server.
  • HS1 High Storage Density – These monsters can have up to 24 2TB local drives delivering 2.6 gigabytes per second of disk throughput. At first glance, they look like a dream. However, the data stored on the local disks will be erased if you stop and then start your AWS instance. You could try to use AlwaysOn Availability Groups, but a total AWS Availability Zone or Regional outage could lead to massive data loss. These are primarily intended for clustered computing like Hadoop or distributed data warehouses (GreenPlum).

Capturing Deadlocks in SQL Server

What’s a deadlock? Well, let’s say there’s a fight going on between Wonder Woman and Cheetah, and, in the same room, a fight between Batman and Mr. Freeze. Wonder Woman decides to help Batman by also attempting to throw her lasso around Mr. Freeze; Batman tries to help Wonder Woman by unleashing a rope from the grappling gun at Cheetah. The problem is that Wonder Woman already has a lock on her opponent, and Batman has his. This would be a superhero (and super) deadlock.

When a deadlock occurs in SQL Server, two or more tasks are running and holding locks on data. Then, each task requests to lock the data the other task is already holding. Both tasks wait for the other to give up. Neither does. SQL Server could let this showdown continue indefinitely, but it won’t. It picks one task – usually, whichever will be the least expensive to roll back – as the victim, and that task is killed.

How do I know if it’s happening to me?

You might notice slowness in your application. Users might complain about slowness or error messages. And, you’ll see a message in the SQL Server Log that says “Transaction (Process ID 103) was deadlocked on resources with another process and has been chosen as the deadlock victim.”

Your server has encountered a deadlock.

How can I capture more information about it?

You have several options: you can enable a trace flag to write more information to the log, you can capture deadlocks graphs using Profiler or Extended Events, and you can track the number of deadlocks occurring using Performance Monitor.

Use trace flags to write to the log

Two trace flags can be enabled to capture more information in the log: 1204 and 1222. 1204 lists the information by node; 1222 lists it by process and resource. You can enable both simultaneously. To enable the flags, use the command

DBCC TRACEON (1204, 1222)

Here’s a small sample of what would appear in the log:

tf 1222

No one wants to read through that, manually parsing it to figure out what database it occurred in, what tasks were involved, and which was the victim. Wouldn’t it be great if this was represented graphically?

Capture a deadlock graph with Profiler or Extended Events

It is – you just have to know where to look to for a deadlock graph. In SQL Server 2005, 2008, and 2008R2, I prefer to use a Profiler trace; in SQL Server 2012 and 2014 I rely on Extended Events.

Profiler

When you set up a Profiler session to gather deadlock information, select the “Deadlock graph”, “Lock:Deadlock” and “Lock:Deadlock Chain” events. Make sure you go to the Events Extraction Settings tab and select the option “Save Deadlock XML events separately”. This ensures that the .xdl files – the graphs – will be saved as separate files. When you start the session, if a deadlock occurs, you will see “Deadlock graph” captured as an event.

deadlock graph

The task with the blue X on it is the victim. By holding your mouse over the process oval, you can see what statement was being executed by that task. The Key Lock rectangles will help you find the object and index that the locking and blocking occurred on.

Extended Events

In the future, Profiler will be removed from SQL Server. Extended Events (EE) is taking its place. You have two options for getting deadlock information using EE. First, there is the system_health session that, by default, runs continuously. You can mine this for deadlocks that have happened in the past. You can also set up your own session to capture only deadlock information (and any other relevant events).

If using the system_health session, filter on “xml_deadlock_report”. If setting up your own session, capture the same event. If you are using the GUI available in 2012 and 2014, you can open the file and view the deadlock graph on the “Deadlock” tab.

system_health

If parsing the results using T-SQL and XQuery, you would use the .query method to extract the deadlock XML, like such:

SELECT DeadlockEventXML.query('(event/data[@name="xml_report"]/value/deadlock)[1]') AS deadlock_graph

This generates the XML, which you would then save as an .xdl, then close and re-open with SSMS to view the graph.

Track Performance Counters

Performance Monitor (PerfMon) is valuable for tracking many statistics. One of the metrics you can track for SQL Server is SQLServer:Locks – Number of Deadlocks/sec. This is the least informational of the methods mentioned here – it only gives you a count of how many are happening per second in your system. It can be helpful, however, to track this to see what times of the day produce the most deadlocks. You can then use the other methods mentioned here to research what is happening during that time to cause the deadlocks.

Don’t be a victim!

Take charge of your SQL Servers! Become familiar with detecting deadlocks, finding out what caused them, and fixing them! For more tools and advice, visit Locking and Blocking in SQL Server.

Our SQL Server Performance Troubleshooting Class: Attendee Feedback

We’ve already told you what’s in our SQL Server Performance Troubleshooting class, but what did the attendees in Chicago think?

“Great information.  If I had to learn this myself, I would have taken years to understand these concepts.” – Kishore P., Application Admin

“So, you’re pretty good at performance tuning?  Yeah?  Well, this course will make you better.  These funny, smart and attractive people will push your knowledge of each little dirty trick SQL server likes to pull and how you can counter-punch.  It’s by far the best laid-out and executed training class I’ve ever attended.” – Drew Furgiuele, DBA

“The course gives you a lot of valuable, practical tips.  The presenters are all very knowledgeable, helpful and friendly.” Kevin Clark, Developer

“Amazing class, learned so much that I can take back and use.  One of the best training classes I have been to.” – Frank Garofalo, DBA

How many of you would like us to stop stuffing you with desserts?

How many of you would like us to stop stuffing you with desserts?

“A much better value for your training dollar than a local training course or even a 3-day conference.  Awesome advice that will be immediately useful once I get back to the office.  Terrific venue & food.  I didn’t have to spend a ton on dinner because of a pretrified chicken satay lunch.  I have been working with SQL Server for 15 years and I still learned tons.  When I get home I am locking this workbook with my notes in my safe full of gold (that I don’t have).  It is literally priceless.” – Andrew Notarian, Lead BI Engineer

“Awesome! Very informative.  Excellent delivery of technical content. Way more fun than any other tech course I have taken.  Looking forward to getting back and fixing a bunch of issues.” – Greg Noel, COO/CIO

“If just 10% of what these bright, personable people talked about stuck to the sides of my brain, I will be far more effective at work. Highly recommended.” – Bill Litman, Developer/DBA

“This class is better to get attention and easily ask questions get answer right way without waiting.  Brent and 4 team members working together very well.  Good for any level of BA. Comfortable, friendly and covered all topics about performance.” – Soojin Yoo, DBA

“This class was great because it covered a lot of the questions I’ve wondered about while trying to make the db and queries fast.  The bestest part (the whole class is best already) is being able to gauge and capture metrics which will help me show existing problems w/ the db.  Thank you!” – Marie Kolodz, Development DBA

“This course gave me the info I need to go back to the office and tackle my performance woes.  The presentations were engaging, and everyone (Brent, Kendra, Jes, Jeremiah, and Doug) were approachable and would discuss our issues with us.” – Joanne Snyder, Architect/Enterprise DBA

“This is a great survey course into the most common (and not so common) problems you will encounter in your server environment.  Calling it just a survey course doesn’t capture its full value.  If you want the best bang for your training budget buck, sign up for this course.” George Stocker, Developer

This class sold out in Chicago, and it’s on track to sell out for Philadelphia as well.  Register now.

Meet the Artist Behind the TempDB Toilets

Brent says: beware public toilets with only one data file.

Brent says: beware public toilets with only one data file.

Occasionally people ask me if I’m responsible for the cool artwork on our site.

I wish I had those kind of skills!

The artist behind the artwork on our website is the one and only Eric Larsen. Eric’s been great to work with: he doesn’t laugh at our crazy ideas, he makes thoughtful suggestions, and he’s always open to feedback like, “Could you make the public toilet larger, please?”

Check out this writeup on theispot.com that includes a fun collection of the work Eric has done for our website.

Treasure Hunt: SQL Server Error Log [Video]

What secrets lie within SQL Server’s error log? Many! This day-by-day record of your server’s actions can be a treasure trove of information. Here you’ll learn what to look for, how to effectively search it, and how to get alerted of potential problems.

What Amazon RDS for SQL Server Users Need to Know about Multi-AZ Mirroring

Amazon Relational Database Service (RDS) now supports multi-availability-zone SQL Servers. This means you can have a primary database in Northern California, and a secondary replica over in Oregon.

Those Portland folks were always a little bit backwards anyway.

Those Portland folks were always a little bit backwards anyway.

When your primary server goes down, OR when the entire AZ goes down, you can fail over to the secondary with zero data loss and keep right on going. When the primary comes back online, you can fail right back, again with no data loss.

Amazon uses replication for their MySQL, Oracle, and PostgreSQL multi-AZ feature, but they chose a different database technology when it came time for SQL Server. They’re using database mirroring, and there’s a few things you need to know before you rely on it.

Technically, database mirroring is deprecated. This means support will be removed in a future version of SQL Server. I’m here to say you shouldn’t care – if it meets your needs today, great. Keep right on using it, because it’s still around in SQL Server 2014, and that’s going to be supported for at least a decade. If the next version of SQL Server actually did remove mirroring support, you’d still have years to make Plan B.

Cross-database transactions are not supported. Distributed transactions and multi-db commits mean your application changes data in multiple databases inside a single transaction. Books Online explains the gotcha. Same thing here – if you’re already using Amazon RDS for SQL Server, and you’re doing cross-database transactions, and you don’t have a Plan B yet, you could start using multi-AZ RDS just to get yourself the beginnings of a safety net. However, you need to start working on Plan B Plan C.

Amazon’s implementation picks safety over performance. They’re using synchronous mirroring, which means whenever you do a delete/update/insert:

  1. The data is written to the primary’s log file
  2. The data is sent across the network wire to the secondary
  3. The data is written to the secondary’s log file
  4. The confirmation is sent back to the primary
  5. The transaction is considered committed, and the app is informed

This is going to add latency to your existing application because steps 2-4 aren’t being done today. If you’re worried about performance, this means you have to pay particular attention to each of those steps.

SQL Server also offers asynchronous mirroring where steps 2 through 4 are done in the background without impacting throughput. However, you don’t get automatic failover with async – after all, you wouldn’t want to automatically fail over to a replica that has less data than your primary. Also, async mirroring is an Enterprise Edition feature, and that’s only available in RDS if you bring your own licensing.

Mirroring protects you from some storage corruption. When the primary reads a page off disk and finds corruption, it requests a clean copy of the page from the mirror. This works because mirroring sends log transactions, not data file pages, over to the replica. The secondary responds (hopefully) with a clean copy of the page, and SQL Server fixes the corruption with nary an error message or outage. It’s called automatic page repair, and it gives you another measure of insurance for database reliability.

And finally, mirroring is actually pretty cool for this. You’re probably using Amazon RDS to reduce your systems administration overhead. You want to get more availability and performance without having to run your own SQL Server, and that means you probably don’t have your own full time DBA. Database mirroring meets those needs pretty well, and I can see why Amazon picked it. You’re definitely going to see slower transactional throughput – but hey, that’s where provisioned IOPs comes in. The faster you want to go, the more quarters you put in the front.

I wouldn’t necessarily recommend database mirroring if you were going to manage this infrastructure yourself. Mirroring has some challenges around cryptic error messages and scaling to large numbers of databases, but neither of those are weaknesses in this particular implementation. But with Amazon managing it, and having a relatively small number of databases involved (less than 100), it makes perfect sense.

Learn more about multi-AZ SQL Servers in Amazon RDS.

Backups, Restores, Corruption, and You: How Strong are You?

You and your databases have a strong relationship. You back them up, and check them for corruption. They provide the business with information day after day.

Don't let your relationship come to this!

Don’t let your relationship come to this!

Then, one day, your relationship is rocked. The backups are too slow. The restores are too slow. There is…gasp…corruption. Are you prepared to handle this? Will your working relationship continue? Take this quiz to find out!

  • 0 points – I’ve never done this in real life before
  • 1 point – I’ve done it once or twice in a test environment 
  • 2 points – I’ve done it in production
  • 3 points – I’ve done it in production and I’ve documented it
Here are your questions!
  1. You run DBCC CHECKDB regularly. Your job fails one day, with an error that corruption was detected. You need to figure out what object has the problem – a clustered index? a nonclustered index? a metadata page? – and repair it.You can identify the object and corruption, and several ways to fix it.
  2. After this incident, the business owners of the database would like extra assurance that their data is safe. You’ve identified the steps you could take to validate backups and restores.
  3. You have a database that is very large. You have a window of 60 minutes to complete the backup each day, but it’s taking much longer. You implement a process to make backups faster.
  4. Due to its size, this database is also exceeding its RTO in a disaster recovery test. You re-architect the database and the restore process to make it faster.
  5. There are several related single-tenant databases for an application. You need to ensure that all related databases can be rolled back to immediately before a single transaction when upgrades occur.

0-2 points: Your relationship is doomed. You need to get my Backup & Recovery, Advanced training stat. As a matter of fact, you may want to start with my Backup & Recovery Step by Step training, to make sure you have the basics down.

3-6 points: Your relationship is rocky at best. Review my Backup & Recovery, Advanced training to make sure you’re prepared for all of the potential crises you could encounter.

7-10 points: Solid, but you still have room for improvement. Perhaps a few days in a test environment, with a little one-on-one attention and a review of all of the modules in my Backup & Recovery, Advanced training is just what you two need.

11+ points: You clearly have a great relationship. Take it to the next level by reviewing all of the scenarios outlined above and documenting those you haven’t tackled yet. Check out my video training for additional tips and ideas.

css.php