Hey SQL Server DBAs — we’re hiring!Help-Wanted-Unicorn

Here’s a quick five question quiz. Give yourself one point for every ‘yes':

  1. Have you been a database administrator for a few years?
  2. Does planning how to keep an application reliable and safe from disasters sound like fun?
  3. Do you love helping people make SQL Server go faster?
  4. Would you like to have a cartoon character of yourself?
  5. Can you see yourself dropping into our Office Hours sessions occasionally to help people for free?

If you got five points, you just might be the next Brent Ozar Unlimited employee!

Here’s what to do next:

Read more about the job here to find out what we’re looking for and what benefits we offer.

Then tell us about you!

We plan to accept applications through March, but don’t wait too long to submit — life moves fast sometimes.

8 comments ↑ Back to top

Dear Brent Ozar Unlimited readers: thanks for following me for three great years. I’m off on a new adventure, but I hope to see you at a user group, a SQL Saturday, or a conference in the future!

I’m leaving with a lot of great memories – and here are a few of my favorites from the last three years.

My first company retreat was also my first SQL Cruise!

Coffee at Victrola before leaving port

Coffee at Victrola before leaving port

Lunch in Ketchikan with Jeremiah and Kendra and other cruisers

Lunch in Ketchikan with Jeremiah and Kendra and other cruisers

I got to co-present my first full-day pre-con at PASS Summit with Brent, Jeremiah, and Kendra in 2013.

This was the only time I've ever been told I wasn't talking loudly enough.

This was the only time I’ve ever been told I wasn’t talking loudly enough.

Body boarding at the company retreats in Mexico is always a blast.

We could do this all day. And we did.

We could do this all day. And we did.

We had awesome training weeks in Chicago in 2013 and 2014.

Teaching in Chicago. Talking with my hands, as usual.

Teaching in Chicago. Talking with my hands, as usual.

We ate at Little Goat on May the 4th. Or, Doug usually dresses up for nights out. You can decide.

Little Goat! We love Stephanie Izard.

Little Goat! We love Stephanie Izard.

And I’ll always fondly remember our mascot from Mexico: donk!



Kendra says: Jes has been amazing and fun to work with. But you can tell that from the pictures! We wish her all the best and are proud to have her as an alumnus.

Doug says: Thanks to you, Jes, I’ve learned a whole new repertoire of workplace-appropriate donkey jokes. (Not appropriate for other workplaces. Just this one.) We’re gonna miss you!

Jeremiah says: It’s been a blast working with Jes – for once I wasn’t the loudest person in the room! I’ll miss your energy and your love of log shipping.

Brent says: I can’t believe it’s been three years already since we first talked to Jes about the job – and in a coffee shop, of course. I’m so happy we got to be part of her journey, and I can’t wait to see what she does next.

14 comments ↑ Back to top

How often are “bad” statistics to blame for a “slow” or “bad” query? Chances are, not nearly as often as you think.

What are “bad” statistics?

Most often, when people say statistics are “bad”, what they mean is “out-of-date”. But what does that mean?

Let’s compare two tables – dbo.OrderStatuses and dbo.Orders. OrderStatuses is a 25-row table that contains a master list of all possible order statuses. Only one or two changes occur per year. Orders is a 2-million-plus-row table that contains information about every order placed in our system. There are, on average, 10,000 new orders placed per day (currently).

Let’s say that statistics were last updated on dbo.OrderStatuses 90 days ago, and last updated on dbo.Orders 9 days ago. Are either of those “out of date”? The date the statistics were last updated isn’t as important as how many changes have occurred since the last update.

Rougly, all tables with more than 500 rows will need a number of changes equal to 20% of the rows plus 500. (The formula for updating statistics is documented here.) This formula applies to tables of 1,000 rows, 560,000 rows, or even 2,000,000 rows. Larger tables require more changes to automatically update – thus, large tables are more susceptible to “bad” statistics.

How do I tell if statistics are my problem?

Or, as the question was asked in a different way, is there a way to tell if statistics are my problem without looking at the query plan? When asked that way, I would answer, “Not easily.”

I most commonly find tables that have statistics problems by looking at a query’s actual execution plan and comparing the estimated and actual rows returned. not matched rows

Finding out-of-date stats in other ways would require me to do some digging. I’d have to find the number of rows in an index, calculate the change threshold, and use sys.dm_db_stats_properties modification_counter to determine if it’s close to the threshold. Then I’d have to decide if I should manually update the stats or not.

I prefer to approach the problem on a query-by-query basis instead.

What can you do to combat this?

Leave the default options to auto-create and auto-update stats turned on.

Run Ola Hallengren’s index maintenance script – or, his entire set of scripts. Ola’s index maintenance script has the parameter @UpdateStatistics, which allows you to choose index stats, column stats, or all. You can also choose to set @OnlyModifiedStatistics to true, so stats are only modified if rows have changed since the last stats update. This is a great way to include stats maintenance along with index maintenance in your regular routine.

If you have large tables which have frequent updates, but automatic stats updates and a weekly stats job aren’t enough to keep up, it’s acceptable to create a SQL Server Agent Job that updates the stats on that table on a regular basis – perhaps once a day.

In SQL Server 2008 R2 SP1, Microsoft introduced trace flag 2371. With this trace flag enabled, rather than a set 20% change threshold, the more rows you have in a table, the lower the change percentage gets for an auto update. What is the formula? That’s not published yet, but one example is given in the KB – “For example, if the trace flag is activated, update statistics will be triggered on a table with 1 billion rows when 1 million changes occur.” Be careful with this option, though, and test it thoroughly before putting it in production. Frequent stats updates can cause frequent query recompilations – queries can take a bit longer to run, and CPU will be used more.

What if statistics aren’t my problem?

The chances of a poor query performance being caused by out-of-date statistics is very low.

venn diagram

There are so many things that factor into query performance that you should look at first. Is the query slow only for you, or for one set of values? The problem could be parameter sniffing. Evaluate the overall health of your SQL Server by checking your wait statistics, or using a tool like sp_AskBrent to see what is happening in your server right now. Is your server suffering from blocking? Review the query execution plan – are there obvious issues such as large scans, index seeks combined with key lookups, large sorts, or other common problems? And last but not least, do you have baselines to compare to?

Remember: bad, or out-of-date, statistics, can cause problems, but it’s rare. A poorly-performing query is always worth tuning!

Want to learn more about statistics? Check out my video Statistics: Beyond Performance. Need performance tuning help? Check out Developer’s Guide to SQL Server Performance!

Kendra says: I completely agree with Jes– I was once a “statistics blamer” and thought it was the source of all my slow query problems. It turns out that there was just a lot I didn’t understand.

5 comments ↑ Back to top

Think AlwaysOn Availability Groups are right for your environment? Take my 6-question quiz to find out:

That’s from our in-person training classes, where we cover what DBAs and developers need to do for a successful AG implementation.

In those classes, here’s some of the questions I get the most often:

Q: How much network bandwidth will I need?

For a really rough estimate, sum up the amount of uncompressed transaction log backups that you generate in a 24-hour period. You’ll need to push that amount of data per day across the wire. Things get trickier when you have multiple replicas – the primary pushes changes out to all replicas, so if you’ve got 3 replicas in your DR site, you’ll need 3x the network throughput. Calculating burst requirements is much more difficult – but at least this helps you get started.

Q: What’s the performance overhead of a synchronous replica?

From the primary replica, ping the secondary, and see how long (in milliseconds) the response takes. Then run load tests on the secondary’s transaction log drive and see how long writes take. That’s the minimum additional time that will be added to each transaction on the primary. To reduce the impact, make sure your network is low-latency and your transaction log drive writes are fast.

Q: How far behind will my asynchronous replica be?

The faster your network and your servers are, and the less transactional activity you have, the more up-to-date each replica will be. I’ve seen setups where the replicas are indistinguishable from the primary. However, I’ve also seen cases with underpowered replicas, slow wide area network connections, and heavy log activity (like index maintenance) where the replicas were several minutes behind.

Q: What’s the difference between AGs in SQL 2012 and SQL 2014?

SQL Server 2014’s biggest improvement is that the replica’s databases stay visible when the primary drops offline – as long as the underlying cluster is still up and running. If I have one primary and four secondary replicas, and I lose just my primary, the secondaries are still online servicing read-only queries. (Now, you may have difficulties connecting to them unless you’re using the secondary’s name, but that’s another story.) Back in SQL 2012, when the primary dropped offline, all of the secondaries’ copies immediately dropped offline – breaking all read-only reporting queries.

Q: How do I monitor AlwaysOn Availability Groups?

That’s rather challenging right now. Uptime monitoring means knowing if the listener is accepting writeable connections, if it’s correctly routing read-only requests to other servers, if all read-only replicas are up and running, if load is distributed between replicas the way you want, and how far each replica is running behind. Performance monitoring is even tougher – each replica has its own statistics and execution plans, so queries can run at totally different speeds on identical replicas. There’s not really a good answer to this question right now.

Q: How does licensing work with AlwaysOn Availability Groups in SQL 2012 and 2014?

All replicas have to have Enterprise Edition. If you run queries, backups, or DBCCs on a replica, you have to license it. For every server licensed with Software Assurance, you get one standby replica for free – but only as long as it’s truly standby, and you’re not doing queries, backups, or DBCCs on it.

Q: Can I use AlwaysOn Availability Groups with Standard Edition?

Not at this time, but it’s certainly something folks have been asking for since database mirroring has been deprecated.

Q: Do AlwaysOn AGs require shared storage or a SAN?

No, you can use local storage, like cheap SSDs.

Q: Do Availability Groups require a Windows cluster?

Yes, they’re built atop Windows failover clustering. This is the same Windows feature that also enables failover clustered instances of SQL Server, but you don’t have to run a failover clustered instance in order to use AlwaysOn Availability Groups.

Q: Do I need a shared quorum disk for my cluster?

No, and check out Kendra’s video on failover cluster quorum to learn your options.

Q: What version of Windows do I need for AlwaysOn AGs?

We highly recommend Windows Server 2012R2, and here’s why.

Q: Can I have different indexes or tables on my replicas?

No, the replica database contents will be exactly the same as the primary.

Q: If I fail over to an asynchronous replica, and it’s behind, how do I sync up changes after the original primary comes back online?

That is left as an exercise for the reader. When I go through an AG design with a team, we talk about the work required to merge the two databases together. If it’s complex (like lots of parent/child tables with identity fields, and no update datestamp field on the tables), then management agrees to a certain amount of data loss upon failover. For example, “If we’re under fifteen minutes of data is involved, we’re just going to walk away from it.” Then we build a project plan for what it would take to actually recover >15 minutes of data, and management decides whether they want to build that tool ahead of time, or wait until disaster strikes.

Kendra says:  I wish I’d had this list when I started learning about AGs. I was also surprised to learn that crazy things can happen in an AG when you create indexes. Check out more on that here.

4 comments ↑ Back to top

For the past two years, Red Gate’s Tribal Awards have let you, the SQL Server community, nominate and vote on their favorite community voices. Here’s the 2014 winners:

This is the second year in a row that we’ve won the Best Blog award, and there’s three sets of people I want to thank.

First, I want to thank you, dear reader, for giving us the award. We toil away over a hot keyboard to bring you all kinds of useful stuff in a humorous way. It’s great to know that you enjoy our work.

Second, I want to thank my coworkers. Whenever they finish a blog post, they send it to the rest of the team for feedback, and I learn something in every single blog post I read. This job is so much fun.

Finally, I want to thank the other Tribal Award winners and nominees. This community is so uplifting, sharing, and generous. I’m proud to be a member of the SQL Server community.

More @redgate Tribal Awards! This year we won Best Blog again, plus Best Twitterer.

A photo posted by Brent Ozar (@brento) on

3 comments ↑ Back to top

Every time someone tells me, “This database is mission critical – we can’t have data loss or downtime,” I just smile and shake my head. Technology is seriously difficult.

To illustrate, here’s a collection of client stories from the last few years:

  1. The DBCC CHECKDB job ran every week just like it was supposed to – but it failed due to corruption every week. No one got email alerts because the SQL Agent mail was no longer valid – internal email server changes meant the mail was just piling up in SQL Server. CHECKDB had been failing for three years, longer than the backups were kept. Data was permanently lost.
  2. The DBA configured his backups to write to a file share. The sysadmins never understood they were supposed to back up that file share. When the DBA asked for a restore, he was surprised to find there were no backups.
  3. Three SQL Servers were all replicating data to each other. When I asked the DBA where the backups were run, he looked at one server, then another, then the third. He sheepishly admitted – in front of his manager – that there were no backups done anywhere.
  4. The DBA set up full backups daily, plus log backups of all databases in full recovery mode. Later, she put a few databases into simple recovery mode in order to fix an issue. She forgot to put them back into full recovery mode. When problems struck and she needed to recover a database, she lost all data back to the prior full backup.
  5. The SQL Server ran out of space on the C drive. During emergency troubleshooting, someone deleted a bunch of BAK files. The server started up, but databases were offline and corrupt. Turned out the user databases were on the C drive, as were all of the backups – the very backups that were just deleted to free up space.
  6. The DBA started getting odd corruption errors on one of his servers, then more, and quickly all of them. The SAN admin had flashed the storage with new firmware – which had a bug. The DBA was writing his backups to that same SAN, and sure enough, some of the corrupt databases had corrupt backups too.
  7. The admin wanted to restore the production databases onto another server. He tried, but it kept saying the files were in use. He stopped the SQL Server service, deleted the files, started it again, and finally his restore worked – but his phone lit up. Turned out he’d remote desktopped into the wrong server – he was on production.
  8. The developer did a deployment on Friday afternoon, tested it, and went home. However, the deployment had an UPDATE script that wrote invalid data to a percentage of the rows in a critical table. Over the weekend, people worked in that table, putting in valid data and changing some invalid data. On Monday, by the time the team figured out what had happened, the data in the table was a total mess – with real valuable data that shouldn’t be restored over.
  9. The team built a complex AlwaysOn Availability Groups infrastructure for really reliable databases, and full backups every night, and log backups every 4 hours. When someone accidentally issued a delete statement, that delete was instantly replicated across to the other replica. They could still recover back to a point in time – but it meant serious downtime for the AG since you can’t restore databases in an AG. While they debated what to do, more valid data went into their production database – meaning they could no longer simply do a point-in-time restore.

Each of these teams thought their data was safe.

They were wrong.

Kendra says: OK, I’ve got to admit it– I’ve been part of teams where we’ve fallen victim to more than one of these scenarios. And I work on teams with smart people! This can happen to you, look at these stories and your own environment slowly and carefully.

19 comments ↑ Back to top

Both Oracle and SQL Server offer several ways to support disaster recovery scenarios. One of the simplest ways to handle DR in the SQL Server world is to use SQL Server log shipping. Oracle doesn’t offer log shipping as a part of the core product, but it’s possible to set up log shipping in Oracle.

I thought it would be fun to take a look at Oracle and SQL Server log shipping side by side, so let’s take a look!

Log Shipping Set Up

SQL Server wins this one. Reason: there’s a wizard. Even a novice DBA can get log shipping working using the wizard.

Congratulations Oracle professional, you’re setting this up by hand! If you’re at all handy with scripting, once you’ve set up Oracle log shipping, you can port your code to any other Oracle installation and be up and running in a few minutes once you’ve set up the log shipping scripts for the first time. Setting up Oracle log shipping the first time might just take you a while.

Although SQL Server wins for ease of set up, it’s worth reminding you – Oracle log shipping set up can be easily moved from server to server by deploying scripts and changing one or two variables.

Day to Day Operations

SQL Server’s log shipping adds additional requirements for database administrators. Log shipping sets up an extra set of transaction log backup jobs – existing jobs will need to be disabled or deleted. If a DBA takes a log backup outside of the normal log shipping jobs, that can potentially break log shipping. Switching SQL Server into SIMPLE recover mode will break log shipping – and yes, this is more common than you’ll think.

Oracle’s log shipping works by copying archived redo log files. There are no extra backup jobs to add. Instead, the log shipping jobs copy archived redo log files from the flash recovery area. Rotating out active redo log files will move them into the archive redo log file area. DBAs can take advantage of their existing backup strategy. It is still possible for an Oracle DBA to break log shipping by using NOARCHIVELOG mode or adding tablespaces or files without adding them on the secondary.

Of course, a DBA can also use the FORCE LOGGING option to prevent users from switching to NOARCHIVELOG mode and breaking the log shipping.

One other note – Oracle log shipping will move security between both instances while SQL Server’s log shipping won’t.

Reversing Log Shipping

In both cases, there’s a lot of work that has to be done when you want to reverse log shipping.

With SQL Server, you can gracefully reverse log shipping by using a “tail of the log backup”, but not a lot of people know how to do this properly so they can avoid the overhead of re-initializing running a full backup. In the end, people frequently re-initialize log shipping, lose data, or just get very upset until they stumble upon the answer.

To reverse Oracle log shipping, we can use the same set up scripts, but change the name of the primary and standby servers. Depending on how things are set up, it may be possible to rapidly start the Oracle instance using the fast recovery area to bring the standby online and immediately resume log shipping.

If you’ve set up log shipping using robust scripts, the Oracle log shipping approach can be easily reversed. If you haven’t, reversing log shipping poses just as many problems on one system as the other.

Querying the Replica

It’s not uncommon to use log shipping to provide a readable replica. This can be very helpful when users can tolerate stale data for reports, but you have not been able to develop a reporting schema.

With SQL Server, we have to put the SQL Server into STANDBY mode. As Kendra Little explains in Reporting From a Log Shipping Secondary in STANDBY Mode, putting SQL Server into STANDBY mode is necessary to read from the secondary, but it can have make recovery take longer which could, in some scenarios, put you outside of your recovery point objective.

With Oracle, this process is easy – at any point during log shipping, we just mount the database in read only mode using the command ALTER DATABASE OPEN READ ONLY. Users will be able to read up to the last restored transaction. Once it’s time to start restoring data, you can start the database in recovery mode.


Licensing is annoying and complex. If I get this wrong, sound off in the comments and I’ll do my best to clarify and correct.

With SQL Server licensing, this may change depending on when you licensed your SQL Server. However, you get one “free” standby instance, as long as you don’t read from that instance. You do have to pay for software assurance on the log shipping secondary.

Oracle’s licensing is simpler: if it’s running, you pay for it. Also, you pay for support.

Oracle wins through simple licensing. If you’re using Oracle Enterprise Edition, you have many more options for disaster recovery, and much more entertaining ways to pay.

Compression and Encryption

When you’re moving backups across the network, compression can help meet your recovery point objective.

SQL Server can compress backups in the Standard Edition of the product. This can either be enabled as a default SQL Server level setting or in the log shipping jobs.

Oracle’s compression is only found in either Oracle Enterprise Edition or customers using Oracle’s backup to the cloud feature – Oracle Database Backup Service. However, it’s trivial to leverage in-flight compression when moving files between Oracle instances. In a recent deployment, we used rsync to move files between primary and standby and enabled the -z flag to ensure we got compression.

Starting with SQL Server 2014, SQL Server supports backup encryption. Oracle Standard Edition, on the other hand, does not have backup encryption. In the Oracle world DBAs are far more likely to use SSH, scp, or rsync to move files between servers – all of these support encrypting data transferred between servers.

There’s no clear winner here – both compression and encryption are possible with both products, but they are built in to SQL Server. The Oracle approach lets system administrators adopt the practices that are in use across enterprise.


Oracle and SQL Server both have log shipping options available. Both tools are built on different sets of technologies and they both have different strengths and weaknesses.

SQL Server Oracle
Set Up Wizard. Manual, but scriptable.
Operations Additional caveats to prevent breaking the log chain. Mostly automatic. Caveats apply, but preventative measures are available.
Reversing Log Shipping Can be done with rocket science and custom scripts. Scripts can be deployed on both primary and standby – reversing requires switching several commands.
Querying the Standby Yes, only in STANDBY mode. STANDBY can make restores slow. Mitigate with complex scripts. Yes, in READ ONLY mode. Oracle must be restarted to resume log shipping.
Licensing Requires software assurance. Requires a second, fully licensed, Oracle instance.
Compression SQL Server backup compression. OS, storage, or network compression.

Brent Says: WOOHOO, SQL Server actually wins a couple of these!

Kendra Says: Ain’t nothin’ wrong with a wizard, especially when you’re just getting started with something.

8 comments ↑ Back to top

Recently, I wanted to play around with the auto_stats event against a test system running SQL Server 2014. I ran through the session setup GUI and added the auto_stats event. I configured it with a filter (predicate) to only show me auto_stats event in a database named AutoStatsTest. There’s a cost to events that may fire frequently and a cost to filters, but this is my test box and I was just using this to learn– so no biggie, right?

The wizard worked just fine. It created a session which scripted out as this:

ADD EVENT sqlserver.auto_stats(
    WHERE ([database_name]=N'AutoStatsTest')) 
ADD TARGET package0.event_file(SET filename=N'S:\XEvents\Traces\AutoStatsTest')

I started up the session and started running some code to create the AutoStatsTest database, populate data in a table, and then run some SELECT queries. I left the auto_create_statistics property enabled on the database so that the SELECT queries should trigger automatic creation of stats. I then deleted data and ran the selects again.

After running my code, I opened up my trace file and saw…. nothing.

Extended Events None Found

Nothing to see here, folks

Hmm. I thought maybe it was a latency issue. I waited and re-checked the file. I reran the code. No change. I set it up from the beginning, same thing happened. I drank more coffee.

Then I realized I had probably screwed up my filter.

I had set up a filter on my “Filter (predicate)” tab, but I hadn’t actually told it to collect database_name. I went back in and checked that off.

Whoops, I didn't ask for that!

Whoops, I didn’t ask for that!

Now the session scripts out a bit differently — we’ve got SET collect_database_name=(1) in there:

ADD EVENT sqlserver.auto_stats(SET collect_database_name=(1)
    WHERE ([database_name]=N'AutoStatsTest')) 
ADD TARGET package0.event_file(SET filename=N'S:\XEvents\Traces\AutoStatsTest')

After recreating the trace with the new definition and saving it off, I re-ran my test script and this time it worked as expected: I got trace data to explore!

Oh, that's interesting.

Oh, that’s interesting.

The Extended Events wizard can be a little rough around the edges. It didn’t warn me that I was doing something problematic. But it also made it pretty easy to set up my trace, and also test my theory about what was wrong with it.

Summing up: when you set up an Extended Events trace, just because you set a filter or predicate on a field doesn’t necessarily mean that it can use the filter. You may have to additionally specify that the field needs to be collected.

0 comments ↑ Back to top

Extended Events have been available since SQL Server 2008, but they are still a black box to most database administrators. Rather than focusing on the technicalities of sessions and targets, Jes will show the value of Extended Events by showing you what problems can be solved. Metrics can be gathered, performance problems can be targeted, and events can be tracked. Let me show you how easy it is!

Brent Says: The timeline in here showing XE changes from SQL Server 2008 to SQL 2014 is really helpful. Profiler is a dead man walking, and XE is more like a zombie that rose up from the dead and became a powerful superhero working for the good guys. (Look, writing analogies isn’t my day job.)

Kendra Says: I agree– Profiler stinks! I especially love Jes’ analysis of what SQL Trace had that we don’t have in Extended Events: super helpful and not something you’d notice right away. I’m totally on Team Extended Events, I just like knowing the little things I may need to code or manage around.

7 comments ↑ Back to top

You can’t do everything with filtered indexes in SQL Server. For instance, you can’t create the following index:

CREATE INDEX IX_Votes_filter ON dbo.Votes (PostId) 
    WHERE (VoteTypeId = 1 OR VoteTypeId = 2);

If you try, you’ll get the error message:

Msg 156, Level 15, State 1, Line 5
Incorrect syntax near the keyword 'OR'.

Instead, you can use ‘IN’ and create the index this way:

CREATE INDEX IX_Votes_filter ON dbo.Votes (PostId) 
    WHERE (VoteTypeId IN (1,2));

That works– and good news, even queries written with ‘OR’ can use that filtered index, because SQL Server is clever like that. Here’s an execution plan that shows it in action.

Index found!

The filtered index can also be used for a subset of the VoteTypeIds, too. Here’s an example where I use it just for VoteTypeId = 1. (Note that I had to force it to do it– it preferred a different plan.)

filtered index subset of in list

Takeaway: SQL Server’s filtered indexes can be complicated and limited sometimes, but don’t get too tripped up by the fact that they don’t allow “OR”.

Need to learn more about indexes?

0 comments ↑ Back to top