Doctors Without Borders is a truly awesome charity. Medical professionals volunteer their own personal time to go to war-torn countries and solve issues of world health.

SQL Server community member, MCM, and all-around-good-guy Argenis Fernandez organizes Argenis Without Borders, a giving event to help. Last year, the SQL Server community helped raise $13,000 for this well-deserving charity.

doctors-without-bordersNow it’s your turn. Please donate $25 if you can. It would mean a lot to me, and I know it’d mean a lot to people all over the world who urgently need medical help. You’ve got a great job – it’s your turn to give back.

Donate now. Seriously, right now.

2 comments ↑ Back to top
Doug demonstrates what he's going to do to the failed replica.

Doug demonstrates what he’s going to do to the failed replica.

One of your SQL Servers is going to fail.

When one of your AG members goes down, what happens next is just like opening a new SSMS window and typing BEGIN TRAN. From this moment forwards, the transaction log starts growing.

And growing. And growing.

SQL Server is waiting for that failed replica to come back online and start synchronizing again. The transaction log has to contain all of the data necessary to bring that once-failed replica back up to speed, to the current point in time. It doesn’t matter whether this replica is sync or async, important failover partner or an unimportant reporting server.

After every 4 hours of outage troubleshooting (at 4 hours in, 8 hours, 12 hours), I look at the facts:

  • How much free space do I have left on the log file drive?
  • How long will it take to reseed the failed replica by using a full backup?
  • Based on these, should I remove the failed replica from the Availability Group?

This means at minimum, my AG replicas all need enough space to have 4 hours of transaction log activity. I can buy myself more time by disabling index defrag/rebuild jobs as soon as a replica fails, too. (In a perfect world, I’d code that into my index rebuild jobs, but that’s left as an exercise for the reader.)

After the emergency finishes, I don’t bother shrinking the log files back down – because after all, another replica is going to fail. It’s not a matter of if – it’s just a matter of when.

Knowing that all this is going to happen, it makes sense to pre-grow/size your transaction log files large enough to handle at least 4 hours of activity without a log backup. After all, when I’m troubleshooting an availability emergency, I don’t want to waste time troubleshooting a performance emergency. If my AG databases are constantly freezing up while they grow the log files, it’s going to be a bad distraction from the problem I really need to focus on.

11 comments ↑ Back to top

Every Wednesday, we get together for half an hour and take open questions about SQL Server. Here’s a recording of today’s webcast.

Want to join us on the next one? Register now.

1 comment ↑ Back to top

Decisions are emotional, right? Brent loves Sanka and I love Postum. We fight about it all the time. But when we wanted to settle the debate once and for all, we engineered a test to figure out who was right. You can do the same thing to take the personal investment out of technical decisions at work.

Check Yourself at the Door

The first thing you need to remember is that your opinions are just as valid everyone else involved. You need to move those aside and be ready to be wrong.

If the other people involved don’t want to play along, just tell them “Heck, I’d be happy to be wrong because I’ll learn something.” It’s not a bet or a contest, you’re just offering up your willingness to be wrong. Being wrong is great, especially when it’s your turn to be right later on.

Test, Test, Test

This dog knows more science than I do

This dog knows more science than I do

The next step to making that decision is to figure out a test. This test has to depend on your different opinions. The purpose of this test is to get your opinions out of the conversation.

Doing this correctly is really hard. You need to figure out:

  • What are both sides saying?
  • Which metrics will prove both points?
  • What won’t prove anyone’s points?
  • What’s the fastest way to test both options?
  • What’s a realistic scale to for testing?
  • What’s the worst that could happen if either side is right?
  • What’s the worst that could happen if either side is wrong?
  • If you can only run one test, which test should you run?

Hey, You Said This Was Easy!

All of this sounds like a lot of work. It turns out being factually right is just as much work as being factually wrong. If you really want to make sure that you’re choosing the right solution to a problem you need to figure out which option is the most right way to solve the problem – both solutions could be good, but one just might be better. The only way to get to proof is to test everything.

The next time there’s an argument on your team, or between two teams, figure out the best way to test each side’s ideas instead of spending your time arguing about which solution is the best.

Of course, you could always just fight in the parking lot like school kids. I hear that works well, too.

12 comments ↑ Back to top

When you build SQL Servers in your data center, and you license the host with Enterprise Edition, you get the ability to run as many virtual SQL Servers as you want on that hardware.

In the cloud, though, you haven’t had that luxury. In the past, with Azure VMs and Amazon Web Services’ EC2 VMs, you’ve had to license each individual VM. You weren’t able to guarantee that your VMs all lived on the same host, so you had to pay through the nose.

aws_logoSoon, Amazon is fixing that with the ability to get your own EC2 Dedicated Hosts.

This is gonna be huge. You can license one AWS host, and then run as many SQL Server VMs on there as you want. Use the same cool AWS management tools, and dramatically cut your costs.

No word on release dates or pricing yet.

7 comments ↑ Back to top
Away From the Keyboard Podcast

Away From the Keyboard Podcast

Cecil Phillip (@cecilphillip) and Richie Rump (@jorriss) interviewed me for their podcast, Away From the Keyboard.

In the half-hour discussion, we talked about:

  • Why I’m a huge believer in giving away stuff for free
  • How I got started with Topspeed Clarion, then moved to SQL Server
  • When I started blogging, and how that turned into a consulting company

Head on over to AFTK and listen, and stay tuned for part 2.

4 comments ↑ Back to top

Got questions about how to design indexes for performance in SQL Server? You’re not alone! Join Kendra to get answers to the top questions she gets about indexes. You’ll learn how to decide on key column order, how to tell if you should drop an index, whether it’s bad to use the clustered index key in your nonclustered index definition, and more.

From the Q&A: Grant Fritchey’s article on querying plans from the cache using query text.

4 comments ↑ Back to top

Time to make SQL Server demos a little more fun. Now you can download a torrent of the SQL Server database version of the Stack Overflow data dump.

As with the original data dump, this is provided under cc-by-sa 3.0 license. That means you are free to share this database and adapt it for any purpose, even commercially, but you must attribute it to the original authors (not us):


How to Get the Database

  1. Install a BitTorrent client – I recommend Deluge, a fast, free, easy open source one.
  2. Download & open this .torrent file – it’s a small metadata file that tells your BitTorrent client where to connect and start downloading the files.
  3. Wait. The 9GB file may take a few hours to download depending on your internet connection and how many other people are seeding the torrent.
  4. Extract the .7Zip file with 7Zip – it will create the database MDF, LDF, and a Readme.txt file, about 70GB in total. Don’t extract the files directly into your SQL Server’s database directories – instead, extract them somewhere else first, and then move or copy them into the SQL Server’s database directories. (This just avoids permissions hassles.)
  5. Attach the database – it’s in Microsoft SQL Server 2005 format, so you can attach it to any 2005 or newer instance. It doesn’t use any Enterprise Edition features like partitioning or compression, so you can attach it to Developer, Standard, or Enterprise Edition. (If your SSMS crashes or throws permissions errors, you likely tried extracting the archive directly into the database directory, and you’ve got permissions problems on the data/log files.)

Please leave the torrent up and running – seeding the torrent helps other folks get it faster.

Why I’m Using BitTorrent

BitTorrent is a peer-to-peer file distribution system. When you download a torrent, you also become a host for that torrent, sharing your own bandwidth to help distribute the file. It’s a free way to get a big file shared amongst friends.

The 9GB download is relatively large, so it would be expensive for me to host on a server. For example, if I hosted it in Amazon S3, I’d have to pay around $1 USD every time somebody downloaded the file. I like you people, but not quite enough to go around handing you dollar bills. (As it is, I’m paying for a seedbox to get this thing started.)

Some corporate firewalls understandably block BitTorrent because it can use a lot of bandwidth, and it can also be used to share pirated movies/music/software/whatever. If you have difficulty running BitTorrent from work, you’ll need to download it from home instead.

What’s Inside the StackOverflow Database

I want you to get started quickly while still keeping the database size small, so:

  • All tables have a clustered index
  • No other indexes are included (nonclustered or full text)
  • The log file is small, and you should grow it out if you plan to build indexes or modify data
  • It’s distributed as an mdf/ldf so you don’t need space to restore a backup – just attach it
  • It only includes data, not data for other Stack sites
  • The PostHistory table is not included

To get started, here’s a few helpful links:

21 comments ↑ Back to top

An interesting question came up in our SQL Server Performance Tuning course in Chicago: when creating an indexed view, does it require an exclusive lock on the underlying table or tables?

Let’s test it out with a simple indexed view run against a non-production environment. (AKA, a VM on my laptop running SQL Server 2014.)

Tracing the Locks

To trace the locks, I just used one of the built-in XEvents templates tor tracing locks, added a filter so it only looked at locks from my current session (53), and set it to write to an event file. The query scripts out like this:

ADD EVENT sqlserver.lock_acquired (
     SET collect_database_name=(1),collect_resource_description=(1)
    WHERE (
         AND [package0].[equal_boolean]([sqlserver].[is_system],(0))
         AND [package0].[equal_uint64]([sqlserver].[session_id],(53))
ADD TARGET package0.event_file (SET filename = N'S:\Xevents\Traces\LockTrace.xel')

Creating the Indexed View

I test it out by creating and dropping the indexed view this way:

FROM dbo.Posts
GROUP BY PostTypeId;

-- This is where it becomes an indexed view.
CREATE UNIQUE CLUSTERED INDEX ix_indexme on dbo.IndexMe (PostTypeId)

DROP INDEX ix_indexme on dbo.IndexMe;

Now let’s analyze the locks!

I stop my trace and open up the event file. I click around in the magic XEvents toolbar and group by object_id and mode (that’s the lock mode). And here’s what I get:



The low object_ids are internal system tables. Object_id 526624919 =dbo.Posts. Object_id 843150049 = dbo.IndexMe.

Translating this out:

  • dbo.Posts required S and SCH_S locks
  • dbo.IndexMe required X, SCH_S, and SCH_M locks

We didn’t require exclusive locks on the underlying table. I did require shared locks, and creating this indexed view could require lots of IO or be part of a blocking chain with other transactions. But creating the indexed view only needed exclusive locks on the indexed view itself.

0 comments ↑ Back to top

You can download the next preview of SQL Server 2016 today, and the announcement blog post touted a few improvements to maintenance plans. Oddly, the screenshots in the announcement were apparently done with Microsoft Paint, so I fired up a VM to see how things really look.

The DBCC CHECKDB step in the wizard now has checkboxes for physical only and tablock:



The index reorg task now lets you do LOB compaction, scan index stats, and then only optimize the index based on statistics:

Index Reorg

Index Reorg

The rebuild task has similar options in the GUI, plus options about keeping the index online:

Index Rebuilds

Index Rebuilds

Updating statistics – I honestly don’t remember what older versions looked like. (I don’t step through that wizard often.)

Updating Statistics

Updating Statistics

The options on the full backup task got some attention as well:

Full Backup Options

Full Backup Options

While improvements are usually a good thing, I gotta be honest: these screens are pretty haphazard. Stuff doesn’t line up, warnings don’t fit into boxes, and it feels terribly slapped together. It doesn’t give me a high level of confidence, certainly not enough to steer me away from Ola Hallengren’s much more powerful and flexible free scripts. I get it – not everybody wants to use add-on scripts – but I’m just not filled with confidence when I see these screens.

That could mean we’re still quite a ways away from a SQL Server 2016 release, which is good news and bad news.

11 comments ↑ Back to top