Doctors Without Borders is a truly awesome charity. Medical professionals volunteer their own personal time to go to war-torn countries and solve issues of world health.
SQL Server community member, MCM, and all-around-good-guy Argenis Fernandez organizes Argenis Without Borders, a giving event to help. Last year, the SQL Server community helped raise $13,000 for this well-deserving charity.
Now it’s your turn. Please donate $25 if you can. It would mean a lot to me, and I know it’d mean a lot to people all over the world who urgently need medical help. You’ve got a great job – it’s your turn to give back.
One of your SQL Servers is going to fail.
When one of your AG members goes down, what happens next is just like opening a new SSMS window and typing BEGIN TRAN. From this moment forwards, the transaction log starts growing.
And growing. And growing.
SQL Server is waiting for that failed replica to come back online and start synchronizing again. The transaction log has to contain all of the data necessary to bring that once-failed replica back up to speed, to the current point in time. It doesn’t matter whether this replica is sync or async, important failover partner or an unimportant reporting server.
After every 4 hours of outage troubleshooting (at 4 hours in, 8 hours, 12 hours), I look at the facts:
- How much free space do I have left on the log file drive?
- How long will it take to reseed the failed replica by using a full backup?
- Based on these, should I remove the failed replica from the Availability Group?
This means at minimum, my AG replicas all need enough space to have 4 hours of transaction log activity. I can buy myself more time by disabling index defrag/rebuild jobs as soon as a replica fails, too. (In a perfect world, I’d code that into my index rebuild jobs, but that’s left as an exercise for the reader.)
After the emergency finishes, I don’t bother shrinking the log files back down – because after all, another replica is going to fail. It’s not a matter of if – it’s just a matter of when.
Knowing that all this is going to happen, it makes sense to pre-grow/size your transaction log files large enough to handle at least 4 hours of activity without a log backup. After all, when I’m troubleshooting an availability emergency, I don’t want to waste time troubleshooting a performance emergency. If my AG databases are constantly freezing up while they grow the log files, it’s going to be a bad distraction from the problem I really need to focus on.
When you build SQL Servers in your data center, and you license the host with Enterprise Edition, you get the ability to run as many virtual SQL Servers as you want on that hardware.
In the cloud, though, you haven’t had that luxury. In the past, with Azure VMs and Amazon Web Services’ EC2 VMs, you’ve had to license each individual VM. You weren’t able to guarantee that your VMs all lived on the same host, so you had to pay through the nose.
Soon, Amazon is fixing that with the ability to get your own EC2 Dedicated Hosts.
This is gonna be huge. You can license one AWS host, and then run as many SQL Server VMs on there as you want. Use the same cool AWS management tools, and dramatically cut your costs.
No word on release dates or pricing yet.
In the half-hour discussion, we talked about:
- Why I’m a huge believer in giving away stuff for free
- How I got started with Topspeed Clarion, then moved to SQL Server
- When I started blogging, and how that turned into a consulting company
Head on over to AFTK and listen, and stay tuned for part 2.
Time to make SQL Server demos a little more fun. Now you can download a torrent of the SQL Server database version of the Stack Overflow data dump.
- tl;dr – here’s a torrent with SQL Server database of the StackOverflow data
- Data export as of: 2015 August
- Torrent contents: 9GB 7Zip file containing the StackOverflow database MDF, LDF, and a Readme.txt
- Database size after extracting the 7z file: 70GB
As with the original data dump, this is provided under cc-by-sa 3.0 license. That means you are free to share this database and adapt it for any purpose, even commercially, but you must attribute it to the original authors (not us):
How to Get the Database
- Install a BitTorrent client – I recommend Deluge, a fast, free, easy open source one.
- Download & open this .torrent file – it’s a small metadata file that tells your BitTorrent client where to connect and start downloading the files.
- Wait. The 9GB file may take a few hours to download depending on your internet connection and how many other people are seeding the torrent.
- Extract the .7Zip file with 7Zip – it will create the database MDF, LDF, and a Readme.txt file, about 70GB in total. Don’t extract the files directly into your SQL Server’s database directories – instead, extract them somewhere else first, and then move or copy them into the SQL Server’s database directories. (This just avoids permissions hassles.)
- Attach the database – it’s in Microsoft SQL Server 2005 format, so you can attach it to any 2005 or newer instance. It doesn’t use any Enterprise Edition features like partitioning or compression, so you can attach it to Developer, Standard, or Enterprise Edition. (If your SSMS crashes or throws permissions errors, you likely tried extracting the archive directly into the database directory, and you’ve got permissions problems on the data/log files.)
Please leave the torrent up and running – seeding the torrent helps other folks get it faster.
Why I’m Using BitTorrent
BitTorrent is a peer-to-peer file distribution system. When you download a torrent, you also become a host for that torrent, sharing your own bandwidth to help distribute the file. It’s a free way to get a big file shared amongst friends.
The 9GB download is relatively large, so it would be expensive for me to host on a server. For example, if I hosted it in Amazon S3, I’d have to pay around $1 USD every time somebody downloaded the file. I like you people, but not quite enough to go around handing you dollar bills. (As it is, I’m paying for a seedbox to get this thing started.)
Some corporate firewalls understandably block BitTorrent because it can use a lot of bandwidth, and it can also be used to share pirated movies/music/software/whatever. If you have difficulty running BitTorrent from work, you’ll need to download it from home instead.
What’s Inside the StackOverflow Database
I want you to get started quickly while still keeping the database size small, so:
- All tables have a clustered index
- No other indexes are included (nonclustered or full text)
- The log file is small, and you should grow it out if you plan to build indexes or modify data
- It’s distributed as an mdf/ldf so you don’t need space to restore a backup – just attach it
- It only includes StackOverflow.com data, not data for other Stack sites
- The PostHistory table is not included
To get started, here’s a few helpful links:
You can download the next preview of SQL Server 2016 today, and the announcement blog post touted a few improvements to maintenance plans. Oddly, the screenshots in the announcement were apparently done with Microsoft Paint, so I fired up a VM to see how things really look.
The DBCC CHECKDB step in the wizard now has checkboxes for physical only and tablock:
The index reorg task now lets you do LOB compaction, scan index stats, and then only optimize the index based on statistics:
The rebuild task has similar options in the GUI, plus options about keeping the index online:
Updating statistics – I honestly don’t remember what older versions looked like. (I don’t step through that wizard often.)
The options on the full backup task got some attention as well:
While improvements are usually a good thing, I gotta be honest: these screens are pretty haphazard. Stuff doesn’t line up, warnings don’t fit into boxes, and it feels terribly slapped together. It doesn’t give me a high level of confidence, certainly not enough to steer me away from Ola Hallengren’s much more powerful and flexible free scripts. I get it – not everybody wants to use add-on scripts – but I’m just not filled with confidence when I see these screens.
That could mean we’re still quite a ways away from a SQL Server 2016 release, which is good news and bad news.
We just finished up the last of our 2015 training classes, and the reviews have been glowing:
“I have attended many training classes in my 19 years as a DBA, including Oracle and SQL Server. This is the first class where 75+% of the information I learned is applicable to my current work environment.” – Christina Z.
“I thought the class was much more useful than the Microsoft Official Curriculum. Those classes seem to teach to the book and instructors often can’t answer questions that aren’t included in the curriculum. You guys have real-world experience that you can share with the students. Very valuable!” – Kris C.
“They’re really best of the best on the market. And always open to QA sessions. And they asked do you have any questions? And ALWAYS have answers!!!!” – Vlad B.
And now, registration is open for next year’s training class lineup. Drum roll please:
Advanced Querying and Indexing – 5 days, $3,995
- Jan 11-15, Newark
- August 8-12, Portland
SQL Server Performance Troubleshooting –
4 days now expanded to 5 days, $3,995:
- February 22-26, San Diego
- April 11-15, Newark
- June 6-10, Chicago
- August 15-19, Portland
- December 5-9, Philadelphia
The Senior DBA Class of 2016 – 5 days, $3,995
- March 7-11, Chicago
- May 2-6, Denver
- July 11-15, Philadelphia
We’ve been updating each class to reflect new technologies and techniques, and we’re excited to share what we’ve been teaching our clients. And of course, if you’d like to bring one of us onsite to teach a custom class lineup, check out our training catalog and email us.
Sure, you’d much rather work with SQL Server 2014, but what’s more prevalent out in the real world? At Dell DBA Days, I saw a chart that shocked me:
That chart shows the number of servers running Dell Spotlight Essentials, a free SQL Server monitoring tool. Note that SQL Server 2014 has 4% of the population, about 1/4 as much as SQL Server 2005.
But the terrifying part: for every 2 SQL Server 2014 instances, there’s one SQL Server 2000 instance still kicking around! That’s not exactly a rapid adoption rate.
I wonder why we aren’t we upgrading our SQL Server 2000 instances? (And I’m not pointing the figure at you – it’s likely the business that’s making you keep it around.)
There’s a fine art to running a productive meeting, keeping the discussion focused and scoping things tight to meet your scheduled finish time. Today’s consulting lines post is about keeping people happy while still finishing meetings on time.
Let’s join a troubleshooting session where I’ve been brought in to help a client whose AlwaysOn Availability Groups are failing over randomly in the middle of the day.
Me: “According to the network control panel, this server has two Ethernet ports, but only one of them is plugged in?”
Sarah the Sysadmin: “Yeah. All our physical servers are set up that way, and I’m not happy about it either. We’re going to be virtualizing these servers early next year, and I’ve got a preliminary plan for the networking switches that I can show you. I can just present one network card to the guest and do the teaming at the host layer, right?”
Me: “That’s a great question. I’m going to set up a section of the whiteboard over here called the Parking Lot, and I’m going to write that down.”
What That Line Does
It conveys that the question is indeed a good one, and you’re qualified to answer it, but that the answer is going to take more than a minute or two.
It also sets up a normal protocol that can help you through the rest of the meeting – and heck, even the client relationship. One of my long-term clients got so trained that when I walked into their conference room, they’d already white boarded out the day’s agenda ahead of time – and set up a parking lot with a few items!
If you’re working remotely, you can use a shared Google Drive document to track the meeting’s agenda, notes, resources, and parking lot. (Just be careful with the Google permissions – those get kinda tricky when you’re working with multiple clients.)
What Happens Next
Sarah: “But this will only take a minute – ”
Me: “Are you willing to virtualize those servers this weekend?”
Sarah: “Oh definitely not, we have to buy the hosts and – ”
Me: “Can we just leave the failover problems as-is, and wait for the virtualization replacements?”
Mark the Manager: “No, we need the failovers fixed now, like yesterday.”
Me: “OK, so here’s the deal. I definitely want to help you configure the virtualization project, and the answer involves sketching out whether you’re using iSCSI or not. My contract for today’s engagement depends on me solving this failover problem, so I kinda gotta focus on that for now. If I don’t fix this, the executives won’t pay my bill, and my wife is getting tired of ketchup sandwiches, ha ha ho ho. So let’s put this in the parking lot, and we can talk about it as soon as we’ve got the root cause of the failover identified. Is that fair?”
For more meeting tactics, check out my consulting lines.