StackOverflow went through a few infrastructure upgrades this week. Jeff Atwood blogged about the changes, and Kyle Brandt goes into more detail. Here’s a short recap of some of the SQL Server changes:
- Changed physical server models (from IBM/Lenovo to Dells)
- Went from two quad-core Xeons to two six-core Xeons
- Enabled hyperthreading (so it looked like 24 cores)
- Went from SQL Server 2008 to SQL Server 2008 R2
- Went from 6 SATA drives to 8 SATA drives (yes, they’re really doing over 2 million page views a day using a handful of cheap drives, but it’s all in the l33t programming, because they do their own caching)
- Did some index tuning
- Moved from one datacenter to another
That’s a lot of moving parts. When the dust settled, the CPU use on the SQL Server had gone up, not down. So where do we start troubleshooting?
I love good developers because they do unit testing. (Yes, readers, that means if you’re not doing unit testing, you’re not a good developer.) The most reliable way to troubleshoot is to boil things down to the simplest repeatable test possible. They identified two different queries that had dramatically different results between the old server and the new server. I looked at the execution plans produced by these queries on both the old and new servers. Everything was identical – same plans, similar reads, similar results – except for one thing: CPU time. You can measure CPU time for a query by executing SET STATISTICS TIME ON in SQL Server Management Studio. From that point forward, all queries in that session will have a bonus set of metrics in the Messages tab listing the milliseconds of CPU time used.
The same query doing the same amount of work took about twice as many milliseconds on the new server as it did the old server. That meant we had two possibilities: either something under the hood of SQL Server 2008 R2 had gone horribly awry and was doing twice as much CPU work for the exact same execution plan, or the new CPUs were twice as slow. The server manufacturers said the new CPUs ran at 3.33GHz – but did they really? I busted out the trusty free CPU-Z tool and ran it on the new servers:
Compare the “Specification” line to the “Core Speed” line. As we say in the business, wah wah wah waaaaah.
The new CPUs were Intel’s diabolical power-saving Nehalems, which have given me no end of grief. My hat is off to them for saving the planet by using less power, but my pants are down to them for killing performance without informing the users. In the last few weeks, I’ve seen several cases where server upgrades have resulted in worse performance, and one of the key factors has been throttled-down CPUs. In theory, the servers should crank up the juice according to demand, but in reality, that’s rarely the case. Server manufacturers are hiding power-saving settings in the BIOS, and Windows Server ships with a default power-saving option that throttles the CPU down way too often.
I was able to prove the problem by running several simultaneous load-test queries that push SQL Server’s CPU hard. When running just one or two of them, the processors didn’t speed up, but if I ran enough of them, the server said, “Alright, alright, I get it. I’ll speed my cores up. Jeez.”
The instant I reduced the number of running queries, the CPUs got lazy again. If you think that’s bad, there might be servers in your datacenter right now running just 800MHz, and they won’t even speed up when they’re under load:
There’s a few morals to the story:
Periodically test your CPUs with CPU-Z. Know their spec speed and their current speed. Saving power by throttling down is fine, but understand that they may not run full speed until they’re under very heavy load. StackOverflow was consistently running 50-60% CPU load, and the processors still weren’t speeding up. Since the power-saving speed was 1.6GHz, their CPU-intensive queries were literally running half as fast as their old server.
When you change your infrastructure, minimize variables. If you change half a dozen things at once, troubleshooting performance is much more difficult. I used to work at a company that did periodic outages to fix infrastructure issues like battery backups and air conditioning, and one of our IT team loved to take advantage of those excused outages to upgrade all his servers. More memory, different network cabling designs, new software versions, you name it, he’d do it. When the power came back on and his servers didn’t work, his users kept asking, “But I thought all we were doing was repairing the air conditioning?”
A/B test your solution with the smallest unit test possible. In StackOverflow’s case, we narrowed it down to just two queries, and we set up a separate testing environment where we could try out different solutions. When we narrowed it down to the CPU clock speeds, we were able to instantly say with confidence, “Yep, we found it, and now we can take production down to fix it.”
At this year’s PASS Summit, we’re adding a series of sessions called Lightning Talks. In a normal one-hour time slot, you can sit and watch several presenters give rapid-fire, 5-minute micro-sessions on various topics.
My Lightning Talk is “Storage Area Networks Simplified” – I’m covering the two most important things you need to know about SANs in just five minutes. Here’s a video of my rehearsal at home. I’m playing with iMovie’s green screen effects, but I forgot to wear a shirt that didn’t have green in it, so it has some sparkly action in the video. Ah, well, that’s what you get with a rehearsal.
My rehearsals ended up being a little over four minutes long, so I had to cut it even tighter. In a live environment, people will be (hopefully) laughing at some slides, so I won’t be able to deliver it this quickly. I just turned in my final deck, and I sliced out a couple of slides.
I’m excited to see how these turn out! The Lightning Talks schedule is:
- Tuesday 3PM-4:15PM – Room 201 (144)
- Wednesday 1:30-2:45PM – Room 201 (144)
- Thursday 1:00-2:15PM – Room 201 (144)
You can view which presenters are in each time slot at the PASS Lightning Talks page.
For the first time this year, the Professional Association for SQL Server will be broadcasting the PASS Summit keynote live on the web!
Each day of the Summit, Microsoft staff will open up the day with their keynote speeches:
- Tuesday – Ted Kummert will be talking about Microsoft’s commitment to mission critical applications and easier business intelligence. Given that the other two keynotes mention the next version of SQL Server, I bet Ted will also cover it here, meaning this will be your first look at vNext.
- Wednesday – Quentin Clark will be showcasing the next version of SQL Server, including “powerful features for data developers and a unified database development experience” – every time he says DAC Pack, you have to take a shot of espresso.
- Thursday – David DeWitt will cover query optimization and better execution plans in future versions of SQL Server
And it’s frrrrrreeee. Register now to watch the keynote live on November 9th. I’ll also be liveblogging it with my own trademarked irreverent take on the announcements. And maybe some Lady Gaga pictures. No promises, though.
When I deliver my performance tuning recommendations to my clients, I want them to be able to go back to my findings again and again over time, learning more each time they read. One of my favorite tools to help their journey is a list of recommended reading material.
I rarely give out links to whitepapers because they’re, uh, boring. Really boring. I won’t ask my clients to read something I hate reading.
Instead, I put a slide like this in my presentation to the client’s DBAs and developers:
When this slide comes up, I say:
“During our performance tuning work, we came across a lot of issues with how to read execution plans and why your queries aren’t performing as fast as you’d like. The biggest challenge in your environment was SQL Server’s inability to use your indexes to satisfy your queries – a concept called SARGability. It’s a big topic, and you’re going to want to learn more about this over time. These are the blogs and speakers I’d recommend that you check out to learn more.”
I even recommend that my clients contact bloggers directly if they’ve got questions about a particular post. I explain to them that bloggers WANT to help the community, and that’s why they blog.
I hope I’m getting you more exposure to the right people – the kind of people who are willing to pay money to get access to what’s in your head. I do that because I want to reward bloggers for sharing, and because there’s plenty of work to go around. It shows my clients that I want to give them the best possible advice at the lowest cost, and that I want to open their eyes to the SQL Server community. This community is just fantastic, and it keeps getting better.
I just fell in love.
If you hook up a projector to your laptop, you can configure the projector to be a second screen rather than a copy of what’s on your laptop display. When you do, your presentation shows up on the projector, but your laptop display shows something else – Presenter View. This has been around for several versions of PowerPoint on both Windows and Mac, but the new Office 2011 for the Mac just kicked it up a notch.
Here’s what my laptop screen shows while I’m presenting:
At the top, there’s a bar showing how far into your presentation you’re currently at – giving you an at-a-glance idea of whether you’re on track. There’s buttons for Swap Displays (for easier dual-monitor setup), Tips (showing the keystroke shortcuts like B to black out the audience screen), and Exit Show.
On the left, “Step 3″ is the current slide that the audience is seeing.
To the right, “One Approach: Wait” is the next slide that the audience will see. This helps you build better segues.
At the bottom left, there’s my slide notes for the current slide.
At the bottom right, I can type in meeting notes live without exiting my presentation. This is great when someone asks me a question that I want to address later, or if there’s something I want to add to my presentation the next time I give it.
Move your mouse to the bottom of the screen, and PowerPoint pops up thumbnails for your slides, giving you a quick way to jump around from slide to slide without the audience getting dizzy:
Even if Office 2011 doesn’t bring any other improvements, that’s enough for me. This is the kind of tool that helps me be a better presenter. YAY!
If you’re an MSDN or TechNet subscriber, you can download Office 2011 for the Mac through your subscriptions downloads. Amazon’s selling Office 2011 for $175 today.
Bob Pusateri (Blog – @SQLBob) just posted his Bucket List – the list of things he wants to do before he kicks the bucket. Aaron Bertrand followed up with his. If you haven’t seen the movie The Bucket List starring Morgan Freeman and Jack Nicholson, I’d highly recommend it. Immediately after seeing that movie, I started assembling my own bucket list, and I’ve been gradually crossing off items ever since.
Here’s the uncompleted items on my bucket list:
Take a road racing class – I’ve always been fascinated by race cars. Ironically, I drive really slowly on public roads, rarely getting a speeding ticket, but I still love going fast. I want to take a multi-day course by Skip Barber or Bob Bondurant.
Drive a Porsche 911 on a twisty road – I don’t really wanna own a 911 because the maintenance costs are crazy, I’d rarely use the performance, and downtown Chicago parking is expensive. Another parking spot would cost me $300-$400 per month. I’d like to just have a great driving experience with a 911, though, and only after taking the road racing class.
Visit Tokyo – I’ve traveled the world and seen almost every city I’ve ever wanted to see, but Tokyo still calls to me. I like the density, the strange cultural differences, and everything else I’ve learned about Tokyo.
Take everyone in our family on a cruise – not all at once, but individually. I’ve taken Mom, Dad, and Caryl (my stepmom) so far. Next up – my sister & crew (her husband, their son), then it’s over to Erika’s side of the family.
Have the #1 highest rated session at a conference – I’m not competitive. It’s not about being better than anybody else – it’s just about being as good as I can possibly be. I made the Best of the 2009 PASS Summit list with my high availability & disaster recovery session at #8, but something in me really wants to be el numero uno.
Retire while I’m young enough to enjoy it – after all, that’s what this is all about, right? I don’t want to work until I’m dead. I love what I do – and I mean, I really, really love it – but I could stay busy for the rest of my life doing things that aren’t on this list (but *are* on my GTD list). I’d like to learn more about architecture, engineering, wine, writing, snorkeling, photography, and boats. I don’t want to change the world, but I want to appreciate more of the wonderful, amazing things happening around me every day.
If you haven’t built a bucket list, go rent the movie, and start making your list. Work can wait. It’ll be there when you get back.
Consultants need to get around.
One of the key differences between consultants and contractors is that contractors tend to stay at the same company for weeks or months, while consultants jump around very quickly from project to project. Most of my engagements are just 3-4 days long: I parachute in, put out the burning fires, train the employees how to use my fire extinguishers, and then jet off to the next company. I get to see a lot of cutting-edge SQL Server implementations, people doing wild and crazy things with SQL Server. Sometimes the most valuable thing in my report is a line that says, “You’re outside the bell curve of normal implementations.” Because I see a lot of shops, and because I talk to a lot of DBAs when I travel to present, I’ve got a pretty good handle on what’s a normal SQL Server environment versus someone who’s pushing the boundaries.
I recently spent a day at Fog Creek Software in Manhattan training their developers. These guys (including the StackOverflow team) are among my favorite clients because they’re really solid developers. Joel Spolsky (Blog – @Spolsky) co-founded both companies, and he wrote Smart and Gets Things Done – a guide to hiring the best coders. You would expect that his team would push SQL Server a little harder than the average company.
One of their design patterns is to use a separate database per client. This approach has its pros and cons, but the important thing to note is that they’re running over 4,000 databases per SQL Server.
I’ll let that sink in – over 4,000 databases per server.
And did I mention that they’re using log shipping for disaster recovery?
It works, but SQL Server doesn’t ship with tools that make life easy for their developers or database administrators. Managing version deployments across thousands of databases is a nightmare, but they’ve built their own schema tools. DBA tasks like index tuning, index defragmenting, statistics updates, procedure cache analysis, you name it, it’s harder to manage. As a consultant, I can’t waltz in and say, “Well, golly, you need to throw all that out and rewrite your application to host lots of clients in a single database.” I can make a case for it, but there’s a big business cost at this point, and as long as it’s working for them, it’s tough for me to justify it.
My most valuable advice in that case is to say something like:
“You’re outside of the typical bell curve for SQL Server implementations. I usually see servers up to around 100-200 databases, but beyond that point it’s usually easier to manage by splitting the databases into multiple instances or multiple virtual servers. I’m not recommending that you simply build more SQL Server instances and divide the load – I don’t see a gain there – but there may be tools or features you can’t utilize due to this design pattern. For example, database mirroring won’t scale to this number of databases, and some off-the-shelf performance tools will have a tough time aggregating similar queries across databases. Your infrastructure isn’t broken – but you’re probably in the top .01% of SQL Server implementations in terms of number of databases per server.”
Clients need to know when they’re doing something unusual – even when you’re not recommending that they do anything about it.
(I got Joel’s approval before writing this, by the way – ordinarily I would never mention a client by name in public.)
Email servers restrict how big your file attachments can be. I frequently ask my clients to send me trace files and diagnostic logs that are hundreds of megabytes (even compressed), and doing this over email just doesn’t work. In the old days, we set up FTP servers to move files around, but FTP uploads aren’t intuitive for most users.
Web services like Filedropper.com and drop.io sprouted up to help alleviate this pain. Users just go to those sites, upload the file, and they get a URL they can share with anyone to download the file. They’re easy and useful, but they reserve the best features – like big file sizes or privacy – to paying users.
Now you can host your own file-upload service on your web server and use Amazon S3 for the cloud back end. The whole thing is pretty simple:
Step 1 – get an Amazon S3 account. I’ve long been a fan of Amazon’s Simple Storage Service (S3), a cheap cloud file server. Bandwidth is free until November 1st, 2010, after which it’s just $.15 per GB, and storage costs are $.15 per GB.
Step 2 – download the HTML files – Ricky Matata did the hard work of building a form with a Flash-based file upload component and hooking it up to Amazon S3.
Step 3 – edit the parameters in the files – there’s a few parameters for things like the maximum file size you’ll allow, the number of files you’ll allow, and your Amazon S3 credentials. If you wanna get fancy, you can also edit the form itself – I simplified mine so that it only shows a file upload and nothing else.
Step 4 – upload the files to your web site & S3 – and presto, you’re done. You can point your clients to the web page whenever they need to send you files. As users upload files, they’ll be available in your Amazon S3 account – you can download them via the web. Depending on how you set up your privacy on your S3 folders (aka buckets), the public can also download those files – handy if you want to share files with multiple people.
Good coders might add things like email alerts so that they get an email whenever someone uploads a file for them, or perhaps automatic deletion of the uploaded files after X days. I’m not a good coder, and I’m quite happy with this setup. I just tell the clients to let me know after they’ve sent me the files I needed.
Tomorrow evening, I’ll be in-person live at the New England SQL Server User Group. They meet in the Microsoft offices in Waltham, MA at 6:30PM. I’ll be covering one of my favorite topics, virtualizing SQL Server:
Virtualization isn’t here to make things faster – it’s here to make things cheaper. The better you know about the compromises it makes, the better you can configure your SQL Servers. Brent Ozar, a Microsoft Certified Master, knows this firsthand because he’s been both a VMware administrator and a SQL Server DBA. He’ll explain how virtualization changes CPU, memory, networking, storage, and high availability. You’ll learn what questions to ask your virtualization sysadmins and how to know if you’re leaving performance on the datacenter floor.
This is a whopping two-hour presentation covering VMware, Hyper-V, Xen, storage, high availability, and disaster recovery. I’ve got hours more of material with me, too, so if you’ve got questions, I’ve got slides.
To attend, RSVPs are required. The easiest way to RSVP is to sign up for their mailing list. This presentation will not be webcast – I’m packin’ light for this trip.
I just got my MVP prize package in the mail. The package is artificially big – it’s 95% cardboard, 5% content – but that’s because the really good stuff doesn’t come in boxes.
MVP awardees get a lot of free software licensing from not just Microsoft, but other software vendors as well. All of the SQL Server vendors give MVPs free individual licenses for most of their software, but it gets even better. Remember that the MVP program spans more than just SQL Server – vendors like VMware and TechSmith (Camtasia) support the MVP program too! There’s no way I could do as much community work without the help of these vendors, because my home lab is totally licensed thanks to them.
MVPs get access to some internal resources at Microsoft – things like webcasts about upcoming software changes or improvements. I’m hesitant to say that MVPs get access to Microsoft team members, though, because the public can get that same access. Microsofties won’t give you NDA information, but they’ll come to the public via blogs or conferences and ask for your opinion on how the software should work. Heck, anybody can just go to connect.Microsoft.com and submit ideas or changes they’d like to see. I’ve never had a Microsoft person say to me, “Well, I thought your idea sucked, but because you’re an MVP…” (Actually, they just stop after that first part of the sentence.) I’m not bashing the MVP program by any means – I’m saying that unlike other vendors I know and love <cough>Apple</cough>, the public can participate (somewhat) in the development process.
The biggest benefit by far, though, is communication with other MVPs. Your fellow MVPs are interesting people doing interesting things with Microsoft technology. They aren’t necessarily the smartest or best users of a particular product, but they’re the kinds of people who are willing to talk with you about what they’re doing. They’re helpful to the community and to each other.
You would imagine that Microsoft, being the vendor that brings us such killer communication tools as SharePoint, Outlook, Exchange Server, Messenger, Office Communication Server, and RoundTable would really pull out all the stops when it comes to helping MVPs communicate with each other. You would be wrong. Until recently, we had an NNTP newsgroup server, and now we’ve got an email distribution list. It’s not the sexiest method of communication, but it works, and I love being a fly on the virtual wall when these people get into passionate arguments about niche features.
If – or when – I lose my MVP status, the one thing I’ll miss is that email list.