Monthly Archives: November 2007

Dinner with Wil Wheaton at Child’s Play

ebay-bid.png

I know there’s absolutely no chance I’m going to win this, but I just have to try. For at least one brief moment in time, I’m the high bidder. Hooah!

My favorite web comic, Penny Arcade, runs a charity called Child’s Play that gives toys, games, books and cash for kids in hospitals. Every year, they throw a big live auction fest party thing, and this year they’re auctioning off a seat at Wil Wheaton’s table.

Any one of these things (meeting the real-life Tycho and Gabe, hanging with a geek legend, donating to a charity for sick kids, attending an amazing party in Seattle, etc) is a good reason to get out the wallet, but when all of these nebulous factors come together, it’s a dangerous time for Mr. Savings Account.

(Note to any of my relatives reading this blog: don’t even think about bidding for this to surprise me, because it’s going to go for at least ten grand. I know you love me beyond any forms of monetary compensation, but no showing off. A Starbucks gift card is fine.)

Update on 12/3 – it went for only $1,035?!?  I’m heartbroken.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

SQL Backup Software: Part 2 – Quad Cores are Changing the Game

In my last post, Why Native SQL Backups Suck, I talked about the weaknesses of native SQL Server backups. Today, I’m going to extend that a little by talking about one of the greatest surprises for DBAs in recent history: the advent of dirt-cheap multi-core processors that don’t cost extra for SQL licensing.

How SQL Server Licensing Is Affected by Quad Cores

Microsoft SQL Server is licensed by the CPU socket, not the core, so it costs the same to license a single-core CPU as it does a quad-core CPU. I’ve used that logic to convince executives to upgrade older single-core database servers to new multi-core hardware because they can often pay for the server hardware via license savings. It’s twice as cheap to license a brand new 2-cpu quad-core box than it is to license a 4-cpu box with single cores, and the license savings completely pays for the cost of a new server.

Most of the time, quad-core CPU’s aren’t really a compelling feature for database administrators because SQL Server experiences more I/O backups than CPU power backups. We pour money into drives, HBA’s, and preach the benefits of raid 10, but we don’t spend a lot of time comparing processors in great detail. I/O is the big bottleneck. This is especially true during the backup window. Backing up a SQL Server database consists of reading a lot of information from drives, and then writing that same information to another set of drives (either local or across the network).

So during the backup window, we have all these extra cores sitting around idle with nothing to do.

Let’s Do Something With Those Cores!

Why not use that extra idle CPU power to compress the data before we send it out to be written?

The users won’t notice because they’re already waiting on I/O anyway, especially during backup windows when we’re taxing the I/O subsystems.

If we dedicate this extra CPU power to data compression, we now have smaller amounts of data being sent out for writes. Our backup size gets smaller, which in turn – decreases our I/O load! In effect, we’re trading CPU power for I/O power. The more CPU power we have for data compression, the more I/O we free up.

The equation gets interesting when we start to relate how much I/O speed we buy with each additional processor core. Going from a single-core CPU to a quad-core CPU enables a massive amount of backup compression power, which means much less data needs to be written to disk. If less data is being written to the backup target, then we have two options: our backup windows become shorter, or we can use cheaper/slower disks.

Using Backup Compression To Save Money

Choosing the latter method means that the shiny new quad-core database server may pay for itself. I’ve been able to say, “You need more drives for your new project? I’ll sell you my raid 10 of high-end, 73gb 15k SAN spindles because I’m downsizing to a raid 5 SATA array.” Trading off those expensive drives enabled me to buy more quad-core database servers, which could compress the backup files better, and I could live with the SATA drives as a backup target. My backup time window stayed the same, and I gained faster CPU power outside of my backup window because I had more cores.

Cheap quad-core processors enable a database administrator to trade CPU power for I/O speed in the backup window – but only when using those newfound cores to actively compress the backup data. SQL Server 2000 & 2005 can’t natively do that, and that’s where backup compression software comes in.

The same quad-core power works in our favor at restore time, too. During restores, the SQL Server has to read from the backup file and then write those objects out to disk. With backup compression software, the server does less file reads from the backup file because the backup is smaller. This means faster restores with less I/O bottlenecking, and fast restore times are important to a DBA’s career success. The faster we can restore a database in an emergency, the better we look.

Old Servers Trickle Down to Dev & QA

This pays off in another (albeit obscure) way: development & QA servers. At our shop, we’re constantly replacing big, multi-cpu (but single-core) servers with smaller quad-core servers. As a result, we have a lot of 4-way and 8-way servers lying around that are relatively expensive to license in production. They make absolutely perfect development & QA SQL Servers, though, since SQL Server Developer Edition isn’t licensed by the socket, but instead by flat rate. I’ve been able to take these 8-way servers by saying, “No one else can afford to license these for their applications, but I can use them for development.” Then, those 8 cores pay off in faster restores from our production database. I’m able to refresh development & QA environments in shorter windows because I can uncompress them faster than I would on a smaller server.

If faster backup & restore windows were the only tricks available in backup compression software, those alone would be a great ROI story, but there’s more. In the next part of my series, New Features for SQL Backup and Restore, we’ll look at ways backup software vendors are able to jump through hoops that native backups can’t.

Continue Reading New Features for SQL Server Backups

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

SQL Server Backup Software: Part 1 – Why Native SQL Backups Suck

Before we start looking at SQL Server backup compression software, we need to spend a few minutes looking at the weaknesses of the native SQL Server backup process. In order to judge the fixes, we have to know what’s broken.

Native SQL Server backups take the same disk space as the data.

When we back up 100gb of data with a native backup, we’ll end up with a 100gb backup file. If a database has 100gb allocated, but it’s half empty (like in the case of unused log files), then the backup size will be roughly 50gb – the size of the data.

Large amounts of data take a long time to write to disk.

The slowest thing in the backup process is usually writing the backup file, whether it’s over the network or to local disk. Reads are typically faster than writes, so unless the database is under heavy transactional load at the time of the backup, the reads won’t be the bottleneck. As a result, the more data that has to get written to disk, the longer the backup will take.

We could alleviate that by purchasing faster and faster arrays for our backup targets, but that gets pretty expensive. Our managers start to ask why the DBA’s fastest raid array is being used for backups instead of the live data!

Large amounts of data take a REALLY long time to push over a network to a DR site.

This affects log shipping or just plain copying backup files over the WAN. Compressing the data as little as 25% cuts transmission times by that same amount, and cuts the amount of bandwidth required to replicate the application data. In a large enterprise where multiple applications are competing for the same WAN bandwidth pipe, other teams will ask why the SQL DBA can’t compress their data before sending it over the wire.

We can work around that problem by installing WAN optimization hardware like a Cisco WAAS appliance, but these have their own drawbacks. They must be installed on both ends of the network (the primary datacenter and the DR site), require a lot of management overhead, and they’re expensive. Really expensive.

Another workaround is to compress the backup files with something like WinZip after the backup has finished, but that’s a manual process that has to be automated by the DBA, actively managed, and adds a lag time for the compression before the data can be sent offsite.

SQL Management Studio doesn’t come with reports about the backup process.

Business folks like to say, “You get what you measure.” The idea is that if you start numerically measuring something in an objective way, that number will start to improve simply because you’re focusing on it and talking about it with others. SQL Server native backups are something of a black box: there’s no quick report to show how long backups are taking per database, how often they’re failing, and how long it would take to do a full restore in the event of an emergency.

I find it hilariously ironic that my job as a database administrator revolves around storing precise metrics for others, enabling them to do dashboards and reports, but SQL’s native backup system doesn’t offer any kind of dashboard or report to show its own backup & restore times and successes. SQL 2005 SP2 started to offer some database performance reports inside of SQL Server Management Studio, but they still don’t address the backup/restore metrics.

An ambitious DBA could build their own reports, but they have to manually consolidate data from all of their database servers across the enterprise and keep it in sync. Whew – I get tired just thinking about that. (I should probably subtitle my blog as “The Lazy DBA” come to think of it.)

Cross-server restores are a ton of manual work.

If the DBA wants to bring a development server up to the most current production backup, including transaction logs, they either have to write a complicated script to parse through a list of available backups, or they have to do a lot of manual restores by picking files.

Even worse, in the event of a disaster, the database administrator has to scramble through directories looking for t-logs, writing restore scripts, and hoping they work. Really good DBA’s plan this scenario out and test it often, but let’s be honest: most of us don’t have that much time. We write our DRP scripts once, test them when we have to, and cross our fingers the rest of the time.

That frustrates me because the restore process has been the same since I started doing database administration back in 1999 as a network admin. For years, I looked for the most reliable restore scripts I could find, played with them, spent time tweaking them, and wasted a lot of time. In my mind, this is something that should be completely integrated with the shipping version of SQL Server just because it’s so central to a DBA’s job.

Enough Backup Problems: Show Me Solutions!

So now we’ve seen some of the weaknesses in SQL Server 2005′s native backups. In my next couple of blog posts, I’ll talk about how third party backup compression software gets around these obstacles and offers better features with more capabilities.

Continue Reading with Part 2: Quad Cores are Changing the Game

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

Flock 1.0: whoa, that’s some vision!

A long time ago in a galaxy far, far away, I understood a vision of the alternative web browser Flock.

I spent more than a few (but less than a few tens) of hours testing the latest builds and faithfully reporting bugs.  I used it as my primary browser, and I liked what I saw.  In late 2006, I stopped using it because it looked like Flock was selling out to Yahoo.

But I say “a” vision, not “the” vision, because I’m just now trying Flock 1.0 after not using it for a few months, and the vision I saw back in 2006 doesn’t at all match up with what Flock 1.0 delivers.  The vision of Flock 1.0 is looking way, way down the road, much farther than the earlier builds.  Yep, it has still kinda sold out to Yahoo, but it’s not that intrusive.  Plus, now that Yahoo has bought out both Del.icio.us and Flickr, I’m not so sure that selling out to Yahoo doesn’t represent a pretty good vision.

Flock 1.0 includes a home page called “My World” that centralizes my contacts from Facebook, Flickr, YouTube, Twitter, Blogger and more.  On one page, I can see everybody’s recent updates (no matter what site they updated).  Within a few hours of installing Flock 1.0, I found myself digging deeper into each of my friends’ online presences, examining what sites they’ve joined and trying to see more about their lives.  For example, I had no idea Lan was uploading photos to Facebook, or that she had a YouTube account.  I started wondering how web geeks were ever going to survive without a single, consolidated view of everybody’s presence.  I wanted to start pushing all of my presence tools (Twitter especially) onto everybody I knew, and I wondered when Flock would support more sites (LinkedIn.com first, Last.FM next).

I get it!  There’s some great vision going on here.

As an avid iPhone user, I didn’t want to like Flock 1.0.  I’d read Daryl’s post about Flock 1.0 and thought I had to at least try it, but I figured I wouldn’t like it because I love Bloglines.  With Bloglines, I can subscribe to RSS feeds, and then read them from any web browser.  When I read articles on my iPhone, they’re automatically marked as read, so when I later peruse feeds from my office Mac, they’re marked as read.  Same thing with when I read feeds from my home Windows machine.  Flock is a workstation-based feed reader, so when I read feeds on my home Windows machine with Flock, those feeds aren’t marked as read on Bloglines (or on my Flock setup on the Mac), so I have to mark them as read again.  That’s nowhere near elegant.

But you know what I discovered?  The integrated, people-aware browser is so gosh-darned compelling that I don’t mind.  I use Flock at home on my Windows box, and it’s fantastic.  It doesn’t replace Bloglines for feed reading, but for being people-aware, there’s nothing even close to it. I haven’t tried the Me.dium plugin yet, but I’m eying it lustfully.

Flock’s got potential.  What’s missing?  Well, it needs to sync profiles across machines.  When I mark an article as read from my work Mac, it needs to mark that article as read on my home Windows machine, and on my iPhone.  Yeah, I know, that’s almost impossibly difficult, and there’s an extremely low number of folks like me that run Windows, Mac and an iPhone, but that’s what being an early adopter is like.

But here’s the kicker for me: that’s the only thing missing so far.

I’ve been giving it a shot as my primary browser at home for the last couple of days, and I’m going to go for it at work too.  I won’t say I’ll give up Bloglines by any means, but Flock 1.0 is pretty darned good.  Good enough that I’m going to try this blog post from Flock’s built-in blog editor.

Blogged with Flock

Tags:

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

SQL Server 2008 November CTP is out

Hot fresh downloads, get your hot fresh downloads here.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

“Buckwheat” apology rejected by Hazel Boykin

Louisiana State Representative Carla Blanchard Dartez apologized today for her ‘Buckwheat remark.’ She ended a phone conversation with Hazel Boykin, the NAACP’s local president, by saying, “Talk to you later, Buckwheat.”

Boykin’s office responded with a cryptic remark, saying that Dartez was “wookin pa nub in all the wong pwaces.”

(And lest anyone think I’m poking fun at Hazel Boykin, I’m not – I’m definitely rolling my eyes at Dartez and anybody who would think of voting for her. Throw her out of office immediately.)

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

SQL Performance Tuning: Estimating Percentage Improvements

When I’m doing performance tuning on an application in the early stages of its lifecycle (or any app that’s never had DBA attention before), I end up with a ton of recommendations within the first day of performance tuning.  The resulting to-do list can seem overwhelming to project managers and developers, so I include one of the following two sentences as a part of each recommendation:

  • This change will improve performance by a percentage, or
  • This change will improve performance by an order of magnitude

I know, I know, that phrase doesn’t come up too often, so it helps to check out Wikipedia’s definition of order of magnitude:

“Orders of magnitude are generally used to make very approximate comparisons. If two numbers differ by one order of magnitude, one is about ten times larger than the other.”

So the two sentences translate into:

  • This change will improve performance by 10-90%, or
  • This change will improve performance by 10-100x

I could use those latter two sentences instead of the “percentage versus order of magnitude” sentences, but those latter sentences make me sound like I’m taking wild, uneducated guesses.  In reality, sure, I am taking wild, uneducated guesses, but on an informed basis – I’m just not putting a lot of time into categorizing the improvements.

Jeff Atwood’s excellent Coding Horror blog has a two-part post about estimation that should be required reading for every DBA.  Part 1 is a quiz, and Part 2 explains the answers.

So why am I leaving so much gray area in my recommendations?  Why break suggestions into such widely varied categories?  Is it smart to lump a 10% improvement in with an 80% improvement?  In the early stages of performance tuning, yes, because the DBA can’t necessarily predict which changes will be the easiest to implement, or the order in which they’ll be implemented.  When a database administrator first looks at an application, queries, stored procedures and database schema, some things pop out right away as massive opportunities for performance gains.  These recommendations are so instrumental to application performance that they often have wide-ranging impacts across the entire app.  After those changes are made, everything speeds up so much that the other recommendations have even less of an impact than they might have originally had.

For example, in a project I’m currently tuning, I found that the three largest tables in a database (which had ten times more records than all of the remaining tables combined) were constantly queried by a single field, the equivalent of a DivisionID integer field.  All of the queries hitting those tables included the DivisionID, and the application frequently did huge update statements that affected all records with a single DivisionID number.  Partitioning those three tables by DivisionID and putting each DivisionID on its own set of disks would result in a staggering performance improvement and a tremendous increase in concurrent nightly ETL processing, since more divisions could run simultaneously.

I made other performance recommendations as well, but frankly, if the developers implemented every other recommendation except the partitioning, they would still have been struggling with their nightly windows, and the implementation time would have put the project way behind on their deadlines.  On the other hand, if they just implemented the partitioning, they would sail through their nightly windows and make their project delivery deadline.  That’s the definition of an “order of magnitude improvement.”

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

RememberTheMilk.com + Jott.com = GTD on the go

I’ve been a longtime user of these two services, and now they’ve got complete integration.

www.Jott.com gives me a dial-in phone number.  When I call that number from my cell phone, I can leave myself messages. Jott transcribes those messages and emails them to me as text.  The email includes a link to the original audio recording in case they goofed up the transcription, which they rarely do, even though I have to yell messages into my iPhone over wind noise in the Jeep.

www.RememberTheMilk.com manages my tasks by giving me a simple, fast web-based to-do list accessible from anywhere.  They’ve even got an iPhone-friendly user interface.

Here’s where the Jott/RTM integration comes in: now, I can call Jott and send messages directly to my RememberTheMilk inbox.  Hooah!  If I get any more productive, I’m going to be two people.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

Sunday Buffet at The Lady & Sons

As part of our road trip this week to Oklahoma City, we stopped at Paula Deen’s The Lady and Sons in Savannah. For those of you unfamiliar with The Food Network, Paula Deen and her two sons are food celebrities, great people with a great story.

The restaurant doesn’t take advance reservations: instead, hopeful diners start lining up in front of the restaurant, waiting for the hostess to arrive and begin taking names for the day’s seatings. On Sundays, the hostess arrives at 9:30 AM, and the buffet opens at 11 AM. We took our place in line around 8 AM, and we were sixth in line. By 9 AM, the line stretched down the block, and by 9:30, it was down to the next block. We gave the hostess our name, and left to do some window shopping and photography.

At 11 AM, an unbelievably loud woman came out with a clipboard and yelled instructions to the crowd of maybe a hundred people. No bullhorn, no drama, just huge pipes. She explained that the restaurant had seating on the first and third floors (with steam tables on both floors), but that the elevators only carried 15 people at a time, so we should be patient while she called out a few names out at a time. That process might sound unfriendly, but the environment was so jovial and amusing, and everybody had a great time.

The Lady And SonsErika and I took our seats at a the third floor table, placed our drink orders, and headed for the buffet. The steam tables were much smaller than I’d expected, with maybe a dozen choices in all, but the staff kept all of the foods replenished quickly. I’ll cover the items one at a time.

Macaroni and cheese – this was, hands down, the very best macaroni and cheese I’ve ever put in my mouth. In fact, this shouldn’t even be called macaroni and cheese. There should be a different culinary term for this masterpiece, because it’s in a league of its own. I think they thicken it with eggs, because it has a bit of a loose-egg feel to it like the eggs in Pad Thai. When I went back for my second plate at the buffet, there was only one thing on it. That’s right – macaroni and cheese. I have resolved to track down this recipe and reproduce it, and then eat it every day for the rest of my life. Okay, maybe not.

Fried chicken – I’ve read reviews of The Lady and Sons fried chicken before, and they were right – it’s good. It’s not the life-changing experience of the macaroni, but it’s good. I will say that it’s the best fried chicken I’ve had off a steam table.

Mashed potatoes – Erika said it best when she said, “I’ve never tasted butter before in cooking, but I taste the butter in this.” Creamy texture, perfect spices, great stuff. I wasn’t as impressed with the gravy.

Roast beef – mmmm, juicy.

Everything was ever-so-slightly salty. If I didn’t tell you, you wouldn’t recognize it, and I probably only caught it because I’d read other reviews prior to our arrival. They could back off the salt just a tiny, teeny, wee bit, but it didn’t detract from the food. I don’t think Erika caught it.

Biscuit & hoe cake – the hoe cake is basically a pancake, but denser and with a more mealy texture. Good, but I gotta be honest – these take up space in a stomach, and that precious space should be saved for macaroni and cheese.

I didn’t try the greens, the grilled chicken, salads, or desserts. I wanted to, but I couldn’t do it in good faith. I’m still training for the Disney marathon in January, and it’s hard to gorge myself when I’ve got ten mile runs on the weekends.

Some of the reviews I’ve read said that Paula’s buffet is just a buffet, just like any other Southern buffet. I beg to differ, and I know how to illustrate it. Erika and I stopped several times at Cracker Barrels during the course of our road trip, and we went there for dinner the same day that we visited The Lady & Sons. Just to check, I ordered some of the same foods we’d had at Paula’s, and wow, what a difference. Paula’s food is famous for a reason – she makes ordinary food amazing.

I resisted the urge to pick up a t-shirt from the Paula Deen store, but its tagline deserves repeating here: “I’m Your Cook, Not Your Doctor.”

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

Idera SQLsafe Followup: Problems with Restoring Databases

It’s now been a month since I initially ran into a few nasty bugs in Idera SQLsafe v4.5, and it’s time to post a followup.

Within a couple days of my original blog entry, “Idera SQLsafe v4.5 Review: Too Many Showstopper Bugs“, I was contacted by staff at Idera. They agreed about the issues, and pledged to correct them in the next build of SQLsafe. They’re great people – I’ve contacted them for support before – and I enjoyed interacting with them.

There was another support issue that I didn’t blog about, because I didn’t want to spread panic (and still don’t.) However, it’s still not fixed, and if I was a DBA shopping for backup products, I’d want to hear about it, so here goes.

SQLsafe Problems Restoring Large Databases

I have a two terabyte SAP BW data warehouse on SQL Server that I backed up with SQLsafe v3.1 for a year. I regularly restored the production database over to our QA server several times a year, and the restores always went flawlessly. SQLsafe paid for itself countless times over because it cut my backup size from 2tb under 400gb, giving me a smaller backup window, less tape costs, and the ability to back up straight to a network share. I never had a problem with it.

When we deployed SQLsafe v4.5, I didn’t test the 2tb restores right away, because our developers were working on the QA server and I couldn’t overwrite their work.

On October 1st, I tried a restore, and … it failed. I tried several troubleshooting steps, beating my head against the wall thinking it was a problem on my end (because it usually is).

Problems with Idera Support

On October 5th, I gave up and opened a support ticket with Idera. I got one bad apple on their support desk, and his answer was for me to apply a SQL 2000 hotfix. After I pointed out that my support ticket was for a SQL 2005 instance, he asked me to change memory settings on SQL, and ended with this quote (bold emphasis is mine, typos are theirs):

“I would not expect fragmentation to matter too much in 2005, but you van still get a large amount of the memory stuck in allocation. Increasing the memory size should also help resolve this. To do either you will need to restart SQL server I am afraid. I wish I had better news, but this is a SQL server side issue so other then dropping MemToLeave(as you have already done) there is not mush of anything more that can be done on the SQL safe side.”

I was furious: I’d never had this issue before with SQLsafe v3.1, so in my mind, it couldn’t be a SQL problem. The only thing that had changed was Idera SQLsafe.

Tip for IT professionals: if you don’t get the support quality you want, escalate the issue. I knew Idera had solid, professional support, so I pressed and escalated the issue at Idera. Sure enough, I got the answers I needed. Later that day, they admitted that they’d reproduced the issue internally and they were working to fix it. They hoped to get me a repaired build that afternoon, and then that afternoon, they said it would be October 8th before I could get a working build. I said I was willing to take an alpha version to test the restores – even if it crashed my QA server, that would be OK, since this was the only database on the server anyway, and the server was down for the count, waiting on the restore.

On October 8th, they still didn’t have a working version ready.

Running Out Of Time, Switched to LiteSpeed

I was running out of options. I didn’t have 2tb of free drive space on the SAN to do a native SQL backup. I didn’t have any other backup products that could handle it, and I couldn’t just stop the production data warehouse so that I could back up the ldf/mdf/ndf files to tape. The developers were getting frustrated because their server had been down nearly a week, and I didn’t have any good options that didn’t involve spending money. My managers were unhappy (to say the least) at the prospect of buying another backup product when we already owned one, and it was looking like I’d have to downgrade my production servers from v4.5 to v3.1 – live – without a reboot. That made me pretty nervous.

Then, I got lucky: Quest stepped in and offered me free licensing for their Litespeed product. I tried it, and it worked the first time. I dodged a bullet, although my reputation at Southern got wounded. For a while, I was the DBA who couldn’t restore from a backup.

I have to be honest: I wanted (and still want) the Idera product to work. I genuinely like SQLsafe, because the user interface is fantastic. My developers rave about how quick and easy it is to refresh development and QA servers from production. And frankly, we paid for it, and it was installed & working on a lot of servers already – why switch from one product to another if I could avoid it? Therefore, I kept in touch with the Idera folks and said I wanted to test the SQLsafe fixes when they were implemented. As soon as the product could restore my 2tb data warehouse again, I wanted to switch back to it. Nothing against the folks at Quest – Litespeed is great too – but I don’t change horses very often.

Idera Found the Bug in SQLsafe

On October 17th, Idera let me know that they’d found the bug in a different place than they’d originally expected to find it. I completely and utterly applaud their honesty, and it echoes my usual support experiences with Idera. They’re honest, they’re professional, and they know their stuff. I had the one bad set of interactions on October 5th, but I can forgive that from tech companies because it’s so hard to get great support people. They found a possible workaround (disabling status reporting in SQLsafe) but I didn’t have another 2tb environment that I could use for testing.

So I was running out of time: Idera said, This will NOT be fixed in our 4.6 release simply because the amount of testing/validation is quite large and we’re still working on the fixes. It will be in 4.7 which will come out as soon as possible.”

Ouch – I couldn’t wait for another regularly scheduled release just to be able to do database restores on my data warehouses. To make matters worse, I had a vacation scheduled at the start of November, and I wanted to have one consistent backup product in place. I had to keep the whole backup/restore thing as easy as possible for the rest of my staff, because I’m the only dedicated production DBA in the shop. While I’m gone, I need the development DBAs and our Windows staff to be able to do restores on demand, and I didn’t want to have to train them on two different products.

So, I switched to Quest Litespeed. I’m not happy about switching only because I’m not happy about any change that I can avoid. I just couldn’t avoid this change. It’s not that I’m disappointed with Litespeed (after all, it saved my bacon and it works great) – I just hate giving up on a product.

On November 6th, when Idera found out that I’d switched, they offered me a private beta version of the 4.6.1 code and promised that it fixed the restore issues. Unfortunately, I was already out on vacation, and it was too late for me to switch back. In their minds, they’ve bent over backwards to accommodate a customer, and they’re right. However, in my mind as the customer, the product didn’t work, and I didn’t get a fixed version until over a month later. I could theoretically put this beta code on my production servers, and it might work fine, but at this point, I can’t afford to gamble any more. When I get back from vacation, I have two days left with an eval LeftHand Networks SAN, and I’ll put those two days into testing the Idera 4.6.1 bugfix. If the restores work, I’ll definitely note it here, because if I was a DBA considering the purchase of SQLsafe, I’d want to know that it works. I hate reading internet posts that leave an open-ended question, leaving me wondering if the product ever worked or not, and I’m not the kind of guy to leave that post hanging without an answer. (Update: we ran into problems with the LeftHand SAN and were not able to test the Idera bugfixes.)

In the coming weeks, I’ll post a series of blog articles comparing the two products since I’ve now got a lot of experience between the two. They’re both great products, they both have their shortcomings, and they’re both worth the money. Shops without a backup compression product don’t know what they’re missing, and it’s not just backup compression; both Idera SQLsafe and Quest Litespeed offer tons of advantages that save me time and money. I’ll demonstrate how DBAs can write an ROI proposal to show how the products really do pay for themselves, all sales BS aside.

Update on February 10, 2007 – I get an email at least once a week whether a new version of SQLsafe fixed the bugs.  We stopped using SQLsafe, so I can’t say, but you can get an evaluation version from Idera to find out by testing in your own environment.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube