Storage virtualization is a really slick SAN technology that does for SANs what VMware did for servers: it abstracts away the underlying hardware to make management easier. Multiple SANs can be swapped around back and forth behind the scenes without affecting any servers that store data on those SANs.
It’s nowhere near common yet – it’s somewhat like VMware was several years ago, not quite large in the datacenter – but it’s gaining traction, and it’s something that DBAs need to be aware of. As a DBA, you need to know the risks and rewards so that if your SAN team wants to evaluate storage virtualization, you’ll be able to voice your opinion.
But why do any research? Take it from me – research is hard, sweaty and painful. Why not just repeat MY opinion and call it your own? After all, my opinion is cool and it’s free, and you can read it online right now courtesy of Search SQL Server:
Detective By Accident – that’s what DBA really stands for, not Database Administrator. Come on, if it was really Database Administrators, we’d be DA’s. Granted, most of us don’t look like Warrick Brown or Sara Sidle, but when something disappears in the datacenter, people pick up the phone and call us.
On July 24th, Jason Hall and I are doing a webcast called CSI:DBA – Going Back in Time to Understand What Happened to Your Data. We’re going to cover some of the SQL Server techniques you can use to:
- Track who made schema changes
- Track who deleted & truncated tables
- Get the data back without time-consuming full restores
I’ll be covering how to do it with the native SQL Server tools, and Jason will demo how Quest tools can help. I’ve written up some of the native SQL parts of the presentation on our community site, SQLServerPedia. You can register for the webcast, or check out my notes on auditing SQL Server schema changes with DDL triggers and the Performance Dashboard.
I’m a DBA, not a programmer, but I subscribe to Jeff Atwood’s excellent Coding Horror blog because it’s well-written, funny and teaches good lessons. In his latest post, he talks about database normalization – when to do it and when to avoid it.
Pay particular attention to the links to HighScalability.com – they have great stories about scalability problems and lessons-learned from really big sites like YouTube and Twitter. Read those stories and know ‘em well, because that’s the easiest way to learn some really expensive lessons.
On July 30th, we’re doing a live video webcast about best practices on SQL Server virtualization and consolidation. The talking heads will be:
- Kevin Kline – he’s been a PASS President, he’s a Microsoft MVP, and he’s written more SQL Server books than I’ve read, he blogs at SQLblog.com, he writes for SQL Mag, he’s all over the place. I think he’s in the new Batman movie, but he’s no joker.
- Ron Talmage of Solid Quality Mentors – I had the privilege of meeting Ron at our recent Quest Customer Advisory Board. Ron talks to SQL shops with very large and very scalable database implementations, and we had a great time discussing SQL Server partitioning. He’s got a Microsoft whitepaper coming out on that topic, and I can’t wait to read it. (And no, I don’t normally get excited about whitepapers, especially before they come out.)
- Brent Ozar – this guy is just an absolute genius. I mean, those other two presenters, sure, they’ve got some books, but c’mon, we’re talking star material here. He’s not only good-looking, but I think he just might be the next Bill Gates. (Please feel free to copy/paste this into emails to your friends.)
All kidding aside (well, most kidding aside) I really like this topic. I’ve been telling DBAs that they either need to consolidate, or their Wintel sysadmins are going to consolidate the SQL Servers by virtualizing them. Sure enough, it just happened recently at one of my clients.
The company was struggling with SQL Server issues on their cluster. The SQL Server services would stop suddenly, without warning, and without any useful information in logs. Looking at the symptoms, I recommended they bring in Microsoft Premier Support immediately, so they called in the big guns. After weeks of troubleshooting, we had a Mexican standoff between the SAN vendor, the server vendor and Microsoft, all blaming each other.
The client solved the problem with virtualization: they dropped the cluster down to a single node, virtualized it, and removed a whole lot of complexity. The SAN drivers were no longer an issue because VMware abstracts the SAN away, handling all SAN failover and pathing issues. To the virtual server, the storage is just a simple locally attached drive, no matter what SAN it lives on. Presto, the server’s been up and running for a week without problems – something they couldn’t say before.
They lost the high availability of the cluster, but in this case, it made perfect sense because the cluster wasn’t highly available anyway! They still have some HA capabilities with VMotion and multiple hosts accessing the SAN – not as good as clustering, but certainly better than what they were experiencing.
More and more, virtualization is becoming just another tool in the sysadmin’s toolbelt, and database administrators need to know the risks, how to identify which servers should (or shouldn’t) be virtualized, and which ones should be consolidated. We’ll talk through this stuff in our webcast, and if there’s any questions you’d like to see us touch on, let me know.
You can register for the webcast here: Don’t Get Caught at the Crossroads: SQL Server Consolidation & Virtualization.
Jason Massie tagged me, so it’s my turn to answer the questionnaire….
How old were you when you first started programming?
Dad and Mom upgraded us from an Atari 2600 to a Commodore 64 when those came out, so I must have been around 9-10. I don’t remember much from those early attempts at programming, but I remember being really frustrated that there was so much typing involved to copy the stuff from magazines into my own computer.
See, I have this mental problem where I don’t want to do something unless I can be really successful at it in a short period of time. That problem is often defined as laziness, but I’m not lazy – I don’t mind working really hard, but I don’t want to STRUGGLE really hard. I will work 20-hour days, but I wanna know that I’m actually achieving great things, not trying to accomplish something basic.
Typing long, boring lines into a computer, especially lines that I didn’t think up, definitely fell into the category of time wasters. It’s one thing to pour your ideas out into a keyboard, but it’s another thing to transcribe somebody else’s ideas, character for character, and then try to hunt down your typos without a debugger.
(That same mental problem is what kept me out of sports! Practice for weeks just to be able to shoot a free throw? You’re out of your mind….)
How did you get started in programming?
Lemme tell you how I didn’t get started: I vividly remember Mom forcing me to do piano lessons, and Dad forcing me to do soccer. Neither of those hobbies stuck in any way, shape or form. I don’t remember how I got started in programming, but I remember poking and peeking around on my own, so I bet I ran across it in a magazine. I was a voracious reader.
What was your first language?
BASIC. (Makes me chuckle because future generations will respond to that question so differently.) I think my second language, if you can call it that, would have been DOS batch files, though. I really, really enjoyed MS-DOS.
What was the first real program you wrote?
To me, it’s not a real program until the first user signs on. That’s when you find the real bugs, the real shortcomings. My first real program was a help desk front end written in classic ASP.
Our company was growing by leaps and bounds, and McAfee wanted absurd amounts of money for more user licenses for their help desk software. That software sucked – I mean, reeeeeally sucked – and I said to myself, I could write something better using the same SQL Server back end, save the company a lot of money, and the users would love me because it’d be so much easier to use. Plus, if I had any ugly bugs, they could use the old Windows console version while I sorted my bugs out. The company still uses that system today, and it’s had over 100k help desk tickets. Hooah!
That’s still my favorite rush in programming: walking past somebody’s computer and seeing my stuff on their screen as they interact with it. I really love knowing that somebody could choose to use any piece of software out there, and they’re choosing to use mine. That rocks. It’s like being the cool kid in school. (Only without the chicks.)
What languages have you used since you started programming?
Trying to think of these in the order of my career:
- BASIC on a Commodore 64
- DOS (yes, I consider batch files a language, especially when they’re hundreds of lines long)
- Classic ASP
- Topspeed Clarion
- Java (the language that made me decide to never learn another language again)
What was your first professional programming gig?
Telman (subsequently bought by UniFocus) hired me on the basis of personal relationships and my hospitality industry knowledge, and then sent me off to Clarion training. I really liked Clarion, and I haven’t touched it in years. Makes me want to go play with it now, come to think of it. Clarion was a database-independent language: in theory, you could change your database back end with a couple of mouse clicks, recompile your program, and hook it up to a different database.
The problem is that when you’re using a particular database, you want to take advantage of the database-specific features that give you better performance or more capabilities. If your code is generic enough, though, or if you’re willing to invest the time to debug it once, it does work, and I did manage to switch a few apps between Clarion’s proprietary flat files, to Microsoft Access, to SQL Server 7.0.
Telman was also my last professional programming gig. Clarion was a dying language, so the company had to switch to a “new”, more maintained language – either Java or .NET. I saw the writing on the wall; .NET would have short-term staying power because it has the Microsoft marketing power behind it, but something else would come along in 5-10 years and knock it over. I could spend a few years becoming really proficient in Java or .NET – but then have to relearn a new language within a decade. Why bother? The ANSI SQL language lasts forever, even across different vendors.
If you knew then what you know now, would you have started programming?
Yeah, because I think it makes me a better database administrator. No way in hell would I go back to programming, though – I hate finding bugs in my own code. Stored procs are easy enough to unit test and be pretty certain that they’re correct, whereas code that faces end users – that’s another problem entirely. Users are crazy. They click everywhere, they do things that don’t make sense, and they expect everything to work flawlessly. That is seriously hard work. I really respect good programmers.
If there is one thing you learned along the way that you would tell new developers, what would it be?
Languages come and go like fashion trends. Don’t get stuck in a pair of baby blue bell bottom pants: choose a language based on how your resume will look 5 or 10 years from now, not based on what the cool kids are doing this week.
Really, really good programmers can pick up a new language in a heartbeat, but you may not be so lucky. Spend your time focusing and getting really good at one language, and don’t listen to the siren song of whatever new language is sexy today. Take database administration: learn to code ANSI SQL today, and you’ll still be using that same syntax in 20 years. Learn the trendy new LINQ, and you may be relearning something else in a few years. (I’m not sayin’, I’m just sayin’.)
What’s the most fun you’ve ever had … programming?
Embedding sounds in other people’s help desk tickets with hidden HTML code in the ticket notes. I embedded the A-Team theme song in a note on somebody’s ticket, so whenever he pulled up his list of tickets, the A-Team song started playing, and he didn’t know why. When I finally let out the secret, hoo boy, that triggered a flood of embedded sounds and graphics.
Who are you calling out?
Bert Scalzo because he talked me into this sweet job
Brian Knight because reading his stuff is almost as good as listening to his seminars
Conor Cunningham before he goes back to work for Microsoft
Jeremey Barrett because I bet he’s better at programming than he lets on
Linchi Shea because he’s a SAN genius
Rhonda Tipton because I’m seconding Jason’s call-out
Before me, the tag order was something like this: Jason Massie > Denis Gobo > Andy Leonard > Frank La Vigne > Pete Brown > Chad Campbell > Dan Rigsby > Michael Eaton > Sarah Dutkiewicz > Jeff Blankenburg > Josh Holmes > Larry Clarkin
I blogged about why database administrators should use Twitter, and I personally loved that tool. It helped people build faster personal connections with each other, and it helped you get a picture of what the brilliant people were thinking at any given time. Take Brian Knight, a SQL Server guru, for example – I love being able to see what he’s working on now, because he does things that shape how the DBA world works.
Unfortunately, Twitter started running into some serious infrastructure and architecture problems. They built the whole thing on top of a couple of MySQL servers with no real Plan B – no way to scale out, no automated cluster failover, no partitioning, and basically no way to make the database fast enough to withstand the user load.
All of this was frustrating, especially since early Twitter adopters tended to be people who worked in & around information technology. We do this stuff for a living, and it frustrates us when we depend on a tool that is constantly borked. I got a few laughs out of Twitter’s scaling issues – not because I thought they were easy to solve, but because I’ve worked with coders who took that same approach. How many times have we heard, “But it works fine on my desktop with ten users….it’ll be fine when we move it to a real server and we have a hundred thousand users.”
- Analysis of Twitter’s infrastructure problems
- Current Twitter Status (color coded by whether features are up or down. Example shown at right.)
- Twitter scores its first round of VC funding in 2007
- Twitter scores a second $15m round in 2008
That’s right, fifteen million bucks. A company with a newly minted $15m in the bank should be able to build in some scalability pretty quickly, right? Buy some faster servers, build in some redundancy, off we go, right?
Not so fast.
Architecture problems take time – especially downtime – to fix, and the one thing Twitter didn’t have was time. More users piled onto the service every day, more applications built hooks into Twitter’s open API, and the press started asking questions about this cool new service.
Smelling blood, a lot of other competitors have sprung up around Twitter. Jaiku, Pownce, Friendfeed and many more sites launched to do similar things.
On July 2nd, a new competitor called Identi.ca launched and gathered a lot of press right away. I’m not going to go into most of the technical differentiators here, but the bottom line is that Identi.ca picked up over 10,000 new users in its first 3 days.
Let me repeat: Identi.ca picked up over 10,000 new users in its first 3 days.
That’s a big rush, and it requires a lot of hardware. Hardware is expensive. Datacenter space is expensive. These things require a lot of VC funding up front, plus planning time to buy, install and provision. Nobody wants to pay for users they don’t have yet, especially in this tight economy, so how was Identi.ca able to scale up that fast?
Identi.ca is hosted on Amazon EC2 cloud servers. Anybody can rent EC2 servers by the hour with no capital required, no down payment. Simply pick what type of servers you want to spin up, and spin up as many as you want. Pricing ranges from $.10 per hour for a 1-cpu, 1.7gb ram server to $.80 per hour for an 8-core instance with 7-15gb ram. As load grows, spin up more servers to handle the load. If the load doesn’t come, or if it comes and goes, just spin down the extra servers. They’re not sitting around on your balance sheet draining your company dry.
Cloud computing lets anybody with a good idea, good architecture and good coders scale without investing in hardware ahead of time. Suddenly, those three guys working in a garage (or a coworking space) are much more dangerous to your established – and funded – internet business.
If I worked for Twitter, I would be terrified that a startup company with no VC funding would be able to steal 10,000 of my users from my precious web application in 3 days. I would be standing in front of my web servers with my arms outstretched, screaming, “No, don’t take my children!”
I don’t work for Twitter, though, so you can catch me on Identi.ca.