Is your SQL Server slowing you down? Are bad queries giving your application a case of the blues? Before you go looking for help with SQL Server, make sure that you’ve done everything you can to solve the problems on your own. Many of SQL Server’s ills can be solved with some easy preventative maintenance, patching, and TLC.
5. Upgrade TO THE LATEST SQL SERVER VERSION
If you aren’t on a recent version of SQL Server, what are you waiting for? SQL Server 2000 and 2005 are no longer under active development – the query engines are intolerably old and diagnostic support has improved by leaps and bounds in newer versions of SQL Server. Once you have the licensing figured out, use our SQL Server setup and post installation checklists to make sure you configure SQL Server the right way the first time through.
Why is a new version of SQL Server going to make things faster? New versions of SQL Server get new versions of the SQL Server query optimizer. While Microsoft sometimes sneaks big performance fixes into a service pack, the biggest improvements come in major version releases. New versions of SQL server also contain fixes for bugs, take advantage of new CPU instruction sets, and are filled with the latest and greatest in software development techniques. In short: you wouldn’t expect a 10 year old sports car to be as fast as a brand new sports car, why would you expect the same thing from SQL Server?
Some vendor applications can’t be upgraded to SQL Server 2008/R2/2012. Odds are these applications are also sitting on older physical hardware. When you’re stuck running old versions of SQL Server on old hardware, it’s a no brainer to virtualize SQL Server and assign enough resources to keep things running quickly.
Even if you can only upgrade from 32-bit to 64-bit SQL Server, take the time to make the change. It’s well worth it to remove the limits of a 32-bit memory space.
4. Upgrade Your Memory
SQL Server 2008 R2 Standard Edition maxes out at 64GB of memory. If you don’t have 64GB of memory in your SQL Server, what are you waiting for? Head on over to your vendor of choice and expense account your way to better performance. Since SQL Server uses memory to cache data and avoid additional trips to disk, you might as well drop a bit of coin and max out SQL Server’s memory. You’ll get the added benefit of additional memory to cache query plans, perform bigger joins and sorts in memory, and may even see a reduction in CPU and disk utilization to boot.
Make sure that you increase SQL Server’s max memory setting after you add more memory, too, or else SQL server won’t take advantage of that new RAM that you just added.
3. Check Task Manager
You’ve upgrade to the latest and greatest SQL Server and you have 64GB of RAM in the server. Things are still slow, what now?
Open up task manager and sort first by CPU and then by memory. If there is anything running that you don’t know about, kill it. If anti-virus is installed, configure exceptions. If other users are RDPing into the server to develop SSIS packages, revoke their access. Get rid of every extra piece of software that is slowing the server down or eating up memory.
If you’re on Windows Server 2008 or newer, you should also make sure that the Windows file system cache isn’t eating up all of your memory. Windows will cache data for file system access when you drag and drop a file, copy it with xcopy, or push a backup across the network. This behavior is by design because it’s supposed to add to the feeling of faster performance for the end user. Unfortunately, it also steals RAM from server-side processes like SQL Server. You can check this by opening up Task Manager, switching to the Performance tab, and checking the Cached number under “Physical Memory (MB)”.
2. Look in the Event Log
Check both the SQL Server error log and the Windows Event Log. Both locations contain a potential motherlode of information that most teams ignore. If SQL Server or Windows is running into any kind of problem, they’re going to be complaining about it. In both locations you’ll be able to see if SQL Server is dumping core, waiting on disks for a long period of time, or encountering any hardware related issues. You’ll also be able to see what other services might be running into problems on the server – just because anti-virus isn’t running right now doesn’t mean it hasn’t been running in the past. The Event Log is another place to look to get an idea about what’s running on your server, what kind of problems it’s having, and what you can do about it.
Filter down the Event Log to just warnings and errors and you may find out if you have hardware that’s on the verge of failing. Most hardware and drivers should report problems up to Windows and the Event Log makes it easy to track down these issues, diagnose the problem, and potentially find a solution. Eliminating hardware problems can be an easy way to eliminate causes of poor performance – you never know what kind of bugs are lurking in out of date drivers and firmware.
While you’re thinking about SQL Server and the Event Log, take a minute to make sure that you have some SQL Server alerts configured. These alerts will make sure that you’re being kept up to date on all of the goings on inside your SQL Server. As the person responsible, you should know if your storage is failing or if SQL Server is encountering serious errors.
1. Run sp_Blitz™
The last thing you should do before going to get help is run sp_Blitz™. Adapted from our consulting, sp_Blitz™ is a fast way to diagnose any configuration problems with your server. The procedure looks at configuration, database design, statistics, and index status to give you a set of general recommendations for improving server help. Don’t freak out after you run sp_Blitz™; work through the output, check out the reference links, and make the changes that you need to get your server in working order. Download it now.
When All Else Fails, Ask for Help
If you’ve run through these 5 steps and you still can’t figure out what’s wrong with SQL Server, here’s your options.
For free, to-the-point questions that aren’t urgent, narrow down the question as tightly as possible, then post it on http://StackOverflow.com or DBA.StackExchange.com (StackOverflow for DBAs). Include as much example code as you can so that others can quickly run it and get an idea of what you’re asking, and they might even post your code back with fixes.
To learn to solve the problems yourself, check out our First Aid section where we’ve got webcast recordings, articles, and scripts to help.
Check out our SQL Server performance tuning class. It’s 3 days in person with us, MVPs and MCMs, and we teach performance tuning for developers and database administrators. If you can’t wait for that, check out our 6-hour video class, Make SQL Server Apps Go Faster.
If you’d like personalized, hands-on help from Microsoft MVPs and Microsoft Certified Masters, we’ve got SQL Critical Care™. In just 3 days, we can get to the bottom of your toughest SQL Server pains and help you get relief. You get all the scripts we use, and it’s like a conference training session on your own server. Contact us to learn more.
What are the costs when you ask SQL Server to recompile a statement or a stored procedure each time it runs? In this 30 minute talk Kendra Little explains the impact on query execution– and on your SQL Server instance– of forcing recompilation. This talk is appropriate for DBAs and Developers who have a working knowledge of execution plans in SQL Server.
Want to use the scripts from the video? Scroll on down to the bottom of the page.
Learn more about Recompilation in SQL Server
- Download the demo database used today: Contoso Retail DW
- The DMV Script used in the demo was adapted from the SQL Server Books Online Entry for sys.dm_exec_sql_text
Scripts used today:
USE ContosoRetailDW; GO SET NOCOUNT ON; GO -- Demo 1: -- Recompile Hogs Shared Resources. -- Run this batch. How long does it take? USE ContosoRetailDW; SET STATISTICS TIME, IO OFF; SET NOCOUNT ON; GO DECLARE @p NUMERIC(10,2); SELECT @p=SUM(TotalCost) FROM dbo.FactOnlineSales os JOIN dbo.dimProduct p ON os.ProductKey = p.ProductKey WHERE StoreKey = 1 GROUP BY ProductName OPTION (RECOMPILE); GO 5000 -- Now run this batch. -- The only thing different is the lack of RECOMPILE hint. USE ContosoRetailDW; SET STATISTICS TIME, IO OFF; SET NOCOUNT ON; GO DECLARE @p NUMERIC(10,2); SELECT @p=SUM(TotalCost) FROM dbo.FactOnlineSales os JOIN dbo.dimProduct p ON os.ProductKey = p.ProductKey WHERE StoreKey = 1 GROUP BY ProductName; GO 5000 -- DEMO 2: -- Recompile Doesn't Tell You It Used Your Stuff USE ContosoRetailDW; GO DBCC FREEPROCCACHE; GO -- First, let's create a simple stored procedure. No hints here! IF OBJECT_ID('dbo.rptStoreSales') IS NULL EXEC sp_executesql N'CREATE PROC dbo.rptStoreSales AS RETURN 0;'; GO ALTER PROC dbo.rptStoreSales @StoreKey INTEGER AS SELECT ProductName, SUM(TotalCost) AS TotalSales FROM dbo.FactOnlineSales os JOIN dbo.dimProduct p ON os.ProductKey=p.ProductKey WHERE StoreKey=@StoreKey GROUP BY ProductName GO -- Give it a run. EXEC dbo.rptStoreSales @StoreKey=199; -- Now check out the details. SELECT ( SELECT TOP 1 SUBSTRING(st.text, statement_start_offset / 2 + 1, ( ( CASE WHEN statement_end_offset = -1 THEN ( LEN(CONVERT(NVARCHAR(MAX), st.text))* 2 ) ELSE statement_end_offset END ) - statement_start_offset ) / 2 + 1) )AS sql_statement, st.1 AS [sql_batch], ps.execution_count, ps.creation_time, ps.last_execution_time, ps.total_elapsed_time, ps.total_worker_time, ps.total_logical_reads, ps.total_physical_reads, ps.total_logical_writes, ps.plan_generation_num, qp.query_plan FROM sys.dm_exec_query_stats ps CROSS APPLY sys.dm_exec_sql_text(ps.sql_handle) AS st OUTER APPLY sys.dm_exec_query_plan(ps.plan_handle) AS qp WHERE (st.text LIKE '%SELECT ProductName, SUM(TotalCost) AS TotalSales%' OR st.text LIKE '%dbo.rptStoreSales%') AND st.text NOT LIKE '%sys.dm_exec_query_stats%'; GO -- Now, alter the procedure to use RECOMPILE ALTER PROC dbo.rptStoreSales @StoreKey INTEGER WITH RECOMPILE AS SELECT ProductName, SUM(TotalCost) AS TotalSales FROM dbo.FactOnlineSales os JOIN dbo.dimProduct p ON os.ProductKey=p.ProductKey WHERE StoreKey=@StoreKey GROUP BY ProductName GO -- Give it a run. EXEC dbo.rptStoreSales @StoreKey=199; -- Now check out the details. SELECT ( SELECT TOP 1 SUBSTRING(st.text, statement_start_offset / 2 + 1, ( ( CASE WHEN statement_end_offset = -1 THEN ( LEN(CONVERT(NVARCHAR(MAX), st.text))* 2 ) ELSE statement_end_offset END ) - statement_start_offset ) / 2 + 1) )AS sql_statement, st.1 AS [sql_batch], ps.execution_count, ps.creation_time, ps.last_execution_time, ps.total_elapsed_time, ps.total_worker_time, ps.total_logical_reads, ps.total_physical_reads, ps.total_logical_writes, ps.plan_generation_num, qp.query_plan FROM sys.dm_exec_query_stats ps CROSS APPLY sys.dm_exec_sql_text(ps.sql_handle) AS st OUTER APPLY sys.dm_exec_query_plan(ps.plan_handle) AS qp WHERE (st.text LIKE '%SELECT ProductName, SUM(TotalCost) AS TotalSales%' OR st.text LIKE '%dbo.rptStoreSales%') AND st.text NOT LIKE '%sys.dm_exec_query_stats%'; GO -- What if we move the RECOMPILE hint? ALTER PROC dbo.rptStoreSales @StoreKey INTEGER AS SELECT ProductName, SUM(TotalCost) AS TotalSales FROM dbo.FactOnlineSales os JOIN dbo.dimProduct p ON os.ProductKey=p.ProductKey WHERE StoreKey=@StoreKey GROUP BY ProductName OPTION (RECOMPILE); GO --Give it a run. Three times. EXEC dbo.rptStoreSales @StoreKey=199; EXEC dbo.rptStoreSales @StoreKey=199; EXEC dbo.rptStoreSales @StoreKey=199; --Now check out the details in the cache. SELECT ( SELECT TOP 1 SUBSTRING(st.text, statement_start_offset / 2 + 1, ( ( CASE WHEN statement_end_offset = -1 THEN ( LEN(CONVERT(NVARCHAR(MAX), st.text))* 2 ) ELSE statement_end_offset END ) - statement_start_offset ) / 2 + 1) )AS sql_statement, st.1 AS [sql_batch], ps.execution_count, ps.creation_time, ps.last_execution_time, ps.total_elapsed_time, ps.total_worker_time, ps.total_logical_reads, ps.total_physical_reads, ps.total_logical_writes, ps.plan_generation_num, qp.query_plan FROM sys.dm_exec_query_stats ps CROSS APPLY sys.dm_exec_sql_text(ps.sql_handle) AS st OUTER APPLY sys.dm_exec_query_plan(ps.plan_handle) AS qp WHERE (st.text LIKE '%SELECT ProductName, SUM(TotalCost) AS TotalSales%' OR st.text LIKE '%dbo.rptStoreSales%') AND st.text NOT LIKE '%sys.dm_exec_query_stats%'; GO -- Compare Execution Count and Plan Generation Number
If you’re using SQL Server 2012′s hot new AlwaysOn Availability Groups feature, your databases will go offline when your network connection does – even if you’re using asynchronous replication.
This is not a bug. This is working as designed – and it’s important to understand the underlying concepts.
AlwaysOn Availability Group functionality relies on Windows Failover Clustering technology to know when things are going well, or when the poop has hit the fan. A core concept of failover clustering is quorum – the voting mechanism that lets each individual node understand whether it’s online or isolated. Windows Server 2008 has a variety of quorum methods, but in the vast majority of configurations, each server needs to be able to see the network in order to reserve its IP address, network name, and see other nodes in the cluster.
Take the following scenario:
- SQL2012PROD1 – primary active node. All read/write connections are going here.
- SQL2012PROD2 – secondary node with asynchronous replication. I could let users query this server, or just let it run in standby – it doesn’t matter for this scenario.
If I disconnect the network connection for SQL2012PROD1 – even for a brief moment – all of the databases in my availability groups roll back all open transactions and then go offline. The informational messages in the SQL Server event log are shown at right for humor purposes.
The SQL Server itself is still up – but the databases in the availability group aren’t because they have a dependency on the availability group’s listener and IP address. Since those aren’t available without a valid network connection, the databases are taken offline.
This is a dramatic departure from database mirroring or replication on a standalone (not clustered) database server. Both of those technologies leave the primary SQL Server’s databases up and running when the network drops.
Bottom line – when deploying AlwaysOn Availability Groups, make sure you understand the risks of everything presented in the cluster validation wizard. Some of the alerts (like some storage alerts) can be skipped for shared-nothing AlwaysOn Availability Group deployments, but others (like the network redundancy alert) definitely can’t. Just a general tip: mission-critical SQL Servers should be connected to two separate network switches, as should all other mission-critical servers. That way when one network switch fails, the mission-critical servers can still all talk to each other and nothing will go down.
If you want to learn more about how to design, test, and manage this slick feature that offers high availability, disaster recovery, and scale-out reads, visit our AlwaysOn Availability Groups resources page.
Let’s jump in the time machine and go back to April 2010. I attended my first SQL Saturday, in Chicago, IL. One of the speakers was a blogger I had been reading, named Brent Ozar. He was showing off a script he’d written, sp_Blitz, to diagnose issues with SQL Servers. I loved the script and have used it at various times throughout my career.
Running sp_Blitz is like sending your SQL Server instance to the doctor. It checks your vitals – it asks if backups are being taken. It tells you who has sysadmin privileges. It checks security. It looks at what non-default options are turned on. It finds tables without clustered indexes. It warns you if tables have triggers on them. It’s a comprehensive overview of which best practices could be implemented.
Making sp_Blitz Better
Throughout the years, the SQL Server community has contributed to making sp_Blitz better (because the community is awesome like that). Suggestions have been submitted to check for mis-matched collations, looking for database owners, and even to fix spelling errors!
It’s a great give and take project. There have been over 10,000 downloads of it so far! Brent gave it to the community, they have used it, and they’ve given suggestions to make it better.
sp_Blitz Results Can Now Be Shown As A Report!
In May of 2012, I joined Brent Ozar PLF! Now, I’m able to make my own contribution to sp_Blitz. I didn’t know this when they hired me, but all those years of playing with SQL Server Reporting Services came in handy right away. I’ve created a SQL Server Management Studio custom report to display the results of sp_Blitz!
You’ve been looking at results displayed in this format:
Now, when you install the custom report, you can view your results in this format:
You can download the report here.
Instructions to install the sp_Blitz custom report:
- Open SQL Server Management Studio and connect to an instance.
- Right-click the instance name and select Reports > Custom Report. (This will create the appropriate folder – C:\Users\username\Documents\SQL Server Management Studio\Custom Reports.)
- Copy the .rdl file to the Custom Reports folder just created.
- In SSMS, right-click the instance name, select Reports > Custom Report > sp_blitz.
- If you see a Warning, “You are about to run a custom report….”, click Run.
Running a custom report by accessing it through the instance name will make the report execute in the context of the master database. If you create stored procedures in a different database, simply right-click the database name and select Reports > Custom Reports to execute it in the context of that database.
Interested in learning more? I’ve made a short video that explains how to run and use the report!
What’s slowing down production? It’s easy to find out in 60 seconds by running a simple plan cache query. See, in order to turn your beautifully written query into a set of results, SQL Server needs a plan – an execution plan. It caches these plans in memory along with statistics like the number of times a query has been run, how long it’s taking on average, and the execution plan itself. Whether you’re a junior DBA or a senior developer, you can use the three free tools covered in this video to find the culprit:
The links we discuss in the video are:
- ClearTrace – a free utility to slice and dice your Profiler trace files
- Plan cache check with sp_Blitz™ – this replaces the plan cache query shown in the video. Use @CheckProcedureCache = 1 to achieve the same results.
- sp_WhoIsActive – replacement for sp_who that actually shows you the queries and execution plans
All four of us – Brent, Jeremiah, Jes, and Kendra – got sessions approved for the PASS Summit in November! Here’s what we’ll be covering:
A Developer’s Guide to Dangerous Queries – Jeremiah Peschka
Jeremiah says: “SQL Server does a good job of working with whatever garbage we throw at it, but sometimes a helping hand is needed to smooth things out. You may have meant well when you wrote that sloppy SQL, but that predicate lurking under the surface is poisoning your performance. This talk will expose your worst habits, anti-patterns, and bad practices. By the time we’re done, your T-SQL will be on the path to make things right again. This talk will be therapeutic for anyone from a junior DBA or a senior developer.”
Diagnose T-SQL Performance Problems Fast with sp_Blitz (Spotlight) – Brent Ozar
Brent says: “Every week, people tell me our sp_Blitz helped them quickly figure out what’s dangerously wrong with a server. I wanted to take it to the next level and start making complex performance troubleshooting easy, too. I’m building the next major version of sp_Blitz to analyze what queries have run recently on the server and help you find out why they’re slow. At the PASS Summit, I’ll unveil this publicly for the first time and explain how it works. I want to make you look like a genius – again!”
Real-Life SQL 2012 Availability Group Lessons Learned – Brent Ozar
Brent says: “SQL Server 2012′s new AlwaysOn features kick ass, and I’m deploying it in production for clients like StackOverflow.com, AllRecipes.com, and Discovery Education. I still can’t believe I get paid to have this much fun, but it’s not all unicorns and bacon – there’s some gotchas too. I’ll share some of the complexities involved with combinations of failover clustering and availability groups, solutions with all local solid state storage, and how to avoid painful surprises at go-live time.”
SQL Server First Responder Kit (Spotlight) – Kendra Little
Kendra says: “Know that horrible feeling you get when your critical SQL Servers are having a heart attack? In this session I’ll outfit you with a kit to abolish that not-so-fresh feeling. The SQL Server First Responder Kit sets you up to diagnose the toughest problems with confidence. I’ll share the proven methods that I train my clients to use when triaging in a crisis. Demos will cover SQL 2005, 2008, and will include new tools for SQL 2012 that will revolutionize the way we troubleshoot performance problems in SQL Server.”
Index Psychiatry: Diagnose and Treat the Top Five Disorders – Kendra Little
Kendra says: “Do you have the right indexes in place to make queries run faster without dragging down write performance? I find major disorders in indexes everywhere I turn– and I’m not just talking about fragmentation. At the PASS Summit I’ll introduce you to the top five index problems I find in the wild, from multiple personalities to narcissism. I’ll give you the scripts to demonstrate when your indexes are just plain crazy, and the steps to help you fix up the patient without calling Sigmund Freud.”
The What, Why, and How of Filegroups – Jes Schultz-Borland
Jes says: “Filegroups aren’t the new hotness, but they’re a reliable and flexible method for organizing data in your database. They’re often misunderstood. Come to my session at PASS Summit to learn about creating, managing, and maintaining Filegroups. I’ll discuss best practices for them. I’ll even wow you with a walk-through of a piecemeal restore!”
Wanna join us? Register before June 30th and it’s just $1,395. If you want help pitching it to your manager, read Brent’s post on How to Get Budget Approval for Conferences.
Last week, I soaked up sun, rain, and knowledge in Orlando, Florida at the annual Microsoft TechEd conference. It’s a huge conference with something like 10,000 attendees, and it covers development and administration for Windows, SQL Server, SharePoint, and more. Between the keynotes, sessions, and attendee chats, here’s what I learned:
Microsoft’s cranking out the server hits. Windows Server 2012 has great new features like better file copy handling, improved clustering, and slick SAN integration. When your file servers and virtualization hosts are backed by an ODX-savvy SAN, you’re going to be shell-shocked at the performance improvements. The SharePoint, BI, and Visual Studio folks seemed just as excited about their upcoming stuff too.
DBAs are pretty sensitive about SQL 2012′s Enterprise Edition licensing. Microsoft switched to core-based licensing in this release. The short story is that the price is still about the same for quad-core CPUs, but above that, cost rises fast. Many DBAs told me their shops were gritting their teeth at paying $140,000 in licensing to run SQL Server on a $25,000 piece of hardware. Whenever I talked about a new feature in SQL Server 2012, people instantly asked, “Is that feature available in Standard, or just Enterprise?”
Hyper-V and System Center are coming on strong. The vast, vast majority of my clients happily use VMware, but they’re not so happy about the continued price increases. Hyper-V’s feature sets catch up to (and in some areas, even surpass) VMware – as long as you’re willing to buy into the whole System Center suite to get the full feature set. System Center’s deployment costs in terms of hardware, software, and manpower mean that there’s not an immediate ROI to replace VMware with Hyper-V, but if your company hasn’t deployed virtualization in a wide scale yet, they’re going to. I talked to less and less holdouts.
Attendees weren’t excited about Windows 8 clients. Turns out I’m not the only one who hates the clunky, low-res, low-information-density Metro interface on high-resolution desktops and laptops. I heard a lot of Windows Vista jokes, and I heard a lot of sysadmins saying, “We’re still running a lot of XP. We held out as long as we could, but right now, we’re going to deploy Windows 7 across the board.” I’m not sure that’s a good idea, though; if you have to retrain all your users on a new UI anyway, I’d skip to 8. If I was already on 7, though, I think I’d hold pat for a couple of years.
We still aren’t hearing release dates. Microsoft missed a big opportunity here because thousands of geeks are about to go home to their coworkers and do free marketing for Microsoft. They’re going to say, “I love what’s coming, but even though the year is half over, we still can’t make any plans because I have no idea when it’ll drop.” I know Microsoft understands the power of good word-of-mouth marketing, so the fact that we still don’t have release dates makes me think the products still have a ways to go before Microsoft can even announce a shipping date.
There’s a huge demand for good PowerShell basic training. The majority of sysadmins and DBAs that I spoke with still aren’t using PowerShell. They’re vaguely interested in it, but they’re not getting the education they need. Don Jones and Jeffry Snover ran an incredibly well-attended PowerShell crash course session with almost 1,500 attendees and got very, very positive feedback. Thanks to Microsoft, you can watch the PowerShell session recording now. (You may have to turn the audio *way* up in the video, and on your computer.)
You like me. You really like me. Attendees voted my session on Building the Fastest SQL Servers as one of the top five sessions at TechEd – not just the SQL Server track, but out of all 419 sessions at the entire show! I’m really humbled by that, given the quantity and quality of presenters like Mark Russinovich, Mark Minasi, and Paula Januszkiewicz. I’m still walking on air about that one, and I find myself asking – what do I do to take things to the next level?
Remember that weirdo in high school who had no social skills? Couldn’t get a date, worked at the fast food joint as a fry cook, face covered with zits – you know the one.
Okay, actually, it was us. Anyway, the point is, we got our act together, didn’t we? So did Windows Failover Clustering. When you weren’t looking, Windows Server 2008 cleaned up its clustering, and now it’s the new hotness that gets all the dates. It’s time to revisit what you thought you knew about Windows clusters.
Clusters Require Identical Hardware and Configuration
When I was your age, I had to look up every single piece of server hardware on the Windows Hardware Compatibility List (HCL) to make sure it was tested and approved. I either had to buy approved clusters as a package, or assemble them from detailed hardware lists. The new servers I wanted were never on the HCL, or they were way too expensive. Even when I got the goods, I had to assemble everything and then just hope it worked right. Inevitably, it didn’t, but the hardware usually wasn’t to blame – it was my own stupidity.
With Windows 2008 and newer, you can slap together pretty much any old hardware, run the Validate a Cluster wizard, and know right away that…uh, you’ve got a lot of work to do. I know you’ve got passionate feelings about how wizards are evil, but the Validate a Cluster wizard is AWESOME. It tests just about all of the requirements for a failover cluster and gives you a simple report of what’s broken and why you need to fix it.
You don’t need identical hardware on each node anymore – not by a long shot – but the configuration rules are still really, really specific. Some rules are guidelines (like the suggestion of multiple network cards to mitigate risks of a patch cable coming loose) but some are outright requirements.
See, this is one of my favorite things about the wizard: by default, if your cluster doesn’t pass the validation wizard, SQL Server won’t install. This is a DBA’s best friend in the war for systems excellence. If your company has separate Windows, storage, and networking teams, you can run the wizard before installing SQL Server. If it doesn’t pass, you can shrug, pass the ball back to the other teams to get the setup right, and work with them to get ‘er done.
Clusters Need a Heartbeat Network
Cluster nodes used to keep tabs on each other, and if they couldn’t reach a node, they’d freak out. To minimize the freakiness, we used a separate heartbeat network that didn’t handle any other traffic than just cluster chatter. In simple two-node clusters, this was often done by running a crossover cable between the two nodes, which even eliminated the possibility of switch failures. This was a giant pain, and almost nobody got the configuration quite right – 258750 was one of the few Microsoft knowledge base article numbers I actually knew by heart.
Windows Server 2008′s failover cluster networking is less freaky and more friendly: it’ll use whatever networks it can to reach the other nodes. This has its own drawbacks – we need to make sure that any node can reach any other node over any available network, and we need to make sure that all of our networks are highly available. That highly available part is key – preferably we’ve got two network cards in a teamed pair, and we test before go-live to make sure the cluster stays up if a patch cable goes down.
Clusters Require Shared Storage (SAN)
The entire history of Windows clustering has revolved around a shared set of drives that all of the nodes could access. If one server crashed, another node reset the drive ownership, took control, and fired up SQL Server.
You can still build shared storage failover clusters, but Windows Server 2008 and 2012 both manage to run some clustered applications with no shared storage devices. The app has to be designed to work without shared storage, like SQL Server 2012′s new AlwaysOn Availability Groups. Heck, we can even fake-out traditional shared-disk clustering solutions by using UNC paths for our databases. Jonathan Kehayias wrote an in-depth tutorial on how to build a SQL Server cluster on a NAS.
Cluster Quorums Manage Themselves
Back in the days of the quorum drive, all of the cluster nodes got together and decided who was boss simply based on who could see the quorum drive on shared storage. We could move the cluster ownership around by passing the quorum drive around.
Today, since we don’t necessarily have shared storage, we can’t rely on a quorum drive. Windows Server now offers a variety of quorum options including node majority, node and disk majority, and my favorite at the moment, node and file share majority. This means a file share can act as a voting member of the team, enabling two-node clusters with no shared storage.
Configuring quorum – especially managing non-voting members of the quorum – is a tricky but necessary part of building a solid cluster. I’ve already helped a few folks bring their clusters back online after they accidentally took the whole thing down due to rebooting just one (seemingly) passive node. We have to understand our cluster’s quorum method, document what happens if one of the members is rebooted, and ensure that all team members know what needs to happen during patch windows.
Cluster Management is Painful and Obscure
If you’ve had the misfortune of memorizing cluster.exe commands just to get your job done, raise your hand. No, wait, put your finger back down, that’s not appropriate. This is a family web site, and we don’t need to hear your horror stories about reading obscure knowledge base articles in the dead of night.
Unfortunately, the bad news is that this particular point is still true. You’re still going to be managing clusters at the command line.
The good news is that for the most part, you can use PowerShell instead of cluster.exe. This means that as you learn to manage clusters, you’ll also be learning a language that can be used to manage more SQL Servers simultaneously, plus Windows, VMware, Exchange, and lots of other things that you probably didn’t want to have to learn. Okay, so that’s also still kinda bad news – but the good news is that sysadmins will find cluster management more intuitive, because they can use the language they already know.
Whether you want help choosing between an active/passive and an active/active cluster, or if you’re the kind of DBA who knows that’s not even the right name for failover clustered instances anymore, check out our SQL Server clustering training page.
Ever wanted an independent measurement for how “fast” a given SQL Server can run? Benchmarks allow you to measure and compare base performance between instances using a standard simulated workload. In this 30 minute webcast, Kendra Little will introduce you to a free benchmarking tool and demonstrate how to run a TPC-C benchmark against SQL Server. If you’re a DBA or developer who wonders whether or not your instance can turn it up to eleven — this session is for you.
Want links from the video? Scroll on down to the bottom of the post.
Links from the Video for TPC, Hammerora, and More
TPC Site: http://www.tpc.org/
I just returned from SQLCruise – a seven-day cruise to Alaska. I even got my company to pay for it because it included SQL Server training. Yes, you read that correctly. SQL Server training…on an Alaskan cruise!
What is SQLCruise?
It’s SQL Server training…on a cruise ship. Eleven people that work with SQL Server in various ways received training from six excellent instructors while on a cruise to Alaska. The training was held while we were at sea, allowing us to enjoy the at-port days, and network (or not) in the evenings. It also included the usual cruise fun – all-you-can-eat buffets, shows, fantastic scenery, and less-than-calm seas.
I spent the week with ten other database professionals. It was interesting that we ranged from people that had been in IT for decades to those that were just starting. Some came from business backgrounds, others were management. A wide variety of job roles and industries were represented in the room.
The best part was, though, that everyone got something out of the training. SQL Server has become such a large product that no one person can know everything about it. Seeing people that have been working with it for years learning new things reminds me that everyone needs to keep taking classes, keep reading, and keep growing.
The Instructors and Sessions
Over seven full days on the ship, four of those having formal class times, we had seven classes worth eighteen hours of training from six knowledgeable and highly-respected instructors – Brent Ozar, Kevin Kline, Jeremiah Peschka, Allen White, Kendra Little, and Robert Davis. That group includes three SQL Server MCMs and five SQL Server MVPs.
How does that compare to the amount of time spent at a large conference, such as PASS Summit? Summit 2011 spanned three full days, with twelve sessions. Most of those were 75 minutes in length, which amounts to 15 hours of training – if you went to a session in every time slot. Thus, the amount of time is very similar. SQLCruise also has the benefit of much smaller class sizes, plus time after to talk with the instructors, often one-on-one.
A wide range of topics were covered: Procedure Cache, Leadership Skills for the Database Professional, Killer Techniques for Database Performance, Automate and Manage SQL Server with PowerShell, Sports Medicine for High-Transaction Databases, and Recovering Data and Databases. Can you say, “WOW!”? I have pages of notes to reference!
One of the best parts for me was that on the ship, unless you paid (and paid dearly), there was no internet access. That means no email , Twitter, IM, or Facebook interruptions. Class time was focused time. I’ve gotten a lot better at shutting off email and Twitter while I’m working, to focus more. It was great to be able to extend that to training. (Worried your boss won’t let you go if you won’t be reachable? Now is the perfect time to start talking about training a backup, reviewing your documentation, and making sure you can go on a vacation undisturbed! It’s a great feeling!)
Every past cruiser I talked to listed networking as the top feature of SQLCruise, and now I know why. With eleven cruisers, six instructors, three MCMs, seven MVPs (it wasn’t just the instructors!), decades of experiences, and a week to get to know each other, we had plenty of time for networking.
The first day we had a great icebreaker, the Red Gate Software Search the Ship Contest. We were divided into teams and had to find items on the ship. This was a hilarious and fast-paced introduction to the ship and each other. The winners were awarded a license for Red Gate’s awesome software, too – you can’t beat that!
We had dinners together in the buffet, the dining rooms, and at some of the specialty restaurants. At breakfast, you were sure to catch at least one other person to chat with over coffee (or tea). Several nights, we had Office Hours at one of the lounges.
The Office Hours were fantastic. The format was simply, “Ask a question – about your job, SQL Server, your career, blogging, anything.” I learned more about virtualization, encouraged others to start presenting or blogging, chatted about user group leadership, and really got to know my fellow cruisers. This time is invaluable. It’s what I try to get out of user groups, but that can be hard when people want to get home to their families. It’s what I try to get out of SQL Saturdays, but organized networking time at those is not consistent. It’s what I try to get out of PASS Summit, but can be difficult with such a large crowd. It worked perfectly on SQLCruise.
Thank you. That is all I can say to the wonderful companies that sponsored this great event. Without their generous support, we would not have had these opportunities. SQL Sentry held the Hairy Execution Plan contest, and sent Allen White along to demo Plan Explorer. Brent Ozar PLF sponsored not only me, but also the SQL Storytellers competition. The lucky winner received a Kindle! Red Gate gave us the Search the Ship contest, and great books. Idera sponsored the Iditarod Challenge (of which no further details shall be released, to surprise future cruisers), and two more people won Kindles! Quest Software provided free licenses for their software to cruisers (hooray!) and sent Kevin Kline. I can’t forget to mention B-Side Consulting, run by our great Cruise Director Tim Ford. Without their generosity, this even would not have been as awesome. (And without their software tools and services, our jobs would be a lot harder.)
Get Your Cruise On!
SQLCruise is a great way to combine training and a vacation. You can bring your significant other, kids, parents – just not the dog. At sea, while they’re taking a class on towel-animal-folding, playing in the casino, or sitting by the pool, you can learn and network.
I highly recommend SQLCruise to anyone that wants to learn more about SQL Server and truly connect with others who want to do the same. Check http://sqlcruise.com for upcoming dates and locations!