Blog

Memory and IO in Tempdb: SQL 2014 and SQL 2012 SP1 CU10 Improvement

SQL Server, TempDB
19 Comments

Update on 5/25/2014: At least parts of this improvement have been backported to SQL Server 2012 SP1 CU 10. Read the KB here. After applying CU10 to SQL Server 2012 SP1, the test query shown in this post performed as it did against SQL Server 2014.

I was excited to read Bob Dorr’s recent post about a hidden gem in tempdb performance in SQL Server 2014. I had a copy of SQL Server 2014 Developer Edition handy, as well as a workload that beat up tempdb, so I decided to do a quick performance test. I wanted to find out how much this might help performance when tempdb was already on SSDs.

The tempdb improvement (in a nutshell)

There’s been a myth for a long time that if you populate a temp table or table variable, the data just resides in memory. This hasn’t been true in the past, but in SQL Server 2014 temporary operations are not as “eager” to cause disk IO.

The test setup

I ran a simple workload using HammerDB. My workload “warmed up” for one minute, then ran for three minutes. The workload ran with 1 watcher thread and 6 worker threads. Each worker thread selects data into a temp table as fast as it possibly can, over and over again. The workload isn’t much like a normal production workload: it just beats up tempdb.

I ran this test a few times independently against two VMs on my laptop. Both VMs have 6000MB RAM, 4 vCPUs, Windows Server 2012 R2, and all their storage is on SSD.

  • Mister: SQL Server 2012 SP1
  • ChairmanMeow: SQL Server 2014 RTM

Both SQL Server instances were configured with ‘Cost Threshold for Parallelism’ set to 50, which effectively makes this workload single threaded. Each instance had 40 equally sized tempdb data files configured. (That’s a lot more than you’d usually have with 4 vCPUs! This is a contrived workload with an especially high amount of tempdb contention.)

During each test run, I took a 30 second sample of waits and file io stats using sp_BlitzFirst®. (I ran: exec sp_BlitzFirst @ExpertMode=1, @Seconds=30)

The Test Query

The workload runs this test query. The execution plan costs out as 16.71 on both of my instances and the plan is the same shape.

How did the instances do?

I got quite consistent results over multiple tests. Here’s a comparison:

tempdb 2012 2014 comparison

Wait stats comparison

My SQL Server configuration isn’t perfect. Sure enough, there’s still some tempdb contention: I’ve got PAGELATCH waits because my queries are fighting over PFS and GAM pages.

PAGEIOLATCH show up more in the SQL Server 2012 sample. That’s the first tip that disk access is more of an issue.

MB written by file comparison

The real validation on what’s happening in this feature comes in on the two columns on the right. sp_BlitzFirst® peeks at your sys.dm_io_virtual_file_stats DMV, and when you run it in ExpertMode it lets you know the top 5 files you’ve read and the top 5 files you’ve written to.

We can clearly see that in SQL Server 2012, I was doing a lot of write IO. That write IO was going to every single tempdb file evenly (that’s expected if they’re evenly sized), as well as to the tempdb log. My average write latency was 7 ms, which isn’t really bad given what I was doing to my SSDs.

And sure enough, that IO just about disappears in SQL Server 2014! That adds up to a notable improvement in transactions per minute.

You’ll really love this feature if you can’t put SSD underneath all of your tempdb databases. But even for those who’ve already tuned up their storage, this feature should help you out.


Introduction to Extended Events (video)

SQL Server
2 Comments

Extended Events were introduced with SQL Server 2008. Extended Events is a replacement for Server Side Traces and so much more – it’s a lightweight way to look at SQL Server performance, events, deadlocks, locking, blocking, and more. In this webcast, Jeremiah Peschka provides a high-level introduction to Extended Events. You’ll learn about how Extended Events are structured, how to find out what’s available, and where to get started.

Find out more at our Extended Events resource page.


“If You Can’t Do ____, You’re Not a Production DBA”

Recently I saw a tweet that said, “If you can’t read a query execution plan, you’re not a production DBA.”

I love execution plans, and there are really great reasons to learn to read execution plans, but I disagree completely.

Database Administration is not all about performance

Performance is only one component to database administration. Let’s take Jill as an example.

Jill is a SQL Server DBA who focuses on High Availability and Disaster Recovery for a large company. She works to keep her knowledge and skills sharp on backups and restores, Windows Failover Clustering, and High Availability Groups. She designs mission critical systems to meet high uptime requirements. She also designs and runs a lab where she proactively causes outages and troubleshoots them, corrupts databases and repairs them. She uses these experiences to build processes for front-line engineers to respond to production incidents and trains them to keep systems online as much as possible. Jill is excited about learning more about virtualization and SAN replication in the next year and has created an aggressive learning plan.

Jill’s pretty awesome, isn’t she? Is Jill magically not a Production DBA if she looks at an execution plan and doesn’t know what it means?

If you’re just getting into database administration, we don’t recommend that you start with performance as your primary focus. We recommend that you start at the base of the database Hierarchy of Needs. As you work your way up, decide where to specialize.

Don’t de-legitimize DBAs who are Different than You Are. Talk to Them!

It’s great to love what you do. Just remember that database administration is a really broad field, and the way you DBA isn’t necessarily the way everyone has to DBA. (Sure, DBAing is a verb. Why not?)

Let’s not fall into the trap of thinking “I’ve got to have this one skill so I can get into the cool club.” Instead, let’s get excited about the fact that there’s so much to learn that we all have to make tough choices. Let’s seek out the people who choose differently than us, and learn from them, too.


Happy Third Birthday to Us, Presents for You

Company News, SQL Server
29 Comments

Three years ago this month, we turned this blog into a consulting company. This year, we’re … well, we’re still not adults.

Brent Ozar Unlimited Class of 2014
Brent Ozar Unlimited Class of 2014

It’s been an incredible three years. Sure, it feels just “normal” now, but when we look at the numbers, suddenly it feels like a big deal compared to where we were in 2011:

  • Over 250 SQL Critical Care® patients
  • 1,322 training video customers
  • Thousands of in-person training attendees at dozens of classes and pre-cons
  • 190 YouTube videos watched by 635,471 people
  • 23,782 email subscribers
  • 3,333,741 unique web site viewers
  • 2 awesome full time employees
  • Microsoft Silver Partner
  • 3 company retreats

Let’s celebrate. On Tuesday, April 29th, for exactly one day only, discount code Happy3rd gets you 33% off our training videos and in-person classes. Enjoy!


Advice to an IT newcomer

SQL Server
4 Comments
Women in Technology
Women in Technology (yes, Clippy is a woman, too)

We recently got the following question for Kendra and Jes in the Brent Ozar Unlimited® mailbox:

Six months ago I stumbled into the IT world. Do you have any advice for someone (a woman) starting off in the IT industry, specially someone without a computer science degree? I really enjoy working with databases and data. I would love to one day be a Database Administrator and get into business intelligence.

JES SAYS…

There has been a lot written about women in technology – positive and negative. I’m happy to say that my experience as a woman in technology has been incredible. The SQL Server community has been accepting, helpful, and nurturing. If anyone, man or woman, came to me, asking how to succeed, I’d give them this advice.

Ask questions. Constantly learn. There is no such thing as too much knowledge. The most successful people I know in IT – not just the SQL Server world – are lifelong learners. They read books and blogs, attend user group meetings and conferences, and learn new programming languages in their spare time. Build your own computers. Know how to set up a network. You don’t have to be an expert at every facet of technology, but knowing enough to talk to other people in “their” language will go a long way.

Speaking of user groups…find the closest user group, and attend it regularly. Don’t have one nearby? Start one, or join a virtual user group (like those at http://www.sqlpass.org/PASSChapters/VirtualChapters.aspx). Not only will you learn about things you may not be exposed to at work, you’ll have a chance to build a network. I can’t tell you the number of questions I’ve gotten answered through this channel, or the number of people I know that have gotten new (better!) jobs this way.

Never, ever be afraid to say, “I don’t know, but I can find the answer.” People will respect you far more if you are honest. Don’t try to be a know-it-all. If you haven’t dealt with a particular technology or situation before, acknowledge that, make note of what the person is asking for, and research it later. You’ll learn something new, and you won’t get caught giving bad – or wrong – answers.

Don’t think, “I’ll never know as much as him/her.” Yes, there are some people in IT that started building computers or robots or software before they could drive a car. Instead of thinking that you will never know as much as they do, remind yourself how many years of experience they have – and how they can help you. Ask to pick their brains. Ask them what books they read or what tools they use. Learn from them.

Most of all, don’t fall victim to the so-called “impostor syndrome”. Someone always appears smarter, faster, more organized, more accomplished, less stressed, or hasn’t spilled coffee all over herself yet today. Don’t let that invalidate where you started from and what you have accomplished. Keeping a blog – even if it’s private – that you can go back and reference over the years is a great way to value yourself. I know I’ve seen a dramatic change in my writing over five years, from both a technical and editorial perspective.

Good luck! Being in IT – and especially the SQL Server field – is exciting and rewarding.

KENDRA SAYS…

Remind yourself that you’re qualified to be a DBA. You mention that you don’t have a computer science degree. Great news: that’s not required. There’s no degree or certification that is the Official DBA Training Program or Proving Ground.

I had a period where I felt I wasn’t “legitimately a DBA” because I also worked with some C# and drove some non-SQL software development processes. But I was doing some really cool things with SQL Server administration, too. I should have felt awesome about being a DBA, regardless of my job title.

Never feel that your background or your current job have to fit a specific mold for you to be a “real” DBA. There is no mold!

Remind yourself that you are designing your own career as you go. Just like there’s no set educational and certification “solution” to end up with all the right skills, job progression for DBAs is usually not linear. It’s a rare person who starts with a “Junior DBA” job title and works their way up to “Mid Level DBA” and then “Senior DBA”.

Instead, most people these days struggle to find the right training, the right mentor, and discover if they want to specialize (Virtualization? Performance Tuning? Data modeling?), be a generalist, or dive off into uncharted waters (Hadoop? Different Platforms? Business Intelligence?). There are many paths, and there are new paths all the time.

Expect to repeatedly redefine your interests and redesign your career. Make sure that every six months you have a conversation with your manager about what kind of work makes you excited, and where you’d like to be in a year or two.

Remind yourself that you do great stuff: and write it down. For a few years, I had a formal review process where I was regularly required to write out my accomplishments. And you know what? That was great for me! Each item on the list seemed small, but when I put it all together it gave me a new view of myself.

This may be difficult, but it’s worth it. Keeping track of the great things you do boosts your confidence and makes you ready when opportunity comes around.

GET INSPIRED

Is this advice only for women? Heck no! Sometimes it’s just nice to ask advice from someone who’s part of a group you also identify with and hear their perspective.

Want to learn more about how to build a great career working with data? Hop on over to Brent’s classic post, “Rock Stars, Normal People, and You.”


Going, Going, Gone: Chicago Performance Class Selling Out

SQL Server
3 Comments

Your application wants to go fast, so I guess that means our tickets go fast.

For our upcoming Chicago training classes in May, the 3-day SQL Server Performance Troubleshooting class is just about sold out. (There’s still about 15 seats left for our How to Be a Senior DBA training class, though.)

If you missed out this round, we’re doing the same classes in Philadelphia this September. Read more about the classes here, or read about how our San Diego attendees liked it earlier this year.

Don’t hold out for additional cities or dates – this is our entire 2014 training calendar. We’re working on our 2015 lineup as we speak, though – let us know what you’re looking for, and you might just influence our training class decisions.

Brent Teaching with Hardware
Brent Teaching with Hardware

SQL Server 2014 Buffer Pool Extensions

SQL Server
59 Comments

SQL Server 2014 contains some interesting new features. Although SQL Server Standard Edition is limited to 128GB of memory, teams deploying on Standard Edition have an option to fit more of their working set in low latency storage – SQL Server Buffer Pool Extensions.

How SQL Server Normally Deals With Data

During SQL Server’s normal operations, data is read from disk into memory. At this point, the data is clean. Once the data is changed, it is marked as dirty. Eventually the dirty pages are written to disk and marked as clean; clean pages may be flushed from memory when the data cache (the buffer pool) comes under pressure. At this point the data is gone from memory and SQL Server has to read it from disk the next time a user runs a query that requires the data.

As long as you have less data than you have memory, this isn’t a problem. As soon as you have more data than you have memory, you’re at the mercy of physical storage.

Right now, some of you are probably saying “So what? I have an awesome EMC/3PAR/Hitachi VX6.” You also probably have SQL Server Enterprise Edition and a pile of unused RAM sticks. This blog post isn’t for you. Go away.

The rest of you, the 99%ers, listen up.

Speeding Up Data Access with Buffer Pool Extensions

SQL Server 2014 Buffer Pool Extensions are our new secret weapon against not having enough memory. Like most secret weapons, there’s a lot of hype surrounding Buffer Pool Extensions (BPE).

The idea behind BPE is a lot like the idea behind virtual RAM (better known as swap space): fast, low latency persistent storage is used to replace a portion of memory. In the case of BPE, SQL Server will use the disk space to store clean buffers – specifically BPE will hold unmodified data pages that would have been pushed out of RAM.

To see just how fast this was going to perform, I create a test instance of SQL Server and decided to find out.

Test Methodology

I ran SQL Server through a relatively boring test harness – TPC-C running through HammerDB. The database was created at a scale factor of 444 warehouses – this yields 44GB of data on disk, or near enough. SQL Server 2014 RTM was installed on Windows Server 2012 on an Amazon i2.8xlarge instance. To find out more about how the instance was physically configured, you can check out the instance type details page.

SQL Server was set up in a fairly vanilla way:

  • Max degree of parallelism was set to 8
  • Max server memory was left alone
  • tempdb was given 4 data files located on a local all SSD RAID 0 of four drives
  • A second all SSD RAID 0 of four drives was reserved for BPE

Tests were run multiple times and the results of the first test were discarded – many changes during this process could clear the buffer pool as such, the first test results were assumed to be anomalous. The remaining results were averaged to produce the following chart:

It's faster than not having enough RAM.
It’s faster than not having enough RAM.

Conclusions

Having an appropriately sized buffer pool was far more effective than allocating considerable space to buffer pool extensions. BPE improved performance by 42.27%. This is not an insignificant performance gain, but BPE is no substitute for memory.

BPE will shine for customers deploying in dense, virtualized environments where memory is constrained but SSD is cheap and plentiful. Given the near ubiquity of SSD, and high speed consumer grade SSD being available for as little as $0.50 per GB, BPE may seem tempting. It may even provide some respite from performance issues. However, BPE is no substitute for RAM.


Collecting Detailed Performance Measurements with Extended Events

SQL Server
20 Comments

Analyzing a workload can be difficult. There are a number of tools on the market (both free and commercial). These tools universally reduce workload analysis to totals and averages – details and outliers are smeared together. I’m against using just averages to analyze workloads; averages and totals aren’t good enough, especially with the tools we have today.

Collecting Performance Data with Extended Events

A note to the reader SQL Server Extended Events let us collect detailed information in a light weight way. These examples were written on SQL Server 2012 and SQL Server 2014 CTP2 – they may work on SQL Server 2008 and SQL Server 2008R2, but I have not tested on either version.

We need to set up an Extended Events session to collect data. There are three Extended Events that provide information we want:

  • sqlos.wait_info
  • sqlserver.sp_statement_completed
  • sqlserver.sql_statement_completed

In addition, add the following actions to each event, just in case we need to perform deeper analysis at a later date:

  • sqlserver.client_app_name
  • sqlserver.client_hostname
  • sqlserver.database_id
  • sqlserver.database_name
  • sqlserver.plan_handle
  • sqlserver.query_hash
  • sqlserver.query_plan_hash
  • sqlserver.session_id
  • sqlserver.sql_text

In a production environment you should enable sampling. Sampling is just a predicate that filters out random events, typically by using the modulus of the session identifier (e.g. SPID modulo 5 = 0 will sample 20% of activity). Enabling sampling makes sure that you don’t collect too much data (yes, too much data can be a bad thing). During a routine 5 minute data load and stress test, I generated 260MB of event data – be careful about how often you run this and how long you run it for.

To get started, download all of the scripts mentioned in this post.

Reporting on Extended Events

Go ahead and set up the Extended Events session and run a workload against that SQL Server for a few minutes. If you’re analyzing a production server, copy the files to a development SQL Server or to a SQL Server that has less workload – you’ll thank me later. Once you’ve collected some data in the Extended Events session, stop the session, and run the results processing script. On my test VM, processing the data takes a minute and a half to run.

The data processing script creates three tables materialize XML data to speed up processing; shredding XML takes a long time. Relevant data is extracted from each of the different events that were collected and persisted into the processing tables. If you’re going to be doing multiple analysis runs across the data, you may even want to put some indexes on these tables.

Once the data is loaded, run the analysis query. This doesn’t take as long as the processing script, but it does take some time; it’s worth it. The analysis script collects a number of statistics about query performance and waits. Queries are grouped into 1 minute time blocks and metrics around reads, writes, duration, and CPU time are collected. Specifically, each metric has the following statistics built up:

  • Total
  • Average
  • Standard Deviation
  • Minimum
  • Maximum
  • Percentiles – 50th, 75th, 90th, 95th, and 99th

The same thing happens for each wait as well – each wait is time boxed on the minute and then both signal and resource waits are analyzed with the same statistics as query duration metrics.

In both analyses, analysis is performed on a query by query basis. At the end of the analysis we you get a multi-dimensional view of the data by time and query. It should be easy to perform additional analysis on the data to create broader time windows or to analyze the entire dataset at once.

Want to see what the output looks like? Check out these two screenshots:

Query Performance by the Numbers
Query Performance by the Numbers
Query Waits
Query Waits

Why Produce All of These Metrics?

All of these metrics give us insight into how these queries really run; an average just doesn’t help. Standard deviation alone lets us be aware of the variability of a particular metric – high standard deviation on a particular wait type means that we have a lot of variability in how long we wait on a resource. We also collect percentiles of all of these different metrics to help understand the distribution of data.

With this data at our disposal, we can make a better analysis of a workload. Now we can identify variations of a query that are producing bad plans, taking too long, or just reading an excessive amount of data. Or, better yet, if query performance is constant, we know that your code is just plain awful.

How’s that for a parting thought?


Five Reasons a Production DBA Should Learn to Read Execution Plans

Execution Plans
8 Comments

While chatting with attendees before my Don’t Fear the Execution Plan webcast, a question was posed: “As a DBA who mainly worries about backups, index fragmentation and checkdb, should I also be looking at execution plans?”

YES! Here’s why.

  1. Performance is important. A production DBA’s main focus should be the safety and security of the data, but the performance of servers is also something to be concerned with. Companies pay a lot of money in hardware, support, and licensing costs for these servers. Being able to squeeze every bit of performance from them will save the company money – which helps everyone in the company in the long run.
  2. Be a superhero. Your first response to an ongoing, persistent performance problem may be a hardware fix. Add more memory, add more disks, ask the SAN team to give you dedicated LUNs, beg for a new server. But at some point, it is going to be worth your time – and the company’s money – to tune bad queries. By being the person that can identify the bad queries and provide ways to improve them, you are putting on your superhero cape and saving the day.
  3. You’re being proactive, not reactive. We all need to be reactive sometimes – that’s a facet of the job. But being proactive – identifying top resource-consuming queries on a server, identifying problem spots in the queries, and suggesting fixes – makes us look better. We show we are bringing value to the company.
  4. Grow your knowledge of SQL Server. You never know when you may need the information. A new developer may have been able to ask the DBAs questions at her last job; you may need the skill at your next job. You will also learn things along the way about SQL Server internals, and that’s great knowledge to have.
  5. Increase communication between DBAs and developers. If your developers already use execution plans to tune queries, you will be able to speak their language. If they don’t, you can teach them as you learn – and there is no better way to learn than to teach something! Breaking down communication barriers is a goal my clients typically have. It involves some work and willingness to give, but will make your job better.

The SQL 2014 Cardinality Estimator Eats Bad TSQL for Breakfast

SQL Server
33 Comments
Is this really happening?
Is this really happening?

Sometimes I run into a query plan that just shocks me. It’s like looking up and suddenly seeing an adult dressed as a squirrel riding a unicycle down a busy city street. You have to stop and ask yourself, “Did that really just happen?” (I live in Portland, Oregon, so yeah, I’ve seen that.)

A while back I blogged about how to write a query to demonstrate a big memory grant against the AdventureWorks2012 database. The trick was to stuff the query’s joins full of functions so that the query optimizer became baffled about how many rows might come out of the join. The query is terrible, but it’s useful for demonstrating some problematic situations with memory on a test server.

Recently I dug out the query to set up a demo on SQL Server 2014 and something strange happened: it saw through the awfulness of my query. It made my horrible TSQL fast.

Let’s feed my terrible query to SQL Server 2014

First, make sure the new cardinality estimator will be used for your query. You do this by setting the database compatibility level to 120:

Now we’ll run this terrible, awful, no good query:

Shazam, it finishes using only 63 ms of CPU time. It asks for a 27MB memory grant and estimates 290 rows (quite accurately).

Top of Execution Plan-New Cardinality Estimator

The execution plan contains warnings that “Type conversion in expression … may affect “CardinalityEstimate” in query plan choice”, but wow, it really did a remarkably good job with this!

Using OPTION(QUERYTRACEON 9481) to test Compatibility Level 110

You can go back to the old cardinality estimator with SQL Server 2014 in two ways: You could change the whole database’s compatibility level back to 110 like this:

But that changes it for everything in the whole database. We might just want to see how THIS query would behave using the old cardinality estimator, but still on SQL Server 2014. You can do that by adding OPTION(QUERYTRACEON 9481) to the very end of our gruesome query:

Retesting the query with the old cardinality estimator… Ouch! The query uses 84,109 ms of CPU time and the execution plan is back to a world of confusion, thinking it’s going to have to handle a kazillion rows:

Execution Plan-High Estimations

Hey there new cardinality estimator, I’d like to get to know you better.

I’m not saying the new cardinality estimator will be better at every query, that it won’t have any regressions, or that you should start putting functions around all your joins.

But it’s pretty remarkable when the optimizer takes code that you wrote to be as terrible as possible, and suddenly makes it fast. Sign me up for more of that.


Updated “How to Think Like SQL Server” Videos

SQL Server
2 Comments

On my East Coast user group tour last month, I presented my How to Think Like the SQL Server Engine course to a few hundred folks. I’m always trying to learn and adapt my delivery, and I noted a lot of attendee questions this round, so I updated the course.

This is my favorite course I’ve ever done. In about an hour and a half, I cover:

  • Clustered and nonclustered indexes
  • Statistics, and how they influence query plans
  • How sargability isn’t just about indexes
  • T-SQL problems like table variables and implicit conversions
  • How SQL turns pages into storage requests

If you’ve always wanted to get started learning about SQL internals, but you just don’t have the time to read books, this course is the beginning of your journey.

Attendees give it rave reviews, but this one is one of my favorites:

Go check it out and let me know what you think.


Why Index Fragmentation and Bad Statistics Aren’t Always the Problem (Video)

Do you rely on index rebuilds to make queries run faster? Or do you always feel like statistics are “bad” and are the cause of your query problems? You’re not alone– it’s easy to fall into the trap of always blaming fragmentation or statistics. Learn why these two tools aren’t the answer to every problem in these two videos from Kendra Little.

Does Rebuilding Indexes Make Queries Faster in SQL Server?

Learn if rebuilding indexes is a good way to help queries run faster in SQL Server in this 13 minute video.

Not convinced that fragmentation isn’t really your problem? Read Brent’s post, “Stop Worrying about Index Fragmentation“.

Why are Statistics Always the Problem? (Or ARE They?)

Once you’ve moved beyond an addiction to rebuilding indexes, you may find that you become an obsessive Statistics Blamer. Learn why you might mistakenly think your stats are “bad” when something else is really the problem in this 12 minute video.

Want to learn more about statistics, parameter sniffing, and hints? Read “Optimize for… Mediocre?” and “Recompile Hints and Execution Plan Caching“.


SQL Server 2014 Licensing Changes

Licensing
159 Comments

With the release of SQL Server 2014, we get to learn all kinds of new licensing changes. While I don’t work for Microsoft legal, I do have a PDF reader and a web browser. You can follow along in the SQL Server 2014 Licensing Datasheet… if you dare.

Server + CAL Licensing is Still Around

It’s only for Standard Edition and BI Edition.

Microsoft are highly recommending that VMs be licensed as Server + CAL (rather than per core). This can make a lot of sense when there are small, single application SQL Servers that cannot be consolidated for security reasons. Having a number of Server + CAL license for 1 or 2 vCPU instances can be much more cost effective than having a large number of core based licensed.

Of course, it makes even more sense to just license the entire VM host…

How much money would it take to give you assurance?
How much money would it take to give you assurance?

 

Standby Servers Require Software Assurance

Prior to SQL Server 2014, many shops were able to deploy a single standby server without licensing SQL Server. Log shipping, mirroring, and even failover clustering allowed for an unlicensed passive node, provided that the passive node didn’t become the primary for more than 28 days.

That’s gone.

If you want to have a standby node, you’ve got to pony up and buy software assurance. Head over to the SQL Server 2014 Licensing Datasheet; at the bottom of page three, it reads “Beginning with SQL Server 2014, each active server licensed with SA coverage allows the installation of a single passive server used for fail-over support.” The passive secondary server doesn’t need to have a complete SQL Server license, but Software Assurance is a pricey pill to swallow. In short, Software Assurance (SA) is a yearly fee that customers pay to get access to the latest and greatest versions of products as well as unlock additional features that may have complex deployment scenarios.

In case you were confused, high availability is officially an enterprise feature. Note: I didn’t say Enterprise Edition. I mean enterprise with all of the cost and trouble that the word “enterprise” entails in our modern IT vernacular.

All Cores Must Be Licensed

You heard me.

To license a physical server, you have to license all of the cores. Don’t believe me? Check out this awesome screenshot:

License ALL THE CORES
License ALL THE CORES

It’s even more important to consider alternatives to having a number of SQL Servers spread throughout your environment. SQL Server consolidation and virtualization are going to become even more important as SQL Server licensing changes.

Finding new ways to analyze, tune, and consolidate existing workloads is going to be more important than ever before. Your ability to tune SQL Server workloads is going to be critical in successful SQL Server deployments. The days of worrying when the server hit 25% capacity are fading into history – as licensing costs increase, expect server density and utilization to increase, too.

Standard Edition has a new definition

“SQL Server 2014 Standard delivers core data management and business intelligence capabilities for non-critical workloads with minimal IT resources.” You can read between the lines a little bit on this one – SQL Server Standard Edition isn’t getting back mirroring or anything like it. In fact – SQL Server Standard Edition sounds an awful lot like the database that you use to run your non-critical ISV applications, SharePoint, and TFS servers.

Software Assurance Gives You Mobility

If you want to move your SQL Server around inside your VM farm, you need to buy Software Assurance. VM mobility lets teams take advantage of VMware DRS or SCOM VMM. This isn’t new for anyone who has been virtualizing SQL Servers for any amount of time. What is explicitly spelled out, though, is that each VM licensed with SA can be moved frequently within a server farm, or to a third-party hoster or cloud services provider, without the need to purchase additional SQL Server licenses.”

In other words – as long as you’re licensed for Software Assurance, those SQL Servers can go anywhere.

SQL Server 2014 Licensing Change Summary

Things are changing. DBAs need to take stock of their skills and help the business get more value from a smaller SQL Server licensing footprint. Realistically, these changes make sense as you look at the broader commercial IT landscape. Basic features continue to get cheaper. More resources are available in SQL Server 2014 Standard Edition, but complex features that may require a lot of implementation time, and Microsoft support time, come with a heavy price tag.


What happens to in-flight data compression in an emergency?

SQL Server
10 Comments

Data compression can have many uses and advantages, but it also has its drawbacks. It’s definitely not a one-size-fits-all strategy. One of the things to be aware of is that initial compression of a table or index can take quite some time, and will be resource-intensive. It also is an offline operation, so the object can’t be accessed while it’s being compressed or uncompressed. (Clarification: by default, an ALTER TABLE statement is an offline operation. You can declare it online, but, as David notes in the comments, “Although the operation is ONLINE, it’s not completely idiot-proof.”)

So what would happen if your SQL Server service or server restarted while you were in the middle of (or, as it usually goes, 90% of the way through) compressing an index? Let’s investigate.

I have a table that is 1.1 GB in size, with a 1.0 GB nonclustered index.

Table name Index name total_pages PagesMB
bigTransactionHistory pk_bigTransactionHistory 143667 1122
bigTransactionHistory IX_ProductId_TransactionDate 131836 1029

I need to reduce the size of the nonclustered index, so I decide to compress it. After using sp_estimate_data_compression_savings, I determine that I will benefit more from page compression.

I apply page compression, and allow my server to work.

Now, let’s say there is an emergency during this operation. Perhaps a component in the server breaks; maybe the data center loses power. (Or I restart the SQL Server service.) Uh-oh! What happened?

When the service is running, I check SSMS. I see the following error.

What happened to the data? Is compression all or nothing, or is it possible that some pages are compressed and others aren’t? I first check the index size.

Table name Index name total_pages PagesMB
bigTransactionHistory pk_bigTransactionHistory 143667 1122
bigTransactionHistory IX_ProductId_TransactionDate 131836 1029

Nothing has changed – the index is not one byte smaller. This tells me the operation was not successful.

I also check the error log to see what information it can provide.

error log rollback

There are 2 transactions that are rolled back. As I am the only person in this instance right now (the benefits of a test environment), I know those were my transactions.

SQL Server has treated my data compression operation as a transaction. If there is a restart at any point, the operation will be rolled back to maintain data integrity.


Are Table Variables as Good as Temporary Tables in SQL 2014?

There’s a couple of new features in SQL Server 2014 that provide options for how you work with temporary objects. Will inline index creation or memory optimized temporary tables forever change the way you code? Let’s take a look!

Inline Index Creation

SQL Server 2014 brings us a TSQL improvement called “inline specification of CLUSTERED and NONCLUSTERED indexes.” This feature works in Standard Edition and applies to many types of table. This improves the functionality of table variables. But does it make table variables as good as temp tables?

First, let’s take a look at the feature, then take it for a test drive.

Creating a table with an inline nonclustered index

Here’s a simple look at the new feature– when I create a table (nothing temporary about it), I can name and define an index on multiple columns. In this case, I’m creating a nonclustered index on the columns j and k:

I also have the option to put all my constraint and index create statements at the end of the table, like this:

What about temp tables?

The same syntax for inline index create listed above works just fine for me on temporary tables in my CTP of SQL Server 2014.

This is a good thing for some people! One of the issues with temporary tables in stored procedures is that creating an index on the temp table can prevent caching of the temporary item. For most people, this is no big deal, but for some frequently run stored procedures, it might make a difference.

Creating the nonclustered indexes on the temp table at creation time (instead of afterward) can improve temp table caching in stored procedures.

Before you rejoice, there’s some fine print. If you change your temp table syntax to use inline index creation and it enables caching, you might run into issues where statistics aren’t updated on your temporary tables when you think they would be. (The short version: Statistics are also cached for temp tables, and Gremlins keep from updating very frequently.) Just test carefully if you’ve got very frequently run stored procedures you’re modifying.

Inline index creation on table variables

The new syntax works on table variables, too! This means that with SQL Server 2014 and higher, you can create non-unique nonclustered indexes on table variables. You can even set the fillfactor option. (Note: I’m not saying you should use a lower fillfactor– I was just surprised that option was available for table variables.)

With the new SQL 2014 syntax, you will also have the option to create a non-unique clustered index on a table variable:

So, are table variables as good as temp tables now?

Well, sorry, not generally.

First, there’s some limitations to the “inline index create” feature. You can’t inline every kind of index you might want — you can’t add included columns or filters to indexes created with an inline create statement. Since you can add indexes with those to a temporary table after it’s created (and you CANNOT do that with a table variable), it has an advantage.

But temporary tables still have another advantage.

Statistics help temporary tables

When looking at the new feature, I wondered if these new indexes declared inline might secretly allow some sort of populated statistic on table variables– which hasn’t worked before.

But, unfortunately no. Even using the inline indexes, table variables do NOT get statistics to help the optimizer psychically predict how many rows will come back based on a specific predicate value. That can cause you to get less efficient execution plans. Let’s look at an example.

First up, the temp table:

For the temporary table, SQL Server uses statistics associated with the nonclustered index to estimate that it will get 25 rows back (which is right on). Based on this it decides to seek to the rows in the nonclustered index, then do a nested loop lookup to fetch the City column from the clustered index. It does 52 logical reads:

Key Lookup

Now let’s run the same code, but with a table variable with an inline index:

Oddly enough, it gets a clustered index scan. It estimates that only one row will be found — that’s because for table variables, statistics associated with the nonclustered index still can’t be populated. So it doesn’t know to estimate the 25 rows, and it just guesses one.

With a one row estimate, I thought the optimizer would surely go for the nested loop lookup. Looking up just one row is easy, right? But instead it decided to scan the clustered index. (Note: The nonclustered index is functional– it will use it if I omit the City column from the query. But it does not seem to want to do a key lookup from it in my testing here. Wacky!)

Clustered Index Scan

Finding: Inline Index creation is the most awesome for temp tables

I like the new syntax for its added flexibility. I do wish it allowed included columns, but for indexes with just key columns it can make for tidier, more compact code.

This feature doesn’t fix all the flaws of table variables– it’s interesting that I didn’t get a nested loop lookup in my testing, which makes me wonder if the optimizer has as many options with the indexes on table variables.

The ability of temporary tables to have column and index related statistics still gives them a great advantage in most situations. Using the inline index creation script on temporary tables in stored procedure to improve caching is a nice little bonus. Even with the gotcha I linked to above about statistics updates on temp tables, I think this feature makes the case for temporary tables even stronger.

What about Memory Optimized Table Variables?

First off, this feature is more expensive. SQL Server 2014 adds in new “memory optimized” tables, AKA “Project Hekaton”. This feature is only available in Enterprise Edition. I won’t come close to covering the whole feature here– I’ll just scratch the surface of one of its uses: the “Memory Optimized Table Variable”.

The first thing of note is that memory optimized tables DO support statistics— so does that mean that a Memory Optimized Table Variable might have them? Let’s take a look!

First, let’s enable memory optimized objects

To test this feature out, I need to make some changes to my test database:

Memory_Optimized = ON !

Now I can start creating my memory optimized table variable. First, I must create a table type with the table variable’s definition. I’m not going to test this in a natively compiled stored procedure just yet, so I’m leaving off the identity (it’s not supported in this scenario). I also use some special collations and don’t allow nullability on some columns to get the whole thing to work.

All right, now that our table type exists, we can put it to use! Let’s populate it and query it and check out what plan we get:

There’s a little surprise in the execution plan:

Index Seek on Memory Optimized Index

Estimated rows is still 1– statistics still aren’t working here. But magically we get an index seek on the nonclustered index instead of a scan on the clustered index. This is remarkable because back in the normal world of non-memory-optimized indexes, the City column would not be in that nonclustered index– we only asked for it to contain StateProvinceID!

Memory Optimized Indexes are Always Covering

In the brave new world of Memory Optimized tables, indexes have a motto: Always Be Covering. We’re not in Kansas any more Toto, and there’s no such thing as a Key Lookup or Bookmark Lookup anymore.

This means a couple of things: for these tables, it really doesn’t matter that you don’t have the INCLUDE syntax on inline index creation. Nonclustered indexes will always have all the columns, so why bother with an INCLUDE clause? (I’m guessing that’s why it’s not possible in the new syntax!) Also, memory optimized indexes may be larger than you’d assume because they cover everything.

Summing Up

For 99% of the use cases I see in the wild these days, temporary tables beat table variables easily because their ability to have statistics improves query optimization dramatically. That remains true with the enhancements in SQL Server 2014.

For the 1% of cases where stored procedures using temporary objects are run at high rates of frequency, there are a few new choices that are attractive. They both have gotchas and require testing, but they could both turn out pretty well:

Option 1: Inline Nonclustered Indexes on Temp Tables: This works in Standard Edition, so I think this will be the most widely adopted. Instead of creating a nonclustered index after the temp table exists, you can improve caching of your temp tables by creating nonclustered indexes with the temp table itself. Just make sure you don’t get into trouble with out of date statistics.

Option 2: Memory Optimized Table Variables: For you lucky people with Enterprise Edition, you can test these out. You won’t want to do exactly what I did above — you’ll probably want to use natively compiled stored procedures. You’ll definitely want to test at scale and make sure you can monitor performance of these new structures at load. That may be a little tricky: you don’t get actual execution plan or cost information for natively compiled procedures. Given the fact that the memory optimized table variables didn’t show any statistics in my testing, in complex implementations you could end up with some strange execution plans. (This post hasn’t covered questions like “How much memory are these structures using?” “How much memory do you have left?” “What happens if you run out?”) In short: there’s much to learn here– don’t rush in without making sure you can support it.


Microsoft Cloud Rebranded as Microsoft Pale Blue

SQL Server
15 Comments

It’s time to learn another new set of acronyms.

Effective today, Microsoft’s as-a-service brand is changing names again. As recently as last week, the product’s name had been changed from Windows Azure to Microsoft Azure, but industry observers noted that Microsoft’s web pages actually referred to a different name – Microsoft Cloud.

“Our research found that the primary barrier to Azz..As..Accu..cloud adoption was pronounciation,” said an inside source. “No one could say the damn word correctly, and nobody wanted to look stupid, so they just recommended Amazon Web Services instead.”

Thus the new name, Microsoft Cloud – but it ran into more branding problems right away, said the source. “We tried to trademark our virtual machines and databases, but you-know-who had a problem with our names, MC-Compute and MC-Database. People kept calling them McCompute and McDatabase. It probably didn’t help that our combination program was called the Value Menu.”

Enter the New Brand: Microsoft Pale Blue

The new Microsoft Pale Blue logo
The new Microsoft Pale Blue logo

Satya Nadella, Microsoft’s new CEO, picked the name himself. “Microsoft Pale Blue captures the wide-open possibilities of the empty sky. Everybody knows that blue is the best color for logos, so why not take it to the next level? Let’s use the color name as our brand.”

“Microsoft has learned to play to their core strength – product rebranding,” said industry analyst Anita Bath. “Nobody goes through product names like they do. Metro, Vista, Zune, PowerWhatever, Xbone, you name it, this is a company that understands brands are meaningless.”

Nadella has realigned Microsoft’s organizational structure to support the new mission. “Developers are building more and more applications with cloud-based services and Javascript. We have to understand that it’s the right combination for today’s agile startups.” The new Pale Blue & Javascript division will be led by John Whitebread, a developer widely known in the community as the beginning and end of this kind of work.

“We’re also announcing new datacenters – or as we call them, Pale Blue Regions – in China, North Korea, and Iran,” said Microsoft spokesperson Pat McCann. “We don’t believe politics should stop people from having access to the best technology, and we’re committed to aggressively growing our regions. Anytime we see new cloud demand, we’ll open a PBR.”

Today’s announcements did not include any numbers about customers or revenues, however, and questions remain. A few European reporters at today’s announcement asked Nadella if he thought security concerns around Microsoft reading customer data or NSA back doors might be barriers to cloud adoption. Nadella paused for a moment, then said, “No, no way. That can’t be it. It’s gotta be the brand name.”


Refactoring T-SQL with Windowing Functions

SQL Server, T-SQL, Videos
6 Comments

You’ve been querying comparative numbers like Year To Date and Same Period Last Year by using tedious CTEs and subqueries. Beginning with SQL Server 2012, getting these numbers is easier than ever! Join Doug for a 30-minute T-SQL tune-up using window functions that will cut down dramatically on the amount of code you need to write.

Looking for the scripts? Grab them below the video!

Script 1: Create Windowing View

Script 2: The Old Way of Querying

Script 3: The New Way of Querying


How to Add Nonclustered Indexes to Clustered Columnstore Indexes

SQL Server 2012 introduced nonclustered columnstore indexes, but I never saw them used in the wild simply because once created, they made the underlying table read-only. Not a lot of folks like read-only tables. (Bad news, by the way – that limitation hasn’t disappeared in 2014.)

SQL Server 2014 brings clustered columnstore indexes, and they’re totally writeable – you can insert, update, or delete into them. They’re best suited for wide data warehouse fact tables that have lots of columns, but your queries might only want a few of those columns, and they might pick any of the columns for filtering. These types of tables are notoriously difficult to index, and columnstore indexes can give you dramatic performance improvements here.

Inventor of The Locke Technique
Inventor of The Locke Technique
Books Online says you’d better be sure your access patterns all benefit from columnstore indexes, because you can’t add any nonclustered indexes to your columnstore tables. The CREATE CLUSTERED COLUMNSTORE INDEX syntax page explains that the clustered columnstore “is the only allowable index on the table,” meaning you can’t add non-clustered indexes.

Or can you?

Allow me to demonstrate what I call the Locke Technique:

Presto – you can have as many nonclustered indexes as you need – and technically, in a strange twist, the clustered index on the view is a nonclustered index on the table as well.

These have the same drawbacks as any other nonclustered index: slower inserts/updates/deletes on the underlying table, more complex query tuning, more space requirements, and so on. They also have even more drawbacks because the schema-binding view means you can’t alter the columnstore table without first dropping the nonclustered indexes and the view.


After You’ve Been a DBA, What Do You Do?

You’ve been managing database servers for a few years.

Now what?

Get more technical – every time a new project comes in, sharpen your skills. Get more involved in failover clustering, storage, virtualization, or the cloud. Write technical presentations to teach your own company’s staff, and then the public.

Business intelligence – if you’re constantly surrounded by valuable data, and you’ve got a creative eye, you can help make sense out of all those letters and numbers. There’s never been a better time to help people get actionable insight out of your tables. It’s a wildly different career path than DBA – it involves more time with people, and less time with maintenance plans.

Consult – get technical enough, and develop a reputation for solving tough problems quickly, and you can change gears entirely. Instead of working for a single company, you can move around from company to company, giving guidance on how to put out more fires faster.

Contract – in some markets, you can keep doing what an employee does, but as an outsider. Consultants tell companies what to do, and contractors do what they’re told. If you enjoy the job duties and don’t really care for the big picture, contracting can be more lucrative.

Management – if you have enough IT pros around you, and you’ve got a knack for people skills, you can really shift gears and manage people instead of servers. Caution, though: this is nothing to do with databases whatsoever. The skills that served you so well troubleshooting a cluster may not help you motivate an employee you don’t like to do a task they hate.

Architecture – your constant interactions with developers teach you more and more about building large systems. Because you’re more senior than other folks in the shop, you end up giving design guidance. Eventually you spend more time doing guidance than managing databases, or they hire a junior DBA in to backfill your production duties.

Stagnate – keep doing exactly the same thing. You’re going to get all of the servers into pretty good shape – not perfect, but good enough that your phone isn’t ringing every night. You’ll have more time to spend playing 2048 at work, but you’re not going to make dramatically more money if you’re not doing dramatically better work.

So what’s your plan?