It’s tempting to think that table partitioning will improve query performance. After all, it’s an Enterprise Edition feature– it must have a lot of magic, right?
Table partitioning does have magic for the right situations. It shines when you want to add a large amount of data to a table or remove a large amount of data in chunks! The “partition switching” feature can essentially help you avoid doing inserts and deletes against a live production table, and it pairs beautifully with cool new features like ColumnStore indexes. For things like fact tables in a data warehouse, it’s drool-worthy.
But for other uses, table partitioning can be a lot of work to implement and it can slow your queries down. And it may not always be obvious to you why your queries are slower. Let’s take a look at a simple example.
Let’s partition the Posts table
We’re using the StackOverflow sample sample database. The Posts table contains questions, answers, and other types of posts, each of which has a different PostTypeId.
Let’s say that our workload does many point lookups on a post based on an integer Id column in the table. We have some aggregate queries that run looking for top scoring questions, recent answers, and things like that, and most of those queries specify PostTypeId. We might be tempted to partition the table on PostTypeId.
For comparison purposes, we create two tables with the exact same columns, data types, and rowcounts:
- dbo.PostsNarrow has a unique clustered index on the Id column
- dbo.PostsPartitioned is partitioned on PostTypeId and has a unique clustered index on Id, PostTypeId
For queries that do specify PostTypeId, we’re hoping that we can get partition elimination. (This can be tricky sometimes due to bugs with min/max syntax, or weird implicit conversion issues, or even optimizer issues with joins). But performance can also be tricky for simple queries which don’t specify PostTypeId, even when partition elimination isn’t an issue.
Our example query
Our example query pulls the people who are most active– they write the most questions and answers. Here’s the version of the query that runs against PostsNarrow:
WITH freq AS ( SELECT TOP 100 OwnerUserId, COUNT(*) as PostCount FROM dbo.PostsNarrow GROUP BY OwnerUserId ORDER BY COUNT(*) DESC ) SELECT u.DisplayName, freq.PostCount FROM freq JOIN dbo.Users u on u.Id=freq.OwnerUserId; GO
We have the following index on both PostsNarrow and PostsPartitioned. (The included columns are to support other queries.)
CREATE INDEX ix_Posts_OwnerUserId_INCLUDES on dbo.PostsPartitioned (OwnerUserId) INCLUDE (PostTypeId, AcceptedAnswerId, CreationDate, LastEditorUserId) GO
Run against the non-partitioned table, this query uses 3344 ms of CPU time and has an elapsed time of 1039 ms.
Just changing the query to run against dbo.PostsPartitioned, the query goes to 5672 ms of CPU time and the elapsed time is 2437 ms– it’s twice as slow!
The secret is in the execution plans
Comparing the plans for the two queries, we can see that SQL Server is doing something different inside the CTE where it groups by OwnerUserId.
Non-Partitioned Plan Excerpt (showing actual row counts):
Partitioned Plan Excerpt (showing actual row counts):
The Non-Partitioned plan is able to stream the data from the index scan directly into a stream aggregate operator to do the group by OwnerUserId. The partitioned plan has to repartition the streams, then it has to put all the data in a Hash Match aggregate and compare all the buckets.
The hash match aggregate is more work
The hash match operator is what’s slowing this query down– it requires a larger memory grant and it has to do more work. But SQL Server has to use it for our query because our non-clustered index on OwnerUserId is partitioned.
In other words, the data in PostsPartitioned’s OwnerUserId index is like this:
- All rows ordered by OwnerUserId for this PostTypeId with included columns
- All rows ordered by OwnerUserId for this PostTypeId with included columns
- All rows ordered by OwnerUserId for this PostTypeId with included columns
Any given OwnerUserId can be in multiple partitions, and we have to find them all. That’s why we can’t use the stream aggregate operator.
Can’t we create a non-aligned index?
We can! We have the option to create a non-aligned (non-partitioned) index on dbo.PostsPartitioned’s OwnerUserId column. Since it’s not partitioned, everything is purely sorted by OwnerUserId and we can get the stream aggregate back.
But, wait a second. Let’s pause and think about this.
The whole point of using table partitioning was to partition things. Creating a bunch of non-aligned indexes is tantamount to admitting defeat: we can’t get partition elimination against those indexes for queries that DO use the partitioning key. (There are other downsides, too, like preventing partition switching.) And we will get into a whole different set of problems if we have to keep partitioned and non-partitioned copies of the same index.
This is just a simple example. What if it were more complex?
I’m going to let you in on a little secret: it took me a bit of time looking at this example to figure out why query 2 was slower. There were several red herrings that I fixed before writing this post regarding statistics differences between the tables that distracted me for
a few minutes more time than I’m willing to admit.
And I had to think back to some of Jeremiah’s modules in our Advanced Querying and Indexing course to remember the difference between those two operators.
And the first time I created a non-aligned index on the partitioned table to verify I could get the stream aggregate operator back, I accidentally created it as aligned at first and got more confused.
Admittedly, I enjoy this kind of thing and am saving another example of a query that goes from 1 ms to 11 seconds for a treat to dig into and write a fun demo with, but most people would rather just keep the 1ms query and not worry about it!
Let’s sum it up
Here’s what to take away from all of this:
- Table partitioning can make some queries slower
- Partitioning adds complexity to troubleshooting and tuning queries– it’s time consuming and takes longer
- Non-aligned indexes aren’t an easy answer: if we need to resort to a lot of them, how much are we actually benefitting from table partitioning?
For these reasons, table partitioning is typically not a great fit for SQL Servers with an OLTP pattern where slow queries are the biggest pain point. Traditional index tuning and query rewrites will usually get you better performance with less hassle.
Wanna learn more about indexing?
After reading through yesterday’s announcements about VMware vCloud Air SQL, here’s a quick boiled-down version for busy people with jobs:
- It’s Microsoft SQL Server 2008R2 and 2012 Enterprise Edition hosted in VMware’s data centers (not on-premises.)
- Only 3 instance sizes are available: small (2vCPU, 8GB RAM), medium (4vCPU, 16GB RAM) and large (8 vCPU).
- Drive space is limited to 150GB today (including data, log, and tempdb) but will be 2TB at official launch.
- Licensing is not included yet – you have to bring your own EEs. (Unfortunately, due to the way MS License Mobility works, that means you have to move your licenses to vCloud Air SQL for a minimum of 90 days. Not many of us have spare licenses for this type of thing.)
- The instance will need a public IP address if you want to access it from outside of VMware’s data centers. (See the “secure access” section on page 3 of the Solutions Brief PDF.)
- The instances appear to be single virtual machines – the only high availability is the VMware host protection. (I can’t find mention of a SLA for data loss.)
- You get a dedicated VM (not a dedicated host), but you can’t access the VM. You can only access SQL Server through your chosen port (typically 1433).
- To get your data into vCloud Air, you’ll be pushing it through ETL products or insert statements. Restores are not supported.
In summary, if it was announced three years ago, it still wouldn’t have even competed with Amazon RDS for SQL Server back then – here’s our initial 2012 post about the RDS release. Amazon has more options for instance sizing, drives, licensing, connectivity, high availability – heck, even database platforms, because AWS supports MySQL, Oracle, SQL Server, and even their own flavor, Aurora.
I don’t get why VMware would enter this market, at this time, with this product.
5. Turning on auto-shrink
It might make disk space alerts go away for a minute, but it can cause lots of problems.
4. Enabling xp_cmdshell and giving the SQL Server account administrative rights in Windows
Young DBAs often do this to try to get a quick fix in place to manage backup files. Later on, they realize they can manage all those things in PowerShell or cmd SQL Agent job steps and give more limited permissions. But now they’re not 100% sure of what will break they disable the option and they live in fear of a security audit.
3. Running transaction log backups every 30 minutes, only during the day
“How often should we run these backups?” (Does internet search) “30 minutes seems popular.”
Your backup frequency should be driven by RPO and RTO. And don’t turn them off at night– usually people can still make changes in the system, and there’s no reason to have a giant log backup in the morning.
2. Looking for one magical feature that will fix all performance problems
When you first start out, it seems like all those expensive enterprise features must make everything blazing fast. Later on, you realize that humble things like indexes, memory, and carefully written TSQL are worth all the effort.
1. Being Rude to All the Developers
As a beginner DBA, it’s easy to work long hours, get stressed out, and start to feel like you’re the only one protecting the SQL Server. You start to say “no” a lot. And sometimes you lose your respect for other people.
But you can’t let that keep happening: if you go down that road, it’ll ruin your job. You’ve got to regularly step back, see things from other people’s perspective, and be part of the team.
For this year’s company retreat at the Oregon coast, we asked awesome photographer Justin Bailie to shoot updated team pictures for the site and this year’s Christmas cards. Enjoy:
In Azure SQL Database, no one can
hear you scream query common system objects that tell you when a restart happened. You don’t get the access you need to sys.dm_os_sys_info, sys.dm_exec_sessions, sys.traces, or sys.databases.
The closest I’ve been able to get is to query sys.dm_os_wait_stats for several common wait types that seem to correspond with clock time – meaning, for every minute on the clock, we get about one minute of these waits.
If we subtract those waits from today’s date, we can get a rough idea of when the server restarted:
SELECT DATEADD(ms, AVG(-wait_time_ms), GETDATE()) AS approximate_restart_date FROM sys.dm_os_wait_stats w WHERE wait_type IN ('DIRTY_PAGE_POLL','HADR_FILESTREAM_IOMGR_IOCOMPLETION','LAZYWRITER_SLEEP','LOGMGR_QUEUE','REQUEST_FOR_DEADLOCK_SEARCH','XE_DISPATCHER_WAIT','XE_TIMER_EVENT')
I’m using an average here because each of those waits actually reports slightly different numbers as shown here:
Best I can tell, this SQL Server restarted on July 14th at around 3:09-3:16AM, and the average reports 2015-07-14 03:11:51.570.
Who cares when the server restarted? sp_AskBrent® cares – because folks have started asking me why their Azure SQL Database is slow, so I made it Azure-friendly. Now when you use the @Seconds = 0 parameter, you can see overall wait stats since server restart – just like your on-premise SQL Servers. (ASD also has sys.dm_db_wait_stats, but I like seeing what the server’s overall bottleneck is too.)
Database Mirroring still matters– a lot.
SQL Server’s Database Mirroring feature is a powerful tool and much easier to manage than Availability Groups. It can give you simple high availability with relatively low support requirements, and it even works in Standard Edition.
But nothing is perfect. Database Mirroring can have big problems like “split brain”, suspended mirrors, and poor performance.
Never fear: if you know what to look out for, you can get all the high availability with none of the headaches.
Get One Hour of Free Training on Mirroring!
Join me on August 19th at 10 am Central for a free Idera Geek Sync webcast! You’ll learn when Database Mirroring is the right choice, common rescue scenarios, and special setup required for mirrors. Register here.
There are two kinds of recruiters. The first kind is the Relationship Recruiter. These are the great ones. These recruiters take time to listen, not just to you but to their clients. They try their best to match you and your skills with a client and their needs. If the two sides don’t match well, they don’t try to force it to work. People-based recruiters stay up-to-date on your career. They ask what kind of work you want to be doing — not what you have been doing — and try to place you somewhere that will help you get there.
Then there’s the second kind of recruiter. The Shotgun Recruiter. These are the people whose email you reflexively trash because you already know how bad it’s going to be. The ones who will send you an opening for a Visual Studio Forms Designer when you’ve been an operations DBA for twelve years and only listed Visual Studio on your LinkedIn profile because you had to emergency repair an SSIS package late one night. I feel for these people; recruiting is hard work. But it’s still work (or at least it should be):
We want to hear your Shotgun Recruiter stories. We know you have them. We know they’re amazing. Send us your worst/funniest/strangest recruiter stories and we’ll share the cream of the crop.
But that’s not all.
We don’t want to destroy your faith in humanity without building it back up again. Therefore, we also want your stories of surprisingly spectacular recruiters — those who went out of their way to make people (or at least the two of you) happy. Like we said, there are two kinds of recruiters. We want to hear about them both.
If you’ve got a great recruiter story and want to share it on Dear $firstname, please send it to email@example.com. Thanks!
Brent says: I get so many of these emails that I had to come up with an email template for recruiter replies.
I’m delighted and honored to be presenting two sessions at the 2015 PASS Summit in Seattle: “Living and Dying by Dynamic SQL” and “SQL Server Mystery: Dead Reports Don’t Talk”. Since I’ve never done either of these sessions at the Summit before, here’s a little more information about the sessions to help you decide whether to attend:
- Although the titles differ, these are both murder mystery sessions, modeled after a murder mystery dinner party. Except we won’t be serving food.
- Like a murder mystery party, you’ll be conversing with the people around you. You’ll discuss clues and work as a group to solve the mystery.
- The suspects will be sitting in the crowd too; you may end up next to one of them.
- The mystery loops through a pattern 3-4 times: topic – interview – clue. This means every 2-3 minutes, we’re moving on to the next step and keeping things lively.
- The two main goals of the session are for you to learn more about 1) the technical topic and 2) the people you’re sitting with. The PASS Summit is as much (if not more so) a networking event as it is an educational event. I want to emphasize both points in the same session.
- In the coming months, I’ll be retweeting people at Cromulent Technologies — people you’ve never heard of. If you’re thinking of coming to my session(s), pay special attention to these retweets. Cromulent Technologies is the workplace of the principal characters in our murder mysteries.
- If you’re familiar with other speakers in the SQL Server community, you’ll probably recognize a few of the suspects.
I promise you, you’ve never seen SQL Server presentations quite like these. I hope to see you there! Got questions? Ask away in the comments!
Brent says: I love stuff like this that breaks up the monotony of conference sessions. Make me think in new ways and keep the conference lively – that keeps me engaged.
… is to use a SQL Server where a robot does it for you!
Let’s take a look at how you can do this in AmazonRDS’s hosted SQL Server these days.
Normally, restoring transaction logs is super tedious.
You’ve got to restore all the files in the right order, for every single database. Even if you script it, if you’ve got a bunch of databases you need to restore, it’s a giant pain. AmazonRDS has essentially done all the scripting for you.
Amazon RDS makes sure it can always restore your SQL Server to a point in time
When you set up an RDS SQL Server instance, it asks you how long you’d like to keep backups, and if you have a preferred full backup window. And that’s all it asks. Transaction log backups are taken every 5 minutes to support point in time restores, no matter what you want.
So I wondered, what happens if someone creates a database in the wrong recovery model by accident? Does this break the transaction log backups?
Nope! I created a database named DB1_CreatedInSIMPLE using the SIMPLE recovery model. Check out what showed up in the RDS logs right afterward:
How Recently Can I Restore?
You can see log backups in the “Recent Events & Logs” above, or you can just look at the latest restore time for your instance. Since transaction log backups are every five minutes, you can’t always restore to one minute ago.
OK, Let’s Restore This Thing! How Easy Is it?
I wasn’t lying when I said it was easy. You just highlight your instance in the console and select “Restore to Point in Time” under instance actions.
This is going to create a whole new instance using the backups they’ve been taking. You get to pick the time you want to restore from. This will vary depending on how long the instance has existed, and what your backup window is set at. (Mine’s at 7 days, it can go up to 35 days.)
You also get to pick the instance name you’re going to restore to, along with all sorts of properties and security settings:
Setting up the restored instance isn’t immediate. RDS has to set up a new Windows instance and apply a whole lot of your backups, and that takes time.
After you get all your settings done, you click Launch to get your new RDS instance in motion to be created:
Our instance is Cooking
The instance goes through multiple phases. It automatically backs up the instance as part of the process if you’ve got backups enabled, just like any other instance.
Once that completes, I can connect
Here I am, connected to my original and restored instances via SSMS on my desktop!
SQL Server in RDS Isn’t Perfect, But It Does a Lot For You
I am not a founding member of Team Cloud. But I meet a lot of people who struggle making sure that backups are being taken properly and that they can restore. They’re frequently teams of developers who don’t want to learn to deal with backups and restore. They also want things like database mirroring without having to learn it, and RDS handles that for them, too.
This isn’t for everyone. RDS limits the number of databases you can have on an instance. It doesn’t do all maintenance for you — you have to set up your own CHECKDB jobs, for instance, which you can do in the SQL Server Agent. You don’t get tools like Database Mail inside the SQL Server, or the ability to see the SQL Server Error log via SSMS (don’t worry, you can get to that and other logs in the AWS Console). You can’t use every version of SQL Server out there (2008R2 and 2012 only, currently). You can’t do Availability Groups in Amazon RDS, either, just database mirroring. (I’m a big fan of mirroring, so I’m not all that sad about that.)
Putting it All Together
Database as a service is a pretty compelling option these days. It doesn’t take away the role of the DBA, and it doesn’t handle everything for you — but the flip side of that is that you get a lot of flexibility and customization.
And dang, I love the ease of restoring that instance to a point in time.
Brent says: what I love about the cloud is that it takes away the parts of the DBA job that I never really liked in the first place.
Jeremiah says: I’m with Brent – I love getting rid of the parts of the job I never liked. Cloud lets me do that. And RDS lets me ease up management on a certain class of servers even more
Erik says: I, for one, welcome our new Robot DBA overlords.
I recently came across a curious case where a SQL Server was suffering a number of long-running queries coming from an application written in Entity Framework. When I measured the average query execution times, I got some unexpected results:
- CPU time: 12 milliseconds
- Reads: 273
- Elapsed time: 6800 milliseconds
Looking at the wait stats for these queries, I saw there was a lot of ASYNC_NETWORK_IO — often 1000+ milliseconds. That didn’t make any sense either! How can a query with so little CPU time and so few reads take so long to complete? It’s not like the application was asking for millions of rows and couldn’t consume the results fast enough.
I ran the query myself in Management Studio:
- CPU time: 17 milliseconds
- Reads: 273
- Elapsed time: 155 milliseconds
These numbers made a lot more sense, and confirmed that the application was to blame. But what exactly was the application doing for 6645 milliseconds?
Entity Framework will try to be clever about connections
After some research, I found that Entity Framework can be sneaky about how it manages database connections. An article about EF connection management on MSDN proved both enlightening and slightly terrifying:
Entity Framework will handle database connections automatically by default. Note two things here: EF will open the connection if you specify any LINQ or ObjectQuery method, and that connection won’t be closed until the ObjectResult has been completely consumed or disposed.
In this case, the EF code invoked the Where method, then went on to do a number of foreach loops, assuming the connection had been closed when it hadn’t. SQL Server was left holding the phone until EF said good-bye and hung up. The solution here was to open the connection, do as little as necessary, and then close the connection.
UPDATE: EF MVP Julie Lerman (@julielerman on Twitter) mentions in the comments below that the MSDN documentation is outright wrong about some methods opening a connection, and has passed this on to Microsoft so they can correct the error.
This would be the end of the connection management story, except…
Entity Framework would like to announce it will no longer be clever about connections (when you open them yourself)
Another MSDN article about EF connection management points out changes to Entity Framework 6 and later:
Again, the seemingly innocuous and trivial “Note” is anything but. It’s a reasonable assumption on Entity Framework’s part; if you’re going to open it yourself, you’ll be the one who closes it. Still, it means we have to be careful with Entity Framework code when it comes to database connections. And now, depending on the Entity Framework version, we’ll see one of two different connection management behaviors.
How to spot Entity Framework keeping connections open
The tell-tale signs, as we discovered in this case are:
- When running the query from the application
- Relatively low CPU time but high elapsed time when running the query from the application.
- ASYNC_NETWORK_IO waits for the query
- When running the query from SQL Server Management Studio
- Relatively similar CPU time and elapsed time when running the query from Management Studio.
- Significant amounts of application code that execute in between the connection open event and close event. To prove the connection is left waiting during the open and close events, step through the code in a debugger and pause before the connection is closed. You should see the query racking up ASYNC_NETWORK_IO waits. (Remember, the events that open and close the connection may not be explicitly doing so.)
IT’S NOT A SQL SERVER PROBLEM. IT’S An entity framework problem.
Entity Framework is great for developers who don’t have the spare time or motivation to learn SQL Server querying, but that convenience comes with costs. One of those costs is keeping a suspicious eye on how it manages database connections. It’s tempting to look at a long-running query and blame SQL Server for being slow. However, if this happens to you and your Entity Framework-based application, it’s worth investigating further to see who’s leaving whom waiting.
Brent says: don’t be ashamed of using EF, either! I’m all about getting to market as quickly as possible. After all, if you don’t get your app out there, you don’t get paid, and DBAs need to get paid too.