SQL Server learning materials seem to live at two extremes. Blog posts are short and to the point, but they don’t cover material in depth. Books are long and detailed, but to quote a famous philosopher, ain’t nobody got time for that.
Enter two resources that have been out for quite a while. They aim to cover subjects from start to finish, but in a way that you can digest in an hour.
Microsoft Books Online (Yes, Books Online!)
Microsoft Books Online earned a terrible reputation for being the last place you’d ever want to look for help. Look up the sys.databases view, for example, and you get gems like this:
No context, no hyperlinks to more details for a particular setting. These aren’t even new settings – they’ve been out for years. Since at least SQL 2000, there’s been a Books Online page for ANSI NULLS that they could have linked to.
However, when you weren’t looking, Books Online went to finishing school. It’s now chock full of great explanations of concepts. When you’re looking to implement a feature, check out these explanations:
Books Online still tends to focus on features rather than tasks. For example, if you need to find out why your server is slow, heaven help you if your only resource is Books Online.
SQL Server Central Stairways
SSC’s Stairways series covers topics start to finish with 5-15 tutorials from one or two authors. I love the consistency on these – you can settle in with one author and really dig into a topic with a logical flow. Think of it as an interactive book chapter, often with lots of demos you can run to illustrate concepts.
They’ve added stairways for T-SQL, indexes, transaction logs, PowerShell, replication, SSRS, and other good foundational topics. The existing stairways keep getting better as the authors add more posts.
It was a dark and stormy… Oh, wrong story. It was actually a warm, sunny afternoon in Charlotte, NC. I was presenting “Index Methods You’re Not Using” at PASS Summit. In this talk, I discussed how indexed views, filtered indexes, and compressed indexes can improve your query performance by reducing I/O.
From stage right, an intrepid audience member raised his hand and asked, “Can you think of an example of when you would use filtered indexes instead of partitioning?”
This question left me speechless (temporarily).
My first thought was one of requirements and practicality: “What if you have Standard Edition? You can’t use table partitioning, but you could create an indexed view” I said.
The better question would have been, “What are you trying to accomplish?”
On the surface, filtered indexes and partitioning may appear similar: both are designed to hold only portions of a table’s data in separate structures. However, how they go about doing this, the use cases for each, and the requirements for each are very different.
Filtered indexes store only a portion of the data in a table. For example, in an orders table, instead of storing all orders in a nonclustered index on the “quantity” field, we could create a filtered index to only store orders with a quantity greater than or equal to 20, or between 10 and 19. In this case, the only column stored in the index leaf pages is the quantity (and the clustered index key).
Sample of data from the table
What a nonclustered index on Quantity would store
What a nonclustered index on quantity, filtered to include quantity between 10 and 19
Let’s say the data in the underlying table changes – someone wants to increase their order quantity from 750 to 1,250. The row is updated at the data level – the clustered index. At the same time, the nonclustered index must also be updated.
Table partitioning breaks the whole of the data into separate physical structure based on a partitioning key. It’s generally used to improve query performance or make administration of large tables easier. Partitioning is based on a schema and a function. The table contains a partitioning key – the field in the table that contains the value you’ll split your partitions on. Most commonly, this is a date field of one sort or another – tables are partitioned by day, month, or year.
To directly compare this to a filtered index: could you use an order quantity as a partitioning key? Yes, as long as it’s a valid data type. The difference is that the partition is going to store all columns from the table – not just the key. When you insert or update a row, SQL Server would have to look at that value to determine which partition to place it on. It would then perform the update on the appropriate partition.
A table, partitioned on quantity – 1-10, 11-20, 21-30, etc
Want to know more about partitioning? Start here.
Comparing How They’re Used
When it comes to reading the data, if you have a filtered index for a specific value, and the query optimizer can use that index, you can often reduce I/O by orders of magnitude because you are storing less of the data in the index itself. With partitioning, SQL Server has to determine which partition the data is stored on, then access it. This can be helpful on very, very large tables – but the care and time taken to implement it and the upkeep required mean it must be very carefully considered and maintained.
Which Should I Choose?
There are many features in SQL Server that may appear similar, but once you look at how they work and what they do, they are very different. Database mirroring is not the same as replication. Change Data Capture is not the same as Change Tracking. Filtered indexes and partitioning solve two very different problems. When considering implementing any new feature, stop to ask, “What is the problem I am trying to solve?”
Because why not?
Starting at midnight Eastern time (5AM GMT) on Friday, November 29th, we’ll post a new sale every four hours from our @BrentOzarULTD Twitter account. To give you an idea of what kinds of deals to expect, here’s the first one:
At midnight, all of our in-person training courses will have exactly two tickets available for just $495. First come, first serve.
No trampling, people. Stay safe out there. Good luck!
Midnight Deal Update: Congratulations to Anne Hills, David Yard, Jeff Willett, Kiran Makkapati, Mark Parfrey, Oleg Bulay, Sam Bryant, and Sam Sparks for getting in on $495 training class tickets. Several of ‘em even bought two so that they could attend back-to-back classes in the same city – nice move, because those tickets went quick!
4AM Deal Update: Congratulations to Brian Han, Brian Moore, John Crawshaw, Manoj Badamikar, and Tae Jin Kim. We posted a 50% off coupon good for any online video course, but only for the first 5 takers, and they were the quickest on the draw.
8AM Deal Update: Coupon code BlackFriday200off was good for $200 off our in-person training classes until noon Eastern. Congrats to the buyers!
Noon Deal Update: Coupon code BlackFridayFree29 got Andre Ranieri, Brian Han, Brian Hendrickson, Eric Klovning, Javier Castrillon, Lily Chiu, Manoj Badamikar, Mark Wilkinson, and Unai Garcia Herguedas a free $29 credit in our training videos.
4PM Deal Update: For just one hour, if you register for the 2-day How to Be a Senior DBA class at regular price ($1,395), you get our 3-day SQL Server Performance Troubleshooting class in that same city for free. That means for $1,395, you get a full week of training class goodness! Register for the city you want, and after we’ve received your payment confirmation, we’ll send you registration information to sign up for the 3-day class for free. This is a manual process that won’t show anything special during your 2-day class registration process. You must register between 4PM and 5PM Eastern (GMT -5) today to be eligible for this deal, and travel/hotel/expenses are still your responsibility. Read about the classes, or register for the 2-day class in San Diego (Feb), Chicago (May), or Philly (Sept).
5PM Final Deal Update: We got such a good response to our Black Friday deals that we’re launching the last one now and leaving it up for the rest of the day. Until midnight Eastern time today, all our in-person training classes are half off. Enjoy!
Thanks for spending Black Friday with us. See you in San Diego, Chicago, and Philadelphia. Enjoy the holiday weekend, and thanks for having fun with us on Black Friday.
You’ve heard the term ‘parameter sniffing’, but you’re always a little bit fuzzy on the details. Is this a bug in SQL Server? Could it be happening to you? Join Kendra in this free 30 minute video to learn the truth about this complex and often confusing feature in SQL Server.
Download demo scripts from the webcast here– the script contains commands that are only safe for test environments (not production): Parameter Sniffing Demo.sql.
The best way to improve your skills as a database professional is to practice. If you’re like me, you might even take that practice outside of the workplace and do a bit of homework on the side. You can’t take a copy of your production database home with you, so what can you do? Thankfully, the fine people Microsoft have put together a set of databases that you can download and play with.
Introducing the AdventureWorks OLTP Sample Database
AdventureWorks is a transactional database for a fictional bicycle and outdoor supply company. It’s not very exciting data and it’s not terribly big data, but it gives developers a highly normalized schema to work with for testing different querying techniques. How Microsoft managed to cram all of the relevant SQL Server features into a 215MB database is beyond me. All that aside, AdventureWorks is the first place that database professionals should go to get started practicing their skills.
There are a number of downloads available for AdventureWorks, but only a few that matter to the OLTP crowd. To get started, database professionals should download one of the following:
- AdventureWorks 2012 Data File – this is just an MDF file. You’ll need to drop it into a folder and tell SSMS to create a log file.
- AdventureWorks 2012 data and log file – there’s an MDF and an LDF zipped up in a file.
- AdventureWorks 2012 OLTP Script – these scripts will create AdventureWorks from scratch.
For a local database, go with one of the first two options – just drop the files in their appropriate folders, attach the database, and then you’re off to the races. If you want something for Azure, use the script file to create the database and deploy data.
Protip: There’s usually a CS option for each of these. The CS lets you know it’s a case sensitive option. If you want to deal with international databases, pick the CS version.
A full list of AdventureWorks OLTP options is available at http://msftdbprodsamples.codeplex.com/. There are some light weight data warehouse options available, but they are still relatively small. If you’re just getting started, those are a good option (especially since there’s an SSAS tutorial available).
Introducing the Contoso Data Warehouse
Let’s say you want to do something a bit more adventurous and you want to branch out into the bigger BI world that Microsoft has to offer. The AdventureWorks data sets are a great starting place, but the data set size doesn’t pose many challenges.
The Contoso Retail DW data set is several times the size of AdventureWorks and comes as a pre-built star schema. While it’s not the biggest database (my copy is around 1.6GB), ContosoRetailDW provides a large enough data set where aspiring database professionals can really start to push the limits of a local SQL Server.
ContosoRetailDW is also a good place to try out Enterprise Edition features like table partitioning and ColumnStore indexes. The data set is large enough that it’s possible to make use of these features but it’s still small enough that you won’t need enterprise grade hardware to accomplish these things in a reasonable amount of time.
What Are You Waiting For?
Database professionals, this is your chance to start exploring the features and functionality that SQL Server has to offer. The sample databases from Microsoft give you plenty of opportunities to work with different techniques for both OLTP and data warehouse and even familiarize yourself with Azure without having to come up with your own data.
Our Hierarchy of Database Needs training email plan has been a lot of fun. Thousands of SQL Server professionals have signed up to get an email in their in-box every Wednesday for 6 months. It’s like a part-time college course – the one you should have been given when you first got shuffled into this DBA job thingy.
Now, we’ve taken some of the content and turned it into a free 38-page PDF ebook.
It starts at the base of Ozar’s Hierarchy of Database Needs, covering backups, security, and then moves up to capacity planning and performance.
It’s not meant to be doctoral-level – this is just the intro course that we all wish we’d have gotten before management started screaming at us about why the database is so slow. And expensive. And unpredictable.
It’s like a prerequisite of things we want to make sure people know before they move up to our training classes.
Let us know what you think, and enjoy!
Views are logical objects in SQL Server databases that present you with a “virtual table”. Views are typically created for one of three reasons: security, simplification, or aggregation.
- Security: we create views so that a user can read specific columns out of certain tables, but not all the data.
- Simplification: sometimes, it’s easier to write a SELECT against one view than a SELECT against multiple tables, with joins – repeatedly.
- Aggregation: again, it can be easier to SELECT already-aggregated data than to write the query – repeatedly.
The down side to views is that when you query them, you’re still reading data from all of the underlying tables. This can result in large amounts of I/O, depending on the number and size of tables. However, you can create a unique clustered index on the view – referred to as an indexed view – to persist the data on disk. This index can then be used for reads, reducing the amount of I/O.
There are some limitations to indexed views, but when you can create one and it improves performance, it really improves performance. But, as with all features in SQL Server, indexed views aren’t the answer to everything. Here’s a quick look at some things you can and can’t do with them.
The view definition can reference one or more tables in the same database.
Once the unique clustered index is created, additional nonclustered indexes can be created against the view.
You can update the data in the underlying tables – including inserts, updates, deletes, and even truncates.
The view definition can’t reference other views, or tables in other databases.
It can’t contain COUNT, MIN, MAX, TOP, outer joins, or a few other keywords or elements.
You can’t modify the underlying tables and columns. The view is created with the WITH SCHEMABINDING option.
You can’t always predict what the query optimizer will do. If you’re using Enterprise Edition, it will automatically consider the unique clustered index as an option for a query – but if it finds a “better” index, that will be used. You could force the optimizer to use the index through the WITH NOEXPAND hint – but be cautious when using any hint.
Choose the Right Tool
If you’re looking to have a complicated, aggregated query persisted to disk to reduce I/O, an indexed view may be the right tool for your job. Test it as an option, and if it works, put it to use!
Learn more about indexed views in my video The Okapis of SQL Server Indexes!
Back in 2009 (wow, seems like only yesterday!), I wrote about designing a recovery strategy for Stack Overflow. Back then, I wrote:
With these answers in mind, Stack Overflow’s decisions not to do transaction log backups, offsite log shipping, database mirroring, and so on make good business sense. Us geeks in the crowd may not like it, and we might demand the latest and greatest in backup & recovery technology, but at the same time we want Stack Overflow to remain free. As their volunteer DBA, I’d love to do 24×7 log shipping or database mirroring to a secondary server at another colo facility – but I wouldn’t be willing to pay out of my own pocket for expenses like that.
Today, the situation is totally different. They’re in the top 50 web networks, 4.4 million users, a job posting network, dozens of crazy smart employees, and I’m not even close to a volunteer DBA for them anymore. (Heck, we’ve even paid to advertise on Stack Exchange.) I hop into the company chat rooms now and then, and I swing by the offices whenever I’m in New York, but these guys don’t need me. I jump in whenever I can for fun, because it really is fun working with engineers this sharp.
That means this time, I’m blogging about designing Stack’s recovery strategy more as an outsider’s perspective. I know you folks like reading real-life case studies, and Stack’s pretty open about their infrastructure, so this will be fun for all.
What Changed? Downtime became more expensive.
If you’ve looked at the Stack Exchange team page, you’ll notice dozens of people with the word “sales” in their job title. Stack now sells ads on the Q&A sites, plus sells job postings to companies on Careers.StackOverflow.com.
There’s real money going through the network now, and downtime starts to cost more money. If the sites are down, people may go back to Google, get their answers from another site <shudder>, and there goes some ad revenue.
This meant that at least a few databases – for ads and careers – we needed to do full recovery mode in SQL Server, and start doing transaction log backups. This didn’t start across-the-board – it started only with the most high-value databases.
As the company grew, the relative cost of standby SQL Servers in a different location started to drop. Downtime seemed more expensive, and interestingly, the actual price of the standby SQL Servers started to drop. As Stack Exchange added more systems administrators, it wasn’t really much extra work for these guys to manage a few extra database servers in other locations. And as long as we’ve got extra database servers somewhere else, kept up to date with the most recent data, we might as well put ‘em to use.
What else Changed? We Scaled Out.
Stack Exchange’s API lets the public run queries against the databases in real time (and starting with API v2.1, they can even write). For a demo of how it works (but not using the live database), try out Data.StackExchange.com. For example, here’s the most recent 10 questions from Stack Overflow:
Yes, Virginia, that’s really an Execution Plan tab at the bottom with the actual plan from your query:
Like many of Stack Exchange’s tools, Data Explorer is completely open source, so you can install this same tool in your own environment if you’d like to let internal power users query your databases without having to install SQL Server Management Studio or a reporting tool.
Enter SQL Server 2012’s AlwaysOn Availability Groups
SQL Server 2012’s AlwaysOn Availability Groups allow for multiple replicas to serve 2 purposes at Stack Exchange: easier failover to remote data centers with minimal data loss, and read-only capabilities out of those remote data centers.
I’m a huge fan of AlwaysOn AGs, but they’re like the opposite of “the easy button.” Sure, you can offload read-only queries, backups, and DBCCs to secondary replicas, but you also have to introduce a lot of new dependencies like clustering, Windows Active Directory, heartbeats, and quorums. After Stack and a few of my other clients went live with the early 2012 versions, I started jokingly calling it OftenOn – the high availability functionality ended up being a source of a lot of downtime.
Stack worked with Microsoft for months to troubleshoot sporadic outages, resulting in this Windows 2008R2 hotfix and some other fun discoveries. After a lot of challenges, Stack (and most of my other AG clients) ended up moving to Windows Server 2012 for their SQL clusters because so many of the clustering problems were fixed with rewritten clustering code.
The big gotcha, though, was that if any replica loses its network connectivity to any of the other replicas, all of the involved databases will drop offline. This is not a bug – this is the desired result.
Well, this was Microsoft’s desired result. It sure wasn’t anybody else’s, ha ha ho ho, and thankfully those nice folks at Microsoft decided the behavior would be different in SQL Server 2014. Now, if a node loses connectivity, its AG-involved databases continue to stay online.
Stack Exchange’s Upgrade to SQL Server 2014
As Nick Craver writes in his post about running SQL Server 2014 in production at Stack, this advantage was absolutely killer for Stack’s uptime goals. The Stack developers have done a killer job at coding the sites – if they detect that the SQL Server databases are in read-only mode, all of the involved sites fail into a read-only mode too, along with a nice polite banner informing the reader that they can’t make any changes right now.
The Stack engineers, God bless ‘em, are so dedicated to making IT better that they’re not just willing to document their infrastructure to help others, plus build open source tools to help people, but they’re even willing to put their production web site in the hands of preview code. So earlier in November, Nick Craver and Steven Murawski did a rolling upgrade of their production clusters to SQL Server 2014.
Each cluster involves 3 nodes. Take the StackOverflow cluster:
- NY-SQL01 – the primary replica handling all incoming write connections. Sits in Manhattan.
- NY-SQL02 – the asynchronous secondary replica sitting next to it in the same cage. NY-SQL01 copies transaction log data off to this server gradually in the background. In the event of a localized failure on NY-SQL01, the admins can manually fail over to NY-SQL02 with some data loss.
- OR-SQL01 – the asynchronous secondary replica sitting in Oregon. NY-SQL01 also copies data here in the background, but due to the vagaries of wide area networks, it can be farther behind than NY-SQL02. To leverage extra hardware in Oregon, Data Explorer can be hosted in Oregon’s web servers using this read-only capacity.
All three were running SQL Server 2012 on Windows Server 2012. Unfortunately, with Windows clustering, we can’t upgrade any of the nodes in-place to a new version of Windows (2012 R2), so if we wanted to deploy that, we’d have to tear down some of the existing nodes temporarily. That was a lot of work for relatively little gain. There wasn’t a need for new hardware, either – the database servers typically run under 10% CPU, and they can cache everything they need in memory.
Since we could keep the same OS, the SQL Server 2014 upgrade process looked like this:
- Upgrade one of the readable replicas (in our case, NY-SQL02) to SQL 2014. From this point forward, replication stops to the other nodes. They can’t apply data from a newer SQL Server version, so they’ll just kinda freeze in limbo. We could still fail back to those, but we would lose all data changes made from this point forward.
- Test the apps in read-only against the newly upgraded replica.
- Test the apps in writeable mode against the newly upgraded replica.
- Make a go/no-go decision. If no-go, fail back to the original replica (NY-SQL01) and send the upgraded replica to detention. If go, start upgrading the other readable replicas. As they come online with SQL Server 2014, the AlwaysOn AG replication will catch them back up.
- Optionally, fail back to NY-SQL01 as a primary.
- Monitor the servers and the plan cache looking for queries getting unexpectedly poor execution plans. (Remnant of our SQL 2008 upgrade, and subsequent full text search problems.)
The upgrade went really smoothly, barring one technical issue involving a combination of the installer and Windows Core. The installer assumes that a particular feature will be installed (whether you need it or not), and that feature requires the full Windows GUI, so it fails on Windows Core. The “fix” was to tell the installer to skip feature compatibility checks.
I was online purely for comic relief. I think I made the sum total of one technical contribution during the entire call, and I left shortly after the go/no-go decision. Nick and Steven are top notch admins. Having said that, I wouldn’t recommend this DO-IT-LIVE! upgrade approach for most companies.
Should You Follow Suit with SQL 2014?
The Stack Exchange team is very in tune with what’s happening in their SQL Server environment. They know exactly which queries are the top 20 resource users, what those execution plans look like, and when a new query pops up to the top of the list. If SQL Server suddenly produces different query plans for a frequent query, these guys know how to work around it. They have great relationships with the dev teams, instant access to source code, and easy, automatic deployments to production to fix query problems.
I’m thankful that there’s cutting-edge shops out there like Stack Exchange that dedicate so much talent to managing their database stack and give so much back to the community (and to Microsoft). They’re helping make sure SQL 2014 is a solid product when it eventually hits RTM, and hopefully they’ll iron out any surprise bugs before folks like you deploy it in production.
If you believe your shop has what it takes to run SQL 2014 in production, you can join Microsoft’s early-access programs to get access to more frequent builds, extra support contacts, and community experts to help with the deployments. Contact us to get started.
In Michigan where I grew up, we pull the boats out of the water at the beginning of the fall. It’s a bit of a sad time, realizing we’re done having fun on the water, and the seasons are about to change.
To prepare a boat for a few months of storage, we drain some fluids, replace others, give it a good cleaning, do some basic maintenance, and put on a tight waterproof cover.
Databases need winterizing too. I bet you’ve got an application that’s no longer used, but you still have to keep the data online just in case. Or maybe you’ve got archived sales data that we still read, but we don’t modify anymore.
Here’s how to winterize a database:
Rebuild all of the indexes with 100% fill factor. Sometimes we set lower fill factors to prevent page split problems during heavy concurrency, but when a database is going into its winter, we don’t need to worry as much about that. By setting fill factor to 100% and rebuilding the indexes, we get everything packed in tightly. This means denser data – less free space, faster reads off disk.
Update statistics with fullscan. In case we don’t rebuild the indexes, we still probably need to update our statistics. I’d recommend fullscan here because we get a great picture of the data, and then we never have to update stats again on a frozen database.
Create a read-only database login using an Active Directory group. This way, if you need to add additional users to the database for read-only permissions, you can simply add them to the Active Directory group. We don’t have to write to the database in order to pull this off – which comes in important for the next step.
Make the database read-only. Let’s be really confident that the data’s not going to change underneath us. This can also get us a modest performance gain from avoiding locking, and it’ll save me time if I’ve got an inefficient index rebuild script that keeps trying to rebuild indexes even when data isn’t changing.
Do a complete DBCC CHECKDB. We want to know that we’ve got a good, clean copy of all of the database pages.
Test the full backups. Restore it somewhere else, and know that we’ve got a really good backup that can actually be restored. Once I’m confident in that, I may even consider no longer backing up this database – especially if it’s over a terabyte – as long as I know I’ll always have that backup available in multiple places.
At the end of a database’s life, winterizing it gives me one less database to worry about. I know it’s good, I know I’ve got a backup, and I can spend my time focusing on databases that are much more volatile.
I’ve been told that an attendee at the PASS Summit pre-conference event asked about using dynamic SQL to get around some of the problems with filtered indexes. I’m not entirely sure what the question was, but it did give me the opportunity to play around with filtered indexes and write some simple demo code to illustrate just the kind of shenanigans that you can get up to.
Creating a Filtered Index
The first step for our demo is to create a filtered index. Grab yourself a copy of Adventure Works and get ready for some fun. Here’s our filtered index:
CREATE INDEX jp_filters_not_offline ON Sales.SalesOrderHeader (OrderDate) INCLUDE (SalesOrderID) WHERE OnlineOrderFlag = 1 ; GO
That creates our initial filtered index. We want to search by
OrderDate and find only those orders that were placed online. This is a realistic index to create in the AdventureWorks database, too – there are 27,659 online sales in my copy of AdventureWorks, but only 3,806 offline sales.
Want to prove that the index works? Run this SQL and take a look at the execution plan:
SELECT SalesOrderID FROM Sales.SalesOrderHeader AS soh WHERE OnlineOrderFlag = 1 ;
There’s No Problem Here!
Let’s create a pair of stored procedures:
CREATE PROCEDURE jp_online_only_date ( @OrderDate DATETIME ) AS BEGIN SELECT SalesOrderID FROM Sales.SalesOrderHeader AS soh WHERE OnlineOrderFlag = 1 AND OrderDate > @OrderDate END GO CREATE PROCEDURE jp_online_only_date_2 ( @OrderDate DATETIME ) AS BEGIN DECLARE @sql NVARCHAR(MAX) = N''; SET @sql += 'SELECT SalesOrderID FROM Sales.SalesOrderHeader AS soh WHERE OnlineOrderFlag = 1 AND OrderDate > ''' + CAST(@OrderDate AS NVARCHAR(MAX)) + ''';' EXEC sp_executesql @sql; END
If you were to run both of these, you’d notice that they both make use of the filtered index. What gives?
Both of these stored procedures are able to take advantage of the filtered index because the filter condition (
OnlineOrderFlag = 1) is matched by predicate in the query. In other words – SQL Server can easily make this match.
The Problem with Parameters
There’s a problem, though. When we switch over to using a parameter, SQL Server isn’t able to make effective use of the filtered index. Check it out:
CREATE PROCEDURE jp_online_only_params ( @OrderDate DATETIME, @OnlineFlag BIT ) AS BEGIN SELECT SalesOrderID FROM Sales.SalesOrderHeader AS soh WHERE OnlineOrderFlag = @OnlineFlag AND OrderDate > @OrderDate ; END GO EXEC jp_online_only_params '2007-12-12', 0 ;
Oh no! This procedure doesn’t use our filtered index any more. There’s got to be a way to trick SQL Server into using our filtered index. We could resort to an index hint, but that would mean we have to keep using the same index or keep re-using the name of that index. I’m not a big fan of that, so let’s think of something else.
What if we were to use string concatenation? That sounds like magic, it has a lot of syllables.
Hooray for Strings!
We can solve the problem by forcing SQL Server to see the literal value that we’re passing in:
CREATE PROCEDURE jp_online_only_strings ( @OrderDate DATETIME, @OnlineFlag BIT ) AS BEGIN DECLARE @sql AS NVARCHAR(MAX) = N''; SET @sql += 'SELECT SalesOrderID FROM Sales.SalesOrderHeader AS soh WHERE OnlineOrderFlag = ' + CAST(@OnlineFlag AS NVARCHAR(1)); SET @sql += ' AND OrderDate > ''' + CAST(@OrderDate AS NVARCHAR(MAX)) + ''';' PRINT @sql EXEC sp_executesql @sql; END
If you were to run that stored procedure, you’d get effective use of the filtered index. SQL Server sees the
OnlineOrderFlag predicate as a literal value, matches that up to the index definition, and we’re off to the races.
There is one problem, though – because we’re also using a literal value for
OrderDate, there could be significant bloat in the plan cache. SQL Server will create one execution plan for every possible combination of parameters that we pass in. In AdventureWorks there are 1,124 distinct values in the
OrderDate column. That’s a lot of execution plans.
Avoiding Plan Cache Bloat
We want to avoid bloat in the execution plan cache. In order to do that, we’re going to need to parameterize part of the query but still leave the filter intact. That’s really easy to do:
CREATE PROCEDURE jp_online_mixed ( @OrderDate DATETIME, @OnlineFlag BIT ) AS BEGIN DECLARE @sql AS NVARCHAR(MAX) = N''; SET @sql += 'SELECT SalesOrderID FROM Sales.SalesOrderHeader AS soh WHERE OnlineOrderFlag = ' + CAST(@OnlineFlag AS NVARCHAR(1)); SET @sql += ' AND OrderDate > @order_date;' PRINT @sql EXEC sp_executesql @sql, N'@order_date DATETIME', @OrderDate; END
By adding the
@OnlineFlag as just another string, but still parameterizing the
@order_date we end up with only two execution plans. One plan will be created for each value of
@OnlineFlag, but we’ll still get use of the filtered index. This is all but impossible to accomplish with forced parameterization and difficult to accomplish with regular parameteriziation (usually you have to resort to index hints).
Summing It Up
Dynamic SQL and filtered indexes are a great match. By carefully making use of dynamic SQL, you can coerce SQL Server into behaving well and providing you with an execution plan that uses the filtered index. Through fully parameterized dynamic SQL you can get good plan re-use and avoid SQL injection attacks. By making careful use of dynamic SQL and filtered indexes, you should be able to get performance out of SQL Server without having to resort to tricks like index hints or plan guides.