Blog

The Top 3 Indexing Mistakes in SQL Server (video)

It’s incredibly easy to make big mistakes with indexes in SQL Server. In this free 30 minute session, Kendra Little will share the three most common mistakes she finds people make with their indexing. You’ll learn how these mistakes can have disastrous impacts and practical steps you can take right away to identify if these problems exist in your environment.

Hungry for More Indexes?

Check out sp_BlitzIndex®, our free index tool that helps you diagnose index insanity.

Or take a tour of our new 6.5 hour video training on SQL Server index tuning here.

The Elephant and the Mouse, or, Parameter Sniffing in SQL Server

Imagine you work for a zoo, and you need to ship an animal to another zoo. If you are told one will be a mouse and one will be an elephant, you will need to handle them differently. The Detroit Zoo used a moving van to move two elephants, while mice can be put in a box and driven or flown.

Something Smells Funny

And it’s not the elephant. SQL Server uses a process called parameter sniffing when it executes stored procedures that have – you guessed it – parameters. When the procedure is compiled or recompiled, the value passed into the parameter is evaluated and used to create an execution plan. That value is then stored with the execution plan in the plan cache. On subsequent executions, that same value – and same plan – is used.

This is a normal, expected behavior in SQL Server. Because compiling queries is expensive, you want plans stored in the cache. You want SQL Server to re-use them as much as possible.

But what happens when the values in a table you’re querying aren’t evenly distributed? What if one value would return 10 rows and another value would return 10,000 rows, or 10 million rows? I call this the elephant and the mouse problem. You would handle one animal differently than the other; SQL Server might create different plans for the queries. But it doesn’t, because you’re using parameters. elephant

What will happen is that the first time the procedure is run and the plan is compiled, whatever value is passed in is stored with the plan. Every time it’s executed, until it’s recompiled, the same value and plan will be used – regardless of whether it is the fastest or best plan for that value. If this is happening to you, and causing performance problems, there are ways to deal with it.

Parameter Sniffing in Action

I’m going to run a few queries to show you this behavior. I’ll be using AdventureWorks2012 on a SQL Server 2012 instance.

I’m going to query the Sales.SalesOrderDetail table. I want to know the order numbers and quantity ordered for specific products. My first set of queries will use literal values in the WHERE clause. When using literal values, SQL Server will compile each separately, and store a separate execution plan for each.

SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = 897;

SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = 945; 

SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = 870;

The same query was run three times. The queries returned 2, 257, and 4,688 rows, respectively. Two different execution plans have been created, even though the query remained the same!

Elephant Mouse 1

At some point, the optimizer decided it was faster to do a clustered index scan, instead of a nonclustered index seek and a key lookup.

How does this behave when parameterized? I create a stored procedure to test this.

CREATE PROCEDURE Get_OrderID_OrderQty 
 @ProductID INT
AS 
 
SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = @ProductID; 

I’m going to execute this stored procedure for the first time with the “elephant” – product ID 870, which returns 4,688 rows.

EXEC Get_OrderID_OrderQty @ProductID=870

The data is retrieved through a clustered index scan once again.

Elephant Mouse 2

Now, I’ll execute this stored procedure with the “mouse” – product ID 897, which returns 2 rows.

EXEC Get_OrderID_OrderQty @ProductID=897

This time, instead of using a nonclustered index seek and a key lookup, the values are retrieved with a clustered index scan. Also note that the estimated number of rows and actual number of rows are very different!

Elephant Mouse 3

What happened? This is parameter sniffing in action. One value is stored in the execution plan, and that is used to create the plan, regardless of what value is passed in. I can verify this by right-clicking the execution plan and selecting “Show Execution Plan XML”. In the XML, I search for “ParameterCompiledValue”. I find the following line.

Elephant Mouse 4

The compiled value is 870. Until the stored procedure is recompiled, this is the value that will be used.

Is This Bad?

The real question is, “Is this bad?” It isn’t – until it is. Depending on the query and the data, each execution of this query may return the results in an acceptable amount of time. It is only when query performance reaches an unacceptable threshold that you may need to do something to prevent it from happening.

When Parameter Sniffing Stinks

When parameter sniffing is negatively affecting a stored procedure, what can you do to fix it?

The first option is to do nothing. You can explain the behavior of SQL Server to your end users. They might even listen! But they won’t be happy. They need their data, and they need it faster.

Other options you have are to recompile the stored procedure each time, use a query hint to produce a good enough plan, or do some creative coding. None of these are perfect options – they all have drawbacks. Understand the benefits and drawbacks of each option. Test the variations carefully in a development environment. The last thing you want to do is make the problem worse!

Recompiling

You can force SQL Server to recompile the stored procedure each time it is run. The benefit here is that the best query plan will be created each time it is run. However, recompiling is a CPU-intensive operation. This may not be an ideal solution for stored procedures that are run frequently, or on a server that is constrained by CPU resources already. Another thing to remember is that the plans won’t be stored in the cache, which makes them harder to find if they are problematic.

To show this, I’m going to alter my stored procedure to include the WITH RECOMPILE statement.

ALTER PROCEDURE Get_OrderID_OrderQty 
 @ProductID INT 
 WITH RECOMPILE 
AS 
 
SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = @ProductID; 

What plan will executing this with the “elephant” – product ID 870 – produce?

EXEC Get_OrderID_OrderQty @ProductID=870

Elephant Mouse 5

A clustered index scan has been performed.

What plan will executing this with the “mouse” – product ID 897 – produce?

exec Get_OrderID_OrderQty @ProductID=897

Elephant Mouse 7

The nonclustered index seek and key lookup are being performed, as expected. This is happening because WITH RECOMPILE tells SQL Server, “Don’t store a plan in cache for me. I’m smart enough to figure this out on my own each time.” Remember, the cost of this is increased CPU usage each time the stored procedure is run.

Query Hinting

Another option is to use the OPTIMIZE FOR query hint. This tells SQL Server to use a specified value when compiling the plan. If, through testing, you can find a value that produces a “good enough” plan each time, and the performance is acceptable for both mice and elephants, this is a good option for you.

However, understand that you are bossing the query optimizer around. You are telling it what you think is best. The biggest drawback with OPTIMIZE FOR is on tables where the distribution of data changes. The faster it changes, the more out of date this hint could become. What if the value you provide is not optimal in a month, or a year? You need to have a method in place to regularly review and revise this.

I know that using product ID 945 produces a “good enough” plan for this query, so I alter the procedure to include the OPTIMIZE FOR query hint.

ALTER PROCEDURE Get_OrderID_OrderQty 
 @ProductID INT
AS 
 
SELECT SalesOrderDetailID, OrderQty
FROM Sales.SalesOrderDetail 
WHERE ProductID = @ProductID 
OPTION (OPTIMIZE FOR (@ProductID=945)); 

What plan will the “elephant” – product ID 870 – produce?

EXEC Get_OrderID_OrderQty @ProductID=870

Elephant Mouse 7

A nonclustered index seek and key lookup has been performed. Viewing the XML shows us that the compiled value is 945.

The “mouse” – product ID 897 – has the same execution plan.

EXEC Get_OrderID_OrderQty @ProductID=897

Elephant Mouse 9

Be Creative

Another option may be to create separate stored procedures for each value, or at least for those values you know will produce a specific execution plan. I’m not going to show this. Why? Because it’s ugly, it’s cumbersome, and I don’t recommend it. Who wants to maintain multiple stored procedures, when you can maintain one? What do you do when the distribution of data changes? How do you make the logic clear and easily understood?

When Parameter Sniffing Stinks

Understand that parameter sniffing is not a bad thing – it is an integral part of SQL Server. The data in your tables and your queries can lead to parameter sniffing not making sense, however. Understanding how to identify if it is a problem and knowing various ways to fix it when it is a problem are valuable skills to have. When parameter sniffing stinks, test methods to correct it and determine the best way to freshen the air in the server.

Announcing Our New SQL Server Video Training Classes

We noticed a hole in the SQL Server training market:

sql-server-training-options

We wanted to build a better way to learn complex topics. We’d experimented with video training last year, and we loved it, but we wanted to make it even better:

  • Handouts, scripts, quizzes to keep the classes interactive and interesting
  • Real-world useful topics, not trivia to pass certification tests
  • High definition videos – so you’re not struggling with grainy screenshots
  • Mobile-friendly – watch on your phone or tablet (looks fantastic on iPads!)
  • Re-watch the training again and again for a year and a half
  • One-time pricing that managers will actually approve

We’re really excited to show you what we’ve built. Here’s our first four classes:

How to Think Like the SQL Server Engine – 90 minutes, $29 – You’re a developer or a DBA, and you’re comfortable writing queries to get the data you need – but you don’t understand why they’re unpredictably slow. In this series of videos with Brent Ozar, you’ll learn how the SQL Server engine looks at your indexes and builds your query results. Learn more and watch a preview, or buy it now.

Virtualization, SANs, and Hardware for SQL Server – 5 hours, $299 - When you’re the one who has to plan, buy, set up, and troubleshoot production SQL Servers, you need to know about hardware, SAN storage, SSDs, virtualization, and more. Microsoft Certified Master Brent Ozar will save you thousands of dollars on bad hardware decisions and wasted troubleshooting time. Learn more and watch a preview, or buy it now.

DBA Job Interview Question and Answer Kit – 90 minutes, $29 – Good interviewing is not a talent: it’s a skill you build with study and practice. Microsoft Certified Master Kendra Little trains you to be a rockstar in SQL Server database administration job interviews in this set of videos. You get practice video interviews– 100 real world DBA interview questions– and a kit of checklists and exercises to help you land the right job. Check out a preview or buy it now.

How to Tune Indexes and Speed Up SQL Server  - 6.5 hours, $299 - You want to become an expert at index tuning, but you can’t find a good start-to-finish training course. In this six hour course packed with demos and quizzes, Microsoft Certified Master Kendra Little teaches you how to design clustered, non-clustered, and filtered indexes to speed up your SQL Server queries– and how to gather the right data to prove your changes made a difference. Learn more and watch a preview, or buy it now.

We’re really proud of what we’ve built, and we know you’ll enjoy it. Let us know what you think!

New DMVs in SQL Server 2014 CTP1

SQL-Server-2014-CTP1When you download and start playing with SQL Server 2014 CTP1, here’s a few of the new instrumentation tables and views:

New Internal Tables

  • plan_persist_context_settings
  • plan_persist_plan – Hey, lookie there! Implies that we can persist execution plans beyond a SQL Server service restart.
  • plan_persist_query
  • plan_persist_query_text
  • plan_persist_runtime_stats
  • plan_persist_runtime_stats_interval

New Views

  • column_store_row_groups
  • dm_db_merge_requests
  • dm_db_xtp_checkpoint - Note that both the DMVs and the SSMS UI imply that Hekaton will be called Extreme Transaction Processing.
  • dm_db_xtp_checkpoint_files
  • dm_db_xtp_gc_cycle_stats
  • dm_db_xtp_hash_index_stats
  • dm_db_xtp_index_stats
  • dm_db_xtp_memory_consumers
  • dm_db_xtp_object_stats
  • dm_db_xtp_table_memory_stats
  • dm_db_xtp_transactions
  • dm_io_cluster_shared_volumes – Management data about the new Cluster Shared Volumes (CSV) support for clusters.
  • dm_os_buffer_pool_extension_configuration – Management data about buffer pool extensions, storing the buffer pool on SSD. You can read more about that in my Almost Everything You Need to Know About SQL Server 2014 post.
  • dm_resource_governor_resource_pool_volumes
  • dm_xe_database_session_event_actions
  • dm_xe_database_session_events
  • dm_xe_database_session_object_columns
  • dm_xe_database_session_targets
  • dm_xe_database_sessions
  • dm_xtp_consumer_memory_usage
  • dm_xtp_gc_queue_stats
  • dm_xtp_gc_stats
  • dm_xtp_memory_stats
  • dm_xtp_system_memory_consumers
  • dm_xtp_threads
  • dm_xtp_transaction_recent_rows
  • dm_xtp_transaction_stats
  • event_session_actions
  • event_session_events
  • event_session_fields
  • event_session_targets
  • event_sessions
  • hash_indexes
  • selective_xml_index_namespaces
  • selective_xml_index_paths

New Functions

  • fn_dblog_xtp
  • fn_dump_dblog_xtp
  • fn_hadr_is_primary_replica – Makes it much easier to identify when you’re running T-SQL on the primary replica in an AlwaysOn Availability Group. Right now I have to jump through some hoops to do this, so yay!

NEW FUNCTIONS IN USER DATABASES/MSDB

  • fn_sysdac_get_currentusername
  • fn_sysdac_get_username
  • fn_sysdac_is_currentuser_sa
  • fn_sysdac_is_login_creator

New Stored Procedures IN MASTER

  • sp_cci_tuple_mover
  • sp_db_ebcdic277_2
  • sp_db_enable_clustered_columnstores
  • sp_db_selective_xml_index
  • sp_MSgetgenstatus4rows
  • sp_xtp_merge_checkpoint_files
  • sp_set_cardinality_estimation_model_110 – This one’s only in the user databases, not master. It would imply that there’s a new cardinality estimation model. (Never exactly clear on what I’m allowed to say due to NDAs, so…I’ll just leave that there.

New Fields in Existing Views

  • all_parameters – new field: is_nullable
  • all_sql_modules – new field: uses_native_compilation
  • availability_databases_cluster – new field: truncation_lsn
  • databases – new field: containment_desc
  • databases – new field: target_recovery_time_in_seconds
  • dm_exec_query_stats – new field: statement_sql_handle
  • dm_exec_query_stats – new field: statement_context_id
  • dm_exec_requests – new field: statement_sql_handle
  • dm_exec_requests – new field: statement_context_id
  • dm_hadr_database_replica_states – new field: low_water_mark_for_ghosts
  • dm_io_pending_io_requests – new field: io_handle_path
  • dm_logpool_stats – new field: total_pages
  • dm_logpool_stats – new field: private_pages
  • dm_os_buffer_descriptors – new field: is_in_bpool_extension
  • dm_os_memory_cache_entries – new field: time_to_generate
  • dm_os_memory_cache_entries – new field: use_count
  • dm_os_memory_cache_entries – new field: average_time_between_uses
  • dm_os_memory_cache_entries – new field: time_since_last_use
  • dm_os_memory_cache_entries – new field: probability_of_reuse
  • dm_os_memory_cache_entries – new field: value
  • dm_os_worker_local_storage – new field: filestream_address
  • dm_os_worker_local_storage – new field: qe_cc_address
  • dm_os_worker_local_storage – new field: xtp_address
  • dm_resource_governor_configuration – new field: max_outstanding_io_per_volume
  • dm_resource_governor_resource_pools – new field: min_iops_per_volume
  • dm_resource_governor_resource_pools – new field: max_iops_per_volume
  • dm_resource_governor_resource_pools – new field: read_io_queued_total
  • dm_resource_governor_resource_pools – new field: read_io_issued_total
  • dm_resource_governor_resource_pools – new field: read_io_completed_total
  • dm_resource_governor_resource_pools – new field: read_io_throttled_total
  • dm_resource_governor_resource_pools – new field: read_bytes_total
  • dm_resource_governor_resource_pools – new field: read_io_stall_total_ms
  • dm_resource_governor_resource_pools – new field: read_io_stall_queued_ms
  • dm_resource_governor_resource_pools – new field: write_io_queued_total
  • dm_resource_governor_resource_pools – new field: write_io_issued_total
  • dm_resource_governor_resource_pools – new field: write_io_completed_total
  • dm_resource_governor_resource_pools – new field: write_io_throttled_total
  • dm_resource_governor_resource_pools – new field: write_bytes_total
  • dm_resource_governor_resource_pools – new field: write_io_stall_total_ms
  • dm_resource_governor_resource_pools – new field: write_io_stall_queued_ms
  • dm_resource_governor_resource_pools – new field: io_issue_violations_total
  • dm_resource_governor_resource_pools – new field: io_issue_delay_total_ms
  • dm_xe_sessions – new field: session_source
  • master_files – new field: credential_id
  • parameters – new field: is_nullable
  • resource_governor_resource_pools – new field: min_iops_per_volume
  • resource_governor_resource_pools – new field: max_iops_per_volume
  • sql_modules – new field: uses_native_compilation
  • system_parameters – new field: is_nullable
  • system_sql_modules – new field: uses_native_compilation
  • table_types – new field: is_memory_optimized
  • tables – new field: is_memory_optimized
  • tables – new field: durability
  • tables – new field: durability_desc
  • xml_indexes – new field: xml_index_type
  • xml_indexes – new field: xml_index_type_description
  • xml_indexes – new field: path_id

New Fields in Existing Functions

  • dm_exec_cursors – new field: statement_sql_handle
  • dm_exec_cursors – new field: statement_context_id
  • dm_logpool_consumers – new field: log_consumer_ref_counter
  • fn_dblog – new field: Log Record
  • fn_dump_dblog – new field: Log Record
  • fn_get_audit_file – new field: audit_schema_version
  • fn_get_audit_file – new field: sequence_group_id

My Free Practical Performance Troubleshooting Webcast: 24 Hours of PASS

You’re a developer or DBA stuck with a database server that’s not going fast enough. You’ve got a hunch that the biggest bottleneck is inside the database server somewhere, but where?

In my 24 Hours of PASS live webcast, you’ll learn how to use a powerful script to identify what your SQL Server is waiting on, and then you can focus on the queries (or settings) that are holding you back.

You’ll learn:

  • What SQL Server waits on when it’s executing queries
  • How to query DMVs to identify your server’s bottlenecks
  • How to identify the most common wait types and their causes

Register here and join me live Wednesday, July 31st at 8AM Eastern, 7AM Central, and yes, that’s 5AM Pacific. (Quit complaining – you’re the one who chose to live out with the granola-eaters and the earthquakes.)

Put in your personal info and click Proceed, and then you get a list of checkboxes for each session. Check out the ones immediately after mine:

24-Hours-of-PASS

AWWW yeah, that’s some good free training – and that’s only the first four hours! Register now.

Optimize for… Mediocre?

Some query hints sound too good to be true. And, unfortunately, usually they aren’t quite as magical as they might seem.

Frustration with unpredictable execution times

People often learn about parameter sniffing when query execution times stop being predictable. Occasionally you’ll hear about a stored procedure taking much longer than normal, but the next time you look, it might be faster.

One of the causes of this uneven execution is parameter sniffing. Let’s take a look at how it works– and why it can actually be a very good thing!

Let’s start with two simple queries.

We work at a find company named AdventureWorks. We need to occasionally query a list of distinct cities by State/Province. We get our list by running a simple query– here are two commonly run statements:

SELECT DISTINCT City FROM Person.Address where StateProvinceID=80;
SELECT DISTINCT City FROM Person.Address where StateProvinceID=79;

When we look at the execution plans for these, we can see that SQL Server executes them very differently. For each query, it uses statistics to estimate how many rows it’s going to get back. For the first query it estimates four rows, so it does a little nested loop to pull back the data. For the second query, it estimates it’s going to get 2,636 rows back, so it decides it’s worth it to scan a nonclustered index.

1-Literals-Plans

What if we’re using a stored procedure?

Let’s say we run this query often, and we create a stored procedure to handle our logic. Our stored procedure looks like this:

CREATE PROCEDURE dbo.GetCities
    @StateProvinceID int
AS
SELECT DISTINCT City
FROM Person.Address
WHERE StateProvinceID=@StateProvinceID;
GO

We execute the stored procedure with the same two values as before, and in the exact same order:

EXEC dbo.GetCities @StateProvinceID=80;
EXEC dbo.GetCities @StateProvinceID=79;
GO

We’re going to get the same execution plans, right?

Well, no. Instead we get:

2-ParameterSniffing-Plans

Hey, what the heck, SQL Server? When we ran this as simple queries using literal values (no parameters), you realized that one of these got a lot more rows than the other, and you created a special execution plan for each one. Now you’ve decided to use one plan as “one size fits all!”

If we right click on the second execution plan and scroll to the bottom, we can see what happened.

3-ShowMeXML

This execution plan was compiled for one value, but was run with another value. The parameter value was “sniffed” during the first run!

4-Compiled vs Runtime

This means that the execution plan for this query will vary, depending on what values were used the last time it was compiled. Periodically, recompilation may be triggered by all sorts of things, including changes in data distribution, index changes, SQL Server configuration changes or SQL Server restarts. Whatever parameters are used when the query is first called upon recompilation will heavily influence the shape of the plan.

Enter ‘Optimize for Unknown’

Many times, people get frustrated with unpredictable execution times. They hear about a feature that was introduced in SQL Server 2008, and they apply it to solve the problem. The feature is called ‘Optimize for Unknown’. Suddenly, hints are popping up on queries everywhere!

To use the feature, you plug this query hint into your stored procedure like this:

DROP PROCEDURE dbo.GetCities
GO
CREATE PROCEDURE dbo.GetCities
    @StateProvinceID int
AS
SELECT DISTINCT City
FROM Person.Address
WHERE StateProvinceID=@StateProvinceID
    OPTION (OPTIMIZE FOR UNKNOWN)
GO

Now, let’s run our queries again:

EXEC dbo.GetCities @StateProvinceID=80;
EXEC dbo.GetCities @StateProvinceID=79;
GO

Did we get great execution plans? Well, maybe not:

5-Optimize For Unknown-Plans

Maybe this is a decent plan for some values, but maybe it isn’t. The query that wanted the “small” nested loop plan did 216 reads in scanning a nonclustered index instead of the 10 it did when it had its ideal plan. If this was a query that ran really frequently, we might not want this plan, either. (Particularly if we had a much larger table).

‘Optimize for Unknown’ has blinders on

The ‘Optimize for Unknown’ feature follows the premise that trying to get a consistent execution time for a given set of parameters and re-using a stable execution plan is better than spending CPU to compile a special, unique flower of an execution plan every time a query runs. That is sometimes the case. But you should also know that it produces a pretty mediocre plan.

Let’s take a closer look at what it did.

For query 1, it estimated it was going to return 265 rows, when it actually returned 4:

6-OptimizeForUnknown-Query1-Estimate

For query 2, it estimated that it was going to return 265 rows, when it actually returned 2,636:

7-OptimizeForUnknown-Query2-Estimate

In both cases, it didn’t look at the value for @StateProvinceID that was passed in at all. It pretended it was unknown– just like we told it to. This means that when the plan was created, it didn’t customize it for the values as they were passed in for that execution.

Instead, SQL Server checked out the table and realized that we have an index on the StateProvinceID column. That index has associated statistics. It looked at the statistics to get a measure of how the values are distributed in the index– this is called the “density vector” — read more about it here in a great post by Benjamin Nevarez. It then multipled the “density vector” by the number of rows in the table to create a generic estimate of how many rows might be returned for any sort of “average” value that was used to query the table.

Check it out– we can peek at the statistics header and the density vector:

8-DensityVector-StateProvinceID

The density vector for StateProvinceID is in the second result set, first row under “All density”– it’s 0.01351351. Multiply 0.01351351 * 19614 (the number of rows) and you get 265.05398514. And so the plan was created based on an “average” or “mediocre” estimate of 265 rows– which was very different than either of the queries we ran.

Well, just how bad is a mediocre plan?

In this case, this isn’t terrible. I’m just running a few test queries against a small database.

But in this case you could say the same thing about parameter sniffing. The plans I got there weren’t terrible, either! In both cases SQL Server was able to re-use execution plans without creating a freshly compiled plan for each run. That’s good because CPU is expensive: I don’t want to compile all the time.

But in a larger, more complex plan, ‘optimize for unknown’ may make execution times more consistent, but it may also produce a very inefficient plan. The “blind” style of estimate may not match ANY real values well, depending on how my data is distributed. I may end up with a plan which thinks it’s dealing with way more or less data than it’s actually handling. Both of these can present real problems with execution.

So what’s the ideal fix?

I’ve got good news and I’ve got bad news.

The bad news is that there is no single setting that will always produce the perfect exection plan without possibly having extreme adverse impacts on your performance. I can’t tell you “Just do X and it’ll always be fine.”

Instead, you should let parameter sniffing happen most of the time. It’s not a bug, it’s a feature. (Really!) Don’t default to using any query hints– just stay simple and let the optimizer figure it out.

You will have times that you find stored procedures and other parameterized queries (called by sp_executesql or queries using sp_prepare) have uneven execution times. In those cases, you need to figure out how often they’re executing and if there’s a single plan that helps them the most. You may solve the problem with an ‘Optimize for Unknown’ hint, or you may use another hint. No matter what you do, you’ll be making a trade-off between performance gains and possible different performance pains. Cross each bridge when you come to it.

What was the good news?

The bright side is that fixing these issues are fun! How queries are optimized and executed against different sets of data is super interesting to dive into. These little things mean that we’ll always need people to help make applications faster– and a little job security never hurt anyone.

Mirror Mirror, On the Server – Part 2 [Video]

Sometimes our identical twin turns out to be evil – and sometimes our SQL Server does, too. When we encounter data corruption, hardware failure, and OS errors, we need to fail over from our principal to our better mirrored twin. I’ll show you how to handle database mirroring failovers in both high safety and high performance setups in this week’s webcast:

Sample SQL Server Database Administrator (DBA) Job Descriptions

Hiring a DBA? Need to get a job description for the human resources folks? Here’s how to get started.

First, decide whether it’s a production or development DBA. Think of the database in terms of a fridge. When you run a restaurant, you need at least one (and probably several) refrigerators to keep your raw ingredients and your prepared dishes cold.

Your chefs rely on the fridges to get their jobs done. They have tons of training to pick the right ingredients to put in the fridge, prepare the food correctly, and know when to take things in & out of the fridge.

If your restaurant absolutely, positively cannot go down, you’ll end up hiring a handyman or facilities guy. He has to know how fridges work, and if a fridge can’t keep the food cold enough, he steps in to diagnose and fix it.

The chefs are your developers.

Development DBAs build your data archiving strategy.

Development DBAs build your data archiving strategy.

When you have a LOT of chefs, you hire a development DBAs to organize the fridge and clean it out. They don’t usually write code, but if they do, the code is inside the database – they’re not writing presentation-layer code in C# or Java.

The handyman or facilities guy is your production DBA. He’s more concerned about the back side of the fridge than the front side. He doesn’t do any cooking coding.

They all work with the fridges, but the similarities end there. Small shops might indeed hire one guy to buy the food, put it in the fridge, cook it, and fix the fridge when it breaks. However, those shops aren’t going to win any awards for food quality, and when the fridge breaks, the cooking stops while he fixes the fridge.

Sample Production Database Administrator Job Description

The open source databases say information wants to be free as in beer.

The open source databases say information wants to be free as in beer.

This position’s job duties and responsibilities include:

  • Ensure all database servers are backed up in a way that meets the business’s Recovery Point Objectives (RPO)
  • Test backups to ensure we can meet the business’ Recovery Time Objectives (RTO)
  • Troubleshoot SQL Server service outages as they occur, including after-hours and weekends
  • Configure SQL Server monitoring utilities to minimize false alarms
  • As new systems are brought in-house, choose whether to use clustering, log shipping, mirroring, Windows Azure, or other technologies
  • Install and configure new SQL Servers
  • Deploy database change scripts provided by third party vendors
  • When performance issues arise, determine the most effective way to increase performance including hardware purchases, server configuration changes, or index/query changes
  • Document the company’s database environment

To do a great job in this position, experience should include:

  • On-call troubleshooting experience with at least one production SQL Server for a year. You don’t have to be the only DBA or have DBA in your job description, but you should have been the one person that the company would call if the SQL Server service stopped working.
  • Finding DMV queries to answer questions about server-level performance
  • Using free tools like sp_Blitz® and sp_WhoIsActive to diagnose server reliability and performance issues

The following skills aren’t strictly necessary, but will make you a well-rounded candidate for bonus points:

  • Tuning T-SQL queries to improve performance
  • Troubleshooting hardware using tools like Dell OpenManage, HP System Insight Manager, and IBM Director

Sample Development Database Administrator Job Description

Ever since we had a BI developer get locked in the data warehouse, we've taken precautions.

Ever since we had a BI developer get locked in the data warehouse, we’ve taken precautions.

This position’s job duties and responsibilities include:

  • Ensure that new database code meets company standards for readability, reliability, and performance
  • Each week, give developers a list of the top 10 most resource-intensive queries on the server and suggest ways to improve performance on each
  • Design indexes for existing applications, choosing when to add or remove indexes
  • When users complain about the performance of a particular query, help developers improve the performance of that query by tweaking it or modifying indexes
  • Conduct SQL Server lunch-and-learn sessions for application developers
  • Advise developers on the most efficient database designs (tables, datatypes, stored procedures, functions, etc)

To do a great job in this position, experience should include:

  • Writing and improving SQL Server T-SQL queries for at least a year. You may have technically had “C# Developer” or “Java Developer” on your job title, but you were known amongst the office as the go-to person for T-SQL questions.
  • Designing tables and picking datatypes
  • Using Profiler traces and other tools to find the most frequently run queries
  • Using free tools like sp_BlitzIndex® and DMV queries to answer questions about index usage

The following skills aren’t strictly necessary, but will make you a well-rounded candidate for bonus points:

  • On-call troubleshooting for SQL Server service outages
  • Deciding whether clustering, log shipping, mirroring, replication, etc are the right fit to solve a business problem

Things I Didn’t Include In These DBA Job Descriptions

You can always tell who's using peer-to-peer replication.

You can always tell who’s using peer-to-peer replication.

If you’re using any of the following technologies, mention it in your job description so that the candidates know what to expect:

  • Failover clustering, SAN replication, and other high availability technologies
  • SQL Server merge, peer to peer, or transactional replication
  • LINQ, Entity Framework, NHibernate, or other ORMs
  • Service Broker
  • Analysis Services, Integration Services, or Reporting Services

There’s nothing wrong with having your production or development DBA work with those technologies, by the way – but they’re special technologies that require prominent placement in job descriptions.

Learn More: Our DBA Job Interview Q&A Kit

Good interviewing is not a talent: it’s a skill you build with study and practice. This set of online videos trains you to be a rockstar in SQL Server database administration job interviews by giving you practice video interviews where you can warm up with 100 real-world questions. You get:

  • 70 technical SQL Server DBA interview practice questions (and answers)
  • 10 “situational” DBA interview practice questions with strategies on how to answer
  • 20 questions for YOU to ask your prospective employer

Buy SQL Server Interview Training
18 months of access for $29
Get-Interview-Training

You’ll learn:

  • 5 things to always do in a DBA interview
  • 3 Rules of a good DBA resume: how to write your DBA story
  • Resume Anti-Patterns to avoid
  • 3 Ways to Get in Front of Hiring Managers
  • How to prepare for technical screenings and in-person inteviews
  • 3 things to NEVER do in a DBA interview
  • How to handle a DBA job offer

Buy it now and get started! For $29, you can watch it for 18 months on your desktop, laptop, iPad, or iPhone.
Get-Interview-Training

The Index Your Queries will Never Request (A Clustered Index!)

When it comes to indexes, SQL Server is really helpful. It lets you see what indexes queries are asking for both in execution plans, and missing index dynamic management views (“DMVs”). I like to look at the DMV missing index requests using sp_BlitzIndex®.

When you look at missing index requests, it’s always important to remember one of the biggest things: these missing index requests won’t ever ask for or recommend a specific clustered index.

Let’s take a look at what this might mean against a slightly modified version of the AdventureWorks2012 sample database.

Hey, we have a high Value missing index!

Let’s say we run sp_BlitzIndex® and it diagnoses that we have a high value missing index.

--To diagnose a database, we would run:
EXEC dbo.sp_BlitzIndex
    @database_name='AdventureWorks'

sp_BlitzIndex® Missing Index Diagnosis - 01

We scroll to the right on this line to get more info and see that the index request has an overall “benefit” over over one million. That benefit is a made up number– it’s a combination of the number of times the index could have been used, the percentage by which SQL Server thought it would help the queries generating the request, and the estimated “cost” of the requests. These factors are all multiplied together to help bubble up the biggest potentially useful requests.

sp_BlitzIndex® Missing Index Diagnosis - 02 - Estimated Benefit

We also see that the index request is really quite narrow. It only wants an index with a single key column on the “City” column! That seems really reasonable.

Scrolling over farther, we can see that this could have potentially been used about 49 thousand times. The queries that could have used it (it’s quite possibly more than one), were rather cheap– their costs were less than one on average– and it would have improved those queries a whole lot (around 93% overall). We currently don’t have ANY nonclustered indexes on the table in question, so we don’t have to worry about creating a duplicate nonclustered index.

sp_BlitzIndex® Missing Index Diagnosis - 03 uses and other indexes

This index seems really reasonable. It would be used a lot, and it would help those queries out.

If we keep going to the right, we see that there’s some sample TSQL to create the suggested index at the far right:

    CREATE INDEX [ix_Address_City] ON [AdventureWorks].[Person].[Address]

Sure enough, that syntax will create a nonclustered index. That seems really good, but hold on a second!

Always look at the whole table

Just to the left of the “Create TSQL” column is a really important helper– the “More Info” column.

sp_BlitzIndex® Missing Index Diagnosis - 04 more info

Copy the command from the appropriate row out. It will look something like this:

 EXEC dbo.sp_BlitzIndex
    @database_name='AdventureWorks',
    @schema_name='Person',
    @table_name='Address'

This command will help you look at the whole table itself and assess if there might be something you’re missing.

When you run this command, you’ll see a big picture view of the indexes on the table, the missing index requests for the table, and the number and types of the columns in the table:

sp_BlitzIndex® Missing Index Diagnosis - 05 table detail overview

In this table, note that we don’t even have a clustered index. This table is a heap! Heaps have all sorts of wacky problems in SQL Server.

sp_BlitzIndex® Missing Index Diagnosis - 05 table detail- heap

In this case, we have an OLTP database and we definitely want to avoid heaps to keep our sanity.

Our queries are requesting an index on the City column, but it looks an awful lot like our table was modeled with another clustered index in mind (AddressID). Don’t just assume that the missing index request itself will always make the best clustered index. You need to take a look at your overall workload and the queries which use the table. You need to decide based on the whole workload and overall schema what should be the clustered index, if you have a primary key, and if the primary key is the same thing or different than the clustered index. After you have that design, then make decisions for your nonclustered indexes.

The clustered index is special

The clustered index in any table has special uses and significance. This index is the data itself, and it will be used in every nonclustered index in the table. If you are defining a new clustered index or changing a clustered index, SQL Server will need to do IO on every nonclustered index on the table as well. Always make sure to test your indexing changes and choose your index definitions carefully.

WANT TO BE AN INDEXING PRO? WE’VE GOT TRAINING!

For $299 you can get 18 months of access on your desktop, laptop, iPad, or iPhone to more than six hours of video training on SQL Server indexes, chock full of demo scripts and quizzes.Get more info or buy it now!

The @DBAreactions Guide to Database Administration [Video]

Sometimes, the best stories are the horror stories. Join Brent as he talks about some of the worst situations he’s seen in his years of database administration. He’ll share the inspiration behind some of his favorite entries at http://DBAreactions.tumblr.com. We’ll either laugh or cry. Or both.

Liked that webcast? We’ve got many more coming up – just check the boxes and put in your email address.

css.php