Blog

You know how to write TSQL queries, but do you know how to make them faster? In this free 30 minute talk, Kendra Little will cover the basics of clustered and non-clustered indexes in SQL Server. She’ll show how you can tell which indexes your queries are using and show warning signs that mean perhaps your query needs a re-write. If you’ve ever wanted to learn how you can use indexes to make your queries run faster, this session is for you.

Want the scripts from the webcast? Scroll on down, the code is below the video.

http://youtu.be/wAhr91FBlx4

Sample Databases from Microsoft

The sample code below uses the Contoso Retail DW database, but you can also play around with AdventureWorks. Here are where you can grab ‘em:

Demo Code

<br />--------------------------------------------------<br />-- Demo script copyright Brent Ozar Unlimited®, 2012<br />-- This script is ***NOT*** suitable for production use!<br />-- The script contains commands that could significantly<br />-- harm/change performance, so use on a dedicated test<br />-- instance only, and always research any commands<br />-- you aren't familiar with.<br />--------------------------------------------------<br /><br />--Set up the script;<br />SET NOCOUNT OFF;<br />SET STATISTICS IO OFF;<br />SET STATISTICS TIME OFF;<br />GO<br /><br />--Restore the db if needed<br />--This is restoring to my rotational drive<br />--Time estimate: 30 seconds<br />USE [master]<br />GO<br />IF DB_ID('ContosoRetailDW') IS NOT NULL<br /><%%KEEPWHITESPACE%%>	BEGIN<br /><%%KEEPWHITESPACE%%>		ALTER DATABASE ContosoRetailDW SET SINGLE_USER WITH ROLLBACK IMMEDIATE;<br /><%%KEEPWHITESPACE%%>	END<br />RESTORE DATABASE [ContosoRetailDW] FROM  DISK = N'S:_INSTALLContosoRetailDW.bak'<br /><%%KEEPWHITESPACE%%>	WITH<br /><%%KEEPWHITESPACE%%>		MOVE N'ContosoRetailDW2.0' TO N'T:MSSQLDataContosoRetailDW.mdf',<br /><%%KEEPWHITESPACE%%>		MOVE N'ContosoRetailDW2.0_log' TO N'T:MSSQLDataContosoRetailDW.ldf',<br /><%%KEEPWHITESPACE%%>		REPLACE, NOUNLOAD,  STATS = 5<br />GO<br /><br />-- Run single threaded. This is just to simplify the demo--<br />-- We don't have time in 30 minutes to cover the index examples<br />-- AND also all the impacts of parallelism on query cost and execution.<br />-- This is not a production configuration recommendation!<br />EXEC sp_configure 'max degree of parallelism', 1;<br />GO<br />RECONFIGURE<br />GO<br /><br />--- Let's clear data out of our SQL Server's memory.<br />--- This is NOT a safe command for production servers!<br />--- We use this in the test so it causes a "fair playing field"<br />--- Otherwise we'd read from disk the first time, then read<br />--- From memory on subsequent runs-- which wouldn't be fair.<br />DBCC DROPCLEANBUFFERS;<br />GO<br /><br />-- Get into our test database.<br />USE ContosoRetailDW;<br />GO<br /><br />--- Let's say we're running this query.<br />--- We need to know total sales by day<br />--- Just for one store<br />--- For seven days<br />--- And this is just a sample of our data!<br />SELECT	CAST(datekey AS DATE) AS [Date of Sales],<br /><%%KEEPWHITESPACE%%>		SUM(salesQuantity) AS [Sum of SalesQuantity],<br /><%%KEEPWHITESPACE%%>		SUM(SalesAmount) AS [Sum of SalesAmount]<br />FROM	dbo.FactOnlineSales AS fs<br />WHERE	StoreKey = 306<br /><%%KEEPWHITESPACE%%>		AND datekey BETWEEN '2009-11-01' AND '2009-11-07'<br />GROUP BY CAST(datekey AS DATE)<br />ORDER BY [Date of Sales] DESC;<br />GO<br /><br />----------------------------------<br />--Question 1:<br />--Why did that take so long?<br />----------------------------------<br />--- Let's clear data out of our SQL Server's memory.<br />--- This is NOT a safe command for production servers!<br />DBCC DROPCLEANBUFFERS;<br />GO<br /><br />--These commands tell SQL Server to report back on how my query runs.<br />--Results will be on the 'Messages' tab.<br />SET STATISTICS IO ON;<br />SET STATISTICS TIME ON;<br />GO<br /><br />-- Also turn on "actual execution plans"<br />-- The execution plan will show up in its own tab.<br />-- This can be done with ctrl+m<br /><br />SELECT	CAST(datekey AS DATE) AS [Date of Sales],<br /><%%KEEPWHITESPACE%%>		SUM(salesQuantity) AS [Sum of SalesQuantity],<br /><%%KEEPWHITESPACE%%>		SUM(SalesAmount) AS [Sum of SalesAmount]<br />FROM	dbo.FactOnlineSales AS fs<br />WHERE	StoreKey = 306<br /><%%KEEPWHITESPACE%%>		AND datekey BETWEEN '2009-11-01' AND '2009-11-07'<br />GROUP BY CAST(datekey AS DATE)<br />ORDER BY [Date of Sales] DESC;<br />GO<br /><br />--- How much did we read?<br />-- What does the execution plan tell us?<br /><br />-- SQL Server says this is the ideal index for THIS query:<br />USE [ContosoRetailDW]<br />GO<br />CREATE NONCLUSTERED INDEX [&lt;Name of Missing Index, sysname,&gt;]<br />ON [dbo].[FactOnlineSales] ([StoreKey],[DateKey])<br />INCLUDE ([SalesQuantity],[SalesAmount])<br />GO<br /><br />-- But do we have to have that exact index?<br />-- First, let's look at all the indexes on the table:<br /><br />EXEC sp_helpindex 'dbo.factOnlineSales'<br /><br />-- OK, well we're starting fresh!<br />-- I talk my DBA into creating ONE very narrow index.<br />-- In the real world, we won't always have the perfect index for *every* query--<br />-- And there is very good reason for that!<br /><br />-- My DBA wants the index to start with DateKey, and they won't agree to all my includes.<br /><br />CREATE NONCLUSTERED INDEX [ix_FactOnlineSales_DateKeyStoreKey]<br />ON [dbo].[FactOnlineSales] ([DateKey],[StoreKey]) WITH (MAXDOP=2);<br />GO<br /><br />-- What happens to our query?<br />-- Will it use the not-so-perfect index?<br />-- Turn on actual execution plans: CTRL+M<br /><br />--- This is NOT a safe command for production servers!<br />DBCC DROPCLEANBUFFERS;<br />GO<br /><br />SELECT	CAST(datekey AS DATE) AS [Date of Sales],<br /><%%KEEPWHITESPACE%%>		SUM(salesQuantity) AS [Sum of SalesQuantity],<br /><%%KEEPWHITESPACE%%>		SUM(SalesAmount) AS [Sum of SalesAmount]<br />FROM	dbo.FactOnlineSales AS fs<br />WHERE	StoreKey = 306<br /><%%KEEPWHITESPACE%%>		AND datekey BETWEEN '2009-11-01' AND '2009-11-07'<br />GROUP BY CAST(datekey AS DATE)<br />ORDER BY [Date of Sales] DESC;<br />GO<br /><br />--- Hmm. Well, what if I forced it? I can use an INDEX HINT!<br />--- (is this a good idea???)<br />--- This is NOT a safe command for production servers!<br />DBCC DROPCLEANBUFFERS;<br />GO<br /><br />SELECT	CAST(datekey AS DATE) AS [Date of Sales],<br /><%%KEEPWHITESPACE%%>		SUM(salesQuantity) AS [Sum of SalesQuantity],<br /><%%KEEPWHITESPACE%%>		SUM(SalesAmount) AS [Sum of SalesAmount]<br />FROM	dbo.FactOnlineSales AS fs WITH ( INDEX=ix_FactOnlineSales_DateKeyStoreKey )<br />WHERE	StoreKey = 306<br /><%%KEEPWHITESPACE%%>		AND datekey BETWEEN '2009-11-01' AND '2009-11-07'<br />GROUP BY CAST(datekey AS DATE)<br />ORDER BY [Date of Sales] DESC;<br />GO<br /><br />-- Bad news: Index hints are risky business over time.<br />-- Is there a better way?<br />-- Let's look at those indexes again.<br /><br />EXEC sp_helpindex 'dbo.factOnlineSales';<br />GO<br /><br />-- There's a secret about the OnlineSalesKey column<br />-- It is the clustering key, so it's automatically in my nonclustered index.<br />-- I can use the knowledge to rewrite my query.<br /><br />--- This is NOT a safe command for production servers!<br />DBCC DROPCLEANBUFFERS;<br />GO<br /><br />WITH	mykeys<br /><%%KEEPWHITESPACE%%>		  AS ( SELECT	OnlineSalesKey<br /><%%KEEPWHITESPACE%%>			   FROM		dbo.FactOnlineSales AS fs<br /><%%KEEPWHITESPACE%%>			   WHERE	datekey BETWEEN '2009-11-01' AND '2009-11-07'<br /><%%KEEPWHITESPACE%%>						AND StoreKey = 306)<br /><%%KEEPWHITESPACE%%>	SELECT	CAST(datekey AS DATE) AS [Date of Sales],<br /><%%KEEPWHITESPACE%%>			SUM(salesQuantity) AS [Sum of SalesQuantity],<br /><%%KEEPWHITESPACE%%>			SUM(SalesAmount) AS [Sum of SalesAmount]<br /><%%KEEPWHITESPACE%%>	FROM	dbo.FactOnlineSales AS fs<br /><%%KEEPWHITESPACE%%>			JOIN mykeys ON fs.OnlineSalesKey = mykeys.OnlineSalesKey<br /><%%KEEPWHITESPACE%%>	GROUP BY CAST(datekey AS DATE);<br />GO<br /><br />-- What if I used a temp table instead?<br /><br />--- This is NOT a safe command for production servers!<br />DBCC DROPCLEANBUFFERS;<br />GO<br /><br />IF OBJECT_ID('tempdb..#mykeys') IS NOT NULL<br /><%%KEEPWHITESPACE%%>	DROP TABLE #myKeys;<br />CREATE TABLE #myKeys<br /><%%KEEPWHITESPACE%%>	(<br /><%%KEEPWHITESPACE%%>	  onlinesaleskey INT<br /><%%KEEPWHITESPACE%%>	);<br /><br />INSERT	#myKeys ( onlinesaleskey )<br /><%%KEEPWHITESPACE%%>		SELECT	OnlineSalesKey<br /><%%KEEPWHITESPACE%%>		FROM	dbo.FactOnlineSales AS fs<br /><%%KEEPWHITESPACE%%>		WHERE	datekey BETWEEN '2009-11-01' AND '2009-11-07'<br /><%%KEEPWHITESPACE%%>				AND StoreKey = 306;<br /><br />SELECT	CAST(DateKey AS DATE) AS [Date of Sales],<br /><%%KEEPWHITESPACE%%>		SUM(salesQuantity) AS [Sum of SalesQuantity],<br /><%%KEEPWHITESPACE%%>		SUM(SalesAmount) AS [Sum of SalesAmount]<br />FROM	dbo.FactOnlineSales AS fs<br /><%%KEEPWHITESPACE%%>		JOIN #mykeys ON fs.OnlineSalesKey = #mykeys.OnlineSalesKey<br />GROUP BY CAST(DateKey AS DATE)<br />ORDER BY [Date of Sales] DESC;<br />GO<br /><br />-- Which of these is better?<br />-- I have a definite preference: watch the webcast to find out why.<br />
↑ Back to top
  1. First, thanks for the redirect to the Paul White post on temp tables vs table variables – that was my question too. Second, thanks for your guys dedication to the community. These webcasts are great. I started this one thinking it was going to be a review… but as always learned some really good new ideas. Keep it up.

  2. Thanks Kendra.

  3. Pingback: Something for the Weekend - SQL Server Links 12/10/12

  4. Thanks for this informative tutorial. I have one question and looks like I am confused here.

    When you do the DBCC dropcleanbuffers before using the index hint at a point in the demo, the results show high logical reads and low physical reads. I am confused because if the memory is clean, the first time it should show high physical reads because the pages needed to be picked from the disk but the high logical reads show something opposite as I thought.

    PS: Sorry if I am asking something stupid.

    Thanks
    Chandan

    • Not a stupid question at all!

      With just 30 minutes I had a hard choice– I could either show less things and include the step of tracking/comparing the results better, or cover more things and not do as good a job comparing them. In this case I picked the second one, and I think that can make it a bit confusing.

      When I am doing this in real life, I like to keep a scorecard (and I also testing things multiple times, etc of course)

      Here’s an example the scorecard for the queries. (I reran the queries so the exact physical read vs readahead count may not match 100% with the video, as these can vary. I haven’t included parse and compile CPU time in these– just execution time).

      Let me know if this doesn’t answer your question 100%. The logical reads are going up because of what the nested loop operator is doing in those plans.

      1) First query with no modifications
      logical reads 46535, physical reads 1, read-ahead reads 43346
      CPU time = 1578 ms

      2) Query after index is created, but with no hint or query changes. (Index is not used.)
      logical reads 46535, physical reads 2, read-ahead reads 46524
      CPU time = 1750 ms

      3) Query after index hint is used to force the index
      logical reads 74777, physical reads 19, read-ahead reads 282
      CPU time = 188 ms

      4) Query after rewritten with CTE
      logical reads 74797, physical reads 19, read-ahead reads 263
      CPU time = 172 ms

      5) Query after rewritten with temp table
      logical reads 261, physical reads 1, read-ahead reads 258
      logical reads 74417, physical reads 19, read-ahead reads 16
      CPU time = 31 ms
      CPU time = 187 ms

  5. Thanks Kendra. My question was based on the high number of logical reads when during the first time you cleaned the buffers because I was surprised to see reads from memory where there was not data yet.
    As per your answer, first the index or data pages are read from the disc but then due to nested loop,these pages are read again from memory hence the high number of logical reads.
    Am I correct in the interpretation?

    Thanks
    Chandan

  6. Hi Kendra, I loved the video. I dug the ending part where you rewrote the CTE as a series of simpler statements. I like the philosophy of it being easier to tune several smaller statements rather than a single large, complex one.

  7. Why did the SQL Server not use the non clustered index automatically the first time it was created ?

    Why do we have to try the tricks ?

    Is it because we are having SUM(Other columns which are not included in non clustered)

    So is it safe to say, “when u retrieve columns which are not part of non clustered index, the sql server may or may not use the create non clustered index”

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php