“How many rows exist in a table?”
It seems like such an innocent request. It isn’t too hard to get this information out of SQL Server. But before you open SSMS and whip out a quick query, understand that there are multiple methods to get this information out of SQL Server – and none of them are perfect!
COUNT(*) or COUNT(1)
The seemingly obvious way to get the count of rows from the table is to use the COUNT function. There are two common ways to do this – COUNT(*) and COUNT(1). Let’s look at COUNT(*) first.
SELECT COUNT(*) FROM dbo.bigTransactionHistory;
The STATISTICS IO output of this query shows that SQL Server is doing a lot of work! Over 100,000 logical reads, physical reads, and even read-ahead reads need to be done to satisfy this query.
(1 row(s) affected) Table 'bigTransactionHistory'. Scan count 5, logical reads 132534,
physical reads 3, read-ahead reads 131834, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
Looking at the execution plan, we can see an Index Scan returning over 31 million rows. This means that SQL Server is reading every row in the index, then aggregating and counting the value – finally ending up with our result set. The cost of this query? 123.910000.
The query results: 31,263,601 rows.
Now, let’s look at the behavior of COUNT(1).
SELECT COUNT(1) FROM dbo.bigTransactionHistory;
We can see from STATISTICS IO that we have a large number of logical reads – over 100,000.
(1 row(s) affected) Table 'bigTransactionHistory'. Scan count 5, logical reads 132531,
physical reads 3, read-ahead reads 131849, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
The execution plan again shows an index scan returning over 31 million rows for processing. The query cost is the same, 123.910000.
The results here are the same – 31,263,601 rows.
The benefit of using COUNT is that it is an accurate indicator of exactly how many rows exist in the table at the time query processing begins. However, as the table is scanned, locks are being held. This means that other queries that need to access this table have to wait in line. This might be acceptable on an occasional basis, but I frequently see applications issuing these types of queries hundreds or thousands of times per minute.
sys.tables + sys.indexes + sys.partitions
We can join several SQL Server catalog views to count the rows in a table or index, also. sys.tables will return objects that are user-defined tables; sys.indexes returns a row for each index of the table; and sys.partitions returns a row for each partition in the table or index. I am going to query for the table ID, name, and count of rows in all partitions.
DECLARE @TableName sysname SET @TableName = 'bigTransactionHistory' SELECT TBL.object_id, TBL.name, SUM(PART.rows) AS rows FROM sys.tables TBL INNER JOIN sys.partitions PART ON TBL.object_id = PART.object_id INNER JOIN sys.indexes IDX ON PART.object_id = IDX.object_id AND PART.index_id = IDX.index_id WHERE TBL.name = @TableName AND IDX.index_id < 2 GROUP BY TBL.object_id, TBL.name;
The output of STATISTICS IO here shows far fewer reads – 15 logical reads total.
(1 row(s) affected) Table 'syssingleobjrefs'. Scan count 1, logical reads 2, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'sysidxstats'. Scan count 2, logical reads 4, physical reads
0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'sysrowsets'. Scan count 1, logical reads 5, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'sysschobjs'. Scan count 1, logical reads 4, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
The execution plan is more complex, but much less work – the query cost here is 0.0341384.
The results of the query are also the same – 31,263,301.
The benefits of using this method are that the query is much more efficient, and it doesn’t lock the table you need the count of rows for.
However, you need to be cautious when counting the rows on a table that is frequently inserted into or deleted from. The TechNet documentation for sys.partitions.rows says it indicates the “approximate number of rows for this partition”. How approximate? That information isn’t documented. Understand, though, that if you use this method, you potentially sacrifice up-to-the-moment accuracy for performance.
A third option is to use the dynamic management view sys.dm_db_partition_stats. This returns one row per partition for an index.
DECLARE @TableName sysname SET @TableName = 'bigTransactionHistory' SELECT OBJECT_NAME(object_id), SUM(row_count) AS rows FROM sys.dm_db_partition_stats WHERE object_id = OBJECT_ID(@TableName) AND index_id < 2 GROUP BY OBJECT_NAME(object_id);
The STATISTICS IO output of this query is even lower – this time, only two logical reads are performed.
(1 row(s) affected) Table 'sysidxstats'. Scan count 1, logical reads 2, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
The execution plan is less complex than our second example involving the three system views. This query also has a lower cost – 0.0146517.
The query results are the same as the previous examples – 31,263,301 rows.
Using this DMV has the same benefits as the system views – fewer logical reads and no locking of the target table. The query is also simpler to write, involving only one object.
But again, the TechNet documentation for sys.dm_db_partition_stats.row_count says it is “the approximate number of rows in the partition”, and when this information is updated is not documented. Here, you are also potentially sacrificing accuracy for performance.
Time to do some digging
The questions that you need to work with the business to answer are, “How up-to-date must the row count be? What is the business purpose? How often do you insert into or delete from that table, and how often do you count the rows?” If the accuracy of the row count is crucial, work to reduce the amount of updates done to the table. If performance is more important, and the row count could be approximate, use one of the system views.
There will come a time when your database backups will not be as fast as you would like them to be. Over time, the volume of data you’re storing will grow, and backups will take longer – but your weekly or daily maintenance windows don’t grow as well. Backups can be resource-intensive, using I/O and network resources – and if they’re running when users are trying to work, the users may notice the performance impact.
Here are two ways to make your backups go faster.
Option A: Full and Differential backups
A typical backup scenario that I see is a full backup daily and, if the database is in Full recovery model, log backups throughout the day. But what happens when that full backup begins to take hours, instead of minutes? A differential backup may help.
A full backup contains all the data in the database, and enough of the log to recover it. A differential backup contains only the extents (groups of 8K data pages) that have changed since the last full backup. Because a differential is usually smaller than a full backup, it can take less time to complete.
The commands to run the backups are very similar. A full backup:
BACKUP DATABASE AdventureWorks2012 TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full.bak';
To perform a differential backup, you add the WITH DIFFERENTIAL option:
BACKUP DATABASE AdventureWorks2012 TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Diff.bak' WITH DIFFERENTIAL;
Combining full and differential backups throughout the day or week can reduce the time you spend waiting for backups to complete. There are two scenarios I usually see. The first is a full backup one or two days per week, with differential backups on the other days, and regular transaction log backups. This makes sense when a full backup every day would be too time-consuming. The second is a full backup daily, differentials every few hours, and regular log backups. This usually makes sense when fast recovery is necessary, and it would be better to be able to restore a full backup, a differential backup, and a subset of log backups – rather than the full backup and all log backups for the day.
This can add a layer of complexity to a restore strategy – instead of restoring the last full backup and ensuing logs, the most recent differential must be accounted for as well. You should practice this restore sequence so you understand the steps involved. You also need to monitor the size of the differential backups closely – depending on the rate of change in the database, you may not be saving as much space as hoped.
However, when your time to do backups and have them impact your users is shrinking, yet database size is growing, the full + differential strategy can be used effectively.
Option B: Backing up to multiple files
The standard database or log backup is done to one backup file:
BACKUP DATABASE AdventureWorks2012 TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_20140109.bak' WITH Name = 'AdventureWorks2012 Full One File';
When a backup is written to one file, one thread is assigned. You can back up to multiple files – in one or more locations – and multiple threads will be used. This can make a backup take less time – sometimes significantly. Here’s the command to create a full backup across four disks:
BACKUP DATABASE AdventureWorks2012 TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_1__20140109.bak', DISK=N'F:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_2__20140109.bak', DISK=N'G:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_3__20140109.bak', DISK=N'H:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_4__20140109.bak' WITH Name = 'AdventureWorks2012 Full 4 Files';
Having multiple backup files can make a restore more complicated. All of the files must be available, and all must be referenced in a restore session. To restore the backup files shown below, use the following command:
RESTORE DATABASE AdventureWorks2012 FROM DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_1__20140109.bak', DISK=N'F:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_2__20140109.bak', DISK=N'G:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_3__20140109.bak', DISK=N'H:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_4__20140109.bak' WITH NORECOVERY; RESTORE DATABASE AdventureWorks2012 WITH RECOVERY;
However, the benefits can be significant. I performed five tests – one file on one disk, two files on one disk, four files on one disk, two files on two disks, and four files on four disks. The results show that adding more disks decreases the backup time.
Note: these tests don’t cover all scenarios. If you have a group of disks in a RAID 5 or RAID 10 array, multiple threads writing to multiple files could be faster than one thread and one file. Lesson learned: test your backup files in your environment to determine what is best for you!
Backups Are Your Best Friend
And just like your best friend, you want to spend some time with them on a regular basis, getting to know them better. Learning tricks like the two I’ve given you here can help you when you face a situation where you need to improve performance. Having experience with these options – even if you do them in a sandbox environment or test it on AdventureWorks – is a huge asset.
Want to know even more about backups, restores, and cool things you can do with them? Check out my Backup & Recovery Step by Step training!
There are a lot of things to think about. What recovery model is best for your database? How can you automate backups? What steps need to be taken to restore a database that had full and log backups taken?
To help answer some of the most common questions, I’ve put together a 115-minute, five-module course on Backup and Recovery, Step by Step!
Problems My Training Will Solve
You’re new to this whole SQL Server thing. You’ve heard people say you need to be backing up your data, but you aren’t sure where to start. This training will guide you step by step through the process.
You’re the “reluctant DBA” and you know you have regular full and log backups. However, you’ve never had to restore a series of them to solve a production issue. Yet. The time to learn is not when there is a problem. Learn the steps now and be prepared when you need them!
You’ve heard of the different recovery models, like simple, full, and bulk-logged, but you aren’t sure what the differences are and when they are appropriate. All of these will be explained.
Start here if you’re interested in learning more!
The year: 2005.
What was happening with the Brent Ozar Unlimited crew?
I was working on the help desk for a small company while attending Fox Valley Tech College to earn my associate degree. I think I still wanted to go into programming at that point in time! I’d never been to a user group meeting or posted on a forum. I wasn’t even a runner!
Brent took his first full-time DBA gig at Southern Wine & Spirits. He worked on his first cluster and his first 1 TB+ database. He’d never written about Perfmon or attended a conference.
Kendra worked for a dot-com in Seattle. She spent a lot of time with 32-bit servers, automating installs, restores, and configuration of SQL Server for large environments.
Doug was working on a VB6 app for insurance adjusters. At home he had a smokin’ PC with a 1.0 GHz AMD Athlon and a 19” CRT monitor. He and his wife didn’t have kids yet, and he surprised her with a trip to Venice for Christmas.
Jeremiah was (don’t laugh) a developer. While sitting on the bench, he decided to revise the company’s project management software. He rapidly prototyped a new system using SQL Server instead of Access and the .NET Framework with an ORM instead of ad hoc SQL written in VBScript and classic ASP.
The Technology World
Star Wars Episode III was released. YouTube was launched. The Xbox 360 was released.
And, after long last, Microsoft released SQL Server 2005. It had been a long five years since SQL Server 2000. There were drastic changes between the two editions. DMVs (dynamic management views), CLR, hot-add memory, ranking functions, and the XML data type were introduced. Database Mirroring, a new HA feature, was available as of SP1. SQL Server Management Studio: need I say more? These were big changes!
Jump in the time machine. 9 years later, I’m writing this blog post. It’s January, and a new year is beginning.
I’ve finished college, I’ve moved up from the help desk (way up), I’ve been to – and spoken at – a couple user group meetings just in the last month, and I’ve run a couple marathons.
Brent? He’s done OK. He’s attained Microsoft Certified Master (MCM) status in SQL Server. He’s gone from DBA to software evangelist to consultant. He’s spoken all over the world.
Kendra has also attained MCM status and become a consultant. She’s learned a lot about hardware, query tuning, and when it pays to upgrade your environment instead of sinking countless people-hours into solving a problem.
Doug is the newest member of the Brent Ozar Unlimited crew. He’s spent his 9 years learning everything he knows about SQL Server, and becoming a blogger, presenter, and user group leader.
Jeremiah used his software project to set the course for his career. He learned he had a knack for databases, and he ran with it. He too has become a successful consultant, blogger, and speaker.
Technology? It’s changed a bit, too.
Our iPads are now as powerful as Doug’s computer was back in the day. The Xbox One and PlayStation 4 launched this year. Instead of carrying around a phone, a laptop, a camera, and scanner, we carry one smartphone that does everything.
SQL Server, 9 Years, and You
SQL Server has changed dramatically, too!
SQL Server 2012 has been out for well over a year, and SP1 was released in 2013. This release brought some big changes. Internally, the way memory is handled was changed, improving operations and efficiency. There are improved T-SQL functions, such as more windowing functions. AlwaysOn Availability Groups were introduced as the latest HA/DR technology. With a GUI, Extended Events is ready to take over Profiler’s job. Columnstore indexes were introduced to make data warehouse storage and retrieval more efficient.
What’s next? We’re awaiting word on a release date for SQL Server 2014. This release is going to have the usual improvements, and then some. There’s a new method for cardinality estimation. A new engine is being introduced – In-Memory OLTP. Backups can be done directly to “the cloud” in Windows Azure. Clustered columnstore indexes will be updateable. There’s more – check out my recent webcast!
Let Me Ask You a Question
And let’s focus on your job. Are you still using the same laptop you did in 2005? If you’re on call, are you still using the same phone – or, help us all, the same pager – you had 9 years ago? Has your company’s main business application undergone change since then?
You have a newer laptop. You have a newer phone. The applications have been updated, or changed entirely. Why are you still on SQL Server 2005?
Yes, the process of testing a database upgrade is time-consuming. But it’s well worth it. The changes to the internals and the new features available are nothing but positive. And let’s not forget that in just over two years – Microsoft is currently stating April 12, 2016 – extended support for SQL Server 2005 ends.
Start making plans to upgrade today, and reap the benefits!
It’s coming…the next version of SQL Server. What surprises are in store for you? “In-memory” is a hot buzzword. Windows Azure capabilities are expanding. Data Explorer is being talked about. Jes will give you an idea of some of the new features and enhancements coming next!
Get an overview of new backup capabilities, integration with Windows Azure, new capabilities in Resource Governor, incremental statistics, and an overview of In-Memory OLTP (the artist formerly known as Hekaton)!
One question came up several times: which features will be Enterprise Edition only, and which will be available for all versions? As of today, Microsoft hasn’t released that information – but keep an eye out for “Features Supported by the Editions of SQL Server 2014″ on msdn.microsoft.com.
It was a dark and stormy… Oh, wrong story. It was actually a warm, sunny afternoon in Charlotte, NC. I was presenting “Index Methods You’re Not Using” at PASS Summit. In this talk, I discussed how indexed views, filtered indexes, and compressed indexes can improve your query performance by reducing I/O.
From stage right, an intrepid audience member raised his hand and asked, “Can you think of an example of when you would use filtered indexes instead of partitioning?”
This question left me speechless (temporarily).
My first thought was one of requirements and practicality: “What if you have Standard Edition? You can’t use table partitioning, but you could create an indexed view” I said.
The better question would have been, “What are you trying to accomplish?”
On the surface, filtered indexes and partitioning may appear similar: both are designed to hold only portions of a table’s data in separate structures. However, how they go about doing this, the use cases for each, and the requirements for each are very different.
Filtered indexes store only a portion of the data in a table. For example, in an orders table, instead of storing all orders in a nonclustered index on the “quantity” field, we could create a filtered index to only store orders with a quantity greater than or equal to 20, or between 10 and 19. In this case, the only column stored in the index leaf pages is the quantity (and the clustered index key).
Sample of data from the table
What a nonclustered index on Quantity would store
What a nonclustered index on quantity, filtered to include quantity between 10 and 19
Let’s say the data in the underlying table changes – someone wants to increase their order quantity from 750 to 1,250. The row is updated at the data level – the clustered index. At the same time, the nonclustered index must also be updated.
Table partitioning breaks the whole of the data into separate physical structure based on a partitioning key. It’s generally used to improve query performance or make administration of large tables easier. Partitioning is based on a schema and a function. The table contains a partitioning key – the field in the table that contains the value you’ll split your partitions on. Most commonly, this is a date field of one sort or another – tables are partitioned by day, month, or year.
To directly compare this to a filtered index: could you use an order quantity as a partitioning key? Yes, as long as it’s a valid data type. The difference is that the partition is going to store all columns from the table – not just the key. When you insert or update a row, SQL Server would have to look at that value to determine which partition to place it on. It would then perform the update on the appropriate partition.
A table, partitioned on quantity – 1-10, 11-20, 21-30, etc
Want to know more about partitioning? Start here.
Comparing How They’re Used
When it comes to reading the data, if you have a filtered index for a specific value, and the query optimizer can use that index, you can often reduce I/O by orders of magnitude because you are storing less of the data in the index itself. With partitioning, SQL Server has to determine which partition the data is stored on, then access it. This can be helpful on very, very large tables – but the care and time taken to implement it and the upkeep required mean it must be very carefully considered and maintained.
Which Should I Choose?
There are many features in SQL Server that may appear similar, but once you look at how they work and what they do, they are very different. Database mirroring is not the same as replication. Change Data Capture is not the same as Change Tracking. Filtered indexes and partitioning solve two very different problems. When considering implementing any new feature, stop to ask, “What is the problem I am trying to solve?”
Views are logical objects in SQL Server databases that present you with a “virtual table”. Views are typically created for one of three reasons: security, simplification, or aggregation.
- Security: we create views so that a user can read specific columns out of certain tables, but not all the data.
- Simplification: sometimes, it’s easier to write a SELECT against one view than a SELECT against multiple tables, with joins – repeatedly.
- Aggregation: again, it can be easier to SELECT already-aggregated data than to write the query – repeatedly.
The down side to views is that when you query them, you’re still reading data from all of the underlying tables. This can result in large amounts of I/O, depending on the number and size of tables. However, you can create a unique clustered index on the view – referred to as an indexed view – to persist the data on disk. This index can then be used for reads, reducing the amount of I/O.
There are some limitations to indexed views, but when you can create one and it improves performance, it really improves performance. But, as with all features in SQL Server, indexed views aren’t the answer to everything. Here’s a quick look at some things you can and can’t do with them.
The view definition can reference one or more tables in the same database.
Once the unique clustered index is created, additional nonclustered indexes can be created against the view.
You can update the data in the underlying tables – including inserts, updates, deletes, and even truncates.
The view definition can’t reference other views, or tables in other databases.
It can’t contain COUNT, MIN, MAX, TOP, outer joins, or a few other keywords or elements.
You can’t modify the underlying tables and columns. The view is created with the WITH SCHEMABINDING option.
You can’t always predict what the query optimizer will do. If you’re using Enterprise Edition, it will automatically consider the unique clustered index as an option for a query – but if it finds a “better” index, that will be used. You could force the optimizer to use the index through the WITH NOEXPAND hint – but be cautious when using any hint.
Choose the Right Tool
If you’re looking to have a complicated, aggregated query persisted to disk to reduce I/O, an indexed view may be the right tool for your job. Test it as an option, and if it works, put it to use!
Learn more about indexed views in my video The Okapis of SQL Server Indexes!
When you write a query, you (usually) don’t want to return all the rows from the table(s) involved – so you add a WHERE clause to the statement. This ensures that fewer rows are returned to the client – but doesn’t reduce the amount of I/O done to get the results.
When you create a nonclustered index, you can add a WHERE clause to reduce the number of rows that are stored at the leaf level – a filtered index. By having fewer rows in an index, less I/O is done when that index is used.
Filtered indexes are great performance boosts if you have a value that is used in a predicate very frequently, but that value is only a small amount of the total values for that table. (Say that ten times, fast.) For example: I have an orders table that contains a Status column. Valid statuses are Open, Processing, Packing, Shipping, Invoicing, Disputed, and Closed. When the business first starts using this table, there’s a good chance there is a fairly even distribution of orders across these statuses. However, over time, a majority of orders should be in Closed status – but the business wants to query for Open, or Disputed, which are only a small percentage.
This is where a filtered index can come in.
When you add a WHERE clause to the index creation statement, you’re limiting which rows are stored in the index. The index is smaller in size and more targeted. It can be trial and error to get the query optimizer to use the filtered index, but when you do, gains can be significant. In one case study I have, logical reads dropped from 88 to 4 – a 95% improvement! What are some of the things you can – and can’t – do with them?
Use equality or inequality operators, such as =, >=, <, and more in the WHERE clause.
Use IN to create an index for a range of values.
Create multiple filtered indexes on one column. In my order status example, I could have an index WHERE Status = ‘Open’, and I could have another index WHERE Status = ‘Shipping’.
You can have multiple items in the same where clause. I could have an index WHERE Status = ‘Open’ OR Status = ‘Shipping’.
Create a filtered index for all NOT NULL values – or all NULL values.
Create filtered indexes in SQL Server 2005.
Use certain expressions, such as BETWEEN, NOT IN, or a CASE statement.
Use date functions such as DATEADD for a rolling date range – the value in WHERE clause must be exact.
The query optimizer won’t consider filtered indexes if you’re using local variables or parameterized SQL. This is one of the hardest limitations to work around – so I usually reference Tim Chapman’s The Pains of Filtered Indexes for help with this!
Filter All The Things!
Using either a WHERE clause in a query to limit the rows returned or a WHERE clause in a filter to reduce the rows stored is beneficial to performance. You want to reduce the amount of data read and the amount of data returned to clients to improve performance. Filtered indexes can help!
Memory is one of the most-used resources in SQL Server. Generally, the more you have, the better query performance you’ll get. This blog isn’t about the magic of the buffer pool or the plan cache, but you should understand how important they are to your server’s performance. Since memory is such an important resource, you want to know how much of it you’re using at any time.
How can you track your server’s memory usage? One way is to use the Performance Monitor (Perfmon) counters exposed through the sys.dm_os_performance_counters DMV. One indicator of memory performance is Page Life Expectancy (PLE). You can capture basic memory usage over time by setting up a SQL Server Agent job to query this DMV, inserting the results into a table, and reporting on the table results.
I’ll show you how to collect this data and report on it!
Collecting the Data
I have a “DBAInfo” database on my instance that I use to track metrics and other information. I create a new table, MemoryHistory.
USE DBAInfo; CREATE TABLE MemoryHistory (ID INT IDENTITY NOT NULL, CollectionDateTime DATETIME, PerfmonObjectName NCHAR(128), CounterName NCHAR(128), CounterValue BIGINT)
Then, I create a new SQL Server Agent job that runs every 5 minutes.
The only step in this job is the below query, which queries the DMV and inserts the results into the table I created.
INSERT INTO MemoryHistory SELECT CURRENT_TIMESTAMP, object_name, counter_name, cntr_value FROM sys.dm_os_performance_counters WHERE object_name = 'SQLServer:Buffer Manager';
I schedule the job to run every five minutes.
Viewing The Data
Now, this data isn’t going to do me any good unless I view it, and make a decision or perform an action based on what I learn.
To view the data I’ve collected, I run the following query:
SELECT ID, CollectionDateTime, CounterName, CounterValue FROM MemoryHistory;
That’s a lot of junk to sort through when all I want to see is PLE, so I narrow down the query a bit.
SELECT ID, CollectionDateTime, CounterName, CounterValue FROM MemoryHistory WHERE CounterName = 'Page life expectancy';
But who wants to read through results like that each time there’s a problem to see when PLE rose or fell? Not me. I’d rather see it in a graphical format. How can I do that?
SQL Server Reporting Services
I have SSRS at my disposal. I’m going to create a very simple report that will allow me to enter start and end dates, and will display a line chart for PLE during that time.
Reporting on the Data
I set up my report to have DBAInfo as my data source. In order to choose dates, I use the following query as my dataset.
SELECT ID, CollectionDateTime, CounterName, CounterValue FROM MemoryHistory WHERE CounterName = 'Page life expectancy' AND CONVERT(DATE, CollectionDateTime) >= @Start AND CONVERT(DATE, CollectionDateTime) <= @End;
I change my @Start and @End parameters to “Date/Time” so I get a date picker.
I drag a Line Chart onto the design surface and add the CounterValue as my Value and CollectionDateTime as my Category Group.
I can preview the report to view it:
Last but not least, I’ll deploy this report to Report Manager so that I and others can run it, or even schedule a regular subscription.
There are several ways to improve this report. How can you modify the query to capture date and time data individually? How do you add parameters to the report so the user running it can choose their own date range? How would you collect and display data for different instances?
Want to know more about how to use Reporting Services to create reports for your environment? Check out my 90-minute SQL Server Reporting Services Basics training video!
Let me ask you a question: do you want to boss the SQL Server query optimizer around?
If you answered no: good. You’re willing to let the query optimizer do its job, which is to find the least expensive way to run a query, as quickly as possible.
If you answered yes: you’re a brave person.
Maybe you’ve hit upon the perfect index for a specific query, but for some reason the optimizer won’t use it. But you know it improves performance. How can you make the query optimizer listen to you?
Index hints are a powerful, yet potentially dangerous, feature of SQL Server.
Let’s look at an example. I’m working with AdventureWorks2012. The database has two related tables, HumanResources.Employee and Person.Person. They are related through the BusinessEntityID column. I want to retrieve information about the users and their logins.
USE AdventureWorks2012; GO SELECT PER.FirstName, PER.LastName, EMP.LoginID FROM HumanResources.Employee EMP INNER JOIN Person.Person PER ON PER.BusinessEntityID = EMP.BusinessEntityID;
Let’s look at the execution plan.
A nonclustered index seek is performed on Employee.AK_Employee_LoginID, a nonclustered index on the LoginID column. A clustered index seek is performed on Person. Note the cost is 0.217439.
I notice that the Employee table has another index, PK_Employee_BusinessEntityID, which is on the BusinessEntityID column. I want to force my query to use this index instead. I can do this by using the WITH (INDEX) hint.
SELECT PER.FirstName, PER.LastName, EMP.LoginID FROM HumanResources.Employee EMP WITH (INDEX (PK_Employee_BusinessEntityID)) INNER JOIN Person.Person PER ON PER.BusinessEntityID = EMP.BusinessEntityID;
Let’s look at this execution plan.
Now, a clustered index scan is performed on Employee. Note, though, that the query cost has increased – to 0.220402.
In your case, the index you force the query to use might get the better cost – for now. But what happens when more data is added to the table, and statistics change? What happens if you update SQL Server, and the query optimizer changes?
Eventually, the index you’re using may not be the best one for the job – but SQL Server is going to continue to use it anyway. You’ve told it to do so, and it’s doing to keep doing it.
Think about how you would get rid of it. Is there an easy way to search all of your code – application code, stored procedures, report definitions – for this specific index hint, to remove it?
Another point to consider is what would happen to that code if the index was disabled or deleted? Would it continue to run? Let’s give it a try. I issue a disable index command.
ALTER INDEX PK_Employee_BusinessEntityID ON HumanResources.Employee DISABLE;
Then I run the same query as before, and what happens? I get an error:
Msg 315, Level 16, State 1, Line 1
Index “PK_Employee_BusinessEntityID” on table “HumanResources.Employee” (specified in the FROM clause) is disabled or resides in a filegroup which is not online.
Index hints can be a powerful feature, but use with caution. Look for other ways to optimize your query – perhaps a different index can be created, or your query can be rewritten. If you can’t find a better way, document the use of the index hint and its purpose. When you upgrade SQL Server, test whether that index is as effective.
Remember, the query optimizer does its job really well – let it.