We’ve all found those databases. They’re on your production SQL Server instance, but nobody seems to know if they’re being used, or what’s using them.
You could run some sort of trace, but you’d probably drag down performance on your SQL Server. What you need is a quicker way to find out if the database is in use.
Is a Login Using the database?
And if so, what query did they run last? My favorite way to see this is with Adam Machanic‘s sp_whoisactive.
You want to run sp_whoisactive with a couple of special parameters, so it shows you sessions that are connected even if they aren’t running, and limits the output to sessions using that database:
1 2 3 4 5 |
exec sp_whoisactive @show_sleeping_spids=2, @filter_type='database', @filter='AdventureWorks2012'; GO |
- Pros: Super fast, super easy. Lets you see the host name and the last query they ran.
- Cons: Whatever’s using the database might not be connected at the moment, or might be using a different database context.
- Verdict: A good first step.
Are Reads and Writes Happening on Tables in the Database?
Sure, it’s possible that stored procedures or functions are in use that don’t use any local tables, but just looking at whether data access is being done tells you a lot fast.
My favorite way to do this is with sp_BlitzIndex®. Run it with the @mode=2 parameter, copy and paste the data into a spreadsheet, and look at the “Index Usage” column. You can also see the timestamps of most recent seeks, scans, and writes.
1 2 3 4 |
exec sp_BlitzIndex @DatabaseName='AdventureWorks2012', @mode=2; GO |
- Pros: Very lightweight, and lets you know which tables are in use.
- Cons: If you have fulltext indexes, you may see reads from the fulltext indexing service on those tables (even if users aren’t querying the fulltext index).
- Verdict: Very useful information. A good second step.
Is the Transaction Counter Going Up for the Database?
SQL Server keeps a cumulative track of activity, and you can check this to see if it’s currently going up:
1 2 3 4 5 |
SELECT * FROM sys.dm_os_performance_counters WHERE counter_name like 'Transactions/sec%' and instance_name like 'AdventureWorks2012%'; GO |
- Pros: Super lightweight, no extra tools needed.
- Cons: Some reads will occur just in starting up SQL Server. You can’t tell reads from maintenance from reads from user transactions. (Full and log backups will increment the counter.) Also, strangely enough, just plain SELECT statements outside of an explicitly defined transaction don’t cause this counter to go up. (I know, weird, right???)
- How to use it: Take a couple of samples and see if the counter is going up dramatically. If it is going up, that means something’s fairly busy in there. I wouldn’t try to get too scientific with this one, though– it’s just too limited.
- Verdict: Kind of clunky, but could be useful along with the other methods here.
Are there user Execution Plans in the Cache for the Database?
You can also ask the execution plan cache if queries have been running against the database. This query takes advantage of the dm_exec_text_query_plan DMV (as recommended here by the very clever Grant Fritchey):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
SELECT SUBSTRING(tx.[text], (qs.statement_start_offset / 2) + 1, (CASE WHEN qs.statement_end_offset =-1 THEN DATALENGTH(tx.text) ELSE qs.statement_end_offset END - qs.statement_start_offset) / 2 + 1) AS QueryText, case when pl.query_plan LIKE '%<MissingIndexes>%' then 1 else 0 end as [Missing Indexes?], qs.execution_count, qs.total_worker_time/execution_count AS avg_cpu_time, qs.total_worker_time AS total_cpu_time, qs.total_logical_reads/execution_count AS avg_logical_reads, qs.total_logical_reads, qs.creation_time AS [plan creation time], qs.last_execution_time [last execution time], CAST(pl.query_plan AS XML) AS sqlplan FROM sys.dm_exec_query_stats AS qs CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) AS pl CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS tx WHERE pl.query_plan LIKE '%[AdventureWorks2012]%' ORDER BY execution_count DESC OPTION (RECOMPILE); GO |
- Pros: Gives you insight into what is using the database when it finds results
- Cons: Doesn’t catch anything that’s not in the cache due to RECOMPILE hints or memory pressure over time. You’ll probably have to slog through some rows of system procedures.
- Verdict: not perfect, but very information-rich if something does turn out to be using the database (and a whole lot quicker and more lightweight than tracing).
I know there’s a way I haven’t THOUGHT of.
Do you have a lightweight technique you use for this which I haven’t mentioned here? Let me know in the comments!
51 Comments. Leave new
Another solution: take the database offfline, and wait until the phone rings…
-Klaus
This is what in the army is known as “recon by fire” 🙂
LOL!
Wenn Nutzer kommt, flach auf den Boden legen und auf Hilfe warten.
(Whenever I see anything from Klaus I think of a great joke in German)
Thanks. Just spit out my coffee!
I call it the “rip and ring” method.
I’ve always heard this approach called “the scream test.”
It’s not wrong tho. I certainly find out who is using what, and when you ‘own’ over 500 databases, that’s helpful to know.
Also the best way to be fired
I had to do this a lot in the SQL 2000 days and some things I would look at would be if the transaction log was growing or do differential backups and see how big they were. Is this too simplistic? Obviously that would only show transaction activity, which can be not enough to prove a database is used.
Oh interesting! Yeah, it only covers modification and if you have index maintenance running that could cause some false positives, but it definitely is a possible source of info.
True. If there was index maintenance. These were mostly tiny databases that never got an index and we were just trying to tell if they could be safely dropped.
If you suspect that a given table isn’t being used (or, worse, is only being written to implying that the data isn’t ever being read) is SQL Audit. Granted, you have to have some idea as to whether or not it’s being used or the audit file gets large quickly, but I’ve trimmed some junk using this process. As a bonus, you get the statements that are running.
Heh, I’m not sure I’d call auditing simple OR easy, but if you’ve got EE and the know how to get Audit rigged up then it absolutely gives a lot of info. Cool to hear that you’ve used this successfully!
Timely stuff…just started looking at Audit today to solve that very problem (auditing specific table access and specific proc execution when the devs assume they aren’t used any longer). I was first worried about performance impact as compared to a server side trace but it sounds like its actually more lightweight than a trace, being built off of XEvents.
Is dm index usage stats not a good idea? We keep a daily snapshot of indeed usage stats for 2 months, if we see no usage we offline for a month and then delete after a external tape backup with 12 month retention.
Have a semi automated process For SQL 2000. We had a trace running for 30 min window to capture which db is used and the next trace filters out the for DB that have already captured as used. The more DB we captured, we increased the time for which the trace would run. We automated stopping trace after 30 mins, reading the trace to capture the db used and add filter to the New trace.
Any better idea please let us know, have removed 30 SQL 2000 instances, 20 more to go :).
Thanks
Anand
Usage stats can definitely be helpful– that’s basically what I’m referring to in the “Are reads and writes happening?” section.
On SQL Server 2012 and higher, index rebuilds can clear out those index usage stats, so that might give you bad info if you sampled right after index maintenance and you’re using rebuilds.
For SQL Server 2000, that’s a pretty tricky situation. I like how you’ve approached it: you don’t have a ton of tools or options, and you’ve designed a way where you’re trying to minimize your trace footprint.
So I have this run every 5 to 10 min. It does 2 things…
1. will show what hosts are connected (but not necessarily whats being executed) and how many connections each host has to each db
2. if there are more than x threshold it starts to dump what is being executed by said connection.
Its my “go to” when I want to find whats connected and if there’s a service that isn’t letting go of the db connection like it should.
FYI … dbname and author were removed to protect … well just to protect 😉
USE []
GO
/****** Object: StoredProcedure [dbo].[ConnectionsStatus] Script Date: 05/05/2014 13:58:55 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
— =============================================
— Author:
— Create date: 04/07/2012
— Description: Inserts current connections, active and possibly inactive, into .dbo.ConnectionsActive & .dbo.ConnectionsInactive tables
— =============================================
ALTER PROCEDURE [dbo].[ConnectionsStatus]
AS
BEGIN
— SET NOCOUNT ON added to prevent extra result sets from
— interfering with SELECT statements.
SET NOCOUNT ON;
— connection count by database, host, login
INSERT INTO .dbo.ConnectionsActive
(DatabaseName, HostName, LoginName, ConnectionCount)
SELECT d.name as DatabaseName, p.hostname, loginame, count(*) as ConnectionCount
FROM master..sysprocesses p
join master..sysdatabases d on p.dbid = d.dbid
GROUP BY d.name, p.hostname, loginame
ORDER BY 4 desc
— inactive connection count by database, host, login
INSERT INTO .dbo.ConnectionsInactive
(DatabaseName, HostName, LoginName, ConnectionCount)
SELECT d.name as DatabaseName, p.hostname, loginame, count(*) as ConnectionCount
FROM master..sysprocesses p
JOIN master..sysdatabases d on p.dbid = d.dbid
WHERE datediff (minute,p.last_batch, getdate()) > 2
GROUP BY d.name, p.hostname, loginame
ORDER BY 4 desc
— Connections Active Detail
DECLARE @MaxConnectionCount INT = 95 –set max amount of connection count to begin showing last executed sprocs
DECLARE @cntr int =1
DECLARE @max INT
DECLARE @spid INT
DECLARE @spids AS TABLE (id INT IDENTITY (1,1), spid INT, DatabaseName VARCHAR(255), HostName VARCHAR(255), LoginName VARCHAR(255), textdata VARCHAR(Max))
–SELECT d.name as DatabaseName, p.hostname, loginame, count(*) as ConnectionCount
–FROM master..sysprocesses p
–JOIN master..sysdatabases d on p.dbid = d.dbid
–GROUP BY d.name, p.hostname, loginame
–ORDER BY 4 desc
INSERT INTO @spids
(
spid ,
DatabaseName ,
HostName ,
LoginName
)
SELECT spid, d.name AS databasename, p.hostname, p.loginame
FROM master..sysprocesses p
JOIN master..sysdatabases d on p.dbid = d.dbid
JOIN
( SELECT d.name as DatabaseName, p.hostname, loginame
FROM master..sysprocesses p
join master..sysdatabases d on p.dbid = d.dbid
GROUP BY d.name, p.hostname, loginame
HAVING COUNT(*) > @MaxConnectionCount
) RU
ON RU.DatabaseName = d.name AND RU.hostname = p.hostname AND RU.loginame = p.loginame
SELECT @max = (SELECT MAX(id) FROM @spids)
WHILE @cntr <= @max
BEGIN
DECLARE @sqltext VARBINARY(128)
SELECT @sqltext = sp.sql_handle
FROM sys.sysprocesses sp
JOIN @spids s
ON s.id = @cntr
WHERE sp.spid = s.spid
SELECT @spid = (SELECT spid FROM @spids WHERE id = @cntr)
UPDATE s
SET Textdata = txt.textdata
FROM @spids s
JOIN (
SELECT TEXT AS textdata, @spid spid
FROM sys.dm_exec_sql_text(@sqltext)
) txt
ON txt.spid = s.spid
SELECT @cntr = @cntr + 1
END
INSERT INTO .dbo.ConnectionsActiveDetail
( DatabaseName ,
HostName ,
LoginName ,
spid ,
TextData
)
SELECT
DatabaseName,
HostName,
LoginName,
spid,
textdata
FROM @spids
WHERE textdata IS NOT NULL
DELETE
FROM .dbo.ConnectionsActiveDetail
WHERE TimeStamp < GETDATE() – 60
END
Looks like db name got removed… well you should get the idea anyway
Clunky or not, I’ve really come to like the Transactions/sec counter this year. It’s been especially useful when filtering on tempdb. (Even though I find it hard to understand what that counter is doing without experimenting.)
BTW: You know you’ve truly made it when you illustrate a blog article with a picture of a burning database.
I always start by running a ‘netstat’. Of course this is server-wide, not per database.
Stefan – right, how does that help here with multiple DBs?
Here’s my solution based in sys.dm_db_index_usage_stats view.
Please comment, is secure ?
/****************************************************************/
/* SCRIPT DATABASE REPORTING FROM UNUSED SQL LAST RESET SERVICE */
/****************************************************************/
use master
go
SELECT DB_NAME() as ‘dbname’,
convert(varchar(255), t.name) AS ‘Table’,
SUM(i.user_seeks + i.user_scans + i.user_lookups)
AS ‘Total accesses’,
SUM(i.user_seeks) AS ‘Seeks’,
SUM(i.user_scans) AS ‘Scans’,
SUM(i.user_lookups) AS ‘Lookups’
INTO ##tb_tables_used
FROM
sys.dm_db_index_usage_stats i RIGHT OUTER JOIN
sys.tables t ON (t.object_id = i.object_id)
GROUP BY
i.object_id,
t.name ORDER BY [Total accesses] DESC
go
delete ##tb_tables_used
go
exec sp_msforeachdb ‘use ?;
insert into ##tb_tables_used
SELECT ”?” as ”dbname”,
t.name AS ”Table”,
SUM(i.user_seeks + i.user_scans + i.user_lookups)
AS ”Total accesses”,
SUM(i.user_seeks) AS ”Seeks”,
SUM(i.user_scans) AS ”Scans”,
SUM(i.user_lookups) AS ”Lookups”
FROM
sys.dm_db_index_usage_stats i RIGHT OUTER JOIN
sys.tables t ON (t.object_id = i.object_id)
GROUP BY
i.object_id,
t.name ORDER BY [Total accesses] DESC ‘
go
SELECT DATEDIFF(D, create_date, GETDATE()) as ‘Running Days’, CREATE_DATE AS ‘Restarting from the’
FROM SYS.databases
WHERE name = ‘TEMPDB’
select name as ‘Databases without Use’
from sys.databases
where name not in (‘model’)
except
select dbname
from ##tb_tables_used
order by 1
Hey there,
That’s essentially what I’m describing in the section, “ARE READS AND WRITES HAPPENING ON TABLES IN THE DATABASE?” So it has the same pros and cons.
Kendra
Thank you so much for the interesting post.
Please note that sp_whoIsActive doesn’t tell you if the database is being quarried via SSAS.
Oh, interesting! If there’s a cube that’s populated from the database, you should see the queries while the cube build is running though, right?
That’s exactly what I would expected, but didn’t see happen… I’ll check for the other scripts and let you know how they’re going, though!
I’ve seen the queries from cube builds before in sp_whoisactive. There’s going to be a period after the data is read in while it’s being processed in memory in Analysis Services, I bet– so the query might not show up the whole time the cube is being processed.
I’ve run the script straight after processing the cube, so I don’t think it is that..
How much data is being retrieved?
It shows up like any other query. So if it’s fast or if you don’t happen to catch it while it’s running, it’s hard to see. But it’s just like any other query even though it’s coming from some other SQL Server component.
Great post. I’m going to use some of these. Thanks for sharing!
How about enabling AutoClose?
Oh, interesting– I’m betting there’s some metadata which would tell you last open/close date, eh? (I don’t enabled this a lot so I can’t quite remember what it does to the DMVs, but I’m thinking it updates sys.databases or at least the default trace.)
Autoclose can give you a big perf hit, and of course a maintenance job could open it, but if you’re relatively certain a database isn’t in use and you’ve cleaned up maintenance, this does seem like a clever trick!
Hi.
Any good approaches for SQL Server 2000 to recomend?
[]’s
Renato – SQL Server 2000 is unsupported by Microsoft. You’ll want to migrate to a supported version ASAP. Just as it’s hard to hire for a position on the Titanic, it’s hard to find people interested in working on SQL Server 2000.
Kendra,
I agree with your point of view. Focus on majority 🙂
Brent,
I agree. Nice analogy, btw.
Unfortunately, we haven’t always 100% of the new ships in the dock…
Sad, but true. Actually, we have some approaches for migrating SQL Server 2000 databases and and we have a very limited possibilities for discover unused databases. Nothing so cool in 2000 like those methods above for 2008+.
Renato – right, don’t bother making changes or taking risks on the 2000 boxes. Just migrate the databases to something supported.
We are working at it.
Thanks for your observation.
I know this thread is a little older but I have been referring to it so its still current for me. Another great little resource on my go to website for all my don’t fall over the learning curve requirements.
In addition to the queries you mention I also find the following useful, for a serverwide glimpse:
— get the current connection count
— databases that have connection count 0 for extended periods are not being used.
use master
go
SELECT @@ServerName AS server
,NAME AS dbname
,COUNT(STATUS) AS number_of_connections
,GETDATE() AS timestamp
FROM sys.databases sd
LEFT JOIN sysprocesses sp ON sd.database_id = sp.dbid
WHERE database_id > 4
GROUP BY NAME
Nice!
I would just add at the end
ORDER by 3,2
I think following should answer the question:
WITH agg AS
(
SELECT
last_user_seek,
last_user_scan,
last_user_lookup,
last_user_update,database_id
FROM
sys.dm_db_index_usage_stats
)
SELECT
database_id, DB_NAME(database_id) DatabaseName,
last_read = MAX(last_read),
last_write = MAX(last_write)
FROM
(
SELECT database_id,last_user_seek, NULL FROM agg
UNION ALL
SELECT database_id,last_user_scan, NULL FROM agg
UNION ALL
SELECT database_id,last_user_lookup, NULL FROM agg
UNION ALL
SELECT database_id,NULL, last_user_update FROM agg
) AS x (database_id,last_read, last_write)
group by database_id
what do you think?
Except usage stats gets reset in 2012/2014 (up to a certain patch level) when an index gets rebuilt.
awesome
would this work?
SELECT
DB_NAME(database_id) AS dbName,
MAX(last_user_seek) AS last_user_seek,
MAX(last_user_scan) AS last_user_scan,
MAX(last_user_lookup) AS last_user_lookup,
MAX(last_user_update) AS last_user_update
FROM sys.dm_db_index_usage_stats
GROUP BY DB_NAME(database_id)
I use this executed by an agent job every 5 mins or whatever I feel is required. Is there anything wrong with using sp_who2?
— Log if databases are in use
USE DBA;
IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.Tables WHERE TABLE_NAME = ‘DatabasesInUse’)
CREATE TABLE DBA.dbo.DatabasesInUse ( [DBName] varchar(200), [HostName] varchar(200), [Login] varchar(200), [Lastbatch] varchar(200), [PollDate] datetime )
IF EXISTS (SELECT * FROM tempdb..sysobjects WHERE id=OBJECT_ID(‘tempdb..#spwho2’)) DROP TABLE #spwho2
CREATE TABLE #spwho2(
[SPID] INT
,[Status] VARCHAR(200)
,[Login] VARCHAR(200)
,[HostName] VARCHAR(200)
,[BlkBy] VARCHAR(200)
,[DBName] VARCHAR(200)
,[Command] VARCHAR(200)
,[CPUTime] INT
,[DiskIO] INT
,[LastBatch] VARCHAR(200)
,[ProgramName] VARCHAR(200)
,[SPID2] INT
,[RequestID] INT
)
INSERT INTO #spwho2
EXEC sp_who2
INSERT DBA.dbo.DatabasesInUse
SELECT DISTINCT
s.DBName
,s.HostName
,s.Login
,MAX(s.LastBatch) AS Lastbatch
,PollDate = GETDATE()
FROM
#spwho2 AS s
WHERE s.DBName IS NOT NULL
AND s.DBName NOT IN (‘msdb’,’master’,’model’)
GROUP BY s.HostName
,s.DBName
,s.[Login];
DROP TABLE #spwho2
It won’t catch short-lives sessions.
As a lightweight trace its quite handy though. I run a similar thing every minute. I find if its been running for a few weeks it will predict unused databases with 95% accuracy, so like all these things, it provides a good place to start further investigation.
Another way is to look for the values of datetime columns, its just another way to build up a bigger picture before doing a full trace.
function Get-LastdattimeValues ($instance, $Database, [switch]$GetLatest)
{
$q = @”
IF OBJECT_ID(‘tempdb..#Temp’) IS NOT NULL DROP TABLE #Temp
CREATE TABLE #Temp (LastDate datetime, SchemaName sysname, TableName sysname, ColName sysname)
DECLARE @Statement nvarchar(500)
declare @SchemaName SYSNAME
DECLARE @TableName sysname
DECLARE @ColName SYSNAME
DECLARE Columns CURSOR FOR SELECT ‘INSERT INTO #Temp SELECT TOP 1 [‘ + c.name + ‘] , ”’ + SCHEMA_NAME(t.schema_id) + ”’ AS SchemaName, ”’ + t.name + ”’ AS TableName, ”’ + c.name + ”’ AS ColName FROM [‘ + SCHEMA_NAME(t.schema_id) + ‘].[‘ + OBJECT_NAME(t.Object_id) + ‘] WHERE [‘ + c.name + ‘] IS NOT NULL and [‘ + c.name + ‘] < getdate() ORDER BY [' + c.name + '] DESC' AS Statement,SCHEMA_NAME(t.schema_id) as SchemaName, t.name, c.name FROM sys.columns c JOIN sys.tables t ON t.object_id = c.object_id WHERE c.user_type_id = 61 AND t.is_ms_shipped = 0
OPEN Columns
FETCH NEXT FROM Columns INTO @STatement, @SchemaName, @TableName, @ColName
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT @statement
EXEC sp_executesql @statement
FETCH NEXT FROM Columns INTO @STatement, @SchemaName, @TableName, @ColName
END
DEALLOCATE Columns
"@
if ($GetLatest)
{
$q = $q + " select top 1 LastDate from #temp order by lastdate DESC"
}
Else
{
$q = $q + " select * from #temp order by lastdate desc"
}
Invoke-Sqlcmd -ServerInstance $instance -Database $Database -Query $Q
}
Is there a way to tell from a transactional replication subscriber if the replication is lagging? I only have user level permissions on the subscriber database with no access to the publisher. I can’t seem to find any fields in the database itself that will assure me that it is up to date. It is a fairly high use database, so I can use certain update fields, but I am afraid that on weekends the database may not have enough activity to use this method with any confidence.
Bob – for general questions, head to a Q&A site like https://DBA.stackexchange.com.
The crux of the problem is how we define “used” and “lightweight”. All of the methods above involve inferring user activity by measuring resource usage for subjectively sufficient length of time.
I’d suggest enabling login auditing for maybe 15 minutes during business hours on what we believe should be a busy oltp database – or x days on something like an occasionally used data warehouse or financial reporting database – and then look in the SQL Server log for login events coming from user accounts.
But still there could be zombie polling or report refresh services that hit the database consistently – even though real users stopped using the application or caring about the data a long time ago.
Eric – I’m guessing you come from an environment outside of SQL Server. In SQL Server, you can log into database A, but query database B (especially with cross-database queries and joins.) I wish it was as easy as logins, but that’s not how SQL Server works. Thanks for the idea though!
Check out the Get-LastIndexUpdateForDatabses cmlet in my DBALibrary PowerShell script.
https://github.com/MikeyMing/powershell-sql/blob/master/DBALibrary.ps1
Unless we can do a full trace a multi-tool approach is needed.
Anyone have any ideas on how extended events can help in this problem?