4 Lightweight Ways to Tell if a Database is Used

SQL Server
51 Comments

We’ve all found those databases. They’re on your production SQL Server instance, but nobody seems to know if they’re being used, or what’s using them.

Database_Fire_Signed
Is anyone using this?

You could run some sort of trace, but you’d probably drag down performance on your SQL Server. What you need is a quicker way to find out if the database is in use.

Is a Login Using the database?

And if so, what query did they run last? My favorite way to see this is with Adam Machanic‘s sp_whoisactive.

You want to run sp_whoisactive with a couple of special parameters, so it shows you sessions that are connected even if they aren’t running, and limits the output to sessions using that database:

  • Pros: Super fast, super easy. Lets you see the host name and the last query they ran.
  • Cons: Whatever’s using the database might not be connected at the moment, or might be using a different database context.
  • Verdict: A good first step.

Are Reads and Writes Happening on Tables in the Database?

Sure, it’s possible that stored procedures or functions are in use that don’t use any local tables, but just looking at whether data access is being done tells you a lot fast.

My favorite way to do this is with sp_BlitzIndex®. Run it with the @mode=2 parameter, copy and paste the data into a spreadsheet, and look at the “Index Usage” column. You can also see the timestamps of most recent seeks, scans, and writes.

  • Pros: Very lightweight, and lets you know which tables are in use.
  • Cons: If you have fulltext indexes, you may see reads from the fulltext indexing service on those tables (even if users aren’t querying the fulltext index).
  • Verdict: Very useful information. A good second step.

Is the Transaction Counter Going Up for the Database?

SQL Server keeps a cumulative track of activity, and you can check this to see if it’s currently going up:

  • Pros: Super lightweight, no extra tools needed.
  • Cons: Some reads will occur just in starting up SQL Server. You can’t tell reads from maintenance from reads from user transactions. (Full and log backups will increment the counter.) Also, strangely enough, just plain SELECT statements outside of an explicitly defined transaction don’t cause this counter to go up. (I know, weird, right???)
  • How to use it: Take a couple of samples and see if the counter is going up dramatically. If it is going up, that means something’s fairly busy in there. I wouldn’t try to get too scientific with this one, though– it’s just too limited.
  • Verdict: Kind of clunky, but could be useful along with the other methods here.

Are there user Execution Plans in the Cache for the Database?

You can also ask the execution plan cache if queries have been running against the database. This query takes advantage of the dm_exec_text_query_plan DMV (as recommended here by the very clever Grant Fritchey):

  • Pros: Gives you insight into what is using the database when it finds results
  • Cons: Doesn’t catch anything that’s not in the cache due to RECOMPILE hints or memory pressure over time. You’ll probably have to slog through some rows of system procedures.
  • Verdict: not perfect, but very information-rich if something does turn out to be using the database (and a whole lot quicker and more lightweight than tracing).

I know there’s a way I haven’t THOUGHT of.

Do you have a lightweight technique you use for this which I haven’t mentioned here? Let me know in the comments!

Previous Post
Tiering kCura Relativity Databases (Or Any SaaS Product)
Next Post
Introducing sp_BlitzCache®

51 Comments. Leave new

  • Another solution: take the database offfline, and wait until the phone rings…

    -Klaus

    Reply
  • Andrew Notarian
    May 5, 2014 9:49 am

    I had to do this a lot in the SQL 2000 days and some things I would look at would be if the transaction log was growing or do differential backups and see how big they were. Is this too simplistic? Obviously that would only show transaction activity, which can be not enough to prove a database is used.

    Reply
    • Kendra Little
      May 5, 2014 11:17 am

      Oh interesting! Yeah, it only covers modification and if you have index maintenance running that could cause some false positives, but it definitely is a possible source of info.

      Reply
      • Andrew Notarian
        May 5, 2014 12:05 pm

        True. If there was index maintenance. These were mostly tiny databases that never got an index and we were just trying to tell if they could be safely dropped.

        Reply
  • If you suspect that a given table isn’t being used (or, worse, is only being written to implying that the data isn’t ever being read) is SQL Audit. Granted, you have to have some idea as to whether or not it’s being used or the audit file gets large quickly, but I’ve trimmed some junk using this process. As a bonus, you get the statements that are running.

    Reply
    • Kendra Little
      May 5, 2014 11:18 am

      Heh, I’m not sure I’d call auditing simple OR easy, but if you’ve got EE and the know how to get Audit rigged up then it absolutely gives a lot of info. Cool to hear that you’ve used this successfully!

      Reply
    • Nic Neufeld
      May 5, 2014 3:57 pm

      Timely stuff…just started looking at Audit today to solve that very problem (auditing specific table access and specific proc execution when the devs assume they aren’t used any longer). I was first worried about performance impact as compared to a server side trace but it sounds like its actually more lightweight than a trace, being built off of XEvents.

      Reply
  • Is dm index usage stats not a good idea? We keep a daily snapshot of indeed usage stats for 2 months, if we see no usage we offline for a month and then delete after a external tape backup with 12 month retention.

    Have a semi automated process For SQL 2000. We had a trace running for 30 min window to capture which db is used and the next trace filters out the for DB that have already captured as used. The more DB we captured, we increased the time for which the trace would run. We automated stopping trace after 30 mins, reading the trace to capture the db used and add filter to the New trace.

    Any better idea please let us know, have removed 30 SQL 2000 instances, 20 more to go :).

    Thanks
    Anand

    Reply
    • Kendra Little
      May 5, 2014 11:22 am

      Usage stats can definitely be helpful– that’s basically what I’m referring to in the “Are reads and writes happening?” section.

      On SQL Server 2012 and higher, index rebuilds can clear out those index usage stats, so that might give you bad info if you sampled right after index maintenance and you’re using rebuilds.

      For SQL Server 2000, that’s a pretty tricky situation. I like how you’ve approached it: you don’t have a ton of tools or options, and you’ve designed a way where you’re trying to minimize your trace footprint.

      Reply
  • So I have this run every 5 to 10 min. It does 2 things…
    1. will show what hosts are connected (but not necessarily whats being executed) and how many connections each host has to each db
    2. if there are more than x threshold it starts to dump what is being executed by said connection.

    Its my “go to” when I want to find whats connected and if there’s a service that isn’t letting go of the db connection like it should.

    FYI … dbname and author were removed to protect … well just to protect 😉

    USE []
    GO
    /****** Object: StoredProcedure [dbo].[ConnectionsStatus] Script Date: 05/05/2014 13:58:55 ******/
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    — =============================================
    — Author:
    — Create date: 04/07/2012
    — Description: Inserts current connections, active and possibly inactive, into .dbo.ConnectionsActive & .dbo.ConnectionsInactive tables
    — =============================================
    ALTER PROCEDURE [dbo].[ConnectionsStatus]

    AS
    BEGIN
    — SET NOCOUNT ON added to prevent extra result sets from
    — interfering with SELECT statements.
    SET NOCOUNT ON;

    — connection count by database, host, login
    INSERT INTO .dbo.ConnectionsActive
    (DatabaseName, HostName, LoginName, ConnectionCount)
    SELECT d.name as DatabaseName, p.hostname, loginame, count(*) as ConnectionCount
    FROM master..sysprocesses p
    join master..sysdatabases d on p.dbid = d.dbid
    GROUP BY d.name, p.hostname, loginame
    ORDER BY 4 desc

    — inactive connection count by database, host, login
    INSERT INTO .dbo.ConnectionsInactive
    (DatabaseName, HostName, LoginName, ConnectionCount)
    SELECT d.name as DatabaseName, p.hostname, loginame, count(*) as ConnectionCount
    FROM master..sysprocesses p
    JOIN master..sysdatabases d on p.dbid = d.dbid
    WHERE datediff (minute,p.last_batch, getdate()) > 2
    GROUP BY d.name, p.hostname, loginame
    ORDER BY 4 desc

    — Connections Active Detail

    DECLARE @MaxConnectionCount INT = 95 –set max amount of connection count to begin showing last executed sprocs
    DECLARE @cntr int =1
    DECLARE @max INT
    DECLARE @spid INT
    DECLARE @spids AS TABLE (id INT IDENTITY (1,1), spid INT, DatabaseName VARCHAR(255), HostName VARCHAR(255), LoginName VARCHAR(255), textdata VARCHAR(Max))

    –SELECT d.name as DatabaseName, p.hostname, loginame, count(*) as ConnectionCount
    –FROM master..sysprocesses p
    –JOIN master..sysdatabases d on p.dbid = d.dbid
    –GROUP BY d.name, p.hostname, loginame
    –ORDER BY 4 desc

    INSERT INTO @spids
    (
    spid ,
    DatabaseName ,
    HostName ,
    LoginName
    )
    SELECT spid, d.name AS databasename, p.hostname, p.loginame
    FROM master..sysprocesses p
    JOIN master..sysdatabases d on p.dbid = d.dbid
    JOIN
    ( SELECT d.name as DatabaseName, p.hostname, loginame
    FROM master..sysprocesses p
    join master..sysdatabases d on p.dbid = d.dbid
    GROUP BY d.name, p.hostname, loginame
    HAVING COUNT(*) > @MaxConnectionCount
    ) RU
    ON RU.DatabaseName = d.name AND RU.hostname = p.hostname AND RU.loginame = p.loginame

    SELECT @max = (SELECT MAX(id) FROM @spids)

    WHILE @cntr <= @max

    BEGIN

    DECLARE @sqltext VARBINARY(128)
    SELECT @sqltext = sp.sql_handle
    FROM sys.sysprocesses sp
    JOIN @spids s
    ON s.id = @cntr
    WHERE sp.spid = s.spid

    SELECT @spid = (SELECT spid FROM @spids WHERE id = @cntr)

    UPDATE s
    SET Textdata = txt.textdata
    FROM @spids s
    JOIN (
    SELECT TEXT AS textdata, @spid spid
    FROM sys.dm_exec_sql_text(@sqltext)
    ) txt
    ON txt.spid = s.spid

    SELECT @cntr = @cntr + 1

    END

    INSERT INTO .dbo.ConnectionsActiveDetail
    ( DatabaseName ,
    HostName ,
    LoginName ,
    spid ,
    TextData
    )
    SELECT
    DatabaseName,
    HostName,
    LoginName,
    spid,
    textdata
    FROM @spids
    WHERE textdata IS NOT NULL

    DELETE
    FROM .dbo.ConnectionsActiveDetail
    WHERE TimeStamp < GETDATE() – 60

    END

    Reply
  • Clunky or not, I’ve really come to like the Transactions/sec counter this year. It’s been especially useful when filtering on tempdb. (Even though I find it hard to understand what that counter is doing without experimenting.)

    BTW: You know you’ve truly made it when you illustrate a blog article with a picture of a burning database.

    Reply
  • I always start by running a ‘netstat’. Of course this is server-wide, not per database.

    Reply
  • Jaime González
    May 12, 2014 12:20 pm

    Here’s my solution based in sys.dm_db_index_usage_stats view.
    Please comment, is secure ?

    /****************************************************************/
    /* SCRIPT DATABASE REPORTING FROM UNUSED SQL LAST RESET SERVICE */
    /****************************************************************/
    use master
    go
    SELECT DB_NAME() as ‘dbname’,
    convert(varchar(255), t.name) AS ‘Table’,
    SUM(i.user_seeks + i.user_scans + i.user_lookups)
    AS ‘Total accesses’,
    SUM(i.user_seeks) AS ‘Seeks’,
    SUM(i.user_scans) AS ‘Scans’,
    SUM(i.user_lookups) AS ‘Lookups’
    INTO ##tb_tables_used
    FROM
    sys.dm_db_index_usage_stats i RIGHT OUTER JOIN
    sys.tables t ON (t.object_id = i.object_id)
    GROUP BY
    i.object_id,
    t.name ORDER BY [Total accesses] DESC
    go
    delete ##tb_tables_used
    go

    exec sp_msforeachdb ‘use ?;
    insert into ##tb_tables_used
    SELECT ”?” as ”dbname”,
    t.name AS ”Table”,
    SUM(i.user_seeks + i.user_scans + i.user_lookups)
    AS ”Total accesses”,
    SUM(i.user_seeks) AS ”Seeks”,
    SUM(i.user_scans) AS ”Scans”,
    SUM(i.user_lookups) AS ”Lookups”
    FROM
    sys.dm_db_index_usage_stats i RIGHT OUTER JOIN
    sys.tables t ON (t.object_id = i.object_id)
    GROUP BY
    i.object_id,
    t.name ORDER BY [Total accesses] DESC ‘
    go

    SELECT DATEDIFF(D, create_date, GETDATE()) as ‘Running Days’, CREATE_DATE AS ‘Restarting from the’
    FROM SYS.databases
    WHERE name = ‘TEMPDB’

    select name as ‘Databases without Use’
    from sys.databases
    where name not in (‘model’)
    except
    select dbname
    from ##tb_tables_used
    order by 1

    Reply
    • Kendra Little
      May 12, 2014 2:16 pm

      Hey there,

      That’s essentially what I’m describing in the section, “ARE READS AND WRITES HAPPENING ON TABLES IN THE DATABASE?” So it has the same pros and cons.

      Kendra

      Reply
  • Thank you so much for the interesting post.
    Please note that sp_whoIsActive doesn’t tell you if the database is being quarried via SSAS.

    Reply
    • Kendra Little
      May 20, 2014 11:36 pm

      Oh, interesting! If there’s a cube that’s populated from the database, you should see the queries while the cube build is running though, right?

      Reply
      • That’s exactly what I would expected, but didn’t see happen… I’ll check for the other scripts and let you know how they’re going, though!

        Reply
        • Kendra Little
          May 20, 2014 11:42 pm

          I’ve seen the queries from cube builds before in sp_whoisactive. There’s going to be a period after the data is read in while it’s being processed in memory in Analysis Services, I bet– so the query might not show up the whole time the cube is being processed.

          Reply
          • I’ve run the script straight after processing the cube, so I don’t think it is that..

          • Kendra Little
            May 20, 2014 11:46 pm

            How much data is being retrieved?

            It shows up like any other query. So if it’s fast or if you don’t happen to catch it while it’s running, it’s hard to see. But it’s just like any other query even though it’s coming from some other SQL Server component.

  • Great post. I’m going to use some of these. Thanks for sharing!

    How about enabling AutoClose?

    Reply
    • Kendra Little
      May 23, 2014 2:35 pm

      Oh, interesting– I’m betting there’s some metadata which would tell you last open/close date, eh? (I don’t enabled this a lot so I can’t quite remember what it does to the DMVs, but I’m thinking it updates sys.databases or at least the default trace.)

      Autoclose can give you a big perf hit, and of course a maintenance job could open it, but if you’re relatively certain a database isn’t in use and you’ve cleaned up maintenance, this does seem like a clever trick!

      Reply
  • Hi.

    Any good approaches for SQL Server 2000 to recomend?

    []’s

    Reply
    • Renato – SQL Server 2000 is unsupported by Microsoft. You’ll want to migrate to a supported version ASAP. Just as it’s hard to hire for a position on the Titanic, it’s hard to find people interested in working on SQL Server 2000.

      Reply
      • Kendra,

        I agree with your point of view. Focus on majority 🙂

        Brent,

        I agree. Nice analogy, btw.
        Unfortunately, we haven’t always 100% of the new ships in the dock…
        Sad, but true. Actually, we have some approaches for migrating SQL Server 2000 databases and and we have a very limited possibilities for discover unused databases. Nothing so cool in 2000 like those methods above for 2008+.

        Reply
  • I know this thread is a little older but I have been referring to it so its still current for me. Another great little resource on my go to website for all my don’t fall over the learning curve requirements.

    In addition to the queries you mention I also find the following useful, for a serverwide glimpse:

    — get the current connection count
    — databases that have connection count 0 for extended periods are not being used.
    use master
    go
    SELECT @@ServerName AS server
    ,NAME AS dbname
    ,COUNT(STATUS) AS number_of_connections
    ,GETDATE() AS timestamp
    FROM sys.databases sd
    LEFT JOIN sysprocesses sp ON sd.database_id = sp.dbid
    WHERE database_id > 4
    GROUP BY NAME

    Reply
  • I think following should answer the question:

    WITH agg AS
    (
    SELECT
    last_user_seek,
    last_user_scan,
    last_user_lookup,
    last_user_update,database_id
    FROM
    sys.dm_db_index_usage_stats
    )
    SELECT
    database_id, DB_NAME(database_id) DatabaseName,
    last_read = MAX(last_read),
    last_write = MAX(last_write)
    FROM
    (
    SELECT database_id,last_user_seek, NULL FROM agg
    UNION ALL
    SELECT database_id,last_user_scan, NULL FROM agg
    UNION ALL
    SELECT database_id,last_user_lookup, NULL FROM agg
    UNION ALL
    SELECT database_id,NULL, last_user_update FROM agg
    ) AS x (database_id,last_read, last_write)
    group by database_id

    what do you think?

    Reply
  • would this work?
    SELECT
    DB_NAME(database_id) AS dbName,
    MAX(last_user_seek) AS last_user_seek,
    MAX(last_user_scan) AS last_user_scan,
    MAX(last_user_lookup) AS last_user_lookup,
    MAX(last_user_update) AS last_user_update
    FROM sys.dm_db_index_usage_stats
    GROUP BY DB_NAME(database_id)

    Reply
  • Paul Hewson
    May 22, 2019 7:50 am

    I use this executed by an agent job every 5 mins or whatever I feel is required. Is there anything wrong with using sp_who2?

    — Log if databases are in use
    USE DBA;

    IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.Tables WHERE TABLE_NAME = ‘DatabasesInUse’)
    CREATE TABLE DBA.dbo.DatabasesInUse ( [DBName] varchar(200), [HostName] varchar(200), [Login] varchar(200), [Lastbatch] varchar(200), [PollDate] datetime )

    IF EXISTS (SELECT * FROM tempdb..sysobjects WHERE id=OBJECT_ID(‘tempdb..#spwho2’)) DROP TABLE #spwho2
    CREATE TABLE #spwho2(
    [SPID] INT
    ,[Status] VARCHAR(200)
    ,[Login] VARCHAR(200)
    ,[HostName] VARCHAR(200)
    ,[BlkBy] VARCHAR(200)
    ,[DBName] VARCHAR(200)
    ,[Command] VARCHAR(200)
    ,[CPUTime] INT
    ,[DiskIO] INT
    ,[LastBatch] VARCHAR(200)
    ,[ProgramName] VARCHAR(200)
    ,[SPID2] INT
    ,[RequestID] INT
    )
    INSERT INTO #spwho2
    EXEC sp_who2

    INSERT DBA.dbo.DatabasesInUse
    SELECT DISTINCT
    s.DBName
    ,s.HostName
    ,s.Login
    ,MAX(s.LastBatch) AS Lastbatch
    ,PollDate = GETDATE()
    FROM
    #spwho2 AS s
    WHERE s.DBName IS NOT NULL
    AND s.DBName NOT IN (‘msdb’,’master’,’model’)
    GROUP BY s.HostName
    ,s.DBName
    ,s.[Login];

    DROP TABLE #spwho2

    Reply
    • It won’t catch short-lives sessions.

      Reply
      • As a lightweight trace its quite handy though. I run a similar thing every minute. I find if its been running for a few weeks it will predict unused databases with 95% accuracy, so like all these things, it provides a good place to start further investigation.

        Another way is to look for the values of datetime columns, its just another way to build up a bigger picture before doing a full trace.

        function Get-LastdattimeValues ($instance, $Database, [switch]$GetLatest)
        {
        $q = @”
        IF OBJECT_ID(‘tempdb..#Temp’) IS NOT NULL DROP TABLE #Temp
        CREATE TABLE #Temp (LastDate datetime, SchemaName sysname, TableName sysname, ColName sysname)

        DECLARE @Statement nvarchar(500)
        declare @SchemaName SYSNAME
        DECLARE @TableName sysname
        DECLARE @ColName SYSNAME
        DECLARE Columns CURSOR FOR SELECT ‘INSERT INTO #Temp SELECT TOP 1 [‘ + c.name + ‘] , ”’ + SCHEMA_NAME(t.schema_id) + ”’ AS SchemaName, ”’ + t.name + ”’ AS TableName, ”’ + c.name + ”’ AS ColName FROM [‘ + SCHEMA_NAME(t.schema_id) + ‘].[‘ + OBJECT_NAME(t.Object_id) + ‘] WHERE [‘ + c.name + ‘] IS NOT NULL and [‘ + c.name + ‘] < getdate() ORDER BY [' + c.name + '] DESC' AS Statement,SCHEMA_NAME(t.schema_id) as SchemaName, t.name, c.name FROM sys.columns c JOIN sys.tables t ON t.object_id = c.object_id WHERE c.user_type_id = 61 AND t.is_ms_shipped = 0
        OPEN Columns
        FETCH NEXT FROM Columns INTO @STatement, @SchemaName, @TableName, @ColName
        WHILE @@FETCH_STATUS = 0
        BEGIN
        PRINT @statement
        EXEC sp_executesql @statement
        FETCH NEXT FROM Columns INTO @STatement, @SchemaName, @TableName, @ColName
        END
        DEALLOCATE Columns
        "@
        if ($GetLatest)
        {
        $q = $q + " select top 1 LastDate from #temp order by lastdate DESC"
        }
        Else
        {
        $q = $q + " select * from #temp order by lastdate desc"
        }
        Invoke-Sqlcmd -ServerInstance $instance -Database $Database -Query $Q
        }

        Reply
  • Is there a way to tell from a transactional replication subscriber if the replication is lagging? I only have user level permissions on the subscriber database with no access to the publisher. I can’t seem to find any fields in the database itself that will assure me that it is up to date. It is a fairly high use database, so I can use certain update fields, but I am afraid that on weekends the database may not have enough activity to use this method with any confidence.

    Reply
  • The crux of the problem is how we define “used” and “lightweight”. All of the methods above involve inferring user activity by measuring resource usage for subjectively sufficient length of time.
    I’d suggest enabling login auditing for maybe 15 minutes during business hours on what we believe should be a busy oltp database – or x days on something like an occasionally used data warehouse or financial reporting database – and then look in the SQL Server log for login events coming from user accounts.
    But still there could be zombie polling or report refresh services that hit the database consistently – even though real users stopped using the application or caring about the data a long time ago.

    Reply
    • Eric – I’m guessing you come from an environment outside of SQL Server. In SQL Server, you can log into database A, but query database B (especially with cross-database queries and joins.) I wish it was as easy as logins, but that’s not how SQL Server works. Thanks for the idea though!

      Reply
  • Mike FLEMING
    April 24, 2021 7:26 am

    Check out the Get-LastIndexUpdateForDatabses cmlet in my DBALibrary PowerShell script.

    https://github.com/MikeyMing/powershell-sql/blob/master/DBALibrary.ps1

    Unless we can do a full trace a multi-tool approach is needed.

    Anyone have any ideas on how extended events can help in this problem?

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.