Performance Tuning SQL Server Change Tracking

SQL Server
26 Comments

Change Tracking is a developer tool introduced in SQL Server 2008 to help you sync data between SQL Servers, or between SQL Servers and devices. It’s designed to help you answer the request, “I need to get the most recent value for rows that have been modified since I last grabbed data.”

First, I’ll give you the top three reasons people use Change Tracking. Next, I’ll give you queries! If you’re using Change Tracking, these will help you document how it’s configured and measure how big your secret internal tables used by Change Tracking have grown to.

Then we’ll peek into query plans for queries using Change Tracking and talk about some common pitfalls that can impact performance.

Finally, we’ll talk blocking and data consistency. There may be some tradeoffs you need to make to get consistent results out of change tracking.

Why do people use Change Tracking?

Here are the top three reasons that Change Tracking with SQL Server sounds attractive to people:

1. Change Tracking doesn’t send every change over the wire. With a technology like transactional replication, every change that’s made is sent off into replication. If I update a single row a million times, that usually means at least a million “commands” head off into my distribution database. But I don’t always want all of those values. Often, I just want the most recent value, so Change Tracking offers a lightweight solution.

2. Change Tracking doesn’t care what platform your client is running. With a technology like merge replication, all the partners need to run merge replication. With Change Tracking, your client application just needs to come in and talk to a web server. The web server then queries out data that’s changed and sends it back to the client. The client doesn’t need to know a thing about SQL Server. (Clarification: there’s a part of merge replication itself which is also called “Change Tracking”. That’s totally different than what I’m talking about in this post.)

3. Change Tracking is cheap. It works in every edition of SQL Server (Standard Edition, Web Edition, you name it). This isn’t true for all your replication options – Change Data Capture requires Enterprise Edition, for example. Licensing SQL Server can be a huge part of your IT budget, so features that work with Standard Edition are often very attractive.

But as you’ll see in this post, there are tradeoffs, and Change Tracking doesn’t fit every scenario.

Are you cleaning up, and do you have a retention period set?

Change Tracking isn’t free. As you’ll see up ahead, it creates secret tables in your database. The first thing to find out is where you have change tracking on, if you have cleanup enabled, and what it cleans up. Check that out by running this query against any database on your SQL Server instance:

One is not like the other
One of these is not like the other.

Are the right tables in Change Tracking?

You need to know which tables are being tracked. The more tables you track changes on, the more SQL Server has to hoard in other secret tables, and the more it will have to clean up. (Assuming you’re cleaning up. If you don’t clean up, that could be bad.)

Run this query against every database that has Change Tracking enabled to find out which tables have changes tracked:

change-tracking-change_tracking_tables
I just have two tables in Change Tracking. You might have a LOT more.

 

How many committed transactions show in sys.dm_tran_commit_table?

We’ve looked at which tables you’re tracking and how long you’re retaining changes. But how many changes have been committed in that time period? This is the number committed across all the tables in Change Tracking (so the more tables you’re tracking, the bigger this can get).

At this moment, things were VERY clean.
At this moment, things were VERY clean.

How big are your hidden Change Tracking tables?

There’s two primary places where Change Tracking keeps data about what’s changed:

  • sys.syscommittab – this is the system table behind the sys.dm_tran_commit_table view, which you saw above
  • sys.change_tracking_objectid tables – these are per each table tracked.

This query will show you all the internal tables with their size and rowcount:

change tracking table sizes
Oooo, at this moment things were NOT cleaned up! Compare the size and rows in my sys.syscommittab table to everything else!

Change Tracking Cleanup: No, you’re not crazy

We’ve just looked at two things that might not completely line up: your minimum commit time, and your cleanup/retention period.

If these don’t match, the root cause might be a few things:

  1. You’ve hit a bug in SQL Server 2008 and cleanup just isn’t running. See KB 973696, “FIX: Change tracking cleanup does not clean up the sys.syscommittab system table in SQL Server 2008”
  2. The Auto Cleanup Thread hasn’t run yet. This is a background task. I have observed that if I change the retention period on an existing database with change tracking it can take quite a while for the CHANGE_TRACKING_MIN_VALID_VERSION to update for that table. It takes about 30 minutes. The best documentation I can find for this is the comment from Jan 5, 2010 this Microsoft Connect Item.

If you adjust cleanup thresholds and you don’t see cleanup happening right away, you’re not crazy.

sys.sp_flush_commit_table_on_demand

I bet I know what you’re thinking right now. “How do I make cleanup run when I want it to?”

Well, I can’t give you a documented way to do it. However, Nic Cain will show you how to use the undocumented sys.sp_flush_commit_table_on_demand procedure. Nic has also written on some limitations that he found in Change Tracking cleanup here, which might require you to run cleanup manually if you have a lot of changes flowing through your system.

This is undocumented, so test carefully and use it at your own risk. You can review the code it runs by running:

sp_flush_commit_table_on_demand
In search of Sysadmin who loves undocumented procedures and long walks on the Change Tracking internal tables.

As Nic warns, your manual cleanup could conflict with automatic cleanup. So if you choose to use this undocumented procedure, you may need to disable auto-cleanup for a time period. That requires running an ALTER DATABASE command, which might impact your execution plan cache and will require its own locks.

Note: I have not seen this automatically start cleaning up rows immediately after I change the retention period on a restored database. The cleanup procedure relies on results from the hidden internal functions change_tracking_hardened_cleanup_version() and safe_cleanup_version(), and I have not found a way to force updating the value used by that function on SQL Server 2012 SP1, I’ve had to wait 30 minutes like I mention above.

So again, even using this procedure, you might at times wonder if you’re crazy, or if Change Tracking is haunted. Be patient, be careful, and write some lightweight code to monitor and alert you if those internal tables get too big for your tastes.

Is your code using the CHANGE_TRACKING_MIN_VALID_VERSION function?

Now that you know if cleanup is turned on and how it’s working, check if your code is calling CHANGE_TRACKING_MIN_VALID_VERSION() at any point.

This is important to make sure the data you’re pulling is correct. Let’s say I’m using Change Tracking to move data to a secondary SQL Server. Normally it polls data every hour, but there’s a hardware problem and my secondary server is offline for 3 days. The next time it polls, it needs to make sure that all the data it requires is still on the publishing server.

This means your clients need to store off the highest version they pull down, and check it against CHANGE_TRACKING_MIN_VALID_VERSION() each time they poll.

Your client may also need to use a special isolation level called SNAPSHOT. (More on this soon!)

How are you querying with CHANGETABLE?

Here’s an example of the columns you can work with when querying:

For each row that’s changed in the Person.Person table since revision number 2, I get the latest version of the FirstName, MiddleName and LastName for that row. I can see what the highest version number of those changes were, and I can also see the current highest version number for all tracked changes in the current database.

change-tracking-change-version-vs-current-version

Could Trace Flag 8295 Help Your Performance? (TF 8295)

Trace Flag 8295 is documented in KB 2476322 and applies to SQL Server 2008 R2 CU7 / SQL Server 2008 SP2 CU4. After the trace flag is turned on, an additional nonclustered index is created on internal tables when you enable Change Tracking for that table.

There’s a big old disclaimer on that KB that says you need to test thoroughly in a non-production environment before making this change. (Hey, that’s a good general rule for trace flags in general.) I personally have not tested this trace flag, but it is used to support Change Tracking features by default by some System Center / SCOM products. This blog post by Santos Martinez gives some details.

(Thanks to Adam Machanic for letting me know about this trace flag, and to André Kamman and Edmund Poillion for helping track down the relevant documentation.)

What do your execution plans look like?

I’ve been giving you lots of ways to look at the internal objects for Change Tracking for a reason: the number of rows in the internal tables impacts performance.

Let’s say that my client last synced at revision number 1 million. A lot of changes have happened since then– 4 million in this table, and a handful of changes in other tables. I run a query like this to see what’s changed:

The execution plan looks like this:

change-tracking-execution-plan
Click on the execution plan for a larger view.

All of the tables with thick lines to the right side of the plan are the change tracking related tables. Looking at our STATISTICS IO and STATISTICS TIME output, here’s where we spent all our CPU usage:

(For readability, I removed the printed messages for readahead, lob, etc– those were all 0.)

Here’s the story the execution plan told me as I stepped through it:

  • First, SQL Server pulled the commit_ts (it’s a timestamp!) and xdes_id (transactional identifier) columns for every commit that occurred since our revision number from sys.syscomittable. It used a function, an index scan, and a merge join to do this. It found 4,001,000 rows with these columns.
  • SQL Server then looked for the corresponding TransactionID values in the sys.change_tracking_574625090 table. Unfortunately it did an index seek 4,001,000 times to do this. It found 4,000,000 values matching the commit_ts and xdex_ids. (The other 1K were changes made for a different tracked table.)
  • The 4 million rows were pushed into a hash operation to find the unique TransactionIDs for all those revisions. It was a single row. (It had just been changed 4 million times.)
  • The TransactionID key was then used to look up the related columns in a single, efficient, clustered index seek. Hey, that part was fast!

The whole process took 7.5 seconds. Of this, 99% of the work was just identifying what had changed.

Oops.

This may seem like an extreme example, but it’s really not. If you’re using Change Tracking on very frequently modified table or you’re not cleaning up very often (or at all), your overhead may in fact be far greater. I’ve seen undersized hash joins and tempdb spills in queries like this where revisions have piled up.

Change tracking, statistics, and parameter sniffing

Sometimes Change Tracking can be smart enough to give you a better execution plan based on the revision number you pass in. Whether or not it can do this well depends on two things:

If you have highly variable runtimes on reads from your Change Tracking tables, either of these might be be part of the problem.

Blocking, Cleanup, and Snapshot Isolation

Let’s talk about two problem scenarios that can happen with Change Tracking. They’re both related to that pesky background cleanup thread that you need to run so that your tables don’t bloat terribly, but which you can’t schedule.

Problem 1: blocking. When the cleanup thread runs, things get slow and you see locking and blocking. Investigation shows that it’s on the Change Tracking tables.

Problem 2: inconsistent results. A client comes in and says, “Hi, I need everything that changed after revision 99!” It checks the minimum revision and it’s below 99, so it begins a transaction and starts pulling down data from multiple tables. While it’s happening, new data is coming into some tables. And also, cleanup starts before it ends, and cleans up through revision 105 on some tables. Oh, we’ve got problems — we will have pulled down MORE revisions for some tables than others. And we will also have missed some revisions for some tables. We’re out of sync.

Because of these two problems, Microsoft strongly recommends that you use transactions and Snapshot isolation when reading and synchronizing data. It will protect you from those inconsistencies, which makes coding simpler, but of course there are tradeoffs. Learn more about implementing Snapshot isolation here.

Backups, Checkpoints, and Change Tracking

There’s a bug where duplicate rows in change tracking tables can cause checkpoints and backups to fail. Get information on the fix in KB 2682488.

Takeaways: Performance Tuning Change Tracking

Here’s the steps we’ve covered that will help you make Change Tracking go faster:

  1. Make sure you’re only tracking changes on tables that needed it. Extra tables in change tracking will bloat internal tables, and this can impact query performance.
  2. Make sure that your code properly checks CHANGE_TRACKING_MIN_VALID_VERSION so that you can run cleanup (or you’ll get troublesome bloat in internal tables).
  3. Validate that your cleanup jobs can keep up (and monitor it).
  4. Review and test using the undocumented procedure SYS.SP_FLUSH_COMMIT_TABLE_ON_DEMAND for cleanup if needed, but be sure to get approval since… well, it’s undocumented.
  5. Use query execution plans and STATISTICS IO and STATISTICS TIME to measure how much of your query runtime is being spent on Change Tracking tables.
  6. Look out for out-of-date statistics on Change Tracking internal tables, and parameter sniffing on procedures using Change Tracking.
  7. Use SNAPSHOT isolation to ensure you get consistent sets of data, even with cleanup running and new data coming in. It will also help you avoid pesky blocking situations.

Should you use Change Tracking?

As you may have guessed from this post, Change Tracking isn’t something that just configures itself and works perfectly for every application. But do you know of any other built-in data replication features which are like that?

Synchronizing data is hard. Change Tracking offers you some built in features that may save you time from writing custom code, but it’s not a great fit for databases with a very high rate of transaction commits to the tracked tables. For these databases there are major downsides when it comes to cleaning up those internal tables. You should also consider that the changes written to all the internal tables are also fully logged, which adds further overhead if you have a high level of writes. So you’ve got to choose wisely.

Could writing entirely custom code do better than Change Tracking? That’s up to you!

Previous Post
How to Pick a Monitoring Tool
Next Post
Our Sessions at the #SQLPASS Summit 2014

26 Comments. Leave new

  • “(Clarification: there’s a part of merge replication itself which is also called “Change Tracking”. That’s totally different than what I’m talking about in this post.)”

    This is true. When I started analyzing Merge Replication and Change Tracking, I first thought that Merge Replication worked on ‘this’ Change Tracking feature. Maybe if someday Microsoft is renewing Merge Replication, they’ll make SQL Server use their own alleged better features.

    Indeed, MSDN encourages us to use Change Tracking for synchronizing tasks, instead of triggers and timestamps or tracking tables: http://msdn.microsoft.com/en-us/library/cc305322(v=sql.105).aspx

    Jaime

    Reply
    • Kendra Little
      June 27, 2014 9:31 am

      Yeah. I think their perspective is “we baked this cake for you professionally, why make your own cake?”

      But maybe I just need a cupcake. Or maybe I’m making changes so often that their cake explodes. (My metaphor is clearly breaking down here!)

      Reply
  • samualhassi
    June 30, 2014 4:36 am

    Agreed with you Jaimn.
    But I also find this fantastic Library from MSDN team that looks perfect to make this tracking task pretty easier : http://msdn.microsoft.com/en-us/library/bb726006(v=sql.105)
    However, I would also like to share another useful software named Lepide SQL server auditor i.e., (http://www.lepide.com/sql-server-audit/) that seems to be a very comprehensive approach while need to monitor SQL server database with real time alert at granular level.

    Reply
    • Kendra Little
      July 7, 2014 12:47 pm

      Oooo, I wouldn’t use Change Tracking for Auditing purposes at all — that’s not what it’s designed for.

      Reply
  • RE: “…. it’s not a great fit for databases with a very high rate of transaction commits to the tracked tables. For these databases there are major downsides when it comes to cleaning up those internal tables…..”

    For those folk that fall into this category (needing to track changes in a set of high volume transaction tables) – any advice? [In the SQL 2012 environment]

    Thanks!
    –Erik

    Reply
  • Just a follow up – I think Nic’s articles have moved here: http://sirsql.net/content/2014/05/

    Reply
  • Thanks for great article! Got many answers by reading this!

    Reply
  • Any impact for having change tracking enabled on the database but not enabled for any tables? I think this happened in one of my databases which uses CDC but not change tracking. CDC retention was supposed to be set but change tracking retention period was set instead.

    Reply
    • Jeff – that’s a great question, but it’s not something that we’ve benchmarked. Generally I’d only recommend enabling features you’re going to use. 😀

      Reply
  • Thank you for this Kendra! If we had change tracking enabled for the same table in multiple databases (same schema, just customer specific)… Is there a way to select all of the changes at the SERVER level? If that’s not built-in, what would your recommendations be to track changed records for any one table that is tracked in multiple databases? Dynamic SQL (ugh) or maybe a system view of some sort or maybe a trigger on a change tracking table to insert a record into a shared database or maybe service broker?

    Ha.. I know each has it’s pros and cons but would you lean toward one over the other to accomplish the task?

    Reply
    • Kendra Little
      August 31, 2015 11:04 am

      There’s no system wide view to pick up all the changes at once. As you can tell from the article, change tracking has a lot of baggage, isn’t documented very well, and has a lot of gotchas, so I wouldn’t be eager to recommend it without a pretty heavy evaluation against other option. Same thing with Service Broker. But there is no general, “this is always better” option in this case, it does require doing a pretty lengthy evaluation to recommend the best architecture.

      Reply
  • Thanks a lot for sharing such valuable information. I am in a situation where my CDC jobs ( 5 of them ) are spiking CPU to +95 % and Disk Queue Length is getting High. We don’t need CDC data immediately ( we have a process that sends CDC tables backups to developer every night ) my question is what can i do to avoid CPU spikes, should i change pooling or lower number of @maxtran ( we are using defaults now). Will really appreciate your help.
    Thanks

    Reply
    • Hi Abbas,

      Couple observations– you’re using some non-SQL metrics for performance which can be misleading, and you’re grasping at setting changes in a way that shows you’re a bit lost. You don’t want to just go bashing at buttons in a case like this, you need to analyze more closely what’s happening at the time to find the root cause (and to establish if it’s even a problem– high CPU itself is a symptom but might be harmless).

      I would start with the checklist here for incidents like this: https://www.brentozar.com/responder/

      Reply
  • Angelo Carzis
    December 7, 2015 8:16 pm

    Kendra, my colleagues and I are working on a solution revolving around change tracking…

    Our goal is to use SQL Server change tracking to track changes to be pushed to another non-SQL system.

    Our database is a subscriber (transactional replication). Numerous publishers are pushing transactions into numerous tables.

    During my testing, We observed that the entries recorded by change tracking has records grouped into ‘chunks’. I.E. There are many records with the same value of SYS_CHANGE_VERSION.

    This is based on the query of select * from changetable(CHANGES my_table, 0)

    Observation: When we execute the above query, some of the rows are ‘interleaved’.

    For example, we may have 50 row with SYS_CHANGE_VERSION = 100, 1 row with SYS_CHANGE_VERSION = 101 and then more rows with SYS_CHANGE_VERSION = 100.

    Our question:

    When rows are inserted into the change tracking table/mechanism, can rows be inserted with a lower value of SYS_CHANGE_VERSION than the current maximum value?

    We are trying to determine if all we need to do is have a sort order on SYS_CHANGE_VERSION and simple ignore records for the current max value of SYS_CHANGE_VERSION.

    Furthermore, once data is read from the change tracking table, can subsequent reads contain entries which have the same ‘maximum’ SYS_CHANGE_VERSION which was read previously.

    We are worried about whether or not we can continue/resume the process of pushing transactions to the non-sql system.

    As an alternative, we have considered using the replication stored procedures on my subscriber to push changes to my non-SQL system.

    Any answers greatly appreciated.

    Reply
  • For a three-tier financial application (Sql Server, Asp.Net application server, http/json clients), we are looking for the best (read: lightest weight) solution for one-way cache validation and synchronisation Sql Server –> client .

    From what I’ve read, Sql Server Change Tracking is overkill, since we only need the most recent date/time (or version) of the Sql Server table, which can then be compared with the version in the client cache.

    We’ve also looked at
    select top 1 user_updates from sys.dm_db_index_usage_stats where database_id=@dbid and object_id=@objid

    and also

    select top 1 leaf_insert_count + leaf_delete_count + leaf_update_count + nonleaf_insert_count + nonleaf_delete_count + nonleaf_update_count from sys.dm_db_index_operational_stats(@dbid, @objid, default, default)

    which both might serve our purpose, but we’re not convinced by the performance.

    Any thoughts here?

    Reply
    • A rectification and partial answer to my own question: whereas the sys.dm_db_index_usage_stats performs very badly, the sys.dm_db_index_operational_stats() performs like a miracle, at sub-microsecond! So it would seem the latter is perfect for our purpose 🙂

      Any other thoughts are still welcome…

      Reply
      • Nick – for general questions, head on over to a Q&A site like http://dbastackexchange.com.

        Reply
      • Those indexes are unreliable for change detection. They cant handle “slow moving sql updates”. UPDATE TAB SET COL=GETDATE() for lots of records causes the index to be updated immedately, but record keep getting modified after that. (For large tabs if it runs for minutes)
        I never found better way than EXISTS CHANGETABLE(CHANGE TAB, XXX). We have 1000 tabs in our db, and sometimes checking them all from 6 threads finishes between 0.5s and 1.5s.
        I would really like to find a quicker way,but there just isnt.

        Reply
  • […] Looking back, even if we had Change Tracking, we benefited from the granular control we had over this custom system, and the elegant simplicity of the implementation. The Change Tracking feature in SQL Server has remained somewhat of an ‘edge feature’, and people using it have had to deal with a variety of issues over time – a need to customize the indexes on the change tables, statistics estimate problems with change functions, and problems with cleanup. (I wrote about these a while back over here.) […]

    Reply
  • Stephen Mandeville
    April 16, 2019 11:38 am

    Is this Thread still relevent

    I have a Microsoft Support guy I cant reach as the DBA here because of a system not managed by us

    Microsoft Congiuration Manager that uses change tracking

    the Microsoft guy sends this to the admin and now he he is asking me?

    After checking the information you provided, I was able to see that the watermarks are effectively updating with each run of the cleanup thread, which means that the issue is not related to watermarks, however, it is possible that the size of the tables to clean can interfere with the cleanup process, so I may require you to implement the following index:
    CREATE NONCLUSTERED INDEX [si_commit_time]
    ON [sys].[syscommittab] (
    ??? [commit_time] DESC,
    ??? [commit_ts] DESC
    )
    WITH (SORT_IN_TEMPDB = ON);
    To implement it, you may require to do it thru a DAC, and once it is in place, please proceed to runt eh SP [dbo].[sp_ManualChangeTrackingMetaDataCleanupProc].

    how to do you add an index to a in memory system table that is my question for the love of god why is this so complicated ….

    any help or hints appreciated

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.