Replication Won’t Refresh Your Dev and Pre-Production Environments

It’s almost like the real thing, except it’s totally different.

At first glance, SQL Server’s transactional replication seems like it’s useful for moving data around in all sorts of situations: it works in Standard Edition, it’s not synchronous, and you can have multiple subscribers.

Why People Want Replication to Test and Pre-Production Environments

Setting up a test environment sounds simple at first. But it’s pretty tricky. Here are typical requirements:

  1. Data in the test environment needs to be fresh, recent production data to support query tuning
  2. Data and schema in the test environment need to be writable for testing purposes
  3. Data and schema in the test environment need to be periodically reset back to production-like configuration

The first of those requirements are why replication sounds like a great idea. There’s a lot of gotchas, even with that issue, but the second two requirements are where replication just clearly doesn’t meet the bar.

Replication isn’t Friendly to Changing Subscriber Tables

Transactional replication won’t like it if you go around changing data and schema on its subscriber tables. If you change the data, updates and deletes flowing through replication may start to fail. Well, let’s just be realistic– they WILL fail, and fixing it will be manual and a major pain point.

Schema changes are even dicier. You can’t just drop or change the data type in a replicated column, you need to modify replication for that– and that means adding filters to the publication.

Resetting Replication isn’t Fun, Either

Re-initializing a replication subscriber is often very impactful on the publisher: running a replication snapshot is an intensive, time consuming process that causes CPU, IO, and lock overhead on the publisher. (Warning: this snapshot is a totally different thing from the database snapshot I talk about a bit later. This one basically copies out all the contents of the published articles on the publisher into flat files, copies everything, and then loads it into the subscriber.)

You can initialize a replication subscriber from a full backup, but that’s not lightweight or fast for large databases, and it’s very dependent on great custom scripting and timing. And usually people are looking at replication because they want to avoid big backups and restores.

What About a Creative Solution Involving Replication?

Refreshing test and staging environments is a tough problem. So tough that people often try to get creative, like this:

  • ProductionDB in Denver is a transactional replication publisher on PROD01
  • SubscriberDB in Portland is a transactional replication subscriber on STG01
  • Periodically, some process is run against subscriber DB to reset other databases on STG01. Ideas usually involve database snapshots, backups, and custom scripts.

But when you dig into the details, this is never great. Here’s why.

Transactional Replication Impacts the Publisher, Your Developers, and Your DBAs

If there’s one thing that most DBAs and developers can agree on, it’s that replication has made them work late and curse a lot.

Replication requires that every table have a primary key. This may mean schema changes just to get it set up.

Replication slows down your software deployment, because it’s easy to mess up publishing tables when you make schema changes to them. The only way to not mess this up is to also implement replication in your development environment, which developers hate. And in our scenario, where the test database is the subscriber, what would that even mean? Recursive replication torture?

Replication impacts performance on the publisher, particularly under high load. It frequently requires tuning storage, setting up a scale out distributor ($$$ hardware and licensing $$$), fine tuning your publisher and subscriber properties, implementing custom monitoring, and basically having at least one team member obtain a Masters Degree in Replication at the School of Late Nights. I went to that school a long time ago, and I didn’t keep the yearbook, but I did write about it here.

Database Snapshots are Read-Only

There’s no such thing as a writable database snapshot in SQL Server, even though it’s an Enterprise Edition feature. (Fancy SANs have options for that, SQL Server itself does not.) That pretty much takes most of the uses out of it for a staging environment.

If You’re Writing a Custom Script in the Staging Environment, Why Use Replication?

If you have to basically write a custom ETL to read from the subscriber to refresh staging databases, transaction log shipping is much easier to manage than replication, and it allows a read only standby mode on the subscriber.

If You’re Using Backup and Restore, Replication is Also Overkill

While it’s technically possible to back up a replication subscriber and restore it, then remove replication from it, it’s not all that great.  This is a lot of complexity without a lot of payoff.

  • You’ve still got the time to do the backup and restore (which you were trying to avoid in the first place)
  • You haven’t validated that the database you’ve restored has a schema that matches production (it could be very different on the replication subscriber)

You’re much better off basing the restore off production in this case. Even if the pre-production and testing databases are in a different datacenter, you can optimize the types of backups used, compression and network copy time, and restore schedule. Even setting up multiple test instances that restore “in rounds” and which can be used at different times a day is often operationally preferable to the performance, monitoring, and caretaking needs of replication.

There’s No Easy Answer to this Problem

Refreshing staging and development environments doesn’t have a single right answer. This is done many different ways based on application requirements. Just don’t pin all your hopes on replication and start investing in it without looking at the details of exactly how it will work — because it’s probably not what you think it is at first. Always make sure you factor in:

  • Performance impact on the publisher and hardware/storage/licensing costs to mitigate this
  • Any restrictions or requirements on schema changes at the publisher and impact on code development
  • Operational cost of upkeep of the technology

Don’t give up! Having staging and development environments with recent data that works well for you is totally possible, you just need to consider more technologies than replication.

Need Help?

We’ve got lots of resources, for money and for love. Or just for free.

Brent says: Yeah, replication isn’t what I call “refreshing.”

Previous Post
The Fastest Way to Reconfigure a Bunch of Servers
Next Post
An Update to SQL Server Management Studio is Available

19 Comments. Leave new

  • “If there’s one thing that most DBAs and developers can agree on, it’s that replication has made them work late and curse a lot.”

    This is an essential truth of the universe!

  • Matt Salsbury
    July 2, 2015 10:07 am

    We use a nightly script that restores the nightly full backup to a QA server. This gives QA and developers a place to point the latest build, and doubles as a test of the nightly backups.

    This works well in an environment where the business keeps regular hours and DBAs are the only people with access to the critical production environments….. which is a rarity.

  • Nice article.

  • Indeed, replication is an amazing and very useful tool, and at some point of my life i spent more time with it than with my wife. However, keep the process of refreshing dev environment “NOT FOR REPLICATION”

  • Robert French
    July 2, 2015 11:31 am

    Thank goodness, its not just me that has one of two challenges with replication :-> Feeling better already !

  • Brett Clyde
    July 2, 2015 12:50 pm

    Excellent post! I laughed. I cried. I had terrifying flashbacks. Good times. Good times.

    Seriously, this was a great post that lends credence to statements I have been making for some time. Replication is a powerful tool, but it has gotchas one must understand and be prepared for. Thank you for all the useful information and links!

  • Hello, Kendra! Great article as always!
    It’s funny when I see bizarre solutions using replication. I’ve seen – more than once!!! – people considering replication as a HA solution…
    It’s true there is no simple and easy solution, but I have a success case where I had to refresh +1TB data for both from production to production and from production to homolog/dev in few minutes instead of few hours. I needed more than the typical DBA skills to develop the solution and went dirty at metal and SAN level. Thanks to storage support I could make snapshots from LUNs with perfectly healthy database files that went straight to their destination servers for attach. The storage had a good Powershell support so I managed to get the best from both SAN and SQL Server worlds in a single error proof script.
    As I used storage level snapshot there was an additional bonus because we saved a lot of space! Of course we had to make sure that working with snapshots wouldn’t cause performance problem. Fortunately my customer SAN did a great job handling multiples snapshots without any perfomance hiccup.
    In the end everybody was very happy with a efficient solution plus a big storage savings!

  • James Anderson
    July 2, 2015 3:05 pm

    I’m thinking of doing nightly refreshes using something I have written for migrations.

    For these migrations I log ship to the destination server and when I want to perform pre go live tests that update the data I run my PowerShell script that ‘pauses’ log shipping.

    It does this by backing up the master DB, stopping the instance, ‘cold copying’ the MDF, NDF and LDFs and then starting the instance bringing the database online (with the original files). Log shipping is broken but the tests can be performed. When testing is done I run my second PowerShell script that stops the instance, deletes the original (now edited) files, copies the copied files in their place, restores the master DB (so the instance is tricked into thinking the DB was never restored, which has never happened to these files) and bring the instance backup. At this point log shipping can be resumed from where you left of. The copying of the files takes time but it’s a over night solution and it far out weighs the impact of pulling fresh backups from production.

    I’ll write a blog post at some point so people can let me know how bad an idea they think it is.

  • I’m about to migrate a 2008 merge\transactional architecture to 2014. Wish me luck!

    • Good luck, man:) I`ve recently migrated a set of 2000 replications to 2014. Fortunately the number of servers has been reduced which led to less replications needed.

  • I’ve worked with replication on 2000/2005/2008/R2/2012 (not a boast BTW) and it never gets any better. After a few beers I’ll show you the scars.

  • As for performance of transactional replication and reducing the load on the publisher it is a good practice to use a separate distributor and separate distribution database on that distributor for each publisher, but that will cost you additional money as mentioned in the article.
    Another tip: set snapshot always available to false and reduce subscription expiration interval.
    Above allowed me to reduce latency a lot.
    However, regarding dev and test environments I prefer to avoid replication and update them via scheduled SSIS packages.

  • We only refresh Dev, QA and/or Staging on-demand – replication certainly would be a poor choice here. Developers, BA’s and Testers often have a ton of new data loaded and/or are testing out scenarios, so automating restores isn’t feasible either. It takes coordination across many, many teams and end users where I’m at.

  • Any thoughts on using SAN clone to non prod environments? we have been using it for really large databases, It has its downsides but we found its not too bad if you can afford it 🙂

  • Nice post.
    We use replication of our ERP DB for BI tools to churn on.
    However we setup a test replication DB and now I need to shutdown replication and toss the DB, however I have read about issues with doing this properly but I can’t seem to find any step by steps on how to do it properly.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.