The best part about transactional replication is that it’s flexible. And the worst thing about transactional replication is that it’s flexible.
A poor replication configuration will cause you performance pains and increase your risks of development and administrative errors. Here’s one important design principle: isolate your replication subscriber tables from one another, and from non-subscriber tables.
Our example environment
We’ve got an environment with two SQL Server databases used for OLTP purposes: SalesDB and CRMDB. Each database uses SQL Server Standard Edition. The databases support different applications, and they’re on totally separate SQL Server instances (and failover clusters).
The SalesDB and CRMDB support completely different applications, but data from selected tables in them both is needed for internal reports. SQL Server transactional replication has been chosen as the technology to keep a copy of the data up to date on the reporting instance. Replication is already in use in the environment and a dedicated SQL Server has been configured for distribution.
There’s still choices to make: how do you configure the subscriber? My recommendation is that each replication subscription be isolated in individual databases. Let’s take a look at why.
The all-in-one subscriber (Avoid This!)
In this model, we have one big database named ReportDB on the Warehouse01 subscriber server. This database holds reporting tables as well as tables from replication subscriptions.
SQL Server will let you do this, but it’s got some downsides:
1. You’ve got a single transaction log (and a single recovery model) for ReportDB. Not all of the data in ReportDB may be recoverable from other sources. If for any reason you need to be in the FULL recovery model for ReportDB, you’re in a sticky situation if you have to reinitialize replication and reload the subscriber tables: you’ve got to do it fully logged, and your transaction log backups will be bigger and slower.
2. It’s easy for DBAs and Developers to screw this up. You may want to drop all the replication subscriber tables at some point if you have to re-initialize replication. I would much rather isolate drop table commands to a database where everything in the database is replication related, and I don’t have to worry about accidentally dropping the wrong thing! Similarly, it’s much easier to set up security so that application accounts can only read subscriber tables (not write to them), when they’re in their own database.
3. You’ve got fewer options when it comes to reinitialization. The ability to initialize a replication subscriber from a restored backup is huge– running snapshots on your publishers makes your users sad.
Isolated subscribers (Much Better Over Time)
Here’s the alternate model, where each subscriber has its own database:
Note that the subscriber databases are named in a way that you can identify their role. (As noted by David in the comments, it’s much easier if you don’t re-use the name of a publishing database.)
This model has a lot of benefits:
- More options to re-initialize
- Supports different recovery models on replication subscriber databases and ReportDB (which can allow minimal logging during bulk inserts if you’re re-initializing that way)
- Safer for administrators
- Easier to set up security
- Easier for new DBAs to and developers to understand (the architecture is much more transparent)
What If I’m Not Doing it This Way? Do I Have to Change All My Code?
Not necessarily. I’ve worked with this with two different models.
Synonyms: ReportDB could contain synonyms for all the replication subscriber tables. This makes the tables seem like they’re in ReportDB, while they’re actually in other databases.
Dynamically configured procedures. In one environment I worked in where replication was common, there was a configuration table where you configured the name of the replication subscriber databases needed by an application. When code was deployed, the installation process for stored procedures checked this configuration table and dynamically compiled the stored procedures referencing the name of the database in that environment. This sounds like more work, but it was very useful for the developers, who might need to deploy builds to instances configured differently than production.
Want To Learn More About Replication?
We’ve got you covered. Start with these articles:
- Introduction to SQL Server Transactional Replication
- Performance Tuning SQL Server Transactional Replication
- Monitoring SQL Server Transactional Replication