“We’d like to offload our reporting queries to a separate SQL Server.”
The first costs are fairly obvious.
Hardware and storage – even if you’re running it in a virtual machine, you need to account for the costs of say, 4 cores and 32GB RAM. Not only will you need storage for the databases, but you’ll also need to decide whether this server gets backed up, and copied to a disaster recovery data center.
Software licensing – Standard Edition is ~$2k per core, and Enterprise Edition is ~$7k per core. Toss in Windows (especially now that it’s licensed per-core), your management/backup/antivirus tools, and your monitoring software.
Project planning – you’ll need to design how to get the data from production to the reporting server, like with Always On Availability Groups, log shipping, or transactional replication.
App modifications – the app running reporting queries will need a new connection string. Even with Always On Availability Groups, reads aren’t automatically offloaded to readable replicas – you have to use the connection string parameter ApplicationIntent = ReadOnly to tell SQL Server that you promise not to try to write anything. If you have a single app that does both reads and writes, and you only want to offload some of the queries, you’ll need to go through the code to switch those queries over to the new connection string.
The rest of the costs are surprises.
Adding a troubleshooting process – sooner or later, the data replication process will break. Depending on the method (AGs, log shipping, replication) and failure type, it’ll fail in different ways – maybe all of the data is old, maybe just some of it is, or maybe the reports aren’t accessible at all. You’ll want to list out the failure methods and explain what symptoms will look like. This helps business users recognize when their reports are wrong, and react appropriately. If you don’t do this step, then after the first failure, people are just always going to expect that there’s a bug in the report data.
Prepare for failure – for each of those failure methods, decide how you’re going to react. For example, if AG replication breaks and reports are out of date, will you point reports at the primary until the problem is resolved, or will users just have to deal with unavailable reports while you troubleshoot or resync the replicas? If you don’t do this step, then you’re going to be winging it every time, and you’ll look unprepared while reports are wrong or down.
Set realistic expectations for RPO and RTO – based on your process and preparation, make sure the business users understand how long their reports will be down when things break.
Measure the overhead of replication – AGs and transactional replication can add performance slowdowns beyond what the reports used to cost. For example, if you were only running a few reports an hour, and only hitting a subset of the data, then suddenly replicating every individual delete/update/insert operation can have a huge overhead.
Add monitoring – you need to start monitoring how far behind the reporting server is, and how performance is doing on both. Performance troubleshooting becomes a lot harder, too – for example, when you’re doing index tuning, you have to combine data across both the primary and the reporting servers in order to find the right mix of indexes across the board.
Are you sure you really need to offload reporting?
Before you embark on this expensive project, ask:
- What’s the primary wait type that we’re facing?
(Find out with sp_BlitzFirst @SinceStartup = 1)
- What’s the cheapest/easiest way to reduce that wait type?
Time and again, I see people facing PAGEIOLATCH waits (which mean waiting to read data pages from a data file), and they’re juggling a 1TB database with 16-32GB RAM. Don’t spend tens of thousands of dollars to fix that problem – buy $1,000 of RAM and spend some time doing index tuning.