… is to use a SQL Server where a robot does it for you!
Let’s take a look at how you can do this in AmazonRDS’s hosted SQL Server these days.
Normally, restoring transaction logs is super tedious.
You’ve got to restore all the files in the right order, for every single database. Even if you script it, if you’ve got a bunch of databases you need to restore, it’s a giant pain. AmazonRDS has essentially done all the scripting for you.
Amazon RDS makes sure it can always restore your SQL Server to a point in time
When you set up an RDS SQL Server instance, it asks you how long you’d like to keep backups, and if you have a preferred full backup window. And that’s all it asks. Transaction log backups are taken every 5 minutes to support point in time restores, no matter what you want.
So I wondered, what happens if someone creates a database in the wrong recovery model by accident? Does this break the transaction log backups?
Nope! I created a database named DB1_CreatedInSIMPLE using the SIMPLE recovery model. Check out what showed up in the RDS logs right afterward:
How Recently Can I Restore?
You can see log backups in the “Recent Events & Logs” above, or you can just look at the latest restore time for your instance. Since transaction log backups are every five minutes, you can’t always restore to one minute ago.
OK, Let’s Restore This Thing! How Easy Is it?
I wasn’t lying when I said it was easy. You just highlight your instance in the console and select “Restore to Point in Time” under instance actions.
This is going to create a whole new instance using the backups they’ve been taking. You get to pick the time you want to restore from. This will vary depending on how long the instance has existed, and what your backup window is set at. (Mine’s at 7 days, it can go up to 35 days.)
You also get to pick the instance name you’re going to restore to, along with all sorts of properties and security settings:
Setting up the restored instance isn’t immediate. RDS has to set up a new Windows instance and apply a whole lot of your backups, and that takes time.
After you get all your settings done, you click Launch to get your new RDS instance in motion to be created:
Our instance is Cooking
The instance goes through multiple phases. It automatically backs up the instance as part of the process if you’ve got backups enabled, just like any other instance.
Once that completes, I can connect
Here I am, connected to my original and restored instances via SSMS on my desktop!
SQL Server in RDS Isn’t Perfect, But It Does a Lot For You
I am not a founding member of Team Cloud. But I meet a lot of people who struggle making sure that backups are being taken properly and that they can restore. They’re frequently teams of developers who don’t want to learn to deal with backups and restore. They also want things like database mirroring without having to learn it, and RDS handles that for them, too.
This isn’t for everyone. RDS limits the number of databases you can have on an instance. It doesn’t do all maintenance for you — you have to set up your own CHECKDB jobs, for instance, which you can do in the SQL Server Agent. You don’t get tools like Database Mail inside the SQL Server, or the ability to see the SQL Server Error log via SSMS (don’t worry, you can get to that and other logs in the AWS Console). You can’t use every version of SQL Server out there (2008R2 and 2012 only, currently). You can’t do Availability Groups in Amazon RDS, either, just database mirroring. (I’m a big fan of mirroring, so I’m not all that sad about that.)
Putting it All Together
Database as a service is a pretty compelling option these days. It doesn’t take away the role of the DBA, and it doesn’t handle everything for you — but the flip side of that is that you get a lot of flexibility and customization.
And dang, I love the ease of restoring that instance to a point in time.
Brent says: what I love about the cloud is that it takes away the parts of the DBA job that I never really liked in the first place.
Jeremiah says: I’m with Brent – I love getting rid of the parts of the job I never liked. Cloud lets me do that. And RDS lets me ease up management on a certain class of servers even more
Erik says: I, for one, welcome our new Robot DBA overlords.
I’ve worked with source control quite a bit over the years — everything from ye olde Visual Source Safe to Subversion to TFS. I even supported the Source Depot system at Microsoft as an engineer for a year back in the day!
These days I don’t use source control a ton. We keep repositories of scripts in GitHub to help manage our scripts as a team, but I don’t really use it often enough to learn the command line well.
Even so, I’ve learned the basics of branching and merging. Git can be really confusing because there’s a ton of features, but here’s an outline of how you can work with a team in an uber-simple branching scenario safely.
When I Say 101, I Mean 101
We’re talking GUI screenshots and simple concepts, here. I’m not talking about how to do source control for schema and diffs, I’m just talking about how to save all those scripts you wrote about checking for that weird thing that happens in production every third Tuesday.
1. Create a Branch
Branches are great, because they keep you safe. They give you a sandbox to work on your code where you don’t disrupt the main line of code. You can work in iterations and make multiple commits to your code, so you can do / undo incremental changes.
2. Check in Your Code (Commit to your branch)
You can make changes to multiple files. For this example, I made changes to just one file. Make sure that you’re committing to the new branch you created!
3. Create a Pull Request (to merge into another Branch)
Once all your commits are done and you’ve tested your code and you’re ready to roll, create a pull request. You’re essentially proposing your change for someone else to review and merge to a destination branch of your choice.
The big advantage to pull requests is that it makes it simple for your reviewer to see exactly the changes you’ve made and know clearly what will be impacted by merging.
Your pull request will automatically send out an email and notify other users of your repo that you’re ready for things to happen, and it’s time to review.
4. Reviewing the Pull Request (Someone else does this!)
Now it’s time for someone else to be responsible. The good news is, your pull request makes it easier for them to review your code. Here’s what it looks like in the GitHub web application:
To get into the details, it’s super easy to click on the “Commits” tab and see all the changes. The reviewer can easy open all the files, leave comments, and even edit changes if they really wanna get up in your business. Or they can push it back to you.
5. When the Reviewer is Happy, they Confirm the Merge
This pushed the change down to the branch you targeted, nice and easy.
6. Delete the Branch
Right after merging this, it’s very easy to delete the branch and cleanup. You can also leave the branch there if you want, and there’ll be a history of what happened with the pull request. No more wondering, “what happened with this branch?”
It’s so easy, even a manager can do it
You don’t have to be sloppy and save all your scripts in a file share!
Jeremiah says: pull requests make collaborative development easier – you can bundle up related changes, comment on why you did what you did, and collaborate with your team to get the best fix in place.
Doug says: This also answers a question we get frequently: “What can I use for source control for SQL Server?” It’s not integrated with SQL Server, but it’s free and not too shabby.
At first glance, SQL Server’s transactional replication seems like it’s useful for moving data around in all sorts of situations: it works in Standard Edition, it’s not synchronous, and you can have multiple subscribers.
Why People Want Replication to Test and Pre-Production Environments
Setting up a test environment sounds simple at first. But it’s pretty tricky. Here are typical requirements:
- Data in the test environment needs to be fresh, recent production data to support query tuning
- Data and schema in the test environment need to be writable for testing purposes
- Data and schema in the test environment need to be periodically reset back to production-like configuration
The first of those requirements are why replication sounds like a great idea. There’s a lot of gotchas, even with that issue, but the second two requirements are where replication just clearly doesn’t meet the bar.
Replication isn’t Friendly to Changing Subscriber Tables
Transactional replication won’t like it if you go around changing data and schema on its subscriber tables. If you change the data, updates and deletes flowing through replication may start to fail. Well, let’s just be realistic– they WILL fail, and fixing it will be manual and a major pain point.
Schema changes are even dicier. You can’t just drop or change the data type in a replicated column, you need to modify replication for that– and that means adding filters to the publication.
Resetting Replication isn’t Fun, Either
Re-initializing a replication subscriber is often very impactful on the publisher: running a replication snapshot is an intensive, time consuming process that causes CPU, IO, and lock overhead on the publisher. (Warning: this snapshot is a totally different thing from the database snapshot I talk about a bit later. This one basically copies out all the contents of the published articles on the publisher into flat files, copies everything, and then loads it into the subscriber.)
You can initialize a replication subscriber from a full backup, but that’s not lightweight or fast for large databases, and it’s very dependent on great custom scripting and timing. And usually people are looking at replication because they want to avoid big backups and restores.
What About a Creative Solution Involving Replication?
Refreshing test and staging environments is a tough problem. So tough that people often try to get creative, like this:
- ProductionDB in Denver is a transactional replication publisher on PROD01
- SubscriberDB in Portland is a transactional replication subscriber on STG01
- Periodically, some process is run against subscriber DB to reset other databases on STG01. Ideas usually involve database snapshots, backups, and custom scripts.
But when you dig into the details, this is never great. Here’s why.
Transactional Replication Impacts the Publisher, Your Developers, and Your DBAs
If there’s one thing that most DBAs and developers can agree on, it’s that replication has made them work late and curse a lot.
Replication requires that every table have a primary key. This may mean schema changes just to get it set up.
Replication slows down your software deployment, because it’s easy to mess up publishing tables when you make schema changes to them. The only way to not mess this up is to also implement replication in your development environment, which developers hate. And in our scenario, where the test database is the subscriber, what would that even mean? Recursive replication torture?
Replication impacts performance on the publisher, particularly under high load. It frequently requires tuning storage, setting up a scale out distributor ($$$ hardware and licensing $$$), fine tuning your publisher and subscriber properties, implementing custom monitoring, and basically having at least one team member obtain a Masters Degree in Replication at the School of Late Nights. I went to that school a long time ago, and I didn’t keep the yearbook, but I did write about it here.
Database Snapshots are Read-Only
There’s no such thing as a writable database snapshot in SQL Server, even though it’s an Enterprise Edition feature. (Fancy SANs have options for that, SQL Server itself does not.) That pretty much takes most of the uses out of it for a staging environment.
If You’re Writing a Custom Script in the Staging Environment, Why Use Replication?
If you have to basically write a custom ETL to read from the subscriber to refresh staging databases, transaction log shipping is much easier to manage than replication, and it allows a read only standby mode on the subscriber.
If You’re Using Backup and Restore, Replication is Also Overkill
While it’s technically possible to back up a replication subscriber and restore it, then remove replication from it, it’s not all that great. This is a lot of complexity without a lot of payoff.
- You’ve still got the time to do the backup and restore (which you were trying to avoid in the first place)
- You haven’t validated that the database you’ve restored has a schema that matches production (it could be very different on the replication subscriber)
You’re much better off basing the restore off production in this case. Even if the pre-production and testing databases are in a different datacenter, you can optimize the types of backups used, compression and network copy time, and restore schedule. Even setting up multiple test instances that restore “in rounds” and which can be used at different times a day is often operationally preferable to the performance, monitoring, and caretaking needs of replication.
There’s No Easy Answer to this Problem
Refreshing staging and development environments doesn’t have a single right answer. This is done many different ways based on application requirements. Just don’t pin all your hopes on replication and start investing in it without looking at the details of exactly how it will work — because it’s probably not what you think it is at first. Always make sure you factor in:
- Performance impact on the publisher and hardware/storage/licensing costs to mitigate this
- Any restrictions or requirements on schema changes at the publisher and impact on code development
- Operational cost of upkeep of the technology
Don’t give up! Having staging and development environments with recent data that works well for you is totally possible, you just need to consider more technologies than replication.
We’ve got lots of resources, for money and for love. Or just for free.
- Watch Brent’s free video on HA/DR basics
- Read more free articles about replication
- Buy our 6+ hour online course on High Availability and Disaster Recovery in SQL Server
- Get custom advice from us directly about your environment in our SQL Critical Care® Service
- Attend our in-person course for Senior DBAs to learn advanced tricks to wrangle your SQL Servers
Brent says: Yeah, replication isn’t what I call “refreshing.”
But Kendra, it can’t be that hard… after all, we have synchronous modes in Database Mirroring and Availability Groups, right?
Synchronous Commit doesn’t mean “zero data loss”
When we think about limiting data loss, the first thing we think of is a technology that lets us reduce points of failure. If every transaction must be written to two separate storage systems, we have a pretty good chance to have no data loss, right?
Let’s say you’re using a SQL Server Availability Group in SQL Server 2014 with a synchronous replica to do this. The secondary replica fails and is offline, but you don’t lose quorum. If you want 0 data loss, the primary needs to stop accepting writes immediately, right?
It doesn’t do that. The primary replica keeps going and writes can continue. Here’s what that looks like:
You could write custom scripts to detect the situation and stop the primary replica, but there’s a couple of problems with that. First, you’re offline, and you probably don’t want that. And second, it’s going to take some time to get that done, and that means that you don’t have zero data loss– you could lose anything written in the meanwhile. You could add another synchronous commit replica, but there’s obvious cost and support impacts, and you still aren’t guaranteed zero data loss.
Synchronous writes don’t necessarily guarantee zero data loss, you’ve got to dig into the details.
This stuff isn’t obvious
I’ll be straight up: I’ve been working with high availability and disaster recovery for a long time, and I hadn’t actually thought very critically about this until a recent chat room conversation with Brent discussing why it’s not super easy for cloud hosting providers to offer zero data loss in return for a lot of dollar signs.
Crazy facts: you can learn things from chat rooms and from the cloud. Who knew?
NEED TO PROTECT YOUR DATA? YOU NEED A FULL TIME EMPLOYEE WHO IS RESPONSIBLE FOR THAT.
If data loss is important to you, don’t just assume that you’ve got it under control because you’re paying a vendor to take care of it. If you look closely, you may find that nothing’s working like you think! When your data is important, you need to make someone responsible for ensuring that you’re meeting your RPO and RTO, and have them prove that it works on a scheduled basis. Their job title doesn’t have to be “Database Administrator,” but they need to work for you, and they need to take their responsibility seriously.
Want to Learn More About High Availability and Disaster Recovery?
We just launched our new DBA’s Guide to SQL Server High Availability and Disaster Recovery – a 6-hour online video course that teaches you about clustering, AlwaysOn AGs, quorum, database mirroring, log shipping, and more.
Head on over to our new free quiz to find out! Take the SQL Server High Availability and Disaster Recovery Quiz.
Everyone’s code is terrible. But exactly how terrible is yours?
First Circle: Code Limbo
We can’t say this is good code, but for the most part nobody notices it.
Second Circle: Code Lust
Third Circle: Gluttonous Code
There’s always one simple piece of logic that’s been written hundreds of times, all over the codebase, but in slightly different ways. And then, one day, you need to change it.
Fourth Circle: Angry Code
Your application has periodic timeouts, which disappear as soon as users report them.
Fifth Circle: Greedy Code
This code was only supposed to run once when a user logged in and their profile was loaded. But sp_BlitzCache® says it runs 4,000 times per minute when you’re hardly doing anything.
Sixth Circle: Code Heresy
This code looks incredibly bad. It violates common sense, but it works right now. You’re afraid to touch it, for fear it will burst into flames.
Seventh Circle: Violent Code
Your application is deadlocking, and literally killing itself.
Eighth Circle: Code Fraud
That day you find out that large portions of your source code don’t actually belong to your company. And now you need to fix it.
Ninth Circle: Treacherous Code
When your database corrupts itself.
The QUERYTRACEON hint can be particularly useful for testing the new cardinality estimator in SQL Server 2014 on individual queries using Trace Flag 2312, if your database is still using compatibility level 110:
SELECT ClosedDate, Id FROM dbo.Posts WHERE LastActivityDate > '2013-09-05 11:57:38.690' OPTION (QUERYTRACEON 2312); GO
Conversely, if you’re using the new estimator everywhere by having database compatibility set to 120, you can use the old estimator for an individual query by using QUERYTRACEON 9481:
SELECT ClosedDate, Id FROM dbo.Posts WHERE LastActivityDate > '2013-09-05 11:57:38.690' OPTION (QUERYTRACEON 9481); GO
There’s a little problem. Turning on a trace flag requires high permissions. Let’s say that I’m attempting to use this hint from an account which doesn’t have superpower permissions. Here we create a login and user for app_account and grant it data reader, and execute on the dbo schema:
USE [master] GO CREATE LOGIN [app_account] WITH PASSWORD=N'DontBeLikeMeUseWindowsAuth', DEFAULT_DATABASE=StackOverflow, CHECK_EXPIRATION=ON, CHECK_POLICY=ON GO use StackOverflow; GO CREATE USER [app_account] FOR LOGIN [app_account]; GO ALTER ROLE [db_datareader] ADD MEMBER [app_account]; GO GRANT EXECUTE ON SCHEMA::dbo to [app_account]; GO
We can now impersonate app_account to run some tests!
EXECUTE AS LOGIN='app_account'; GO
Meet Error 2561, User Does Not Have Permission to Run DBCC TRACEON
When I try to run the query, things screech to a halt:
Even if I try to get tricky, I can’t sneak this into a temporary procedure as app_account, either:
Stored Procedures to the Rescue
We need a little help from a stored procedure. We’re going to revert back to our sa role, and create a procedure around our query:
REVERT GO CREATE PROCEDURE dbo.RecentPosts @DateVal DATETIME AS SELECT ClosedDate, Id FROM dbo.Posts WHERE LastActivityDate > @DateVal OPTION (QUERYTRACEON 2312, RECOMPILE); GO /* Now we go back to running as app_account again */ EXECUTE AS LOGIN='app_account'; GO
Due to the magic of ownership chaining, we can now run the query as app_account:
This Means You Don’t have to Give Your applications Sysadmin Rights
… but you may have problems if those applications need to run a lot of adhoc queries.
Want to learn more about statistics and the new cardinality estimator? Join me and Jeremiah Peschka for our Advanced Querying and Indexing course!
Want to test your knowledge on indexes and query tuning in SQL Server?
Curious how your query tuning style compares to others?
Looking to have a little fun?
Interested in a coupon code?
Your Monday just got better. Check out our new free online quizzes right now:
Find and Fix Your Worst Query Patterns (Sponsored by Idera Software)
Friday, June 12, at 11:00AM Central
You’d love to make your queries faster, but you’re not sure what to change. Kendra Little will teach you how to identify bad patterns in the execution plans of the biggest, baddest queries running against your SQL Server. You’ll see Kendra triage bad queries in action and get an easy-to-use reference you can use to diagnose problems in your query execution plans back at work. Register here!
Missed the Webcast or Want to Learn More on the Content?
- Index articles: http://www.brentozar.com/sql/index-all-about-sql-server-indexes/
- Scalar functions and the observer effect: http://www.brentozar.com/archive/2015/02/measuring-performance-can-slow-sql-server/
- Statistics, execution plans, and parameter sniffing: http://www.brentozar.com/archive/2013/11/why-parameter-sniffing-can-slow-down-queries-video/
- Crazy joins and cardinality estimation: http://www.brentozar.com/sql/index-all-about-sql-server-indexes/
Want More Free Training?
We had some great questions about trivial execution plans in SQL Server in our Advanced Querying and Indexing class a few weeks ago. Here’s a little glimpse into what we talked about.
For really simple queries, SQL Server can use “trivial optimization”. If there’s a very limited number of ways to run the query, why do a bunch of fancy, CPU burning cost-based optimization? Just chuck the plan at the query and let it go!
Downside: Trivial plans don’t ask for indexes
With a trivial plan, SQL Server never gets to the phase of optimization where it would ask for an index. This means you might have a really frequently executing query running against your SQL instance, burning lots of CPU which might be vastly improved by an index, but nothing ever registers in the missing index dynamic management views.
How Trivial Does A Query Need to be for this Optimization?
I find trivial plans frequently, but it’s also a little trickier than you’d think. Let’s look at some examples:
I’m starting off with a simple query, running against a large table:
SELECT Id FROM dbo.Posts WHERE ParentId=3; GO
The Posts table has only a clustered primary key on on the Id column. This query gets FULL optimization, and SQL Server asks for an index.
Why wasn’t this plan trivial? Well, even though SQL Server didn’t have its choice of indexes to use, notice the parallel lines in there. SQL Server did have to decide whether or not to use more than one processor!
Let’s Take Away SQL Server’s Choices
Sorry SQL Server, I’m writing this blog post, and I know that I can remove this query’s chance to go parallel by raising one setting: Cost Threshold for Parallelism. This setting is the “estimated cost bar” for who gets to use multiple cores. I’m going to pump this setting up so my query doesn’t qualify.
exec sp_configure 'cost threshold for parallelism', 500 GO reconfigure GO
Be careful, this setting impacts all queries on the instance, and maxdop hints don’t override it.
Now, rerunning my query:
This time I got trivial optimization – the query’s estimated cost for running a single threaded plan doesn’t even qualify for parallelism, so that choice doesn’t exist.
No missing index shows up, even though SQL Server did 346,913 logical reads for this query.
What if I Add an Unrelated Nonclustered Index?
Now that we have a trivial plan, let’s give SQL Server a choice– but not a very good choice. We’re going to create an unrelated non-clustered index:
CREATE NONCLUSTERED INDEX ix_Posts_LastActivityDate ON dbo.Posts (LastActivityDate) GO
Our query doesn’t reference the LastActivityDate column at all.
Rerunning our query…
Full optimization is back! Even though the LastActivityDate index seems really unrelated to what we’re doing, just adding it puts us back in FULL optimization.
Let’s Clean Up That Cost Threshold
Otherwise I’ll forget about it and not understand the weird plans on my test machine later. I’m just setting it back to the default here:
exec sp_configure 'cost threshold for parallelism', 5 GO reconfigure GO
Takeaways: Beware the Creeping Trivial Plan
While most of your tables may be indexed well, it’s easy for changes in code to result in poorly indexed changes creeping out. While SQL Server’s default setting of 5 for “Cost Threshold for Parallelism” is generally far too low for modern processors, understand that raising it may increase your chances of getting trivial execution plans, which won’t ever ask for indexing help.
Want to find out if you’ve got trivial plans among your top queries? Our free sp_BlitzCache™ procedure sniffs around in your query XML and warns you right away.