Dev, Test and Production SQL Server Environments

In a perfect world, my test/QA servers get restored nightly from production. Let’s say every night at 9pm, the production full backups kick off, and they’re finished by 10pm. At 11pm, the QA box kicks off a restore job that grabs the latest full backups off the file share and restores them. The production backups are written to a file share, never local storage, so there’s no additional overhead on the production box for this restore process.

Now when QA tests deployment scripts in the QA environment, they can be reasonably sure that the scripts will perform the same way in production.  Sometimes developers and QA staff assume that production is exactly like development: they’ll make small tweaks over time to the dev server and schema, and their deployment scripts will fail because the production server is missing those tweaks.  If the QA server is restored every night, that possibility of failure is reduced.

Developers love this fresh-QA-box approach because it gives them fresh data every morning for testing.  Want to find out how a new query would perform with real production data?  Development environments are rarely refreshed from production because people might lose work, but an automated QA restore system means there’s always a non-production box with the freshest data possible.

The next perk is that now I’ve got an automated fire drill restore system going.  Every night, I know for sure that my production backups worked.  Unfortunately, this can also bite me in the rear: if my production backups fail for some reason, then my QA system is unusable until I fix it.  Backup failures suddenly become much more time-sensitive, and I may not have enough time to rerun the production backups.  To prevent problems, I recommend keeping at least two days of backups online so that you can restore from the previous (and successful) backup if the normal one fails.  I wasn’t ever ambitious enough to implement an automated restore-the-second-backup script because my production backups so rarely failed.

This approach doesn’t work if:

  • You’ve got sensitive data in production (credit card data, medical data, and other things you don’t want developers to have free reign over)
  • You can’t restore your production database fast enough (in which case, you probably want to work on tuning that anyway in order to meet your RPO and RTO)
  • Your QA box isn’t big enough to handle production (in which case you don’t know that dev code will scale to production anyway)
  • Your QA team needs 24/7 access to the QA box (in which case, I have questions)
  • You don’t do full backups nightly (if you’re doing diffs, you’ll just need a more complex auto-restore script like sp_DatabaseRestore)

Never Restore QA From Development

Some developers like to have a QA server that they restore from development periodically, and then tell the QA people to test their work on the QA server.

The only time I allow that scenario is when they’re going to take the same approach with the production server.  If their production upgrade plan is to take down production and restore the dev database onto it, then restoring QA from dev is a valid approach.  Otherwise, they’re cheating!

Restoring QA from production forces the developers to script out their deployments instead of point-and-click-table-changes in SSMS.

The Perfect Test/QA Server is Identical to Production

Keeping identical sets of CPUs, memory and storage in both environments means that you can run performance testing with confidence before a new dev query knocks the production server over.  This means query execution plans on each environment should be identical.  If the QA server has a different CPU count or memory size, your query plans can be different, making it harder to predict production performance.

For example, I worked in one environment where the servers were configured as:

  • Production – dual quad-core CPUs (2×4, 8 cores total) with 128GB RAM
  • QA – four single-core CPUs (4×1, 4 cores total) with 16GB RAM

Unfortunately, when the SQL Server engine builds an execution plan for a query, it’ll build differently based on the number of CPUs and the amount of memory.  In this environment, the queries and indexes were tuned for the QA box, but production performance never seemed to match up with their expectations.  Keeping identical hardware and identical database configurations mitigates that risk.

In the real world, we can’t always afford to keep these two environments identical, especially in infrastructure-as-a-service cloud providers when we’re paying every moment that a resource sits idle. Everything’s a compromise. The closer you can make test/QA to production, the more accurate your testing will be – but the more you’ll spend doing it.

Previous Post
Another backup failure: Carbonite
Next Post
Getting me to speak at your user group

77 Comments. Leave new

  • It seems like the nightly refresh of QA environment is key, even if you don’t have a production environment yet (as in a greenfield development project). Then, once the team makes that first release, you don’t have to change anything but the source of the backup to restore from in order to invoke your ideal process.

    Reply
  • I wish I could do the nightly refresh. The problem is that the ERP system I administer here would also require a daily application to run to re-sync the Visual FoxPro and SQL databases. I haven’t been able to figure out how to automate that.

    Oh, and Brent I am regularly amazed and humbled by the amount of time you put into blogging. It inspires me to write more.

    Reply
  • I loved the backup-auto-restore, when it’s working, it’s beautiful!

    Another not so fun scenario is external Database hosting for production while QA/DEV are in-house
    Then I had to compress (or LiteSpeed/SQL Backup), download via FTP hoping it’s fast enough, then restore in-house

    Downloading 6GB (compressed) nightly is no fun, imagine some bigger files!

    Reply
  • I love the idea of the nightly restore – but my production environment is just way too big, and now that I’ve suddenly been thrust into the world of SAN management (hence my sudden departure from the internets), I’m even more sensitive to the space I’m allocating.

    We really haven’t sorted out a good solution just yet. Our QA folks have managed to make “skinny” versions of the databases, but obviously this doesn’t allow for equal performance testing, only functionality. They still keep one fully loaded QA environment, but we aren’t in that perfect realm of matching hardware.

    That said, thanks for yet another insightful post!

    Reply
  • I played around with this idea in the past, but always had issues on the secondary box that the .bak files were being restored on. I wrote a custom script but could never quite get it to work. Guess I’ll be playing with it again.

    Reply
  • Brent and everyone, is it NOT considered “Best Practice” to put SQL Test and Production environments (databases) on the same hardware or even the same VM server?
    One vendor doesn’t see anything wrong with this. I work at a large hospital. Thanks eveyone, Cheryl

    Reply
    • Cheryl – putting test and production on the same hardware is a bad idea for safety purposes. If for some reason someone screws up and picks the wrong database or has the wrong connection string set up, it’s easier to trash records on the production database. Setting them up as different virtual OS instances on the same VM server may not be so bad, though – if you’re sharing hardware anyway, it doesn’t really matter if they’re on the same host. You want to be careful to avoid performance bottlenecks, but that’s true no matter where you put them in virtual machines.

      Reply
  • what about the stored procedures that are different in prod and qa?

    Reply
    • Laurie – can you give me an example? Usually your code should be identical between servers, and it should fetch configuration variables from data sources (like config tables). Having different code from prod to QA to dev introduces risk for problems.

      Reply
  • Hi Brent,

    We recently just ordered 2 brand new servers: HP Prliant DL380 G7 – Xeon X 5690 3.46 Ghz X 2 with 144GB Memory and a 160gb Fusion iODrive for TempDB. these 2 new servers will replace the current production server and DR. This is for a central data warehouse environment.

    As we don’t have enough capex to buy the third one right now, my question is:

    is it better to use these 2 new servers as one for Production and one for DR or one for production and one for Test/QA?

    In worst situation, a down time of 2-3 hours is OK as our reporting servers are on defferent OLAP servers.

    what is the best way to utilize these 2 new servers?

    Thanks a lot.

    Reply
    • Hi, Steven. If you used the second one for test/QA, how long would it take you to make it the DR server in the event of a disaster? (And don’t think about when everything’s going right – think about a true disaster when phone lines are down.)

      Reply
  • Nelson Viggiani
    October 21, 2013 2:31 pm

    that is a good approach refreshing QA every morning but in my case I have different users on QA (more actually then Prod) and every time I do a refresh I need to fix all users again…Any ideas?
    thanks Brent.
    PS.. we work together on Southern wine and spirits (florida) a long time ago…Regards
    Nelson

    Reply
    • Hi Nelson! Good to hear from you. The easiest way to do it is to just script it out once, and then set it up as a SQL Agent job that runs after the refresh step finishes.

      Reply
  • Hi Brent,

    Your article was quite insightful and after going through it I realized we are following a very similar approach, but are still having some challenges. I was hoping to get some suggestions from you on how we can improve our work model.

    The biggest aspect in our set up is that we have 3 different teams (geographically and time zone separated) working on the same database. We have 3 environments – Dev, QA, Prod. QA gets refreshed on an overnightly basis with Prod.

    Until a few months back, the Dev was being used by all the developers to develop code and then use the QA to test out with the lastest data. However the issue came when we needed an environment for UAT testing – an environment which shouldn’t get refreshed daily. The UAT testing usually goes on for weeks. So we created a 4th environment called UAT server where the developers started putting their developed code so that the business users could test it. The data in UAT was an on-demand copy from Prod environment. So now when there are issues reported by the business users, the developers are fixing the code in UAT environment.

    Then there is another team of developers who work during the business hours. They get certain Production issues where some development is required. The changes need to be done in a short time. So they quickly make changes in the QA environment and once its working, they move it to Production. Now these changes are never pulled back into either the UAT or Dev version.

    Now when the developers who originally developed through the DEV-UAT route want to release something in QA-Prod, they face challenges as their DEV-UAT don’t match with QA-Prod. So they need to reverse engineer to make their own code work in QA-Prod.

    With 3 different teams, it is causing all the more confusion and I was wondering if you could give some input to better the existing process. Thanks!

    Regards,
    Nayan

    Reply
    • Uh, whoa. At first glance, it doesn’t sound like we’re really doing source code control here. I’d start by reviewing the development source code control process and making sure that we’re checking changes in, checking them, and then deploying them to QA.

      Reply
  • Hi Brent,
    First thanks a lot for your great work and generosity.

    We are upgrading SQL Server’s hardware. The new production server is going to be 80 Cores, 1TB of RAM and the database is over 2 TB. But Dev and QA environments will be VMs with a subset of the data, not the whole database.

    Is there a way to scale non-production servers so that they will closely behalf like production?

    e.g. Assuming that the work load in development is 10% of that in production and the database size in Development is scaled down to 200GB, then the development machine should have 10 Cores (10% of production + background processes) and 16 GB RAM, which is (Procedure cache, etc size (6 GB) + (10% * (200 GB Disk / 2 TB DISK ) * 1 TB RAM)

    does that make any sense. is there a better way for doing it?

    Best Regards
    Nabila

    Reply
    • Hi, Nabila. 80 cores of production licensing, wow, that’s roughly half a million US dollars worth of licensing alone. It must be a pretty important application. If performance is so important that you’re willing to put half a million dollars into licensing, why not use actual physical servers for QA? (Remember, as long as it’s QA, you can likely get by with SQL Server Developer Edition licensing on the QA box, which is practically free.)

      Reply
      • Thanks a lot Brent.

        Yes, it is a mission critical application. And there are some issues with the application, and adding more hardware is the fastest way to fix them. But the company is growing and more hardware will be needed soon anyways.

        I might be able to get away with physical QA servers, I’ll try :), but not with the same kind of hardware we will have in production. So I still need to scale down non-Production servers. Even if I have the 80 cores in QAs, it will be hard to generate the same work load that we have in production, so I think scaling down may have some advantages.

        Reply
  • Hi Brent,

    Thanks for sharing your knowledge on such a real time issue , I’m trying to develop an approach way to bring one of our non prod environment as close as possible to Production server. I’m having difficulty in figuring out a solution because of the replication setup, for one of our critical applications there is a transactional replication setup between two prod servers , and I manage three sets of non prod environments with replication (Dev ,Test ,UAT ,Training environments) .

    As sqlserver erases replication related schema every time a restore is performed, it’s impossible to take the restore approach for mimicking a NonProd_Prod server 🙂 i.e., QA as you referred in this article .

    so on every refresh from Prod I had to reinitialize replication , which takes about 6 hr’s for two databases (even if I bump up the hardware to 8 core and 64 GB RAM), is a quite backbreaking work.

    Please can you advise any best approached to follow for this scenario.

    Appreciate your help to the SQL community.

    Regards,
    Kiran

    Reply
    • Kiran – dumb question. Why do you need replication in dev, test, UAT, and training? Seriously, for each of the four environments – are you testing replication? Testing replication is kinda Microsoft’s job, not yours.

      Reply
      • There is a reason to have replication in all non prod environments , a in-house developed payroll process depends heavily on the data warehouse databases that pull data from the subscription dB’s which also serve as reporting database. in summary there are regular development needs that require the entire replication process to test complete functionality., i.e. for reporting, payroll , dashboards depending on data warehouse , developers need to test the entire functionality before production deployment.

        Reply
        • Gotcha. I’ve got a client with that exact same scenario, and they have 3 full time database administrators dedicated to managing the different environments. Payroll’s a great example of that because you also have to make sure the personally identifiable data is masked before it ever gets to development or QA, and you have to make sure the backups don’t get offsite. There’s no simple best practice here.

          Reply
  • We have a table that currently holds 650,000,000 records and growing. There is no need for this data in non-production databases. How would you recommend handling something like this? Filegroups? Selective replication?

    Reply
    • Brian – if you don’t need the data, I’d recommend not putting it in the database.

      Sounds snarky, but seriously, what am I missing?

      Reply
      • You’re missing the “non-production” part. The data is essential for performance reasons on production as it contains a snapshot of user entitlements that is referenced frequently but would be inefficient to recalculate each time. However, there is no need to transfer that data to a test environment and it represents a sizable percentage of a particular database. It is fairly dynamic data that is repopulated for a given user each time they log into the supported application. Honestly it wouldn’t be missed if it wasn’t included in the backup at all.

        Reply
  • HI Brian,

    If it is meant to grow frequently , writing a achieve script and deleting the data once a year would be a good Idea > less data > Less storage & CPU cost > More performance > faster recoverability in case of a disaster/data corruption.

    Reply
  • Sorry Typo , I mean Archive script

    Reply
  • Hi experts,

    I am new to RedShift.
    Can anybody tell me how to copy data from one database server to another in Redshift?

    I have to copy data from Dev to QA.
    I want best approach to copy data. Any help would be appreciated.
    Regards,
    reet

    Reply
  • Dear Brent, hello.
    Thank you for the great article.
    Could you please advise? We are a bit limited in licenses, (only 4 cores MS SQL Enterprise lic) and right now i’m trying to understand what would be the better way to build the DEV / QA / PROD environments.
    While DEV – is ok, we would use separate DEV licenses
    There is a trouble with QA / PROD as we still have only 4 cores, the only option I see, is using the same VM, but different instances for QA & PROD, which of course sound a bit weird.

    Reply
  • At the risk of being “that guy” commenting on an old post…

    There is one scenario that has always prevented me from implementing the auto-restore, and that is when testing is happening in QA involving structural changes to the database. The deployment has updated the DB, and when the restore comes it, it destroys those changes, causing all sorts of issues in the testing environment. How do you handle that scenario? The only solution I’ve hit on would be to auto-deploy the application again once the restore is done – basically continually verifying the deployment plan will always work. There’s a hefty level of orchestration that goes into that sort of setup, especially in the QA data setup – these aren’t bad things (in fact, I’d argue being able to redeploy all components of an application to QA in one click is my ideal scenario), but it is certainly time and resource consuming to get there. Is that just the cost of entry, or is there a better way to do this?

    Reply
  • Alexandre Araujo
    November 21, 2014 7:08 am

    Hi Brent, i really appreciate your dedication to improve Sql Server .

    My problem is a little different . I have 08 QA’s teams and each one with a special kind of workload. Even a sort of tables that you must disable PK,FK, and other stuffs thus a little overhead to admin scripts individually and jobs . My question is … Is there a good tool that makes this job easily ? or a good process ?

    Thanks in advance.

    PS. I’m still regreting not taking your 2 days training classes in Seattle !!! 🙁 I was with you in the same flight from Chicago to Seattle, i’ll tell you about the training but my company had already payed preconfs but next year i won’t miss )

    Reply
    • Alexandre – thanks, glad you like our work.

      Dumb question – why do you need to disable all the keys in the database for QA?

      Reply
      • Alexandre Araujo
        November 21, 2014 1:22 pm

        Brent, sorry i wrote QA but specifically what i meant was TEST environment.
        There are a lot of problems with dev’s team … lot projects ran together … and unfortunately i need solve the problem . 🙁
        Thanks.

        Reply
        • Alexandre – I still don’t understand why you would disable all the keys for testing, either.

          Reply
          • Alexandre Araujo
            November 21, 2014 2:06 pm

            I know it’s sounds strange. Imagine 08 differents workloads absolutely specifcs with a lot of jobs and scripts that prepare a environment to comply with each team. Is there a tool that you can make this job easily choosing objects and workload for each environment ?

          • Alexandre – sorry, I’m not understanding what you mean, so no, I don’t know of a tool to make that easy.

  • It’s ok , maybe my only solution is admin scripts to mount the environments.
    Thanks for your attention.

    Reply
  • why Database server is used to QA test environment?

    Reply
  • Very informative blog. Do you have a script to refresh QA from Prod? Thanks!

    Reply
  • Brian Cariveau
    March 13, 2015 8:47 am

    Brent,

    I am keeping your post alive!

    I have a scenario I would like the communities recommendations on.

    We are on SQL 2012, and use LightSpeed for backup/restore. The environment is primarily used for reporting purposes. We have a PROD/UAT(QA)/DEV/BCP instance of SQL server.

    Here is our current issue. We have the space to have the same data in the different environments, and our development team wants a mirror of production in UAT. As we do this our team is stepping on each other. We have code (new stored procedures/tables/views, etc) moving up from the development environment, and we have current data moving down from production with backup restore on a daily basis. Therefore, we are over-writing the code that is moved up from DEV.

    How have you seen people manage this so that the QA/UAT environment has up to date data, but scripts / new data is available for testing, that is not yet available in production?

    Interested in your insights.
    Thanks!

    Reply
    • Brian – typically this is accomplished by checking your development code into source control, and then scripting your deployments to the QA/UAT environment. That way you can rehearse deployment into production – except you’re deploying to QA/UAT first. Then, by the time you’re deploying into production for real, your deployments are well-rehearsed.

      Reply
      • Brian Cariveau
        March 13, 2015 8:56 am

        Brent,

        Thanks for the quick reply!

        We are using SVN for source code control. So would you be re-running your scripts after every restore?

        We have been doing some of that. Where we are running into issues with that is our restores take a bit of time due to data volume, and a lot of times in the UAT environment we have large sets of test data that would have to be re-loaded in tables via SSIS. With the restores/new UAT data loads on a daily basis we are having challenges fitting it into the overnight process, as we extend to 10-11 in the AM, and often times are doing manual processes to get things cleaned up in UAT after the restores.

        We have been challenging the team to come up with something fast/automated, but thus far are coming up dry.

        Reply
  • Is this work if I have merge replicatin on all databases (test , QA an dproduction ) . thanks

    Reply
  • Can Production data be stored to the Test box for troubleshoot.

    Reply
  • Hi Team,

    My question is not relevant to the refresh activities but placing the environments for better maintenance and use.

    We have two data centers and 4 types of environments 1. Prod 2. Stage 3. Reporting 4. Test. Which is the best option to keep place the environments like First data center with Prod and Sage and Second data center with Reporting and Test environments? Do we need to consider any best practices before we build these servers in two different data centers and we are planning to start with VMs.

    Thanks,
    Vijay

    Reply
  • Hello Team,

    As we planning to migrate database from SQL 2012 and windows 2008 r2 to SQL 2014 with windows 2012.
    could you please let me know step by step from scratch where we need to start from. we are building QA environment when every thing goes well we move to Prod Environment.

    Thanks in Advance.

    Reply
    • Sandy – sure, that’s exactly the kind of thing that we do in our consulting engagements. We analyze where you’re currently at, look at where you need to get to, and then help you build a plan to get there. Feel free to click Contact at the top of the page and we can get started!

      Reply
  • What are the legal/licencing implications of restoring from Prod to UAT? If you’re paying the full whack for a Prod licence, I heard somewhere that you can’t (or shouldn’t) take data from your Prod box and put it on a box where you’re only paying for a development licence. Is this true?

    Reply
    • John – hahaha, no, you should track down where you read that from, make a note of it, and never believe anything they say again. 😀

      Reply
    • As per my understanding, if you have a valid MSDN subscription then you are legitimated to use SQL Server licensed for testing, development and training purposes, but absolutely NOT PRODUCTION purpose.

      Have that said, you can copy the data from production environment to any testing, development and training servers, just make sure those servers are not used for production purpose. And in my organization, we constantly do that too and our dev/test/training env have the same exact version (2008/2008r2/2012 etc) and edition (standard/enterprise etc) following the production server settings. Therefore you do not need to worry about any features that are exclusive to enterprise (e.g. TDE, resource governor, partition table etc)

      Please correct me if i am wrong.

      Reference: https://msdn.microsoft.com/en-us/subscriptions/cc150618.aspx

      Reply
  • If you make a new database on a development box, and you want to start using that box for commercial purposes while you’re still playing around with the data (before putting it into production), is this allowed? Do Microsoft mind?

    Reply
  • As a developer, I am often in a situation where it would be quicker to run a package on my machine than deploy a package to Production and then run it there. Am I licensed to use my Developer version of SSIS to create a file on my computer, if I’m doing this for business/commercial purposes? Its not a test file. It is being produced so that it can be used for production purposes.

    Reply
  • how would you do the refresh of an environment with transactional replication involved?

    Reply
    • I’d script out the replication so that I could rebuild it each time I restored production data into development. (Generally speaking, though, I don’t do replication in development environments.)

      Reply
      • Thanks Brent, at this time we remove replication manually, do the restores and then put back replication. I was looking for a way to automate it. Do you know of any tool I could use for this.

        Reply
  • Would you just do a regular nightly backup on production and then just take that file and run a script on the development system to restore from that backup?
    Does anyone have a sample script – I have a script that backups the production database with compression and on the development sql server I have another that takes the backup and just replaces the current database, the reason for replacing is because everytime I tried to do a standard backup I would also get an error that the file was in use or something – any suggestions I can provide the scripts I am using upon request. Or any suggestions or ideas would be helpful

    Reply
  • Hi Brent, Any suggestions for sharing the SSRS development environment? My team is three developers and I hear all the time that sharing of the cube data for development is impossible. The outcome is they develop in both QA and Development.

    Reply
  • laxminarain
    May 8, 2017 2:41 am

    Hi Brent,

    Historically within our enterprise/department we perform data refresh for close to 20 applications in one go on a semi annual basis across each non production environment. All these 20 systems are closely related to each other from data standpoint. All the underlying databases of the application are refreshed from Production.

    The above process is not very efficient as we end up spending close to 50 man days worth in a duration of 3-4 days. This huge effort really goes because some of the pre-refresh/post-refresh tasks needs to be performed in an sequential manner rather than parallel fashion. For Database refresh we are exploring to on-board Delphix and virtual databases to speed up the overall refresh time. With the current refresh backup-restore technique we end up spending at least 1 day per application.

    Pls advise if you think Delphix will help in lowering the time. Or should explore the opportunity of near real time data push from prod box to no prod box

    We have started exploring

    Reply
  • A colleague is considering Snapshot replication to copy databases down to the testing/UAT and development servers. It is an OLTP database and it is constantly updated. We are currently pulling from a third party backup tool to do so. I read that this is not a good way to do this when data changes frequently. Is this correct? What would be the best way to refresh UAT and development daily?

    Reply
    • Dave – for architecture design, that’s the kind of question we answer with our consulting. If you’d like free architecture advice, head on over to DBA.StackExchange.com.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}