Migrating from SQL Server to MySQL, PostgreSQL, and Aurora

I’m getting an increasing number of requests these days that say:

“We’re considering migrating our application from SQL Server to a cheaper, easier-to-scale database. Can you help?”

Well, kinda: why are you doing it?

If you’re doing it to get easier scalability, we start by taking a step back and looking at the performance problems in the current setup. Usually, I get these kinds of calls from teams of developers that don’t have a full time DBA. They’ve built a software-as-a-service application, and they’re hitting performance issues. The simple fixes are:

  • Measure the SQL Server’s current bottleneck with wait stats (like with sp_BlitzFirst @SinceStartup = 1)
  • Find the queries causing those bottlenecks (like with sp_BlitzCache @SortOrder = ‘reads’)
  • Remove the bottlenecks by tuning indexes, queries, SQL Server config settings, or hardware

If you don’t have a full time database administrator, then it’s easy to implement a round of quick fixes in just a few days that take care of the low-hanging fruit. This usually buys the company a year’s worth of runway time, at which point they can either knock out another round of low-hanging fruit, or reconsider the database back end decisions.

If you want lower SQL Server licensing costs, then we take a step back and look at how you’re buying SQL Server today. I usually hear this goal from companies renting their licensing from their hosting provider (like Rackspace or AWS), in which case we’ve got a few options:

  • Can we performance tune the database to run on a smaller instance size?
  • Can we drop from Enterprise Edition to Standard Edition?
  • Can we move non-relational data out of the database? (Like full text search, XML/JSON storage)
  • Can we implement a free/cheap caching layer like hosted Redis?

Those projects are usually way cheaper and faster than trying to rewrite the entire application’s back end. That’s what these change-databases projects usually involve – as soon as you take even a cursory look at the database queries, they’re written in a way that ties them to a specific database platform.

Don’t get me wrong – I’m not saying SQL Server is the best answer for every app.

But I *am* saying that rewriting the entire app almost never is.

Erik says: If you use common table expressions, re-writing your code for MySQL will be fun. By fun I mean “use Postgres instead”.

Previous Post
Top 5 Overlooked Index Features
Next Post
[Video] Office Hours 2016/11/09 (With Transcriptions)

19 Comments. Leave new

  • Had to help a coworker delete some records from mysql – first time I touched one – can’t delete from a table if the where clause uses a subquery from the same table…

    Reply
  • My current job is half SQL Server half Postgres. A big draw back to Postgres is lack of a robust community like SQL Server.

    Reply
  • Not to mention if you only have experience on Windows. It seems that most of the tools and third party apps; to help with the transition, aren’t available on Windows. You get to learn PowerShell very well. From vendors; you get the whole… “it’s a Linux world anyways” response from everyone. The bright side of all of this is that you get some new skills to add to your resume.

    Reply
  • Excellent post. For bullet 3 under simple fixes, I’d change the order to put config settings and query tuning before index tuning. Can’t count the number of DBs I’ve seen with the files configured badly, parallelism threshold too low, etc. Procs written to be procedural, abusing tempdb, needless re-querying of the same data, SQL client calls inside loops fetching too little data. All this kind of stuff needs to be addressed before index tuning.

    Reply
  • At my work, we choose SQL Server 2014 over PostgreSQL for a DatawareHouse Migration from our Oracle 11g Cluster (with 2 instances)… If license is your problem, prepare yourself for a journey full of pain trying to obtain the same or more performance and efficiency in PostgreSQL… with SQL Server, their clustered columnstore index, their space savings, the query parallelization options (hint, Postgre right now don’t have parallelization natively… only in custom dists like Postgre-XL, but, limited to 1 thread for every nodes in your cluster… aaaaaand.. you need to distribute your data in the right way to obtain performance improvements), going to SQL Server we obtained great performance gains over Oracle (query response time dropped like 500% in some cases, without the need to rewrite the sentence, thanks to their columnstore indexes… remember, its a Dataware House)… even rewriting the code for SQL Server was a lot more painless than trying to do the same for Postgre…
    We even dropped one of our Enterprise Edition Clusters for one Standard Edition Cluster… we just needed to fix the things that only work on Enterprise Edition and optimize the querys to run on a more constrained server…
    Some things don’t work the same way, or needed some hacks to make it work, like, implementing some analytical functions that exists in Oracle for our Datamining team and certaing UDF functions in SQLCLR, for example
    But, we are full at SQL Server right now… sometimes we discover some things Oracle managed best than SQL Server (Oracle MVCC & SQL Server RCSI dont work the same way for us), but, hey, the savings, going even to 2 Enterprise Edition Clusters, were big enough to justify the effort
    And we are glad to have choosen SQL Server…
    I dont have anything against PostgreSQL… but this time, even the SQL Server Standard Edition was good…
    I’m waiting for the CTP of SQL Server on Linux, to try it…
    Sorry if my english it’s not understandable… i’m writting this with the help of Google Translate haha

    Reply
  • I’m administering 6 types of databases: Oracle, DB2, SQL Server, MySQL, Postures and MaxDB.
    I have to say that, in terms of performance and scalability, Oracle 12C Enterprise Edition is far ahead of any relational DBMS, followed by DB2 and SQL Server. There’s a good reason why they are the most expensive.
    If you want a cheap and fast DBMS for small to mid size databases, you are better off with MySQL.

    Reply
    • Gabriel – interesting, thanks for the note. Can you elaborate on the specific features for performance and scalability that you find most helpful in Oracle and DB2? Thanks!

      Reply
  • Oracle 12C EE provides powerful SQL advisors (like SQL Tuning Advisor and SQL Access Advisor) and memory advisors that you can use to help improve database performance. It also features the powerful In-Memory Column Store (IM column store), that allows you to store columns, tables, partitions and materialized views in memory in a columnar format, rather than the typical row format. The advantage of having data in memory is obvious, but the columnar storage lends itself extremely well to analytic queries found in business intelligence products.
    DB2 10.5 also features a powerful In-Memory feature called BLU acceleration, that offers a vast improvement in performance of queries, a dramatic reduction in database tuning, significant storage savings because of data compression. Because BLU Acceleration in-memory computing does not need tuning or indexes, their database tuning times declined by approximately 85% resulting in savings in DBA time to tune and administer the database.
    SQL Server also features In-Memory OLTP, that can significantly improve OLTP database application performance. It improves throughput and reduces latency for transaction processing, and can help improve performance of transient data scenarios such as temp tables and ETL.

    Reply
  • A interesting view here:
    https://www.pg-versus-ms.com/
    (I have nothing to do with it, just to fuel the debate

    Reply
  • In the money-no-object performance category, Oracle Exadata is quite something.

    I did 12 months on it. It’s impressive. No data shunting fudge like PDW. Some predicates are dealt with in the storage systems, so there’s a load of main IO path stuff that never happens. In fact, counter intuitively, for some queries, full scans are *faster* – brute force of the storage layer. Other stuff… the storage systems only send the requested columns up the main IO path, not the full row, so that’s neat. Oh and bloom filters happen at the storage level too. But hey, developing on Oracle is horrible in many ways, glad to back on home turf with MS-SQL

    Going the other direction, MySQL. The thing that hobbles it is no parallelism. For pure fine-grained OLTP, maybe that’s ok. But for heavy lifting, it’s *slow*. All those cores, doing *nothing*, whilst one fries itself. One good thing though… you can do proper active-active clustering. It’s a bit of fudge (data duplication) compared with Oracle RAC, but it’s ACID compliant and works. That’s nice. But expensive of course (e.g. duplicating SSD for each node).

    But yeah, migrate? Really? From any RDBMs to any other, the dev cost is usually enough to make it a non-starter.

    Reply
  • Just for note: Common table expressions where introduced in MariaDB 10.2.1 (https://mariadb.com/kb/en/mariadb/mariadb-1021-release-notes/) and from my very limited testing seems to work fine when moving from SQL server. But I definitely can respect Erik when he says “have fun” as the new MariaDB CTE feature is very new and your mileage will likely vary in your conversion process depending on the complexity of the CTEs.

    Reply
  • “Can we move non-relational data out of the database? (Like full text search, XML/JSON storage)”

    I would love to move full text search out of sql server for a few reasons. Does anyone have any info or ideas on how to do this? Thanks

    Reply
    • Elasticsearch is a popular option.

      Reply
      • Have you done a lot with it ? I have an application that uses it in it’s stack along with Redis, MySQL, and Postgres. I haven’t had the time to do anything with it other than making sure it running correctly. I’m sure that I need to find the time to do more with it ?

        Reply
    • Tim – generally speaking, I discuss it with the developers, we agree to move it out of the database, and…that’s the end of it from the database angle. The developers go find the right platform to store the data, and since it’s not a database, the DBA stays out of it.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.