What Do You Think About ORMs?

I was recently asked what I thought about ORMs (Entity Framework, NHibernate, Dapper, etc) while we were looking at implicit conversion warnings in execution plans.

Before I answered the question, I let them know that my answer is based on being a production DBA and the numerous performance problems I have had to troubleshoot that were caused by improper usage of an ORM.

What I don’t like about ORMs:

  • Implicit conversions due to nvarchar variables vs varchar columns
  • Queries grab every single column, regardless if they are needed or not
    • Expensive key lookups
    • Very wide missing index recommendations
  • The resulting query is hard to read

I first came across the implicit conversion problem 10 years ago and blogged about it (back then I didn’t realize the culprit was the ORM and instead blamed the database driver). I had been battling very high CPU utilization for a few weeks and finally figured out it was due to an implicit conversion that caused the plan to scan a very large index. Why did it take several weeks to figure this out? I’ll have to chalk that up to inexperience back then. I’m a much better performance tuner these days.

Based on the servers that I see as a consultant, implicit conversions is a very common problem. Sometimes it’s due to the ORM, but sometimes it’s due to a data type mismatch in the join conditions (Table1.ColumnA is varchar, Table2.ColumnA is nvarchar) or in the stored procedure (@var1 is nvarchar and is being compared to a varchar column). Avoid the data type mismatch issue by being diligent with the data types.

If you are using an ORM, be sure you know about the implicit conversion issue and how to work around it. Like I said, I’m a production DBA and not a developer. I can’t tell you how to fix it, I can tell you why it’s slow and how to find queries with this issue.

To help figure out if you are experiencing implicit conversions, look at the Warnings column when you run sp_BlitzCache. For more information, check this out. It links to a good article about implicit conversions.

This is not an anti-ORM blog post. This is the “I’m on-call 24×7 and am sick of being woken up at 2am because a query generated by an ORM is causing a severe performance problem” blog post.

Brent says: It’s not that ORMs are inherently bad – I’m usually for anything that gets products to market faster – but the problems hit when you scale the app without putting additional work into performance along the way. ORMs are like any other technical debt: as the app becomes popular, you’re gonna need to pay that debt back down sooner or later.

Previous Post
Last Season’s Performance Tuning Techniques: Slides & Demos
Next Post
sp_AllNightLog: Poll Vaulting

26 Comments. Leave new

  • remi bourgarel
    July 20, 2017 9:16 am

    You described the main problems, but ORM like Entity Framework have some tools for making it easier to overcome these problems : you can put your own sql and the orm just fills your object with it or you can even call a stored procedure. But I think the problem is not with ORM : the problem with RDBMS is that it not compatible with OOP or Layered architecture : if I create a data layer for loading with optimized SQL a Product object, I will always load all the columns because I don’t know what the caller will do with it, and the caller doesn’t know how I do it because there is abstraction and decoupling between layers. If you want to get rid of this you put your data layer directly under the GUI layer and every bit of SQL is done for one business purpose (“update the product name” , “update the product price” …). The money you’ll save in hosting will be spend in developpers and long time to market.

    Reply
  • As a long time application developer, having worked with multiple tools, languages, and frameworks, but never a real DBA, I’m a big fan of ORMs. Big, big fan.

    I spent around a decade working on a project in which I was the data modeler, and the main SQL guy and pseudo-DBA for a team of between that varied between 5 and 50 developers, ranging from very junior to relatively senior, and never went live during my time there. For most of that time, all database structure changes went through me, I kept all development and QA databases up-to-date, and I helped other team members write their SQL queries when they needed help. I’d say SQL was the biggest source of bugs, especially from junior developers.

    After that I was a developer on a project that used an ORM for all database access. Also, with a range of junior to senior developers, but database stuff was no longer a major source of bugs. The ORM simply eliminated most of them. Developers focused on the Java application code, with the database being mostly out-of-sight, out-of-mind, and getting data into or out of the database was relatively bug-free.

    Now I’m a developer on some legacy applications 90% of which was written before I joined the company, and most of the SQL is either hardcoded in Java strings or in stored procedures, some of which are thousands of lines long with little structure or indentation or consistent style. And again SQL is a huge source of bugs. If these apps used an ORM, we probably wouldn’t have most of them.

    From my experience, only a few, generally experienced, developers can write good SQL for applications, can get the joins right to return the right rows, and can structure their SQL so that it can be maintained as needs change.

    If I was given the task of starting a new application, as much as I enjoy programming in SQL, and I’m good at it myself, I’d still wave it good-bye and go with an ORM for sure.

    Reply
    • Rémi BOURGAREL
      July 21, 2017 3:35 am

      Indeed there is a lot of chance you’ll get better SQL with ORM than with average developper !

      Reply
  • Chris Porter
    July 20, 2017 12:27 pm

    My company uses an in-house built ORM, created mostly by a previous developer that wasn’t satisfied with the other systems available on the free market. Fast forward a couple years and some particular plans were found with very expensive VARCHAR to NVARCHAR implicit conversions. The source of the issue is how the ORM interacts with the SqlCommand and SqlParameter objects. When I dug deeper into the issue I found it very difficult to convert the use of SqlParameter with a DbType.String to use VARCHAR instead of NVARCHAR. Under the covers of Sql.Data, the use of NVARCHAR is a very painful behavior to override.

    Reply
  • Chris Porter
    July 20, 2017 12:30 pm

    On this topic, is anyone familiar with a method to get the number of rows a query with FETCH/OFFSET would return if those statements weren’t used? Our ORM uses a CTE to gather a COUNT(*) so we know how many rows would be returned and allow for proper paging between results. My research has led to me to nothing of value but it is a problem I still hope to solve. The CTE does it, as would a partition approach, but both have a performance hit I that shouldn’t be necessary.

    Reply
    • If I understand, you would like to have a way to know the count of rows that a particular sql statement (joins, where clauses, groupings, etc) would return without the sql engine actually performing the joins, filtering, and groupings?

      Reply
  • Thomas Cortes
    July 20, 2017 1:06 pm

    ORMs can be good as much as they can be bad :). It all depends on the project (size and time) .

    No matter what, ORM will have a cost somewhere. Easier to write but slower to perform as said in the articles and the comments.
    Pure SQL is harder and longer to write, but isn’t it a good idea to dig in and make sure it performs at its best ?

    I had a project when I passed through all these phases (looks like Brent was saying about debt):
    – ORM everywhere for everything
    – ORM with custom SQL / SP
    – No ORM, SP only and pure ADO.NET

    At first, I was pretty happy with the ORM I used (ORM Lite, second fastest after dapper few years ago). Using it hides all the SQL thingy that is not really fun to do (query and mapping). But it also hides the possibility to optimize.

    After few months and few thousands users more, SQL Server and the website started to eat a lot of CPU. So I started to write classic SQL queries and SP to replace the lovely expressions used with the ORM. The difference was noticeable, but the code started to look a bit ugly.

    When you write SP and queries yourself, maybe you’ll start to think: so I just use the ORM for mapping ? Why wouldn’t I do that myself. And that’s what I’ve done. It requires more time to write and is more error-prone and could end badly if you don’t structure your code correctly, but at the end it’s not that difficult to do and it has an incredible advantage: speed, can’t get any faster than barebone ADO.NET ^^.

    So yes, I basically wrote things 3 times ^^.

    Based on this experience, definitely not a fan of ORM. That said, I made numerous projects with EntityFramework, but limited to projects with dead simple queries and where performance was not a real problem (limited number of users).

    I could have made the choice of staying with the ORM, I would have paid more for servers to handle the requests, but instead I optimized everything and that’s why I discovered BrentOzar (thanks parameters sniffing ^^), it was a good decision, right ? XD

    Reply
  • We’ve previously used Linq-SQL and had mixed results. Some developers used it well and got good results out of it; those were the ones who used profiler to capture the generated SQL statements for anything more than a simple query, and then run it though management studio examining the execution plans.

    Those developers that just used it and hoped for the best, caused some serious problems (e.g. select * from Wide50kRowTable – no where clause, and then filter it in memory), and because we didn’t have fast-release cycle for the application code, it became very hard to track down and remedy problems in production. Tracing queries against the “core” schema was very hard – which bit of C# wrote this god-awful SQL??? – and we could not do anything but throw indexes at slow queries until we could arrange a release.
    Because of this no new development using ORM’s is allowed, everything is via Stored Procedures these days.

    Oh, and finding TransactionScope using serializable isolation level by default on the (pooled) connections wasn’t nice either, with procs every one has “set transaction isolation level read committed”, where as the linq-sql queries got stuck with serializable sometimes.

    Reply
  • Specific to the implicit unicode conversion issue, less about ORMS – at my last place I was at first cursed by my java devs that the db was slow on what should have been a singleton select by PK value, until I not only showed them the implicit conversion but showed them the connection property to change, it happens to be called sendStringParametersAsUnicode for both MSFT and jTDS drivers, because it’s ON BY DEFAULT. (Because strings in java are always unicode as I was told.) Here’s the MSFT and jTDS links respectively. https://docs.microsoft.com/en-us/sql/connect/jdbc/setting-the-connection-properties and http://jtds.sourceforge.net/faq.html (And yes, once they turned it off, everything as as fast as they wanted, I got to look like a hero for a short time, and there was much rejoicing.)

    Reply
  • I am a developer, fairly senior etc (i.e. not young). But I’ve always liked SQL as a language and I have just about enough understanding of, for example, indexes to keep me out of major trouble. Most other devs, even the senior ones, really have a dislike of SQL. So the ORM’s get used as a way of avoiding what is a core language after C# and JavaScript. I have worked for two companies who didn’t just run into performance problems, they had to rewrite all of their data access code. It easily becomes baked in. I think the core problem is a business one though. It’s very similar to security, and how bad most developers are at security basics. The problems arise so far down the line that it’s someone else’s problem. Devs have no incentive to want to solve the problem, because the wider business doesn’t. But I work for myself now so I make everyone use Linq2SQL, but use sprocs for 99% of the selects and use linq for inserts etc. Even though it’s deprecated it’s so much easier to use sprocs in strongly typed way than EF. You still keep a lot of the productivity of an ORM, but it’s easy to see the execution plan. Solves the problem for us but most devs would go nuts because it’s not “clean” or “elegant”. Critical thinking doesn’t come into it. But again it comes back to incentives, “clean code” looks good on CV and is great to talk about in an interview, being the “guy who argues with everyone because they don’t know what an execution plan is” is a trickier one 🙂

    Reply
  • With the advent of SQL Server Data Tools (SSDT) I don’t think there is much reason to use ORMs anymore. I still use Dapper, but mainly just to make sure I don’t miss any of the garbage collection code. I wrote a code generator that I put the schema(s) into and it spits out the code I need to call the stored procedures in the specified schema. This makes it so if I change anything in the stored procs I will be able to find it out really easily in my code. I think EF is supposed to do this but it doesn’t work with .NET Core. I’m a big fan of FSharp.Data.SqlClient too. But it doesn’t work with .NET Core either (until .NET Core 2.0 is released – then it should be supported.).

    Since back end code has become so light weight now that everything is put into SPAs on the front end I don’t think a knowledge of C# is as important as it used to be (still important but lower on the list). I think the knowledge for languages should be SQL, JavaScript, C#/F# (if you have any complicated logic it should go into F#). If you only do back end then you can cut out the JS. I actually generate the JSON in the SQL too, unless the code is really complex then I’ll bump it into C#, but it is extremely rare that it is, if I have to hit multiple databases then that would be a problem and I would just return tables.

    Having said all that I do think SSDT does need some serious love from Microsoft. The static analyzer for the code misses quite a bit of errors and some of the errors aren’t caught until deploying the code.

    Another reason to only do stored procedures is that if you need to refactor a table you will know exactly what stored procedures/functions need to change also (using SSDT). They are like the API to your database, a necessary decoupling. I put each application’s code into its own schema. Ideally each app would own its own database on its own server but that’s not always reality.

    I’ll put the link to the code generation if anyone is interested. I still haven’t released it yet but my boss said it would be fine. It needs some refining but it works for my limited needs right now.

    My last comment on SQL :-). Part of the problem of SQL itself is that it doesn’t always work the best for code reuse. I have some code that I need to use for multiple languages in a single table and I need to sort by different collations. I want to only have a single place where I write the different possible languages. But, is cumbersome in SQL and slow. So, I ended up having to write a PowerShell script that would search for that code and replace it with the update. Not ideal, but it’s the limitations of SQL.

    Also, occasionally I want to reuse code but then I would need to put an option (recompile) on it, which can be a pain depending on what the code is doing.

    Reply
    • Jon – what does SSDT have to do with application code like Java?

      Reply
      • Brent,

        If I understand what you are asking. If you use SSDT (which means you are using SQL Server) then SSDT is independent of Java. You can then use a code generator or ORM in Java that can create the types you need to interact with the stored procedures you created.

        I’m not that familiar with Java, so I don’t know what it offers in regards to ORMs as this post is talking about and my comment was talking about.

        Reply
        • Jon – using SSDT doesn’t mean you use stored procedures.

          You can use stored procedures without using SSDT, and you can use SSDT without using stored procedures.

          I think what you meant to say is, “With the advent of stored procedures, I don’t think there is much reason to use ORMs anymore.” I wouldn’t say that’s true either – the beauty of ORMs is that you can do your development without having to do T-SQL work. If you don’t have a stored procedure developer on board, then ORMs let you bypass that expensive complexity.

          I’m with you in that if every shop had great T-SQL developers who were awesome at building stored procedures for every access of every object, you could get by without ORMs. That’s just not the reality today, though.

          Reply
          • Brent,

            What I mean to say is that SSDT allows you to have a good way to track your code through Git, et. al. It also allows you to have some form of static typing so when you do make mistakes in your code it will tell you before you deploy (but like I said in my original comment it is far from perfect and needs a lot of work). So, I meant to say what I said. I’m not sure why what I wrote isn’t conveying what I mean.

            I know you don’t need SSDT to write and use Stored Procedures.

            I’m also advocating that if you are a developer SQL should be one of your top languages you know. Yes, they can be complex sometimes and it would be nice to have a complete rewrite of SQL to make it easier to code in. But most queries are fairly simple. ORMs don’t take away that much complexity, it is just a matter of sitting down and learning it. And if you do need have a more complex query you will need to write it in SQL regardless. Developers are smart people, they can figure it out.

          • Jon – gotcha. I understand that you feel that “if you are a developer SQL should be one of your top languages you know” – but managers would often trade development speed for elegant code.

          • Brent, You can develop pretty fast with SQL, especially now that you can return JSON in SQL Server. You have hot code swapping. You have SSDT. All these tools push development towards SQL rather than apps. Maybe in the past it was more difficult to develop with SQL but those days are gone. There are still some strides that SQL Server tools can make, but they have definitely improved to the point where you might as well develop with native SQL.

            A quick question, if you are still reading my responses :-). Is there a light weight SQL Server-like database I could use on Linux? I will be developing an app that will need a database on the same machine, it will be pretty light weight database so I don’t need full SQL Server but the new SQL Server 2017 is still a heavy weight. I was thinking of using MySQL since the rest of my team uses it but my boss said to research to see if there is SQL Server light weight database I could use.

          • Jon – no, we focus on SQL Server here. Thanks!

  • People who are struggling with ORM limitations should take a hard look at LLBLGen Pro. It’s been around for more than 10 years and is a mature product with tons of great features. It does not have the issue with implicit conversions that you are describing here.

    Reply
    • The problem is that as a production DBA, and now a consultant, I don’t have a say in what the developers use. I used to have a say but rapid development became more important. So I’m struggling to support the databases that are used by applications that are using an ORM. I’ve never heard of LLBLGen, but I’m not a developer.

      Reply
      • @Tara,

        The cool thing about developing directly with stored procedures is hot code swapping. Talk about rapid! 🙂 I guess a lot of places have their databases locked down. But if you don’t or at least allow access to specific schemas then it can be much faster to just develop directly in SQL.

        You can tell I’m enjoying programming in SQL :-). Although it would be nice if SQL were a little more flexible it isn’t that bad of a language!

        Reply
  • Pim Brouwers
    July 25, 2017 8:01 am

    Great article. Nothing like an ORM discussion to spurn a heated debate!

    My take is this, ORMs as a means of generating SQL are bad bad bad. My history with them has shown my exactly all of the major problems you’ve outlined above (implicit conversions, over selection etc.).

    In almost all cases, I use sprocs. How I invoke those sprocs is by using plain-old ADO.NET (SqlDataReader obviously!) and something like FastMember (or reflection if you like things to go slow) to convert them representational objects. All abstracted away in a tiny helper to stay DRY.

    The only major problem with this approach is that there’s a lot of extra app-level coding involved to generate the classes needed. You could always use dynamics, but history has shown me it’s more trouble than it’s worth the further you go up the stack.

    In this sense, you could say that I’m still using an “ORM”. However, given that all of the SQL executed is sproc-based. The majority of the “evil” is taken away.

    Thoughts?

    Reply
  • An ORM is a tool, and like any tool it can be used well or not. But when it generates code like

    CAST( CAST( cast(1 as tinyint) ))

    I am immediately skeptical.

    Reply
  • My issue is this… We have an ENORMOUS data warehouse that most of our c# applications interface with. We are talking millions of rows with hundreds of columns in the fact tables.. Maybe I’m paranoid, but I don’t trust an ORM to efficiently navigate these data structures in our db’s, and even less, the developers. For simple db’s, I can see the value in full ORM implementation like EF letting it handle gets, insert, updates, deletes… Has anyone used EF right out of the box on a VLDB?

    Reply
  • Just on this:-

    > Queries grab every single column, regardless if they are needed or not

    That’s certainly not implicitly true. In LINQ for example, developers can and should control what columns are coming back. The problem is as it was even before ORM’s existed. Many developers, using high level languages are not sufficiently interested in what goes on under the covers. This is true in OS terms, networking terms, memory terms and most definitely SQL/Oracle etc terms

    It’s perfectly possible for a sufficiently motivated developer to make an ORM approach fast – but you need to actually decide to do that rather than expecting a fairy to do it for you

    Reply
    • Simon,

      We know it’s possible, but all too often we find people just NOT doing it.

      It’s either get-it-out-the-door development, not being sure what columns they need, and unfortunately, we still run into people who either don’t know how to pick columns, or don’t know that choosing every column in a table is bad.

      So while it’s not implicitly true, we often find that it’s true in practice.

      It’s time to have a long talk with your people 🙂

      Thanks!

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}